SlideShare a Scribd company logo
Linear_Algebra,_4th_Edition.pdf
SCHAUM’S
outlines
Linear Algebra
Fourth Edition
Seymour Lipschutz, Ph.D.
Temple University
Marc Lars Lipson, Ph.D.
University of Virginia
Schaum’s Outline Series
New York Chicago San Francisco Lisbon London Madrid
Mexico City Milan New Delhi San Juan
Seoul Singapore Sydney Toronto
SCHAUM’S
outlines
Copyright © 2009, 2001, 1991, 1968 by The McGraw-Hill Companies, Inc. All rights reserved. Except as permitted under the United States Copyright Act of
1976, no part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior writ-
ten permission of the publisher.
ISBN: 978-0-07-154353-8
MHID: 0-07-154353-8
The material in this eBook also appears in the print version of this title: ISBN: 978-0-07-154352-1, MHID: 0-07-154352-X.
All trademarks are trademarks of their respective owners. Rather than put a trademark symbol after every occurrence of a trademarked name, we use names in an
editorial fashion only, and to the benefit of the trademark owner, with no intention of infringement of the trademark. Where such designations appear in this book,
they have been printed with initial caps.
McGraw-Hill eBooks are available at special quantity discounts to use as premiums and sales promotions, or for use in corporate training programs. To contact
a representative please e-mail us at bulksales@mcgraw-hill.com.
TERMS OF USE
This is a copyrighted work and The McGraw-Hill Companies, Inc. (“McGraw-Hill”) and its licensors reserve all rights in and to the work. Use of this work is
subject to these terms. Except as permitted under the Copyright Act of 1976 and the right to store and retrieve one copy of the work, you may not decompile,
disassemble, reverse engineer, reproduce, modify, create derivative works based upon, transmit, distribute, disseminate, sell, publish or sublicense the work or any
part of it without McGraw-Hill’s prior consent. You may use the work for your own noncommercial and personal use; any other use of the work is strictly
prohibited. Your right to use the work may be terminated if you fail to comply with these terms.
THE WORK IS PROVIDED “AS IS.” McGRAW-HILL AND ITS LICENSORS MAKE NO GUARANTEES OR WARRANTIES AS TO THE ACCURACY,
ADEQUACY OR COMPLETENESS OF OR RESULTS TO BE OBTAINED FROM USING THE WORK, INCLUDING ANY INFORMATION THAT CAN
BE ACCESSED THROUGH THE WORK VIA HYPERLINK OR OTHERWISE, AND EXPRESSLY DISCLAIM ANY WARRANTY, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
McGraw-Hill and its licensors do not warrant or guarantee that the functions contained in the work will meet your requirements or that its operation will be
uninterrupted or error free. Neither McGraw-Hill nor its licensors shall be liable to you or anyone else for any inaccuracy, error or omission, regardless of cause,
in the work or for any damages resulting therefrom. McGraw-Hill has no responsibility for the content of any information accessed through the work. Under no
circumstances shall McGraw-Hill and/or its licensors be liable for any indirect, incidental, special, punitive, consequential or similar damages that result from the
use of or inability to use the work, even if any of them has been advised of the possibility of such damages. This limitation of liability shall apply to any claim
or cause whatsoever whether such claim or cause arises in contract, tort or otherwise.
Preface
Linear algebra has in recent years become an essential part of the mathematical background required by
mathematicians and mathematics teachers, engineers, computer scientists, physicists, economists, and
statisticians, among others. This requirement reflects the importance and wide applications of the subject
matter.
This book is designed for use as a textbook for a formal course in linear algebra or as a supplement to all
current standard texts. It aims to present an introduction to linear algebra which will be found helpful to all
readers regardless of their fields of specification. More material has been included than can be covered in most
first courses. This has been done to make the book more flexible, to provide a useful book of reference, and to
stimulate further interest in the subject.
Each chapter begins with clear statements of pertinent definitions, principles, and theorems together with
illustrative and other descriptive material. This is followed by graded sets of solved and supplementary
problems. The solved problems serve to illustrate and amplify the theory, and to provide the repetition of basic
principles so vital to effective learning. Numerous proofs, especially those of all essential theorems, are
included among the solved problems. The supplementary problems serve as a complete review of the material
of each chapter.
The first three chapters treat vectors in Euclidean space, matrix algebra, and systems of linear equations.
These chapters provide the motivation and basic computational tools for the abstract investigations of vector
spaces and linear mappings which follow. After chapters on inner product spaces and orthogonality and on
determinants, there is a detailed discussion of eigenvalues and eigenvectors giving conditions for representing
a linear operator by a diagonal matrix. This naturally leads to the study of various canonical forms,
specifically, the triangular, Jordan, and rational canonical forms. Later chapters cover linear functions and
the dual space V*, and bilinear, quadratic, and Hermitian forms. The last chapter treats linear operators on
inner product spaces.
The main changes in the fourth edition have been in the appendices. First of all, we have expanded
Appendix A on the tensor and exterior products of vector spaces where we have now included proofs on the
existence and uniqueness of such products. We also added appendices covering algebraic structures, including
modules, and polynomials over a field. Appendix D, ‘‘Odds and Ends,’’ includes the Moore–Penrose
generalized inverse which appears in various applications, such as statistics. There are also many additional
solved and supplementary problems.
Finally, we wish to thank the staff of the McGraw-Hill Schaum’s Outline Series, especially Charles Wall,
for their unfailing cooperation.
SEYMOUR LIPSCHUTZ
MARC LARS LIPSON
iii
This page intentionally left blank
Contents
CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors 1
1.1 Introduction 1.2 Vectors in Rn
1.3 Vector Addition and Scalar Multi-
plication 1.4 Dot (Inner) Product 1.5 Located Vectors, Hyperplanes, Lines,
Curves in Rn
1.6 Vectors in R3
(Spatial Vectors), ijk Notation 1.7
Complex Numbers 1.8 Vectors in Cn
CHAPTER 2 Algebra of Matrices 27
2.1 Introduction 2.2 Matrices 2.3 Matrix Addition and Scalar Multiplica-
tion 2.4 Summation Symbol 2.5 Matrix Multiplication 2.6 Transpose of a
Matrix 2.7 Square Matrices 2.8 Powers of Matrices, Polynomials in
Matrices 2.9 Invertible (Nonsingular) Matrices 2.10 Special Types of
Square Matrices 2.11 Complex Matrices 2.12 Block Matrices
CHAPTER 3 Systems of Linear Equations 57
3.1 Introduction 3.2 Basic Definitions, Solutions 3.3 Equivalent Systems,
Elementary Operations 3.4 Small Square Systems of Linear Equations 3.5
Systems in Triangular and Echelon Forms 3.6 Gaussian Elimination 3.7
Echelon Matrices, Row Canonical Form, Row Equivalence 3.8 Gaussian
Elimination, Matrix Formulation 3.9 Matrix Equation of a System of Linear
Equations 3.10 Systems of Linear Equations and Linear Combinations of
Vectors 3.11 Homogeneous Systems of Linear Equations 3.12 Elementary
Matrices 3.13 LU Decomposition
CHAPTER 4 Vector Spaces 112
4.1 Introduction 4.2 Vector Spaces 4.3 Examples of Vector Spaces 4.4
Linear Combinations, Spanning Sets 4.5 Subspaces 4.6 Linear Spans, Row
Space of a Matrix 4.7 Linear Dependence and Independence 4.8 Basis and
Dimension 4.9 Application to Matrices, Rank of a Matrix 4.10 Sums and
Direct Sums 4.11 Coordinates
CHAPTER 5 Linear Mappings 164
5.1 Introduction 5.2 Mappings, Functions 5.3 Linear Mappings (Linear
Transformations) 5.4 Kernel and Image of a Linear Mapping 5.5 Singular
and Nonsingular Linear Mappings, Isomorphisms 5.6 Operations with
Linear Mappings 5.7 Algebra A(V ) of Linear Operators
CHAPTER 6 Linear Mappings and Matrices 195
6.1 Introduction 6.2 Matrix Representation of a Linear Operator 6.3
Change of Basis 6.4 Similarity 6.5 Matrices and General Linear Mappings
CHAPTER 7 Inner Product Spaces, Orthogonality 226
7.1 Introduction 7.2 Inner Product Spaces 7.3 Examples of Inner Product
Spaces 7.4 Cauchy–Schwarz Inequality, Applications 7.5 Orthogonal-
ity 7.6 Orthogonal Sets and Bases 7.7 Gram–Schmidt Orthogonalization
Process 7.8 Orthogonal and Positive Definite Matrices 7.9 Complex Inner
Product Spaces 7.10 Normed Vector Spaces (Optional)
v
CHAPTER 8 Determinants 264
8.1 Introduction 8.2 Determinants of Orders 1 and 2 8.3 Determinants of
Order 3 8.4 Permutations 8.5 Determinants of Arbitrary Order 8.6 Proper-
ties of Determinants 8.7 Minors and Cofactors 8.8 Evaluation of Determi-
nants 8.9 Classical Adjoint 8.10 Applications to Linear Equations,
Cramer’s Rule 8.11 Submatrices, Minors, Principal Minors 8.12 Block
Matrices and Determinants 8.13 Determinants and Volume 8.14 Determi-
nant of a Linear Operator 8.15 Multilinearity and Determinants
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 292
9.1 Introduction 9.2 Polynomials of Matrices 9.3 Characteristic Polyno-
mial, Cayley–Hamilton Theorem 9.4 Diagonalization, Eigenvalues and
Eigenvectors 9.5 Computing Eigenvalues and Eigenvectors, Diagonalizing
Matrices 9.6 Diagonalizing Real Symmetric Matrices and Quadratic
Forms 9.7 Minimal Polynomial 9.8 Characteristic and Minimal Polyno-
mials of Block Matrices
CHAPTER 10 Canonical Forms 325
10.1 Introduction 10.2 Triangular Form 10.3 Invariance 10.4 Invariant
Direct-Sum Decompositions 10.5 Primary Decomposition 10.6 Nilpotent
Operators 10.7 Jordan Canonical Form 10.8 Cyclic Subspaces 10.9
Rational Canonical Form 10.10 Quotient Spaces
CHAPTER 11 Linear Functionals and the Dual Space 349
11.1 Introduction 11.2 Linear Functionals and the Dual Space 11.3 Dual
Basis 11.4 Second Dual Space 11.5 Annihilators 11.6 Transpose of a
Linear Mapping
CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 359
12.1 Introduction 12.2 Bilinear Forms 12.3 Bilinear Forms and
Matrices 12.4 Alternating Bilinear Forms 12.5 Symmetric Bilinear
Forms, Quadratic Forms 12.6 Real Symmetric Bilinear Forms, Law of
Inertia 12.7 Hermitian Forms
CHAPTER 13 Linear Operators on Inner Product Spaces 377
13.1 Introduction 13.2 Adjoint Operators 13.3 Analogy Between A(V ) and
C, Special Linear Operators 13.4 Self-Adjoint Operators 13.5 Orthogonal
and Unitary Operators 13.6 Orthogonal and Unitary Matrices 13.7 Change
of Orthonormal Basis 13.8 Positive Definite and Positive Operators 13.9
Diagonalization and Canonical Forms in Inner Product Spaces 13.10
Spectral Theorem
APPENDIX A Multilinear Products 396
APPENDIX B Algebraic Structures 403
APPENDIX C Polynomials over a Field 411
APPENDIX D Odds and Ends 415
List of Symbols 420
Index 421
vi Contents
CHAPTER 1
Vectors in Rn
and Cn
,
Spatial Vectors
1.1 Introduction
There are two ways to motivate the notion of a vector: one is by means of lists of numbers and subscripts,
and the other is by means of certain objects in physics. We discuss these two ways below.
Here we assume the reader is familiar with the elementary properties of the field of real numbers,
denoted by R. On the other hand, we will review properties of the field of complex numbers, denoted by
C. In the context of vectors, the elements of our number fields are called scalars.
Although we will restrict ourselves in this chapter to vectors whose elements come from R and then
from C, many of our operations also apply to vectors whose entries come from some arbitrary field K.
Lists of Numbers
Suppose the weights (in pounds) of eight students are listed as follows:
156; 125; 145; 134; 178; 145; 162; 193
One can denote all the values in the list using only one symbol, say w, but with different subscripts; that is,
w1; w2; w3; w4; w5; w6; w7; w8
Observe that each subscript denotes the position of the value in the list. For example,
w1 ¼ 156; the first number; w2 ¼ 125; the second number; . . .
Such a list of values,
w ¼ ðw1; w2; w3; . . . ; w8Þ
is called a linear array or vector.
Vectors in Physics
Many physical quantities, such as temperature and speed, possess only ‘‘magnitude.’’ These quantities
can be represented by real numbers and are called scalars. On the other hand, there are also quantities,
such as force and velocity, that possess both ‘‘magnitude’’ and ‘‘direction.’’ These quantities, which can
be represented by arrows having appropriate lengths and directions and emanating from some given
reference point O, are called vectors.
Now we assume the reader is familiar with the space R3
where all the points in space are represented
by ordered triples of real numbers. Suppose the origin of the axes in R3
is chosen as the reference point O
for the vectors discussed above. Then every vector is uniquely determined by the coordinates of its
endpoint, and vice versa.
There are two important operations, vector addition and scalar multiplication, associated with vectors
in physics. The definition of these operations and the relationship between these operations and the
endpoints of the vectors are as follows.
1
CHAPTER 1
(i) Vector Addition: The resultant u þ v of two vectors u and v is obtained by the parallelogram law;
that is, u þ v is the diagonal of the parallelogram formed by u and v. Furthermore, if ða; b; cÞ and
ða0
; b0
; c0
Þ are the endpoints of the vectors u and v, then ða þ a0
; b þ b0
; c þ c0
Þ is the endpoint of the
vector u þ v. These properties are pictured in Fig. 1-1(a).
(ii) Scalar Multiplication: The product ku of a vector u by a real number k is obtained by multiplying
the magnitude of u by k and retaining the same direction if k > 0 or the opposite direction if k < 0.
Also, if ða; b; cÞ is the endpoint of the vector u, then ðka; kb; kcÞ is the endpoint of the vector ku.
These properties are pictured in Fig. 1-1(b).
Mathematically, we identify the vector u with its ða; b; cÞ and write u ¼ ða; b; cÞ. Moreover, we call
the ordered triple ða; b; cÞ of real numbers a point or vector depending upon its interpretation. We
generalize this notion and call an n-tuple ða1; a2; . . . ; anÞ of real numbers a vector. However, special
notation may be used for the vectors in R3
called spatial vectors (Section 1.6).
1.2 Vectors in Rn
The set of all n-tuples of real numbers, denoted by Rn
, is called n-space. A particular n-tuple in Rn
, say
u ¼ ða1; a2; . . . ; anÞ
is called a point or vector. The numbers ai are called the coordinates, components, entries, or elements
of u. Moreover, when discussing the space Rn
, we use the term scalar for the elements of R.
Two vectors, u and v, are equal, written u ¼ v, if they have the same number of components and if the
corresponding components are equal. Although the vectors ð1; 2; 3Þ and ð2; 3; 1Þ contain the same three
numbers, these vectors are not equal because corresponding entries are not equal.
The vector ð0; 0; . . . ; 0Þ whose entries are all 0 is called the zero vector and is usually denoted by 0.
EXAMPLE 1.1
(a) The following are vectors:
ð2; 5Þ; ð7; 9Þ; ð0; 0; 0Þ; ð3; 4; 5Þ
The first two vectors belong to R2
, whereas the last two belong to R3
. The third is the zero vector in R3
.
(b) Find x; y; z such that ðx  y; x þ y; z  1Þ ¼ ð4; 2; 3Þ.
By definition of equality of vectors, corresponding entries must be equal. Thus,
x  y ¼ 4; x þ y ¼ 2; z  1 ¼ 3
Solving the above system of equations yields x ¼ 3, y ¼ 1, z ¼ 4.
Figure 1-1
2 CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors
Column Vectors
Sometimes a vector in n-space Rn
is written vertically rather than horizontally. Such a vector is called a
column vector, and, in this context, the horizontally written vectors in Example 1.1 are called row
vectors. For example, the following are column vectors with 2; 2; 3, and 3 components, respectively:
1
2
 
;
3
4
 
;
1
5
6
2
4
3
5;
1:5
2
3
15
2
6
4
3
7
5
We also note that any operation defined for row vectors is defined analogously for column vectors.
1.3 Vector Addition and Scalar Multiplication
Consider two vectors u and v in Rn
, say
u ¼ ða1; a2; . . . ; anÞ and v ¼ ðb1; b2; . . . ; bnÞ
Their sum, written u þ v, is the vector obtained by adding corresponding components from u and v. That is,
u þ v ¼ ða1 þ b1; a2 þ b2; . . . ; an þ bnÞ
The scalar product or, simply, product, of the vector u by a real number k, written ku, is the vector
obtained by multiplying each component of u by k. That is,
ku ¼ kða1; a2; . . . ; anÞ ¼ ðka1; ka2; . . . ; kanÞ
Observe that u þ v and ku are also vectors in Rn
. The sum of vectors with different numbers of
components is not defined.
Negatives and subtraction are defined in Rn
as follows:
u ¼ ð1Þu and u  v ¼ u þ ðvÞ
The vector u is called the negative of u, and u  v is called the difference of u and v.
Now suppose we are given vectors u1; u2; . . . ; um in Rn
and scalars k1; k2; . . . ; km in R. We can
multiply the vectors by the corresponding scalars and then add the resultant scalar products to form the
vector
v ¼ k1u1 þ k2u2 þ k3u3 þ    þ kmum
Such a vector v is called a linear combination of the vectors u1; u2; . . . ; um.
EXAMPLE 1.2
(a) Let u ¼ ð2; 4; 5Þ and v ¼ ð1; 6; 9Þ. Then
u þ v ¼ ð2 þ 1; 4 þ ð5Þ; 5 þ 9Þ ¼ ð3; 1; 4Þ
7u ¼ ð7ð2Þ; 7ð4Þ; 7ð5ÞÞ ¼ ð14; 28; 35Þ
v ¼ ð1Þð1; 6; 9Þ ¼ ð1; 6; 9Þ
3u  5v ¼ ð6; 12; 15Þ þ ð5; 30; 45Þ ¼ ð1; 42; 60Þ
(b) The zero vector 0 ¼ ð0; 0; . . . ; 0Þ in Rn
is similar to the scalar 0 in that, for any vector u ¼ ða1; a2; . . . ; anÞ.
u þ 0 ¼ ða1 þ 0; a2 þ 0; . . . ; an þ 0Þ ¼ ða1; a2; . . . ; anÞ ¼ u
(c) Let u ¼
2
3
4
2
4
3
5 and v ¼
3
1
2
2
4
3
5. Then 2u  3v ¼
4
6
8
2
4
3
5 þ
9
3
6
2
4
3
5 ¼
5
9
2
2
4
3
5.
CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors 3
Basic properties of vectors under the operations of vector addition and scalar multiplication are
described in the following theorem.
THEOREM 1.1: For any vectors u; v; w in Rn
and any scalars k; k0
in R,
(i) ðu þ vÞ þ w ¼ u þ ðv þ wÞ, (v) kðu þ vÞ ¼ ku þ kv,
(ii) u þ 0 ¼ u; (vi) ðk þ k0
Þu ¼ ku þ k0
u,
(iii) u þ ðuÞ ¼ 0; (vii) (kk’)u=k(k’u);
(iv) u þ v ¼ v þ u, (viii) 1u ¼ u.
We postpone the proof of Theorem 1.1 until Chapter 2, where it appears in the context of matrices
(Problem 2.3).
Suppose u and v are vectors in Rn
for which u ¼ kv for some nonzero scalar k in R. Then u is called a
multiple of v. Also, u is said to be in the same or opposite direction as v according to whether k  0 or
k  0.
1.4 Dot (Inner) Product
Consider arbitrary vectors u and v in Rn
; say,
u ¼ ða1; a2; . . . ; anÞ and v ¼ ðb1; b2; . . . ; bnÞ
The dot product or inner product or scalar product of u and v is denoted and defined by
u  v ¼ a1b1 þ a2b2 þ    þ anbn
That is, u  v is obtained by multiplying corresponding components and adding the resulting products.
The vectors u and v are said to be orthogonal (or perpendicular) if their dot product is zero—that is, if
u  v ¼ 0.
EXAMPLE 1.3
(a) Let u ¼ ð1; 2; 3Þ, v ¼ ð4; 5; 1Þ, w ¼ ð2; 7; 4Þ. Then,
u  v ¼ 1ð4Þ  2ð5Þ þ 3ð1Þ ¼ 4  10  3 ¼ 9
u  w ¼ 2  14 þ 12 ¼ 0; v  w ¼ 8 þ 35  4 ¼ 39
Thus, u and w are orthogonal.
(b) Let u ¼
2
3
4
2
4
3
5 and v ¼
3
1
2
2
4
3
5. Then u  v ¼ 6  3 þ 8 ¼ 11.
(c) Suppose u ¼ ð1; 2; 3; 4Þ and v ¼ ð6; k; 8; 2Þ. Find k so that u and v are orthogonal.
First obtain u  v ¼ 6 þ 2k  24 þ 8 ¼ 10 þ 2k. Then set u  v ¼ 0 and solve for k:
10 þ 2k ¼ 0 or 2k ¼ 10 or k ¼ 5
Basic properties of the dot product in Rn
(proved in Problem 1.13) follow.
THEOREM 1.2: For any vectors u; v; w in Rn
and any scalar k in R:
(i) ðu þ vÞ  w ¼ u  w þ v  w; (iii) u  v ¼ v  u,
(ii) ðkuÞ  v ¼ kðu  vÞ, (iv) u  u  0; and u  u ¼ 0 iff u ¼ 0.
Note that (ii) says that we can ‘‘take k out’’ from the first position in an inner product. By (iii) and (ii),
u  ðkvÞ ¼ ðkvÞ  u ¼ kðv  uÞ ¼ kðu  vÞ
4 CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors
That is, we can also ‘‘take k out’’ from the second position in an inner product.
The space Rn
with the above operations of vector addition, scalar multiplication, and dot product is
usually called Euclidean n-space.
Norm (Length) of a Vector
The norm or length of a vector u in Rn
, denoted by kuk, is defined to be the nonnegative square root of
u  u. In particular, if u ¼ ða1; a2; . . . ; anÞ, then
kuk ¼
ffiffiffiffiffiffiffiffiffi
u  u
p
¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
a2
1 þ a2
2 þ    þ a2
n
q
That is, kuk is the square root of the sum of the squares of the components of u. Thus, kuk  0, and
kuk ¼ 0 if and only if u ¼ 0.
A vector u is called a unit vector if kuk ¼ 1 or, equivalently, if u  u ¼ 1. For any nonzero vector v in
Rn
, the vector
^
v ¼
1
kvk
v ¼
v
kvk
is the unique unit vector in the same direction as v. The process of finding ^
v from v is called normalizing v.
EXAMPLE 1.4
(a) Suppose u ¼ ð1; 2; 4; 5; 3Þ. To find kuk, we can first find kuk2
¼ u  u by squaring each component of u and
adding, as follows:
kuk2
¼ 12
þ ð2Þ2
þ ð4Þ2
þ 52
þ 32
¼ 1 þ 4 þ 16 þ 25 þ 9 ¼ 55
Then kuk ¼
ffiffiffiffiffi
55
p
.
(b) Let v ¼ ð1; 3; 4; 2Þ and w ¼ ð1
2 ;  1
6 ; 5
6 ; 1
6Þ. Then
kvk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 þ 9 þ 16 þ 4
p
¼
ffiffiffiffiffi
30
p
and kwk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
9
36
þ
1
36
þ
25
36
þ
1
36
r
¼
ffiffiffiffiffi
36
36
r
¼
ffiffiffi
1
p
¼ 1
Thus w is a unit vector, but v is not a unit vector. However, we can normalize v as follows:
^
v ¼
v
kvk
¼
1
ffiffiffiffiffi
30
p ;
3
ffiffiffiffiffi
30
p ;
4
ffiffiffiffiffi
30
p ;
2
ffiffiffiffiffi
30
p
 
This is the unique unit vector in the same direction as v.
The following formula (proved in Problem 1.14) is known as the Schwarz inequality or Cauchy–
Schwarz inequality. It is used in many branches of mathematics.
THEOREM 1.3 (Schwarz): For any vectors u; v in Rn
, ju  vj  kukkvk.
Using the above inequality, we also prove (Problem 1.15) the following result known as the ‘‘triangle
inequality’’ or Minkowski’s inequality.
THEOREM 1.4 (Minkowski): For any vectors u; v in Rn
, ku þ vk  kuk þ kvk.
Distance, Angles, Projections
The distance between vectors u ¼ ða1; a2; . . . ; anÞ and v ¼ ðb1; b2; . . . ; bnÞ in Rn
is denoted and defined
by
dðu; vÞ ¼ ku  vk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ða1  b1Þ2
þ ða2  b2Þ2
þ    þ ðan  bnÞ2
q
One can show that this definition agrees with the usual notion of distance in the Euclidean plane R2
or
space R3
.
CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors 5
The angle y between nonzero vectors u; v in Rn
is defined by
cos y ¼
u  v
kukkvk
This definition is well defined, because, by the Schwarz inequality (Theorem 1.3),
1 
u  v
kukkvk
 1
Note that if u  v ¼ 0, then y ¼ 90
(or y ¼ p=2). This then agrees with our previous definition of
orthogonality.
The projection of a vector u onto a nonzero vector v is the vector denoted and defined by
projðu; vÞ ¼
u  v
kvk2
v ¼
u  v
v  v
v
We show below that this agrees with the usual notion of vector projection in physics.
EXAMPLE 1.5
(a) Suppose u ¼ ð1; 2; 3Þ and v ¼ ð2; 4; 5Þ. Then
dðu; vÞ ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ð1  2Þ2
þ ð2  4Þ2
þ ð3  5Þ2
q
¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 þ 36 þ 4
p
¼
ffiffiffiffiffi
41
p
To find cos y, where y is the angle between u and v, we first find
u  v ¼ 2  8 þ 15 ¼ 9; kuk2
¼ 1 þ 4 þ 9 ¼ 14; kvk2
¼ 4 þ 16 þ 25 ¼ 45
Then
cos y ¼
u  v
kukkvk
¼
9
ffiffiffiffiffi
14
p ffiffiffiffiffi
45
p
Also,
projðu; vÞ ¼
u  v
kvk2
v ¼
9
45
ð2; 4; 5Þ ¼
1
5
ð2; 4; 5Þ ¼
2
5
;
4
5
; 1
 
(b) Consider the vectors u and v in Fig. 1-2(a) (with respective endpoints A and B). The (perpendicular) projection
of u onto v is the vector u* with magnitude
ku*k ¼ kuk cos y ¼ kuk
u  v
kukvk
¼
u  v
kvk
To obtain u*, we multiply its magnitude by the unit vector in the direction of v, obtaining
u* ¼ ku*k
v
kvk
¼
u  v
kvk
v
kvk
¼
u  v
kvk2
v
This is the same as the above definition of projðu; vÞ.
Figure 1-2
z
y
x
0
u
( )
b
B b b b
( , , )
1 2 3
u = B – A
A a a a
( , , )
1 2 3
P b a b a b a
( – , – , – )
1 1 2 2 3 3
0
u
( )
a
Projection of onto
u* u
A
u*
B
C
θ
6 CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors
1.5 Located Vectors, Hyperplanes, Lines, Curves in Rn
This section distinguishes between an n-tuple PðaiÞ  Pða1; a2; . . . ; anÞ viewed as a point in Rn
and an
n-tuple u ¼ ½c1; c2; . . . ; cn viewed as a vector (arrow) from the origin O to the point Cðc1; c2; . . . ; cnÞ.
Located Vectors
Any pair of points AðaiÞ and BðbiÞ in Rn
defines the located vector or directed line segment from A to B,
written AB
!
. We identify AB
!
with the vector
u ¼ B  A ¼ ½b1  a1; b2  a2; . . . ; bn  an
because AB
!
and u have the same magnitude and direction. This is pictured in Fig. 1-2(b) for the
points Aða1; a2; a3Þ and Bðb1; b2; b3Þ in R3
and the vector u ¼ B  A which has the endpoint
Pðb1  a1, b2  a2, b3  a3Þ.
Hyperplanes
A hyperplane H in Rn
is the set of points ðx1; x2; . . . ; xnÞ that satisfy a linear equation
a1x1 þ a2x2 þ    þ anxn ¼ b
where the vector u ¼ ½a1; a2; . . . ; an of coefficients is not zero. Thus a hyperplane H in R2
is a line, and a
hyperplane H in R3
is a plane. We show below, as pictured in Fig. 1-3(a) for R3
, that u is orthogonal to
any directed line segment PQ
!
, where Pð piÞ and QðqiÞ are points in H: [For this reason, we say that u is
normal to H and that H is normal to u:]
Because Pð piÞ and QðqiÞ belong to H; they satisfy the above hyperplane equation—that is,
a1 p1 þ a2 p2 þ    þ an pn ¼ b and a1q1 þ a2q2 þ    þ anqn ¼ b
v ¼ PQ
!
¼ Q  P ¼ ½q1  p1; q2  p2; . . . ; qn  pn
Let
Then
u  v ¼ a1ðq1  p1Þ þ a2ðq2  p2Þ þ    þ anðqn  pnÞ
¼ ða1q1 þ a2q2 þ    þ anqnÞ  ða1 p1 þ a2 p2 þ    þ an pnÞ ¼ b  b ¼ 0
Thus v ¼ PQ
!
is orthogonal to u; as claimed.
Figure 1-3
CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors 7
Lines in Rn
The line L in Rn
passing through the point Pðb1; b2; . . . ; bnÞ and in the direction of a nonzero vector
u ¼ ½a1; a2; . . . ; an consists of the points Xðx1; x2; . . . ; xnÞ that satisfy
X ¼ P þ tu or
x1 ¼ a1t þ b1
x2 ¼ a2t þ b2
::::::::::::::::::::
xn ¼ ant þ bn
or LðtÞ ¼ ðait þ biÞ
8





:
where the parameter t takes on all real values. Such a line L in R3
is pictured in Fig. 1-3(b).
EXAMPLE 1.6
(a) Let H be the plane in R3
corresponding to the linear equation 2x  5y þ 7z ¼ 4. Observe that Pð1; 1; 1Þ and
Qð5; 4; 2Þ are solutions of the equation. Thus P and Q and the directed line segment
v ¼ PQ
!
¼ Q  P ¼ ½5  1; 4  1; 2  1 ¼ ½4; 3; 1
lie on the plane H. The vector u ¼ ½2; 5; 7 is normal to H, and, as expected,
u  v ¼ ½2; 5; 7  ½4; 3; 1 ¼ 8  15 þ 7 ¼ 0
That is, u is orthogonal to v.
(b) Find an equation of the hyperplane H in R4
that passes through the point Pð1; 3; 4; 2Þ and is normal to the
vector u ¼ ½4; 2; 5; 6.
The coefficients of the unknowns of an equation of H are the components of the normal vector u; hence, the
equation of H must be of the form
4x1  2x2 þ 5x3 þ 6x4 ¼ k
Substituting P into this equation, we obtain
4ð1Þ  2ð3Þ þ 5ð4Þ þ 6ð2Þ ¼ k or 4  6  20 þ 12 ¼ k or k ¼ 10
Thus, 4x1  2x2 þ 5x3 þ 6x4 ¼ 10 is the equation of H.
(c) Find the parametric representation of the line L in R4
passing through the point Pð1; 2; 3; 4Þ and in the
direction of u ¼ ½5; 6; 7; 8. Also, find the point Q on L when t ¼ 1.
Substitution in the above equation for L yields the following parametric representation:
x1 ¼ 5t þ 1; x2 ¼ 6t þ 2; x3 ¼ 7t þ 3; x4 ¼ 8t  4
or, equivalently,
LðtÞ ¼ ð5t þ 1; 6t þ 2; 7t þ 3; 8t  4Þ
Note that t ¼ 0 yields the point P on L. Substitution of t ¼ 1 yields the point Qð6; 8; 4; 4Þ on L.
Curves in Rn
Let D be an interval (finite or infinite) on the real line R. A continuous function F: D ! Rn
is a curve in
Rn
. Thus, to each point t 2 D there is assigned the following point in Rn
:
FðtÞ ¼ ½F1ðtÞ; F2ðtÞ; . . . ; FnðtÞ
Moreover, the derivative (if it exists) of FðtÞ yields the vector
VðtÞ ¼
dFðtÞ
dt
¼
dF1ðtÞ
dt
;
dF2ðtÞ
dt
; . . . ;
dFnðtÞ
dt
 
8 CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors
which is tangent to the curve. Normalizing VðtÞ yields
TðtÞ ¼
VðtÞ
kVðtÞk
Thus, TðtÞ is the unit tangent vector to the curve. (Unit vectors with geometrical significance are often
presented in bold type.)
EXAMPLE 1.7 Consider the curve FðtÞ ¼ ½sin t; cos t; t in R3
. Taking the derivative of FðtÞ [or each component of
FðtÞ] yields
VðtÞ ¼ ½cos t;  sin t; 1
which is a vector tangent to the curve. We normalize VðtÞ. First we obtain
kVðtÞk2
¼ cos2
t þ sin2
t þ 1 ¼ 1 þ 1 ¼ 2
Then the unit tangent vection TðtÞ to the curve follows:
TðtÞ ¼
VðtÞ
kVðtÞk
¼
cos t
ffiffiffi
2
p ;
 sin t
ffiffiffi
2
p ;
1
ffiffiffi
2
p
 
1.6 Vectors in R3
(Spatial Vectors), ijk Notation
Vectors in R3
, called spatial vectors, appear in many applications, especially in physics. In fact, a special
notation is frequently used for such vectors as follows:
i ¼ ½1; 0; 0 denotes the unit vector in the x direction:
j ¼ ½0; 1; 0 denotes the unit vector in the y direction:
k ¼ ½0; 0; 1 denotes the unit vector in the z direction:
Then any vector u ¼ ½a; b; c in R3
can be expressed uniquely in the form
u ¼ ½a; b; c ¼ ai þ bj þ cj
Because the vectors i; j; k are unit vectors and are mutually orthogonal, we obtain the following dot
products:
i  i ¼ 1; j  j ¼ 1; k  k ¼ 1 and i  j ¼ 0; i  k ¼ 0; j  k ¼ 0
Furthermore, the vector operations discussed above may be expressed in the ijk notation as follows.
Suppose
u ¼ a1i þ a2j þ a3k and v ¼ b1i þ b2j þ b3k
Then
u þ v ¼ ða1 þ b1Þi þ ða2 þ b2Þj þ ða3 þ b3Þk and cu ¼ ca1i þ ca2j þ ca3k
where c is a scalar. Also,
u  v ¼ a1b1 þ a2b2 þ a3b3 and kuk ¼
ffiffiffiffiffiffiffiffiffi
u  u
p
¼ a2
1 þ a2
2 þ a2
3
EXAMPLE 1.8 Suppose u ¼ 3i þ 5j  2k and v ¼ 4i  8j þ 7k.
(a) To find u þ v, add corresponding components, obtaining u þ v ¼ 7i  3j þ 5k
(b) To find 3u  2v, first multiply by the scalars and then add:
3u  2v ¼ ð9i þ 13j  6kÞ þ ð8i þ 16j  14kÞ ¼ i þ 29j  20k
CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors 9
(c) To find u  v, multiply corresponding components and then add:
u  v ¼ 12  40  14 ¼ 42
(d) To find kuk, take the square root of the sum of the squares of the components:
kuk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
9 þ 25 þ 4
p
¼
ffiffiffiffiffi
38
p
Cross Product
There is a special operation for vectors u and v in R3
that is not defined in Rn
for n 6¼ 3. This operation is
called the cross product and is denoted by u v. One way to easily remember the formula for u v is to
use the determinant (of order two) and its negative, which are denoted and defined as follows:
a b
c d







 ¼ ad  bc and 
a b
c d







 ¼ bc  ad
Here a and d are called the diagonal elements and b and c are the nondiagonal elements. Thus, the
determinant is the product ad of the diagonal elements minus the product bc of the nondiagonal elements,
but vice versa for the negative of the determinant.
Now suppose u ¼ a1i þ a2j þ a3k and v ¼ b1i þ b2j þ b3k. Then
u v ¼ ða2b3  a3b2Þi þ ða3b1  a1b3Þj þ ða1b2  a2b1Þk
¼
a1 a2 a3
b1 b2 b3







i 
a1 a2 a3
b1 b2 b3







j þ
a1 a2 a3
b1 b2 b3







i
That is, the three components of u v are obtained from the array
a1 a2 a3
b1 b2 b3
 
(which contain the components of u above the component of v) as follows:
(1) Cover the first column and take the determinant.
(2) Cover the second column and take the negative of the determinant.
(3) Cover the third column and take the determinant.
Note that u v is a vector; hence, u v is also called the vector product or outer product of u
and v.
EXAMPLE 1.9 Find u v where: (a) u ¼ 4i þ 3j þ 6k, v ¼ 2i þ 5j  3k, (b) u ¼ ½2; 1; 5, v ¼ ½3; 7; 6.
(a) Use
4 3 6
2 5 3
 
to get u v ¼ ð9  30Þi þ ð12 þ 12Þj þ ð20  6Þk ¼ 39i þ 24j þ 14k
(b) Use
2 1 5
3 7 6
 
to get u v ¼ ½6  35; 15  12; 14 þ 3 ¼ ½41; 3; 17
Remark: The cross products of the vectors i; j; k are as follows:
i j ¼ k; j k ¼ i; k i ¼ j
j i ¼ k; k j ¼ i; i k ¼ j
Thus, if we view the triple ði; j; kÞ as a cyclic permutation, where i follows k and hence k precedes i, then
the product of two of them in the given direction is the third one, but the product of two of them in the
opposite direction is the negative of the third one.
Two important properties of the cross product are contained in the following theorem.
10 CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors
THEOREM 1.5: Let u; v; w be vectors in R3
.
(a) The vector u v is orthogonal to both u and v.
(b) The absolute value of the ‘‘triple product’’
u  v w
represents the volume of the parallelopiped formed by the vectors u; v, w.
[See Fig. 1-4(a).]
We note that the vectors u; v, u v form a right-handed system, and that the following formula
gives the magnitude of u v:
ku vk ¼ kukkvk sin y
where y is the angle between u and v.
1.7 Complex Numbers
The set of complex numbers is denoted by C. Formally, a complex number is an ordered pair ða; bÞ of
real numbers where equality, addition, and multiplication are defined as follows:
ða; bÞ ¼ ðc; dÞ if and only if a ¼ c and b ¼ d
ða; bÞ þ ðc; dÞ ¼ ða þ c; b þ dÞ
ða; bÞ  ðc; dÞ ¼ ðac  bd; ad þ bcÞ
We identify the real number a with the complex number ða; 0Þ; that is,
a $ ða; 0Þ
This is possible because the operations of addition and multiplication of real numbers are preserved under
the correspondence; that is,
ða; 0Þ þ ðb; 0Þ ¼ ða þ b; 0Þ and ða; 0Þ  ðb; 0Þ ¼ ðab; 0Þ
Thus we view R as a subset of C, and replace ða; 0Þ by a whenever convenient and possible.
We note that the set C of complex numbers with the above operations of addition and multiplication is
a field of numbers, like the set R of real numbers and the set Q of rational numbers.
Figure 1-4
CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors 11
The complex number ð0; 1Þ is denoted by i. It has the important property that
i2
¼ ii ¼ ð0; 1Þð0; 1Þ ¼ ð1; 0Þ ¼ 1 or i ¼
ffiffiffiffiffiffiffi
1
p
Accordingly, any complex number z ¼ ða; bÞ can be written in the form
z ¼ ða; bÞ ¼ ða; 0Þ þ ð0; bÞ ¼ ða; 0Þ þ ðb; 0Þ  ð0; 1Þ ¼ a þ bi
The above notation z ¼ a þ bi, where a  Re z and b  Im z are called, respectively, the real and
imaginary parts of z, is more convenient than ða; bÞ. In fact, the sum and product of complex numbers
z ¼ a þ bi and w ¼ c þ di can be derived by simply using the commutative and distributive laws and
i2
¼ 1:
z þ w ¼ ða þ biÞ þ ðc þ diÞ ¼ a þ c þ bi þ di ¼ ða þ bÞ þ ðc þ dÞi
zw ¼ ða þ biÞðc þ diÞ ¼ ac þ bci þ adi þ bdi2
¼ ðac  bdÞ þ ðbc þ adÞi
We also define the negative of z and subtraction in C by
z ¼ 1z and w  z ¼ w þ ðzÞ
Warning: The letter i representing
ffiffiffiffiffiffiffi
1
p
has no relationship whatsoever to the vector i ¼ ½1; 0; 0 in
Section 1.6.
Complex Conjugate, Absolute Value
Consider a complex number z ¼ a þ bi. The conjugate of z is denoted and defined by

z ¼ a þ bi ¼ a  bi
Then z
z ¼ ða þ biÞða  biÞ ¼ a2
 b2
i2
¼ a2
þ b2
. Note that z is real if and only if 
z ¼ z.
The absolute value of z, denoted by jzj, is defined to be the nonnegative square root of z
z. Namely,
jzj ¼
ffiffiffiffi
z
z
p
¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
a2 þ b2
p
Note that jzj is equal to the norm of the vector ða; bÞ in R2
.
Suppose z 6¼ 0. Then the inverse z1
of z and division in C of w by z are given, respectively, by
z1
¼

z
z
z
¼
a
a2 þ b2

b
a2 þ b2
i and
w
z

w
z
z
z
¼ wz1
EXAMPLE 1.10 Suppose z ¼ 2 þ 3i and w ¼ 5  2i. Then
z þ w ¼ ð2 þ 3iÞ þ ð5  2iÞ ¼ 2 þ 5 þ 3i  2i ¼ 7 þ i
zw ¼ ð2 þ 3iÞð5  2iÞ ¼ 10 þ 15i  4i  6i2
¼ 16 þ 11i

z ¼ 2 þ 3i ¼ 2  3i and 
w ¼ 5  2i ¼ 5 þ 2i
w
z
¼
5  2i
2 þ 3i
¼
ð5  2iÞð2  3iÞ
ð2 þ 3iÞð2  3iÞ
¼
4  19i
13
¼
4
13

19
13
i
jzj ¼
ffiffiffiffiffiffiffiffiffiffiffi
4 þ 9
p
¼
ffiffiffiffiffi
13
p
and jwj ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffi
ffi
25 þ 4
p
¼
ffiffiffiffiffi
29
p
Complex Plane
Recall that the real numbers R can be represented by points on a line. Analogously, the complex numbers
C can be represented by points in the plane. Specifically, we let the point ða; bÞ in the plane represent the
complex number a þ bi as shown in Fig. 1-4(b). In such a case, jzj is the distance from the origin O to the
point z. The plane with this representation is called the complex plane, just like the line representing R is
called the real line.
12 CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors
1.8 Vectors in Cn
The set of all n-tuples of complex numbers, denoted by Cn
, is called complex n-space. Just as in the real
case, the elements of Cn
are called points or vectors, the elements of C are called scalars, and vector
addition in Cn
and scalar multiplication on Cn
are given by
½z1; z2; . . . ; zn þ ½w1; w2; . . . ; wn ¼ ½z1 þ w1; z2 þ w2; . . . ; zn þ wn
z½z1; z2; . . . ; zn ¼ ½zz1; zz2; . . . ; zzn
where the zi, wi, and z belong to C.
EXAMPLE 1.11 Consider vectors u ¼ ½2 þ 3i; 4  i; 3 and v ¼ ½3  2i; 5i; 4  6i in C3
. Then
u þ v ¼ ½2 þ 3i; 4  i; 3 þ ½3  2i; 5i; 4  6i ¼ ½5 þ i; 4 þ 4i; 7  6i
ð5  2iÞu ¼ ½ð5  2iÞð2 þ 3iÞ; ð5  2iÞð4  iÞ; ð5  2iÞð3Þ ¼ ½16 þ 11i; 18  13i; 15  6i
Dot (Inner) Product in Cn
Consider vectors u ¼ ½z1; z2; . . . ; zn and v ¼ ½w1; w2; . . . ; wn in Cn
. The dot or inner product of u and v is
denoted and defined by
u  v ¼ z1 
w1 þ z2 
w2 þ    þ zn 
wn
This definition reduces to the real case because 
wi ¼ wi when wi is real. The norm of u is defined by
kuk ¼
ffiffiffiffiffiffiffiffiffi
u  u
p
¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
z1
z1 þ z2
z2 þ    þ zn
zn
p
¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
jz1j2
þ jz2j2
þ    þ jvnj2
q
We emphasize that u  u and so kuk are real and positive when u 6¼ 0 and 0 when u ¼ 0.
EXAMPLE 1.12 Consider vectors u ¼ ½2 þ 3i; 4  i; 3 þ 5i and v ¼ ½3  4i; 5i; 4  2i in C3. Then
u  v ¼ ð2 þ 3iÞð3  4iÞ þ ð4  iÞð5iÞ þ ð3 þ 5iÞð4  2iÞ
¼ ð2 þ 3iÞð3 þ 4iÞ þ ð4  iÞð5iÞ þ ð3 þ 5iÞð4 þ 2iÞ
¼ ð6 þ 13iÞ þ ð5  20iÞ þ ð2 þ 26iÞ ¼ 9 þ 19i
u  u ¼ j2 þ 3ij2
þ j4  ij2
þ j3 þ 5ij2
¼ 4 þ 9 þ 16 þ 1 þ 9 þ 25 ¼ 64
kuk ¼
ffiffiffiffiffi
64
p
¼ 8
The space Cn
with the above operations of vector addition, scalar multiplication, and dot product, is
called complex Euclidean n-space. Theorem 1.2 for Rn
also holds for Cn
if we replace u  v ¼ v  u by
u  v ¼ u  v
On the other hand, the Schwarz inequality (Theorem 1.3) and Minkowski’s inequality (Theorem 1.4) are
true for Cn
with no changes.
SOLVED PROBLEMS
Vectors in Rn
1.1. Determine which of the following vectors are equal:
u1 ¼ ð1; 2; 3Þ; u2 ¼ ð2; 3; 1Þ; u3 ¼ ð1; 3; 2Þ; u4 ¼ ð2; 3; 1Þ
Vectors are equal only when corresponding entries are equal; hence, only u2 ¼ u4.
CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors 13
1.2. Let u ¼ ð2; 7; 1Þ, v ¼ ð3; 0; 4Þ, w ¼ ð0; 5; 8Þ. Find:
(a) 3u  4v,
(b) 2u þ 3v  5w.
First perform the scalar multiplication and then the vector addition.
(a) 3u  4v ¼ 3ð2; 7; 1Þ  4ð3; 0; 4Þ ¼ ð6; 21; 3Þ þ ð12; 0; 16Þ ¼ ð18; 21; 13Þ
(b) 2u þ 3v  5w ¼ ð4; 14; 2Þ þ ð9; 0; 12Þ þ ð0; 25; 40Þ ¼ ð5; 39; 54Þ
1.3. Let u ¼
5
3
4
2
4
3
5; v ¼
1
5
2
2
4
3
5; w ¼
3
1
2
2
4
3
5. Find:
(a) 5u  2v,
(b) 2u þ 4v  3w.
First perform the scalar multiplication and then the vector addition:
(a) 5u  2v ¼ 5
5
3
4
2
4
3
5  2
1
5
2
2
4
3
5 ¼
25
15
20
2
4
3
5 þ
2
10
4
2
4
3
5 ¼
27
5
24
2
4
3
5
(b) 2u þ 4v  3w ¼
10
6
8
2
4
3
5 þ
4
20
8
2
4
3
5 þ
9
3
6
2
4
3
5 ¼
23
17
22
2
4
3
5
1.4. Find x and y, where: (a) ðx; 3Þ ¼ ð2; x þ yÞ, (b) ð4; yÞ ¼ xð2; 3Þ.
(a) Because the vectors are equal, set the corresponding entries equal to each other, yielding
x ¼ 2; 3 ¼ x þ y
Solve the linear equations, obtaining x ¼ 2; y ¼ 1:
(b) First multiply by the scalar x to obtain ð4; yÞ ¼ ð2x; 3xÞ. Then set corresponding entries equal to each
other to obtain
4 ¼ 2x; y ¼ 3x
Solve the equations to yield x ¼ 2, y ¼ 6.
1.5. Write the vector v ¼ ð1; 2; 5Þ as a linear combination of the vectors u1 ¼ ð1; 1; 1Þ, u2 ¼ ð1; 2; 3Þ,
u3 ¼ ð2; 1; 1Þ.
We want to express v in the form v ¼ xu1 þ yu2 þ zu3 with x; y; z as yet unknown. First we have
1
2
5
2
4
3
5 ¼ x
1
1
1
2
4
3
5 þ y
1
2
3
2
4
3
5 þ z
2
1
1
2
4
3
5 ¼
x þ y þ 2z
x þ 2y  z
x þ 3y þ z
2
4
3
5
(It is more convenient to write vectors as columns than as rows when forming linear combinations.) Set
corresponding entries equal to each other to obtain
x þ y þ 2z ¼ 1
x þ 2y  z ¼ 2
x þ 3y þ z ¼ 5
or
x þ y þ 2z ¼ 1
y  3z ¼ 3
2y  z ¼ 4
or
x þ y þ 2z ¼ 1
y  3z ¼ 3
5z ¼ 10
This unique solution of the triangular system is x ¼ 6, y ¼ 3, z ¼ 2. Thus, v ¼ 6u1 þ 3u2 þ 2u3.
14 CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors
1.6. Write v ¼ ð2; 5; 3Þ as a linear combination of
u1 ¼ ð1; 3; 2Þ; u2 ¼ ð2; 4; 1Þ; u3 ¼ ð1; 5; 7Þ:
Find the equivalent system of linear equations and then solve. First,
2
5
3
2
4
3
5 ¼ x
1
3
2
2
4
3
5 þ y
2
4
1
2
4
3
5 þ z
1
5
7
2
4
3
5 ¼
x þ 2y þ z
3x  4y  5z
2x  y þ 7z
2
4
3
5
Set the corresponding entries equal to each other to obtain
x þ 2y þ z ¼ 2
3x  4y  5z ¼ 5
2x  y þ 7z ¼ 3
or
x þ 2y þ z ¼ 2
2y  2z ¼ 1
 5y þ 5z ¼ 1
or
x þ 2y þ z ¼ 2
2y  2z ¼ 1
0 ¼ 3
The third equation, 0x þ 0y þ 0z ¼ 3, indicates that the system has no solution. Thus, v cannot be written as
a linear combination of the vectors u1, u2, u3.
Dot (Inner) Product, Orthogonality, Norm in Rn
1.7. Find u  v where:
(a) u ¼ ð2; 5; 6Þ and v ¼ ð8; 2; 3Þ,
(b) u ¼ ð4; 2; 3; 5; 1Þ and v ¼ ð2; 6; 1; 4; 8Þ.
Multiply the corresponding components and add:
(a) u  v ¼ 2ð8Þ  5ð2Þ þ 6ð3Þ ¼ 16  10  18 ¼ 12
(b) u  v ¼ 8 þ 12 þ 3  20  8 ¼ 5
1.8. Let u ¼ ð5; 4; 1Þ, v ¼ ð3; 4; 1Þ, w ¼ ð1; 2; 3Þ. Which pair of vectors, if any, are perpendicular
(orthogonal)?
Find the dot product of each pair of vectors:
u  v ¼ 15  16 þ 1 ¼ 0; v  w ¼ 3 þ 8 þ 3 ¼ 14; u  w ¼ 5  8 þ 3 ¼ 0
Thus, u and v are orthogonal, u and w are orthogonal, but v and w are not.
1.9. Find k so that u and v are orthogonal, where:
(a) u ¼ ð1; k; 3Þ and v ¼ ð2; 5; 4Þ,
(b) u ¼ ð2; 3k; 4; 1; 5Þ and v ¼ ð6; 1; 3; 7; 2kÞ.
Compute u  v, set u  v equal to 0, and then solve for k:
(a) u  v ¼ 1ð2Þ þ kð5Þ  3ð4Þ ¼ 5k  10. Then 5k  10 ¼ 0, or k ¼ 2.
(b) u  v ¼ 12  3k  12 þ 7 þ 10k ¼ 7k þ 7. Then 7k þ 7 ¼ 0, or k ¼ 1.
1.10. Find kuk, where: (a) u ¼ ð3; 12; 4Þ, (b) u ¼ ð2; 3; 8; 7Þ.
First find kuk2
¼ u  u by squaring the entries and adding. Then kuk ¼
ffiffiffiffiffiffiffiffiffiffi
kuk2
q
.
(a) kuk2
¼ ð3Þ2
þ ð12Þ2
þ ð4Þ2
¼ 9 þ 144 þ 16 ¼ 169. Then kuk ¼
ffiffiffiffiffiffiffiffi
169
p
¼ 13.
(b) kuk2
¼ 4 þ 9 þ 64 þ 49 ¼ 126. Then kuk ¼
ffiffiffiffiffiffiffiffi
126
p
.
CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors 15
1.11. Recall that normalizing a nonzero vector v means finding the unique unit vector ^
v in the same
direction as v, where
^
v ¼
1
kvk
v
Normalize: (a) u ¼ ð3; 4Þ, (b) v ¼ ð4; 2; 3; 8Þ, (c) w ¼ ð1
2, 2
3,  1
4).
(a) First find kuk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffi
9 þ 16
p
¼
ffiffiffiffiffi
25
p
¼ 5. Then divide each entry of u by 5, obtaining ^
u ¼ ð3
5,  4
5).
(b) Here kvk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
16 þ 4 þ 9 þ 64
p
¼
ffiffiffiffiffi
93
p
. Then
^
v ¼
4
ffiffiffiffiffi
93
p ;
2
ffiffiffiffiffi
93
p ;
3
ffiffiffiffiffi
93
p ;
8
ffiffiffiffiffi
93
p
 
(c) Note that w and any positive multiple of w will have the same normalized form. Hence, first multiply w
by 12 to ‘‘clear fractions’’—that is, first find w0
¼ 12w ¼ ð6; 8; 3Þ. Then
kw0
k ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
36 þ 64 þ 9
p
¼
ffiffiffiffiffiffiffiffi
109
p
and ^
w ¼ b
w0 ¼
6
ffiffiffiffiffiffiffiffi
109
p ;
8
ffiffiffiffiffiffiffiffi
109
p ;
3
ffiffiffiffiffiffiffiffi
109
p
 
1.12. Let u ¼ ð1; 3; 4Þ and v ¼ ð3; 4; 7Þ. Find:
(a) cos y, where y is the angle between u and v;
(b) projðu; vÞ, the projection of u onto v;
(c) dðu; vÞ, the distance between u and v.
First find u  v ¼ 3  12 þ 28 ¼ 19, kuk2
¼ 1 þ 9 þ 16 ¼ 26, kvk2
¼ 9 þ 16 þ 49 ¼ 74. Then
(a) cos y ¼
u  v
kukkvk
¼
19
ffiffiffiffiffi
26
p ffiffiffiffiffi
74
p ,
(b) projðu; vÞ ¼
u  v
kvk2
v ¼
19
74
ð3; 4; 7Þ ¼
57
74
;
76
74
;
133
74
 
¼
57
74
;
38
37
;
133
74
 
;
(c) dðu; vÞ ¼ ku  vk ¼ kð2; 7  3Þk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
4 þ 49 þ 9
p
¼
ffiffiffiffiffi
62
p
:
1.13. Prove Theorem 1.2: For any u; v; w in Rn
and k in R:
(i) ðu þ vÞ  w ¼ u  w þ v  w, (ii) ðkuÞ  v ¼ kðu  vÞ, (iii) u  v ¼ v  u,
(iv) u  u  0, and u  u ¼ 0 iff u ¼ 0.
Let u ¼ ðu1; u2; . . . ; unÞ, v ¼ ðv1; v2; . . . ; vnÞ, w ¼ ðw1; w2; . . . ; wnÞ.
(i) Because u þ v ¼ ðu1 þ v1; u2 þ v2; . . . ; un þ vnÞ,
ðu þ vÞ  w ¼ ðu1 þ v1Þw1 þ ðu2 þ v2Þw2 þ    þ ðun þ vnÞwn
¼ u1w1 þ v1w1 þ u2w2 þ    þ unwn þ vnwn
¼ ðu1w1 þ u2w2 þ    þ unwnÞ þ ðv1w1 þ v2w2 þ    þ vnwnÞ
¼ u  w þ v  w
(ii) Because ku ¼ ðku1; ku2; . . . ; kunÞ,
ðkuÞ  v ¼ ku1v1 þ ku2v2 þ    þ kunvn ¼ kðu1v1 þ u2v2 þ    þ unvnÞ ¼ kðu  vÞ
(iii) u  v ¼ u1v1 þ u2v2 þ    þ unvn ¼ v1u1 þ v2u2 þ    þ vnun ¼ v  u
(iv) Because u2
i is nonnegative for each i, and because the sum of nonnegative real numbers is nonnegative,
u  u ¼ u2
1 þ u2
2 þ    þ u2
n  0
Furthermore, u  u ¼ 0 iff ui ¼ 0 for each i, that is, iff u ¼ 0.
16 CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors
1.14. Prove Theorem 1.3 (Schwarz): ju  vj  kukkvk.
For any real number t, and using Theorem 1.2, we have
0  ðtu þ vÞ  ðtu þ vÞ ¼ t2
ðu  uÞ þ 2tðu  vÞ þ ðv  vÞ ¼ kuk2
t2
þ 2ðu  vÞt þ kvk2
Let a ¼ kuk2
, b ¼ 2ðu  vÞ, c ¼ kvk2
. Then, for every value of t, at2
þ bt þ c  0. This means that the
quadratic polynomial cannot have two real roots. This implies that the discriminant D ¼ b2
 4ac  0 or,
equivalently, b2
 4ac. Thus,
4ðu  vÞ2
 4kuk2
kvk2
Dividing by 4 gives us our result.
1.15. Prove Theorem 1.4 (Minkowski): ku þ vk  kuk þ kvk.
By the Schwarz inequality and other properties of the dot product,
ku þ vk2
¼ ðu þ vÞ  ðu þ vÞ ¼ ðu  uÞ þ 2ðu  vÞ þ ðv  vÞ  kuk2
þ 2kukkvk þ kvk2
¼ ðkuk þ kvkÞ2
Taking the square root of both sides yields the desired inequality.
Points, Lines, Hyperplanes in Rn
Here we distinguish between an n-tuple Pða1; a2; . . . ; anÞ viewed as a point in Rn
and an n-tuple
u ¼ ½c1; c2; . . . ; cn viewed as a vector (arrow) from the origin O to the point Cðc1; c2; . . . ; cnÞ.
1.16. Find the vector u identified with the directed line segment PQ
!
for the points:
(a) Pð1; 2; 4Þ and Qð6; 1; 5Þ in R3
, (b) Pð2; 3; 6; 5Þ and Qð7; 1; 4; 8Þ in R4
.
(a) u ¼ PQ
!
¼ Q  P ¼ ½6  1; 1  ð2Þ; 5  4 ¼ ½5; 3; 9
(b) u ¼ PQ
!
¼ Q  P ¼ ½7  2; 1  3; 4 þ 6; 8  5 ¼ ½5; 2; 10; 13
1.17. Find an equation of the hyperplane H in R4
that passes through Pð3; 4; 1; 2Þ and is normal to
u ¼ ½2; 5; 6; 3.
The coefficients of the unknowns of an equation of H are the components of the normal vector u. Thus, an
equation of H is of the form 2x1 þ 5x2  6x3  3x4 ¼ k. Substitute P into this equation to obtain k ¼ 26.
Thus, an equation of H is 2x1 þ 5x2  6x3  3x4 ¼ 26.
1.18. Find an equation of the plane H in R3
that contains Pð1; 3; 4Þ and is parallel to the plane H0
determined by the equation 3x  6y þ 5z ¼ 2.
The planes H and H0
are parallel if and only if their normal directions are parallel or antiparallel (opposite
direction). Hence, an equation of H is of the form 3x  6y þ 5z ¼ k. Substitute P into this equation to obtain
k ¼ 1. Then an equation of H is 3x  6y þ 5z ¼ 1.
1.19. Find a parametric representation of the line L in R4
passing through Pð4; 2; 3; 1Þ in the direction
of u ¼ ½2; 5; 7; 8.
Here L consists of the points XðxiÞ that satisfy
X ¼ P þ tu or xi ¼ ait þ bi or LðtÞ ¼ ðait þ biÞ
where the parameter t takes on all real values. Thus we obtain
x1 ¼ 4 þ 2t; x2 ¼ 2 þ 2t; x3 ¼ 3  7t; x4 ¼ 1 þ 8t or LðtÞ ¼ ð4 þ 2t; 2 þ 2t; 3  7t; 1 þ 8tÞ
CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors 17
1.20. Let C be the curve FðtÞ ¼ ðt2
; 3t  2; t3
; t2
þ 5Þ in R4
, where 0  t  4.
(a) Find the point P on C corresponding to t ¼ 2.
(b) Find the initial point Q and terminal point Q 0
of C.
(c) Find the unit tangent vector T to the curve C when t ¼ 2.
(a) Substitute t ¼ 2 into FðtÞ to get P ¼ f ð2Þ ¼ ð4; 4; 8; 9Þ.
(b) The parameter t ranges from t ¼ 0 to t ¼ 4. Hence, Q ¼ f ð0Þ ¼ ð0; 2; 0; 5Þ and
Q 0
¼ Fð4Þ ¼ ð16; 10; 64; 21Þ.
(c) Take the derivative of FðtÞ—that is, of each component of FðtÞ—to obtain a vector V that is tangent to
the curve:
VðtÞ ¼
dFðtÞ
dt
¼ ½2t; 3; 3t2
; 2t
Now find V when t ¼ 2; that is, substitute t ¼ 2 in the equation for VðtÞ to obtain
V ¼ Vð2Þ ¼ ½4; 3; 12; 4. Then normalize V to obtain the desired unit tangent vector T. We have
kVk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
16 þ 9 þ 144 þ 16
p
¼
ffiffiffiffiffiffiffiffi
185
p
and T ¼
4
ffiffiffiffiffiffiffiffi
185
p ;
3
ffiffiffiffiffiffiffiffi
185
p ;
12
ffiffiffiffiffiffiffiffi
185
p ;
4
ffiffiffiffiffiffiffiffi
185
p
 
Spatial Vectors (Vectors in R3
), ijk Notation, Cross Product
1.21. Let u ¼ 2i  3j þ 4k, v ¼ 3i þ j  2k, w ¼ i þ 5j þ 3k. Find:
(a) u þ v, (b) 2u  3v þ 4w, (c) u  v and u  w, (d) kuk and kvk.
Treat the coefficients of i, j, k just like the components of a vector in R3
.
(a) Add corresponding coefficients to get u þ v ¼ 5i  2j  2k.
(b) First perform the scalar multiplication and then the vector addition:
2u  3v þ 4w ¼ ð4i  6j þ 8kÞ þ ð9i þ 3j þ 6kÞ þ ð4i þ 20j þ 12kÞ
¼ i þ 17j þ 26k
(c) Multiply corresponding coefficients and then add:
u  v ¼ 6  3  8 ¼ 5 and u  w ¼ 2  15 þ 12 ¼ 1
(d) The norm is the square root of the sum of the squares of the coefficients:
kuk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
4 þ 9 þ 16
p
¼
ffiffiffiffiffi
29
p
and kvk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
9 þ 1 þ 4
p
¼
ffiffiffiffiffi
14
p
1.22. Find the (parametric) equation of the line L:
(a) through the points Pð1; 3; 2Þ and Qð2; 5; 6Þ;
(b) containing the point Pð1; 2; 4Þ and perpendicular to the plane H given by the equation
3x þ 5y þ 7z ¼ 15:
(a) First find v ¼ PQ
!
¼ Q  P ¼ ½1; 2; 8 ¼ i þ 2j  8k. Then
LðtÞ ¼ ðt þ 1; 2t þ 3; 8t þ 2Þ ¼ ðt þ 1Þi þ ð2t þ 3Þj þ ð8t þ 2Þk
(b) Because L is perpendicular to H, the line L is in the same direction as the normal vector
N ¼ 3i þ 5j þ 7k to H. Thus,
LðtÞ ¼ ð3t þ 1; 5t  2; 7t þ 4Þ ¼ ð3t þ 1Þi þ ð5t  2Þj þ ð7t þ 4Þk
1.23. Let S be the surface xy2
þ 2yz ¼ 16 in R3
.
(a) Find the normal vector Nðx; y; zÞ to the surface S.
(b) Find the tangent plane H to S at the point Pð1; 2; 3Þ.
18 CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors
(a) The formula for the normal vector to a surface Fðx; y; zÞ ¼ 0 is
Nðx; y; zÞ ¼ Fxi þ Fyj þ Fzk
where Fx, Fy, Fz are the partial derivatives. Using Fðx; y; zÞ ¼ xy2
þ 2yz  16, we obtain
Fx ¼ y2
; Fy ¼ 2xy þ 2z; Fz ¼ 2y
Thus, Nðx; y; zÞ ¼ y2
i þ ð2xy þ 2zÞj þ 2yk.
(b) The normal to the surface S at the point P is
NðPÞ ¼ Nð1; 2; 3Þ ¼ 4i þ 10j þ 4k
Hence, N ¼ 2i þ 5j þ 2k is also normal to S at P. Thus an equation of H has the form 2x þ 5y þ 2z ¼ c.
Substitute P in this equation to obtain c ¼ 18. Thus the tangent plane H to S at P is 2x þ 5y þ 2z ¼ 18.
1.24. Evaluate the following determinants and negative of determinants of order two:
(a) (i)
3 4
5 9







, (ii)
2 1
4 3







, (iii)
4 5
3 2








(b) (i) 
3 6
4 2







, (ii) 
7 5
3 2







, (iii) 
4 1
8 3








Use
a b
c d







 ¼ ad  bc and 
a b
c d







 ¼ bc  ad. Thus,
(a) (i) 27  20 ¼ 7, (ii) 6 þ 4 ¼ 10, (iii) 8 þ 15 ¼ 7:
(b) (i) 24  6 ¼ 18, (ii) 15  14 ¼ 29, (iii) 8 þ 12 ¼ 4:
1.25. Let u ¼ 2i  3j þ 4k, v ¼ 3i þ j  2k, w ¼ i þ 5j þ 3k.
Find: (a) u v, (b) u w
(a) Use
2 3 4
3 1 2
 
to get u v ¼ ð6  4Þi þ ð12 þ 4Þj þ ð2 þ 9Þk ¼ 2i þ 16j þ 11k:
(b) Use
2 3 4
1 5 3
 
to get u w ¼ ð9  20Þi þ ð4  6Þj þ ð10 þ 3Þk ¼ 29i  2j þ 13k:
1.26. Find u v, where: (a) u ¼ ð1; 2; 3Þ, v ¼ ð4; 5; 6Þ; (b) u ¼ ð4; 7; 3Þ, v ¼ ð6; 5; 2Þ.
(a) Use
1 2 3
4 5 6
 
to get u v ¼ ½12  15; 12  6; 5  8 ¼ ½3; 6; 3:
(b) Use
4 7 3
6 5 2
 
to get u v ¼ ½14 þ 15; 18 þ 8; 20  42 ¼ ½29; 26; 22:
1.27. Find a unit vector u orthogonal to v ¼ ½1; 3; 4 and w ¼ ½2; 6; 5.
First find v w, which is orthogonal to v and w.
The array
1 3 4
2 6 5
 
gives v w ¼ ½15 þ 24; 8 þ 5; 6  61 ¼ ½9; 13; 12:
Normalize v w to get u ¼ ½9=
ffiffiffiffiffiffiffiffi
394
p
, 13=
ffiffiffiffiffiffiffiffi
394
p
, 12=
ffiffiffiffiffiffiffiffi
394
p
:
1.28. Let u ¼ ða1; a2; a3Þ and v ¼ ðb1; b2; b3Þ so u v ¼ ða2b3  a3b2; a3b1  a1b3; a1b2  a2b1Þ.
Prove:
(a) u v is orthogonal to u and v [Theorem 1.5(a)].
(b) ku vk2
¼ ðu  uÞðv  vÞ  ðu  vÞ2
(Lagrange’s identity).
CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors 19
(a) We have
u  ðu vÞ ¼ a1ða2b3  a3b2Þ þ a2ða3b1  a1b3Þ þ a3ða1b2  a2b1Þ
¼ a1a2b3  a1a3b2 þ a2a3b1  a1a2b3 þ a1a3b2  a2a3b1 ¼ 0
Thus, u v is orthogonal to u. Similarly, u v is orthogonal to v.
(b) We have
ku vk2
¼ ða2b3  a3b2Þ2
þ ða3b1  a1b3Þ2
þ ða1b2  a2b1Þ2
ð1Þ
ðu  uÞðv  vÞ  ðu  vÞ2
¼ ða2
1 þ a2
2 þ a2
3Þðb2
1 þ b2
2 þ b2
3Þ  ða1b1 þ a2b2 þ a3b3Þ2
ð2Þ
Expansion of the right-hand sides of (1) and (2) establishes the identity.
Complex Numbers, Vectors in Cn
1.29. Suppose z ¼ 5 þ 3i and w ¼ 2  4i. Find: (a) z þ w, (b) z  w, (c) zw.
Use the ordinary rules of algebra together with i2
¼ 1 to obtain a result in the standard form a þ bi.
(a) z þ w ¼ ð5 þ 3iÞ þ ð2  4iÞ ¼ 7  i
(b) z  w ¼ ð5 þ 3iÞ  ð2  4iÞ ¼ 5 þ 3i  2 þ 4i ¼ 3 þ 7i
(c) zw ¼ ð5 þ 3iÞð2  4iÞ ¼ 10  14i  12i2
¼ 10  14i þ 12 ¼ 22  14i
1.30. Simplify: (a) ð5 þ 3iÞð2  7iÞ, (b) ð4  3iÞ2
, (c) ð1 þ 2iÞ3
.
(a) ð5 þ 3iÞð2  7iÞ ¼ 10 þ 6i  35i  21i2
¼ 31  29i
(b) ð4  3iÞ2
¼ 16  24i þ 9i2
¼ 7  24i
(c) ð1 þ 2iÞ3
¼ 1 þ 6i þ 12i2
þ 8i3
¼ 1 þ 6i  12  8i ¼ 11  2i
1.31. Simplify: (a) i0
; i3
; i4
, (b) i5
; i6
; i7
; i8
, (c) i39
; i174
, i252
, i317
:
(a) i0
¼ 1, i3
¼ i2
ðiÞ ¼ ð1ÞðiÞ ¼ i; i4
¼ ði2
Þði2
Þ ¼ ð1Þð1Þ ¼ 1
(b) i5
¼ ði4
ÞðiÞ ¼ ð1ÞðiÞ ¼ i, i6
¼ ði4
Þði2
Þ ¼ ð1Þði2
Þ ¼ i2
¼ 1, i7
¼ i3
¼ i, i8
¼ i4
¼ 1
(c) Using i4
¼ 1 and in
¼ i4qþr
¼ ði4
Þq
ir
¼ 1q
ir
¼ ir
, divide the exponent n by 4 to obtain the remainder r:
i39
¼ i4ð9Þþ3
¼ ði4
Þ9
i3
¼ 19
i3
¼ i3
¼ i; i174
¼ i2
¼ 1; i252
¼ i0
¼ 1; i317
¼ i1
¼ i
1.32. Find the complex conjugate of each of the following:
(a) 6 þ 4i, 7  5i, 4 þ i, 3  i, (b) 6, 3, 4i, 9i.
(a) 6 þ 4i ¼ 6  4i, 7  5i ¼ 7 þ 5i, 4 þ i ¼ 4  i, 3  i ¼ 3 þ i
(b) 
6 ¼ 6, 3 ¼ 3, 4i ¼ 4i, 9i ¼ 9i
(Note that the conjugate of a real number is the original number, but the conjugate of a pure imaginary
number is the negative of the original number.)
1.33. Find z
z and jzj when z ¼ 3 þ 4i.
For z ¼ a þ bi, use z
z ¼ a2
þ b2
and z ¼
ffiffiffiffi
z
z
p
¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
a2 þ b2
p
.
z
z ¼ 9 þ 16 ¼ 25; jzj ¼
ffiffiffiffiffi
25
p
¼ 5
1.34. Simpify
2  7i
5 þ 3i
:
To simplify a fraction z=w of complex numbers, multiply both numerator and denominator by 
w, the
conjugate of the denominator:
2  7i
5 þ 3i
¼
ð2  7iÞð5  3iÞ
ð5 þ 3iÞð5  3iÞ
¼
11  41i
34
¼ 
11
34

41
34
i
20 CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors
1.35. Prove: For any complex numbers z, w 2 C, (i) z þ w ¼ 
z þ 
w, (ii) zw ¼ 
z
w, (iii) 
z ¼ z.
Suppose z ¼ a þ bi and w ¼ c þ di where a; b; c; d 2 R.
(i) z þ w ¼ ða þ biÞ þ ðc þ diÞ ¼ ða þ cÞ þ ðb þ dÞi
¼ ða þ cÞ  ðb þ dÞi ¼ a þ c  bi  di
¼ ða  biÞ þ ðc  diÞ ¼ 
z þ 
w
(ii) zw ¼ ða þ biÞðc þ diÞ ¼ ðac  bdÞ þ ðad þ bcÞi
¼ ðac  bdÞ  ðad þ bcÞi ¼ ða  biÞðc  diÞ ¼ 
z
w
(iii) 
z ¼ a þ bi ¼ a  bi ¼ a  ðbÞi ¼ a þ bi ¼ z
1.36. Prove: For any complex numbers z; w 2 C, jzwj ¼ jzjjwj.
By (ii) of Problem 1.35,
jzwj2
¼ ðzwÞðzwÞ ¼ ðzwÞð
z
wÞ ¼ ðz
zÞðw
wÞ ¼ jzj2
jwj2
The square root of both sides gives us the desired result.
1.37. Prove: For any complex numbers z; w 2 C, jz þ wj  jzj þ jwj.
Suppose z ¼ a þ bi and w ¼ c þ di where a; b; c; d 2 R. Consider the vectors u ¼ ða; bÞ and v ¼ ðc; dÞ in
R2
. Note that
jzj ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
a2 þ b2
p
¼ kuk; jwj ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
c2 þ d2
p
¼ kvk
and
jz þ wj ¼ jða þ cÞ þ ðb þ dÞij ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ða þ cÞ2
þ ðb þ dÞ2
q
¼ kða þ c; b þ dÞk ¼ ku þ vk
By Minkowski’s inequality (Problem 1.15), ku þ vk  kuk þ kvk, and so
jz þ wj ¼ ku þ vk  kuk þ kvk ¼ jzj þ jwj
1.38. Find the dot products u  v and v  u where: (a) u ¼ ð1  2i; 3 þ iÞ, v ¼ ð4 þ 2i; 5  6iÞ;
(b) u ¼ ð3  2i; 4i; 1 þ 6iÞ, v ¼ ð5 þ i; 2  3i; 7 þ 2iÞ.
Recall that conjugates of the second vector appear in the dot product
ðz1; . . . ; znÞ  ðw1; . . . ; wnÞ ¼ z1 
w1 þ    þ zn 
wn
(a) u  v ¼ ð1  2iÞð4 þ 2iÞ þ ð3 þ iÞð5  6iÞ
¼ ð1  2iÞð4  2iÞ þ ð3 þ iÞð5 þ 6iÞ ¼ 10i þ 9 þ 23i ¼ 9 þ 13i
v  u ¼ ð4 þ 2iÞð1  2iÞ þ ð5  6iÞð3 þ iÞ
¼ ð4 þ 2iÞð1 þ 2iÞ þ ð5  6iÞð3  iÞ ¼ 10i þ 9  23i ¼ 9  13i
(b) u  v ¼ ð3  2iÞð5 þ iÞ þ ð4iÞð2  3iÞ þ ð1 þ 6iÞð7 þ 2iÞ
¼ ð3  2iÞð5  iÞ þ ð4iÞð2 þ 3iÞ þ ð1 þ 6iÞð7  2iÞ ¼ 20 þ 35i
v  u ¼ ð5 þ iÞð3  2iÞ þ ð2  3iÞð4iÞ þ ð7 þ 2iÞð1 þ 6iÞ
¼ ð5 þ iÞð3 þ 2iÞ þ ð2  3iÞð4iÞ þ ð7 þ 2iÞð1  6iÞ ¼ 20  35i
In both cases, v  u ¼ u  v. This holds true in general, as seen in Problem 1.40.
1.39. Let u ¼ ð7  2i; 2 þ 5iÞ and v ¼ ð1 þ i; 3  6iÞ. Find:
(a) u þ v, (b) 2iu, (c) ð3  iÞv, (d) u  v, (e) kuk and kvk.
(a) u þ v ¼ ð7  2i þ 1 þ i; 2 þ 5i  3  6iÞ ¼ ð8  i; 1  iÞ
(b) 2iu ¼ ð14i  4i2
; 4i þ 10i2
Þ ¼ ð4 þ 14i; 10 þ 4iÞ
(c) ð3  iÞv ¼ ð3 þ 3i  i  i2
; 9  18i þ 3i þ 6i2
Þ ¼ ð4 þ 2i; 15  15iÞ
CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors 21
(d) u  v ¼ ð7  2iÞð1 þ iÞ þ ð2 þ 5iÞð3  6iÞ
¼ ð7  2iÞð1  iÞ þ ð2 þ 5iÞð3 þ 6iÞ ¼ 5  9i  36  3i ¼ 31  12i
(e) kuk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
72 þ ð2Þ2
þ 22 þ 52
q
¼
ffiffiffiffiffi
82
p
and kvk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
12 þ 12 þ ð3Þ2
þ ð6Þ2
q
¼
ffiffiffiffiffi
47
p
1.40. Prove: For any vectors u; v 2 Cn
and any scalar z 2 C, (i) u  v ¼ v  u, (ii) ðzuÞ  v ¼ zðu  vÞ,
(iii) u  ðzvÞ ¼ 
zðu  vÞ.
Suppose u ¼ ðz1; z2; . . . ; znÞ and v ¼ ðw1; w2; . . . ; wnÞ.
(i) Using the properties of the conjugate,
v  u ¼ w1
z1 þ w2
z2 þ    þ wn
zn ¼ w1
z1 þ w2
z2 þ    þ wn
zn
¼ 
w1z1 þ 
w2z2 þ    þ 
wnzn ¼ z1 
w1 þ z2 
w2 þ    þ zn 
wn ¼ u  v
(ii) Because zu ¼ ðzz1; zz2; . . . ; zznÞ,
ðzuÞ  v ¼ zz1 
w1 þ zz2 
w2 þ    þ zzn 
wn ¼ zðz1 
w1 þ z2 
w2 þ    þ zn 
wnÞ ¼ zðu  vÞ
(Compare with Theorem 1.2 on vectors in Rn
.)
(iii) Using (i) and (ii),
u  ðzvÞ ¼ ðzvÞ  u ¼ zðv  uÞ ¼ 
zðv  uÞ ¼ 
zðu  vÞ
SUPPLEMENTARY PROBLEMS
Vectors in Rn
1.41. Let u ¼ ð1; 2; 4Þ, v ¼ ð3; 5; 1Þ, w ¼ ð2; 1; 3Þ. Find:
(a) 3u  2v; (b) 5u þ 3v  4w; (c) u  v, u  w, v  w; (d) kuk, kvk;
(e) cos y, where y is the angle between u and v; (f ) dðu; vÞ; (g) projðu; vÞ.
1.42. Repeat Problem 1.41 for vectors u ¼
1
3
4
2
4
3
5, v ¼
2
1
5
2
4
3
5, w ¼
3
2
6
2
4
3
5.
1.43. Let u ¼ ð2; 5; 4; 6; 3Þ and v ¼ ð5; 2; 1; 7; 4Þ. Find:
(a) 4u  3v; (b) 5u þ 2v; (c) u  v; (d) kuk and kvk; (e) projðu; vÞ; ( f ) dðu; vÞ.
1.44. Normalize each vector:
(a) u ¼ ð5; 7Þ; (b) v ¼ ð1; 2; 2; 4Þ; (c) w ¼
1
2
; 
1
3
;
3
4
 
.
1.45. Let u ¼ ð1; 2; 2Þ, v ¼ ð3; 12; 4Þ, and k ¼ 3.
(a) Find kuk, kvk, ku þ vk, kkuk:
(b) Verify that kkuk ¼ jkjkuk and ku þ vk  kuk þ kvk.
1.46. Find x and y where:
(a) ðx; y þ 1Þ ¼ ðy  2; 6Þ; (b) xð2; yÞ ¼ yð1; 2Þ.
1.47. Find x; y; z where ðx; y þ 1; y þ zÞ ¼ ð2x þ y; 4; 3zÞ.
22 CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors
1.48. Write v ¼ ð2; 5Þ as a linear combination of u1 and u2, where:
(a) u1 ¼ ð1; 2Þ and u2 ¼ ð3; 5Þ;
(b) u1 ¼ ð3; 4Þ and u2 ¼ ð2; 3Þ.
1.49. Write v ¼
9
3
16
2
4
3
5 as a linear combination of u1 ¼
1
3
3
2
4
3
5, u2 ¼
2
5
1
2
4
3
5, u3 ¼
4
2
3
2
4
3
5.
1.50. Find k so that u and v are orthogonal, where:
(a) u ¼ ð3; k; 2Þ, v ¼ ð6; 4; 3Þ;
(b) u ¼ ð5; k; 4; 2Þ, v ¼ ð1; 3; 2; 2kÞ;
(c) u ¼ ð1; 7; k þ 2; 2Þ, v ¼ ð3; k; 3; kÞ.
Located Vectors, Hyperplanes, Lines in Rn
1.51. Find the vector v identified with the directed line segment PQ
!
for the points:
(a) Pð2; 3; 7Þ and Qð1; 6; 5Þ in R3
;
(b) Pð1; 8; 4; 6Þ and Qð3; 5; 2; 4Þ in R4
.
1.52. Find an equation of the hyperplane H in R4
that:
(a) contains Pð1; 2; 3; 2Þ and is normal to u ¼ ½2; 3; 5; 6;
(b) contains Pð3; 1; 2; 5Þ and is parallel to 2x1  3x2 þ 5x3  7x4 ¼ 4.
1.53. Find a parametric representation of the line in R4
that:
(a) passes through the points Pð1; 2; 1; 2Þ and Qð3; 5; 7; 9Þ;
(b) passes through Pð1; 1; 3; 3Þ and is perpendicular to the hyperplane 2x1 þ 4x2 þ 6x3  8x4 ¼ 5.
Spatial Vectors (Vectors in R3
), ijk Notation
1.54. Given u ¼ 3i  4j þ 2k, v ¼ 2i þ 5j  3k, w ¼ 4i þ 7j þ 2k. Find:
(a) 2u  3v; (b) 3u þ 4v  2w; (c) u  v, u  w, v  w; (d) kuk, kvk, kwk.
1.55. Find the equation of the plane H:
(a) with normal N ¼ 3i  4j þ 5k and containing the point Pð1; 2; 3Þ;
(b) parallel to 4x þ 3y  2z ¼ 11 and containing the point Qð2; 1; 3Þ.
1.56. Find the (parametric) equation of the line L:
(a) through the point Pð2; 5; 3Þ and in the direction of v ¼ 4i  5j þ 7k;
(b) perpendicular to the plane 2x  3y þ 7z ¼ 4 and containing Pð1; 5; 7Þ.
1.57. Consider the following curve C in R3
where 0  t  5:
FðtÞ ¼ t3
i  t2
j þ ð2t  3Þk
(a) Find the point P on C corresponding to t ¼ 2.
(b) Find the initial point Q and the terminal point Q 0
.
(c) Find the unit tangent vector T to the curve C when t ¼ 2.
1.58. Consider a moving body B whose position at time t is given by RðtÞ ¼ t2
i þ t3
j þ 3tk. [Then
VðtÞ ¼ dRðtÞ=dt and AðtÞ ¼ dVðtÞ=dt denote, respectively, the velocity and acceleration of B.] When
t ¼ 1, find for the body B:
(a) position; (b) velocity v; (c) speed s; (d) acceleration a.
CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors 23
1.59. Find a normal vector N and the tangent plane H to each surface at the given point:
(a) surface x2
y þ 3yz ¼ 20 and point Pð1; 3; 2Þ;
(b) surface x2
þ 3y2
 5z2
¼ 160 and point Pð3; 2; 1Þ:
Cross Product
1.60. Evaluate the following determinants and negative of determinants of order two:
(a)
2 5
3 6







;
3 6
1 4







;
4 2
7 3








(b) 
6 4
7 5







; 
1 3
2 4







; 
8 3
6 2








1.61. Given u ¼ 3i  4j þ 2k, v ¼ 2i þ 5j  3k, w ¼ 4i þ 7j þ 2k, find:
(a) u v, (b) u w, (c) v w.
1.62. Given u ¼ ½2; 1; 3, v ¼ ½4; 2; 2, w ¼ ½1; 1; 5, find:
(a) u v, (b) u w, (c) v w.
1.63. Find the volume V of the parallelopiped formed by the vectors u; v; w appearing in:
(a) Problem 1.60 (b) Problem 1.61.
1.64. Find a unit vector u orthogonal to:
(a) v ¼ ½1; 2; 3 and w ¼ ½1; 1; 2;
(b) v ¼ 3i  j þ 2k and w ¼ 4i  2j  k.
1.65. Prove the following properties of the cross product:
(a) u v ¼ ðv uÞ (d) u ðv þ wÞ ¼ ðu vÞ þ ðu wÞ
(b) u u ¼ 0 for any vector u (e) ðv þ wÞ u ¼ ðv uÞ þ ðw uÞ
(c) ðkuÞ v ¼ kðu vÞ ¼ u ðkvÞ ( f ) ðu vÞ w ¼ ðu  wÞv  ðv  wÞu
Complex Numbers
1.66. Simplify:
(a) ð4  7iÞð9 þ 2iÞ; (b) ð3  5iÞ
2
; (c)
1
4  7i
; (d)
9 þ 2i
3  5i
; (e) ð1  iÞ
3
.
1.67. Simplify: (a)
1
2i
; (b)
2 þ 3i
7  3i
; (c) i15
; i25
; i34
; (d)
1
3  i
 2
.
1.68. Let z ¼ 2  5i and w ¼ 7 þ 3i. Find:
(a) v þ w; (b) zw; (c) z=w; (d) 
z; 
w; (e) jzj, jwj.
1.69. Show that for complex numbers z and w:
(a) Re z ¼ 1
2 ðz þ 
zÞ, (b) Im z ¼ 1
2 ðz  
z), (c) zw ¼ 0 implies z ¼ 0 or w ¼ 0.
Vectors in Cn
1.70. Let u ¼ ð1 þ 7i; 2  6iÞ and v ¼ ð5  2i; 3  4iÞ. Find:
(a) u þ v (b) ð3 þ iÞu (c) 2iu þ ð4 þ 7iÞv (d) u  v (e) kuk and kvk.
24 CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors
1.71. Prove: For any vectors u; v; w in Cn
:
(a) ðu þ vÞ  w ¼ u  w þ v  w, (b) w  ðu þ vÞ ¼ w  u þ w  v.
1.72. Prove that the norm in Cn
satisfies the following laws:
½N1 For any vector u, kuk  0; and kuk ¼ 0 if and only if u ¼ 0.
½N2 For any vector u and complex number z, kzuk ¼ jzjkuk.
½N3 For any vectors u and v, ku þ vk  kuk þ kvk.
ANSWERS TO SUPPLEMENTARY PROBLEMS
1.41. (a) ð3; 16; 4Þ; (b) (6,1,35); (c) 3; 12; 8; (d)
ffiffiffiffiffi
21
p
,
ffiffiffiffiffi
35
p
,
ffiffiffiffiffi
14
p
;
(e) 3=
ffiffiffiffiffi
21
p ffiffiffiffiffi
35
p
; ( f )
ffiffiffiffiffi
62
p
; (g)  3
35 ð3; 5; 1Þ ¼ ð 9
35,  15
35,  3
35)
1.42. (Column vectors) (a) ð1; 7; 22Þ; (b) ð1; 26; 29Þ; (c) 15; 27; 34;
(d)
ffiffiffiffiffi
26
p
,
ffiffiffiffiffi
30
p
; (e) 15=ð
ffiffiffiffiffi
26
p ffiffiffiffiffi
30
p
Þ; ( f )
ffiffiffiffiffi
86
p
; (g)  15
30 v ¼ ð1;  1
2 ;  5
2Þ
1.43. (a) ð13; 14; 13; 45; 0Þ; (b) ð20; 29; 22; 16; 23Þ; (c) 6; (d)
ffiffiffiffiffi
90
p
;
ffiffiffiffiffi
95
p
;
(e)  6
95 v; ( f )
ffiffiffiffiffiffiffiffi
167
p
1.44. (a) ð5=
ffiffiffiffiffi
76
p
; 9=
ffiffiffiffiffi
76
p
Þ; (b) ð1
5 ; 2
5 ;  2
5 ; 4
5Þ; (c) ð6=
ffiffiffiffiffiffiffiffi
133
p
; 4
ffiffiffiffiffiffiffiffi
133
p
; 9
ffiffiffiffiffiffiffiffi
133
p
Þ
1.45. (a) 3; 13;
ffiffiffiffiffiffiffiffi
120
p
; 9
1.46. (a) x ¼ 3; y ¼ 5; (b) x ¼ 0; y ¼ 0, and x ¼ 1; y ¼ 2
1.47. x ¼ 3; y ¼ 3; z ¼ 3
2
1.48. (a) v ¼ 5u1  u2; (b) v ¼ 16u1  23u2
1.49. v ¼ 3u1  u2 þ 2u3
1.50. (a) 6; (b) 3; (c) 3
2
1.51. (a) v ¼ ½1; 9; 2; (b) [2; 3; 6; 10]
1.52. (a) 2x1 þ 3x2  5x3 þ 6x4 ¼ 35; (b) 2x1  3x2 þ 5x3  7x4 ¼ 16
1.53. (a) ½2t þ 1; 7t þ 2; 6t þ 1; 11t þ 2; (b) ½2t þ 1; 4t þ 1; 6t þ 3; 8t þ 3
1.54. (a) 23j þ 13k; (b) 9i  6j  10k; (c) 20; 12; 37; (d)
ffiffiffiffiffi
29
p
;
ffiffiffiffiffi
38
p
;
ffiffiffiffiffi
69
p
1.55. (a) 3x  4y þ 5z ¼ 20; (b) 4x þ 3y  2z ¼ 1
1.56. (a) ½4t þ 2; 5t þ 5; 7t  3; (b) ½2t þ 1; 3t  5; 7t þ 7
1.57. (a) P ¼ Fð2Þ ¼ 8i  4j þ k; (b) Q ¼ Fð0Þ ¼ 3k, Q0
¼ Fð5Þ ¼ 125i  25j þ 7k;
(c) T ¼ ð6i  2j þ kÞ=
ffiffiffiffiffi
41
p
1.58. (a) i þ j þ 2k; (b) 2i þ 3j þ 2k; (c)
ffiffiffiffiffi
17
p
; (d) 2i þ 6j
1.59. (a) N ¼ 6i þ 7j þ 9k, 6x þ 7y þ 9z ¼ 45; (b) N ¼ 6i  12j  10k, 3x  6y  5z ¼ 16
CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors 25
1.60. (a) 3; 6; 26; (b) 2; 10; 34
1.61. (a) 2i þ 13j þ 23k; (b) 22i þ 2j þ 37k; (c) 31i  16j  6k
1.62. (a) ½5; 8; 6; (b) ½2; 7; 1; (c) ½7; 18; 5
1.63. (a) 143; (b) 17
1.64. (a) ð7; 1; 3Þ=
ffiffiffiffiffi
59
p
; (b) ð5i þ 11j  2kÞ=
ffiffiffiffiffiffiffiffi
150
p
1.66. (a) 50  55i; (b) 16  30i; (c) 1
65 ð4 þ 7iÞ; (d) 1
2 ð1 þ 3iÞ; (e) 2  2i
1.67. (a)  1
2 i; (b) 1
58 ð5 þ 27iÞ; (c) 1; i; 1; (d) 1
50 ð4 þ 3iÞ
1.68. (a) 9  2i; (b) 29  29i; (c) 1
61 ð1  41iÞ; (d) 2 þ 5i, 7  3i; (e)
ffiffiffiffiffi
29
p
,
ffiffiffiffiffi
58
p
1.69. (c) Hint: If zw ¼ 0, then jzwj ¼ jzjjwj ¼ j0j ¼ 0
1.70. (a) ð6 þ 5i, 5  10iÞ; (b) ð4 þ 22i, 12  16iÞ; (c) ð8  41i, 4  33iÞ;
(d) 12 þ 2i; (e)
ffiffiffiffiffi
90
p
,
ffiffiffiffiffi
54
p
26 CHAPTER 1 Vectors in Rn
and Cn
, Spatial Vectors
Algebra of Matrices
2.1 Introduction
This chapter investigates matrices and algebraic operations defined on them. These matrices may be
viewed as rectangular arrays of elements where each entry depends on two subscripts (as compared with
vectors, where each entry depended on only one subscript). Systems of linear equations and their
solutions (Chapter 3) may be efficiently investigated using the language of matrices. Furthermore, certain
abstract objects introduced in later chapters, such as ‘‘change of basis,’’ ‘‘linear transformations,’’ and
‘‘quadratic forms,’’ can be represented by these matrices (rectangular arrays). On the other hand, the
abstract treatment of linear algebra presented later on will give us new insight into the structure of these
matrices.
The entries in our matrices will come from some arbitrary, but fixed, field K. The elements of K are
called numbers or scalars. Nothing essential is lost if the reader assumes that K is the real field R.
2.2 Matrices
A matrix A over a field K or, simply, a matrix A (when K is implicit) is a rectangular array of scalars
usually presented in the following form:
A ¼
a11 a12 . . . a1n
a21 a22 . . . a2n
           
am1 am2 . . . amn
2
6
6
4
3
7
7
5
The rows of such a matrix A are the m horizontal lists of scalars:
ða11; a12; . . . ; a1nÞ; ða21; a22; . . . ; a2nÞ; . . . ; ðam1; am2; . . . ; amnÞ
and the columns of A are the n vertical lists of scalars:
a11
a21
. . .
am1
2
6
6
4
3
7
7
5;
a12
a22
. . .
am2
2
6
6
4
3
7
7
5; . . . ;
a1n
a2n
. . .
amn
2
6
6
4
3
7
7
5
Note that the element aij, called the ij-entry or ij-element, appears in row i and column j. We frequently
denote such a matrix by simply writing A ¼ ½aij.
A matrix with m rows and n columns is called an m by n matrix, written m n. The pair of numbers m
and n is called the size of the matrix. Two matrices A and B are equal, written A ¼ B, if they have the
same size and if corresponding elements are equal. Thus, the equality of two m n matrices is equivalent
to a system of mn equalities, one for each corresponding pair of elements.
A matrix with only one row is called a row matrix or row vector, and a matrix with only one column is
called a column matrix or column vector. A matrix whose entries are all zero is called a zero matrix and
will usually be denoted by 0.
27
CHAPTER 2
Matrices whose entries are all real numbers are called real matrices and are said to be matrices over R.
Analogously, matrices whose entries are all complex numbers are called complex matrices and are said to
be matrices over C. This text will be mainly concerned with such real and complex matrices.
EXAMPLE 2.1
(a) The rectangular array A ¼
1 4 5
0 3 2
 
is a 2 3 matrix. Its rows are ð1; 4; 5Þ and ð0; 3; 2Þ,
and its columns are
1
0
 
;
4
3
 
;
5
2
 
(b) The 2 4 zero matrix is the matrix 0 ¼
0 0 0 0
0 0 0 0
 
.
(c) Find x; y; z; t such that
x þ y 2z þ t
x  y z  t
 
¼
3 7
1 5
 
By definition of equality of matrices, the four corresponding entries must be equal. Thus,
x þ y ¼ 3; x  y ¼ 1; 2z þ t ¼ 7; z  t ¼ 5
Solving the above system of equations yields x ¼ 2, y ¼ 1, z ¼ 4, t ¼ 1.
2.3 Matrix Addition and Scalar Multiplication
Let A ¼ ½aij and B ¼ ½bij be two matrices with the same size, say m n matrices. The sum of A and B,
written A þ B, is the matrix obtained by adding corresponding elements from A and B. That is,
A þ B ¼
a11 þ b11 a12 þ b12 . . . a1n þ b1n
a21 þ b21 a22 þ b22 . . . a2n þ b2n
           
am1 þ bm1 am2 þ bm2 . . . amn þ bmn
2
6
6
4
3
7
7
5
The product of the matrix A by a scalar k, written k  A or simply kA, is the matrix obtained by
multiplying each element of A by k. That is,
kA ¼
ka11 ka12 . . . ka1n
ka21 ka22 . . . ka2n
           
kam1 kam2 . . . kamn
2
6
6
4
3
7
7
5
Observe that A þ B and kA are also m n matrices. We also define
A ¼ ð1ÞA and A  B ¼ A þ ðBÞ
The matrix A is called the negative of the matrix A, and the matrix A  B is called the difference of A
and B. The sum of matrices with different sizes is not defined.
28 CHAPTER 2 Algebra of Matrices
EXAMPLE 2.2 Let A ¼
1 2 3
0 4 5
 
and B ¼
4 6 8
1 3 7
 
. Then
A þ B ¼
1 þ 4 2 þ 6 3 þ 8
0 þ 1 4 þ ð3Þ 5 þ ð7Þ
 #
¼
5 4 11
1 1 2
 #
3A ¼
3ð1Þ 3ð2Þ 3ð3Þ
3ð0Þ 3ð4Þ 3ð5Þ
 #
¼
3 6 9
0 12 15
 #
2A  3B ¼
2 4 6
0 8 10
 #
þ
12 18 24
3 9 21
 #
¼
10 22 18
3 17 31
 #
The matrix 2A  3B is called a linear combination of A and B.
Basic properties of matrices under the operations of matrix addition and scalar multiplication follow.
THEOREM 2.1: Consider any matrices A; B; C (with the same size) and any scalars k and k0
. Then
(i) ðA þ BÞ þ C ¼ A þ ðB þ CÞ, (v) kðA þ BÞ ¼ kA þ kB,
(ii) A þ 0 ¼ 0 þ A ¼ A, (vi) ðk þ k0
ÞA ¼ kA þ k0
A,
(iii) A þ ðAÞ ¼ ðAÞ þ A ¼ 0; (vii) ðkk0
ÞA ¼ kðk0
AÞ,
(iv) A þ B ¼ B þ A, (viii) 1  A ¼ A.
Note first that the 0 in (ii) and (iii) refers to the zero matrix. Also, by (i) and (iv), any sum of matrices
A1 þ A2 þ    þ An
requires no parentheses, and the sum does not depend on the order of the matrices. Furthermore, using
(vi) and (viii), we also have
A þ A ¼ 2A; A þ A þ A ¼ 3A; . . .
and so on.
The proof of Theorem 2.1 reduces to showing that the ij-entries on both sides of each matrix equation
are equal. (See Problem 2.3.)
Observe the similarity between Theorem 2.1 for matrices and Theorem 1.1 for vectors. In fact, the
above operations for matrices may be viewed as generalizations of the corresponding operations for
vectors.
2.4 Summation Symbol
Before we define matrix multiplication, it will be instructive to first introduce the summation symbol S
(the Greek capital letter sigma).
Suppose f ðkÞ is an algebraic expression involving the letter k. Then the expression
P
n
k¼1
f ðkÞ or equivalently
Pn
k¼1 f ðkÞ
has the following meaning. First we set k ¼ 1 in f ðkÞ, obtaining
f ð1Þ
Then we set k ¼ 2 in f ðkÞ, obtaining f ð2Þ, and add this to f ð1Þ, obtaining
f ð1Þ þ f ð2Þ
CHAPTER 2 Algebra of Matrices 29
Then we set k ¼ 3 in f ðkÞ, obtaining f ð3Þ, and add this to the previous sum, obtaining
f ð1Þ þ f ð2Þ þ f ð3Þ
We continue this process until we obtain the sum
f ð1Þ þ f ð2Þ þ    þ f ðnÞ
Observe that at each step we increase the value of k by 1 until we reach n. The letter k is called the index,
and 1 and n are called, respectively, the lower and upper limits. Other letters frequently used as indices
are i and j.
We also generalize our definition by allowing the sum to range from any integer n1 to any integer n2.
That is, we define
P
n2
k¼n1
f ðkÞ ¼ f ðn1Þ þ f ðn1 þ 1Þ þ f ðn1 þ 2Þ þ    þ f ðn2Þ
EXAMPLE 2.3
(a)
P
5
k¼1
xk ¼ x1 þ x2 þ x3 þ x4 þ x5 and
P
n
i¼1
aibi ¼ a1b1 þ a2b2 þ    þ anbn
(b)
P
5
j¼2
j2
¼ 22
þ 32
þ 42
þ 52
¼ 54 and
P
n
i¼0
aixi
¼ a0 þ a1x þ a2x2
þ    þ anxn
(c)
P
p
k¼1
aikbkj ¼ ai1b1j þ ai2b2j þ ai3b3j þ    þ aipbpj
2.5 Matrix Multiplication
The product of matrices A and B, written AB, is somewhat complicated. For this reason, we first begin
with a special case.
The product AB of a row matrix A ¼ ½ai and a column matrix B ¼ ½bi with the same number of
elements is defined to be the scalar (or 1 1 matrix) obtained by multiplying corresponding entries and
adding; that is,
AB ¼ ½a1; a2; . . . ; an
b1
b2
. . .
bn
2
6
6
4
3
7
7
5 ¼ a1b1 þ a2b2 þ    þ anbn ¼
P
n
k¼1
akbk
We emphasize that AB is a scalar (or a 1 1 matrix). The product AB is not defined when A and B have
different numbers of elements.
EXAMPLE 2.4
(a) ½7; 4; 5
3
2
1
2
4
3
5 ¼ 7ð3Þ þ ð4Þð2Þ þ 5ð1Þ ¼ 21  8  5 ¼ 8
(b) ½6; 1; 8; 3
4
9
2
5
2
6
6
4
3
7
7
5 ¼ 24 þ 9  16 þ 15 ¼ 32
We are now ready to define matrix multiplication in general.
30 CHAPTER 2 Algebra of Matrices
DEFINITION: Suppose A ¼ ½aik and B ¼ ½bkj are matrices such that the number of columns of A is
equal to the number of rows of B; say, A is an m p matrix and B is a p n matrix.
Then the product AB is the m n matrix whose ij-entry is obtained by multiplying the
ith row of A by the jth column of B. That is,
a11 . . . a1p
: . . . :
ai1 . . . aip
: . . . :
am1 . . . amp
2
6
6
6
6
4
3
7
7
7
7
5
b11 . . . b1j . . . b1n
: . . . : . . . :
: . . . : . . . :
: . . . : . . . :
bp1 . . . bpj . . . bpn
2
6
6
6
6
4
3
7
7
7
7
5
¼
c11 . . . c1n
: . . . :
: cij :
: . . . :
cm1 . . . cmn
2
6
6
6
6
4
3
7
7
7
7
5
where cij ¼ ai1b1j þ ai2b2j þ    þ aipbpj ¼
P
p
k¼1
aikbkj
The product AB is not defined if A is an m p matrix and B is a q n matrix, where p 6¼ q.
EXAMPLE 2.5
(a) Find AB where A ¼
1 3
2 1
 
and B ¼
2 0 4
5 2 6
 
.
Because A is 2 2 and B is 2 3, the product AB is defined and AB is a 2 3 matrix. To obtain
the first row of the product matrix AB, multiply the first row [1, 3] of A by each column of B,
2
5
 
;
0
2
 
;
4
6
 
respectively. That is,
AB ¼
2 þ 15 0  6 4 þ 18
 
¼
17 6 14
 
To obtain the second row of AB, multiply the second row ½2; 1 of A by each column of B. Thus,
AB ¼
17 6 14
4  5 0 þ 2 8  6
 
¼
17 6 14
1 2 14
 
(b) Suppose A ¼
1 2
3 4
 
and B ¼
5 6
0 2
 
. Then
AB ¼
5 þ 0 6  4
15 þ 0 18  8
 
¼
5 2
15 10
 
and BA ¼
5 þ 18 10 þ 24
0  6 0  8
 
¼
23 34
6 8
 
The above example shows that matrix multiplication is not commutative—that is, in general,
AB 6¼ BA. However, matrix multiplication does satisfy the following properties.
THEOREM 2.2: Let A; B; C be matrices. Then, whenever the products and sums are defined,
(i) ðABÞC ¼ AðBCÞ (associative law),
(ii) AðB þ CÞ ¼ AB þ AC (left distributive law),
(iii) ðB þ CÞA ¼ BA þ CA (right distributive law),
(iv) kðABÞ ¼ ðkAÞB ¼ AðkBÞ, where k is a scalar.
We note that 0A ¼ 0 and B0 ¼ 0, where 0 is the zero matrix.
CHAPTER 2 Algebra of Matrices 31
2.6 Transpose of a Matrix
The transpose of a matrix A, written AT
, is the matrix obtained by writing the columns of A, in order, as
rows. For example,
1 2 3
4 5 6
 T
¼
1 4
2 5
3 6
2
4
3
5 and ½1; 3; 5T
¼
1
3
5
2
4
3
5
In other words, if A ¼ ½aij is an m n matrix, then AT
¼ ½bij is the n m matrix where bij ¼ aji.
Observe that the tranpose of a row vector is a column vector. Similarly, the transpose of a column
vector is a row vector.
The next theorem lists basic properties of the transpose operation.
THEOREM 2.3: Let A and B be matrices and let k be a scalar. Then, whenever the sum and product are
defined,
(i) ðA þ BÞT
¼ AT
þ BT
, (iii) ðkAÞT
¼ kAT
,
(ii) ðAT
ÞT
¼ A; (iv) ðABÞT
¼ BT
AT
.
We emphasize that, by (iv), the transpose of a product is the product of the transposes, but in the
reverse order.
2.7 Square Matrices
A square matrix is a matrix with the same number of rows as columns. An n n square matrix is said to
be of order n and is sometimes called an n-square matrix.
Recall that not every two matrices can be added or multiplied. However, if we only consider square
matrices of some given order n, then this inconvenience disappears. Specifically, the operations of
addition, multiplication, scalar multiplication, and transpose can be performed on any n n matrices, and
the result is again an n n matrix.
EXAMPLE 2.6 The following are square matrices of order 3:
A ¼
1 2 3
4 4 4
5 6 7
2
4
3
5 and B ¼
2 5 1
0 3 2
1 2 4
2
4
3
5
The following are also matrices of order 3:
A þ B ¼
3 3 4
4 1 6
6 8 3
2
6
4
3
7
5; 2A ¼
2 4 6
8 8 8
10 12 14
2
6
4
3
7
5; AT
¼
1 4 5
2 4 6
3 4 7
2
6
4
3
7
5
AB ¼
5 7 15
12 0 20
17 7 35
2
6
4
3
7
5; BA ¼
27 30 33
22 24 26
27 30 33
2
6
4
3
7
5
Diagonal and Trace
Let A ¼ ½aij be an n-square matrix. The diagonal or main diagonal of A consists of the elements with the
same subscripts—that is,
a11; a22; a33; . . . ; ann
32 CHAPTER 2 Algebra of Matrices
The trace of A, written trðAÞ, is the sum of the diagonal elements. Namely,
trðAÞ ¼ a11 þ a22 þ a33 þ    þ ann
The following theorem applies.
THEOREM 2.4: Suppose A ¼ ½aij and B ¼ ½bij are n-square matrices and k is a scalar. Then
(i) trðA þ BÞ ¼ trðAÞ þ trðBÞ, (iii) trðAT
Þ ¼ trðAÞ,
(ii) trðkAÞ ¼ k trðAÞ, (iv) trðABÞ ¼ trðBAÞ.
EXAMPLE 2.7 Let A and B be the matrices A and B in Example 2.6. Then
diagonal of A ¼ f1; 4; 7g and trðAÞ ¼ 1  4 þ 7 ¼ 4
diagonal of B ¼ f2; 3; 4g and trðBÞ ¼ 2 þ 3  4 ¼ 1
Moreover,
trðA þ BÞ ¼ 3  1 þ 3 ¼ 5; trð2AÞ ¼ 2  8 þ 14 ¼ 8; trðAT
Þ ¼ 1  4 þ 7 ¼ 4
trðABÞ ¼ 5 þ 0  35 ¼ 30; trðBAÞ ¼ 27  24  33 ¼ 30
As expected from Theorem 2.4,
trðA þ BÞ ¼ trðAÞ þ trðBÞ; trðAT
Þ ¼ trðAÞ; trð2AÞ ¼ 2 trðAÞ
Furthermore, although AB 6¼ BA, the traces are equal.
Identity Matrix, Scalar Matrices
The n-square identity or unit matrix, denoted by In, or simply I, is the n-square matrix with 1’s on the
diagonal and 0’s elsewhere. The identity matrix I is similar to the scalar 1 in that, for any n-square matrix
A,
AI ¼ IA ¼ A
More generally, if B is an m n matrix, then BIn ¼ ImB ¼ B.
For any scalar k, the matrix kI that contains k’s on the diagonal and 0’s elsewhere is called the scalar
matrix corresponding to the scalar k. Observe that
ðkIÞA ¼ kðIAÞ ¼ kA
That is, multiplying a matrix A by the scalar matrix kI is equivalent to multiplying A by the scalar k.
EXAMPLE 2.8 The following are the identity matrices of orders 3 and 4 and the corresponding scalar
matrices for k ¼ 5:
1 0 0
0 1 0
0 0 1
2
4
3
5;
1
1
1
1
2
6
6
4
3
7
7
5;
5 0 0
0 5 0
0 0 5
2
4
3
5;
5
5
5
5
2
6
6
4
3
7
7
5
Remark 1: It is common practice to omit blocks or patterns of 0’s when there is no ambiguity, as
in the above second and fourth matrices.
Remark 2: The Kronecker delta function dij is defined by
dij ¼
0 if i 6¼ j
1 if i ¼ j
Thus, the identity matrix may be defined by I ¼ ½dij.
CHAPTER 2 Algebra of Matrices 33
2.8 Powers of Matrices, Polynomials in Matrices
Let A be an n-square matrix over a field K. Powers of A are defined as follows:
A2
¼ AA; A3
¼ A2
A; . . . ; Anþ1
¼ An
A; . . . ; and A0
¼ I
Polynomials in the matrix A are also defined. Specifically, for any polynomial
f ðxÞ ¼ a0 þ a1x þ a2x2
þ    þ anxn
where the ai are scalars in K, f ðAÞ is defined to be the following matrix:
f ðAÞ ¼ a0I þ a1A þ a2A2
þ    þ anAn
[Note that f ðAÞ is obtained from f ðxÞ by substituting the matrix A for the variable x and substituting the
scalar matrix a0I for the scalar a0.] If f ðAÞ is the zero matrix, then A is called a zero or root of f ðxÞ.
EXAMPLE 2.9 Suppose A ¼
1 2
3 4
 
. Then
A2
¼
1 2
3 4
 
1 2
3 4
 
¼
7 6
9 22
 
and A3
¼ A2
A ¼
7 6
9 22
 
1 2
3 4
 
¼
11 38
57 106
 
Suppose f ðxÞ ¼ 2x2
 3x þ 5 and gðxÞ ¼ x2
þ 3x  10. Then
f ðAÞ ¼ 2
7 6
9 22
 
 3
1 2
3 4
 
þ 5
1 0
0 1
 
¼
16 18
27 61
 
gðAÞ ¼
7 6
9 22
 
þ 3
1 2
3 4
 
 10
1 0
0 1
 
¼
0 0
0 0
 
Thus, A is a zero of the polynomial gðxÞ.
2.9 Invertible (Nonsingular) Matrices
A square matrix A is said to be invertible or nonsingular if there exists a matrix B such that
AB ¼ BA ¼ I
where I is the identity matrix. Such a matrix B is unique. That is, if AB1 ¼ B1A ¼ I and AB2 ¼ B2A ¼ I,
then
B1 ¼ B1I ¼ B1ðAB2Þ ¼ ðB1AÞB2 ¼ IB2 ¼ B2
We call such a matrix B the inverse of A and denote it by A1
. Observe that the above relation is
symmetric; that is, if B is the inverse of A, then A is the inverse of B.
EXAMPLE 2.10 Suppose that A ¼
2 5
1 3
 
and B ¼
3 5
1 2
 
. Then
AB ¼
6  5 10 þ 10
3  3 5 þ 6
 
¼
1 0
0 1
 
and BA ¼
6  5 15  15
2 þ 2 5 þ 6
 
¼
1 0
0 1
 
Thus, A and B are inverses.
It is known (Theorem 3.16) that AB ¼ I if and only if BA ¼ I. Thus, it is necessary to test only one
product to determine whether or not two given matrices are inverses. (See Problem 2.17.)
Now suppose A and B are invertible. Then AB is invertible and ðABÞ1
¼ B1
A1
. More generally, if
A1; A2; . . . ; Ak are invertible, then their product is invertible and
ðA1A2 . . . AkÞ1
¼ A1
k . . . A1
2 A1
1
the product of the inverses in the reverse order.
34 CHAPTER 2 Algebra of Matrices
Inverse of a 2 2 Matrix
Let A be an arbitrary 2 2 matrix, say A ¼
a b
c d
 
. We want to derive a formula for A1
, the inverse
of A. Specifically, we seek 22
¼ 4 scalars, say x1, y1, x2, y2, such that
a b
c d
 
x1 x2
y1 y2
 
¼
1 0
0 1
 
or
ax1 þ by1 ax2 þ by2
cx1 þ dy1 cx2 þ dy2
 
¼
1 0
0 1
 
Setting the four entries equal to the corresponding entries in the identity matrix yields four equations,
which can be partitioned into two 2 2 systems as follows:
ax1 þ by1 ¼ 1; ax2 þ by2 ¼ 0
cx1 þ dy1 ¼ 0; cx2 þ dy2 ¼ 1
Suppose we let jAj ¼ ab  bc (called the determinant of A). Assuming jAj 6¼ 0, we can solve uniquely for
the above unknowns x1, y1, x2, y2, obtaining
x1 ¼
d
jAj
; y1 ¼
c
jAj
; x2 ¼
b
jAj
; y2 ¼
a
jAj
Accordingly,
A1
¼
a b
c d
 1
¼
d=jAj b=jAj
c=jAj a=jAj
 
¼
1
jAj
d b
c a
 
In other words, when jAj 6¼ 0, the inverse of a 2 2 matrix A may be obtained from A as follows:
(1) Interchange the two elements on the diagonal.
(2) Take the negatives of the other two elements.
(3) Multiply the resulting matrix by 1=jAj or, equivalently, divide each element by jAj.
In case jAj ¼ 0, the matrix A is not invertible.
EXAMPLE 2.11 Find the inverse of A ¼
2 3
4 5
 
and B ¼
1 3
2 6
 
.
First evaluate jAj ¼ 2ð5Þ  3ð4Þ ¼ 10  12 ¼ 2. Because jAj 6¼ 0, the matrix A is invertible and
A1
¼
1
2
5 3
4 2
 
¼
 5
2
3
2
2 1
 
Now evaluate jBj ¼ 1ð6Þ  3ð2Þ ¼ 6  6 ¼ 0. Because jBj ¼ 0, the matrix B has no inverse.
Remark: The above property that a matrix is invertible if and only if A has a nonzero determinant
is true for square matrices of any order. (See Chapter 8.)
Inverse of an n n Matrix
Suppose A is an arbitrary n-square matrix. Finding its inverse A1
reduces, as above, to finding the
solution of a collection of n n systems of linear equations. The solution of such systems and an efficient
way of solving such a collection of systems is treated in Chapter 3.
2.10 Special Types of Square Matrices
This section describes a number of special kinds of square matrices.
Diagonal and Triangular Matrices
A square matrix D ¼ ½dij is diagonal if its nondiagonal entries are all zero. Such a matrix is sometimes
denoted by
D ¼ diagðd11; d22; . . . ; dnnÞ
CHAPTER 2 Algebra of Matrices 35
where some or all the dii may be zero. For example,
3 0 0
0 7 0
0 0 2
2
4
3
5;
4 0
0 5
 
;
6
0
9
8
2
6
6
4
3
7
7
5
are diagonal matrices, which may be represented, respectively, by
diagð3; 7; 2Þ; diagð4; 5Þ; diagð6; 0; 9; 8Þ
(Observe that patterns of 0’s in the third matrix have been omitted.)
A square matrix A ¼ ½aij is upper triangular or simply triangular if all entries below the (main)
diagonal are equal to 0—that is, if aij ¼ 0 for i  j. Generic upper triangular matrices of orders 2, 3, 4 are
as follows:
a11 a12
0 a22
 
;
b11 b12 b13
b22 b23
b33
2
4
3
5;
c11 c12 c13 c14
c22 c23 c24
c33 c34
c44
2
6
6
4
3
7
7
5
(As with diagonal matrices, it is common practice to omit patterns of 0’s.)
The following theorem applies.
THEOREM 2.5: Suppose A ¼ ½aij and B ¼ ½bij are n n (upper) triangular matrices. Then
(i) A þ B, kA, AB are triangular with respective diagonals:
ða11 þ b11; . . . ; ann þ bnnÞ; ðka11; . . . ; kannÞ; ða11b11; . . . ; annbnnÞ
(ii) For any polynomial f ðxÞ, the matrix f ðAÞ is triangular with diagonal
ð f ða11Þ; f ða22Þ; . . . ; f ðannÞÞ
(iii) A is invertible if and only if each diagonal element aii 6¼ 0, and when A1
exists
it is also triangular.
A lower triangular matrix is a square matrix whose entries above the diagonal are all zero. We note
that Theorem 2.5 is true if we replace ‘‘triangular’’ by either ‘‘lower triangular’’ or ‘‘diagonal.’’
Remark: A nonempty collection A of matrices is called an algebra (of matrices) if A is closed
under the operations of matrix addition, scalar multiplication, and matrix multiplication. Clearly, the
square matrices with a given order form an algebra of matrices, but so do the scalar, diagonal, triangular,
and lower triangular matrices.
Special Real Square Matrices: Symmetric, Orthogonal, Normal
[Optional until Chapter 12]
Suppose now A is a square matrix with real entries—that is, a real square matrix. The relationship
between A and its transpose AT
yields important kinds of matrices.
(a) Symmetric Matrices
A matrix A is symmetric if AT
¼ A. Equivalently, A ¼ ½aij is symmetric if symmetric elements (mirror
elements with respect to the diagonal) are equal—that is, if each aij ¼ aji.
A matrix A is skew-symmetric if AT
¼ A or, equivalently, if each aij ¼ aji. Clearly, the diagonal
elements of such a matrix must be zero, because aii ¼ aii implies aii ¼ 0.
(Note that a matrix A must be square if AT
¼ A or AT
¼ A.)
36 CHAPTER 2 Algebra of Matrices
EXAMPLE 2.12 Let A ¼
2 3 5
3 6 7
5 7 8
2
4
3
5; B ¼
0 3 4
3 0 5
4 5 0
2
4
3
5; C ¼
1 0 0
0 0 1
 
:
(a) By inspection, the symmetric elements in A are equal, or AT
¼ A. Thus, A is symmetric.
(b) The diagonal elements of B are 0 and symmetric elements are negatives of each other, or BT
¼ B.
Thus, B is skew-symmetric.
(c) Because C is not square, C is neither symmetric nor skew-symmetric.
(b) Orthogonal Matrices
A real matrix A is orthogonal if AT
¼ A1
—that is, if AAT
¼ AT
A ¼ I. Thus, A must necessarily be
square and invertible.
EXAMPLE 2.13 Let A ¼
1
9
8
9  4
9
4
9  4
9  7
9
8
9
1
9
4
9
2
6
4
3
7
5. Multiplying A by AT
yields I; that is, AAT
¼ I. This means
AT
A ¼ I, as well. Thus, AT
¼ A1
; that is, A is orthogonal.
Now suppose A is a real orthogonal 3 3 matrix with rows
u1 ¼ ða1; a2; a3Þ; u2 ¼ ðb1; b2; b3Þ; u3 ¼ ðc1; c2; c3Þ
Because A is orthogonal, we must have AAT
¼ I. Namely,
AAT
¼
a1 a2 a3
b1 b2 b3
c1 c2 c3
2
4
3
5
a1 b1 c1
a2 b2 c2
a3 b3 c3
2
4
3
5 ¼
1 0 0
0 1 0
0 0 1
2
4
3
5 ¼ I
Multiplying A by AT
and setting each entry equal to the corresponding entry in I yields the following nine
equations:
a2
1 þ a2
2 þ a2
3 ¼ 1; a1b1 þ a2b2 þ a3b3 ¼ 0; a1c1 þ a2c2 þ a3c3 ¼ 0
b1a1 þ b2a2 þ b3a3 ¼ 0; b2
1 þ b2
2 þ b2
3 ¼ 1; b1c1 þ b2c2 þ b3c3 ¼ 0
c1a1 þ c2a2 þ c3a3 ¼ 0; c1b1 þ c2b2 þ c3b3 ¼ 0; c2
1 þ c2
2 þ c2
3 ¼ 1
Accordingly, u1  u1 ¼ 1, u2  u2 ¼ 1, u3  u3 ¼ 1, and ui  uj ¼ 0 for i 6¼ j. Thus, the rows u1, u2, u3 are
unit vectors and are orthogonal to each other.
Generally speaking, vectors u1, u2; . . . ; um in Rn
are said to form an orthonormal set of vectors if the
vectors are unit vectors and are orthogonal to each other; that is,
ui  uj ¼
0 if i 6¼ j
1 if i ¼ j
In other words, ui  uj ¼ dij where dij is the Kronecker delta function:
We have shown that the condition AAT
¼ I implies that the rows of A form an orthonormal set of
vectors. The condition AT
A ¼ I similarly implies that the columns of A also form an orthonormal set
of vectors. Furthermore, because each step is reversible, the converse is true.
The above results for 3 3 matrices are true in general. That is, the following theorem holds.
THEOREM 2.6: Let A be a real matrix. Then the following are equivalent:
(a) A is orthogonal.
(b) The rows of A form an orthonormal set.
(c) The columns of A form an orthonormal set.
For n ¼ 2, we have the following result (proved in Problem 2.28).
CHAPTER 2 Algebra of Matrices 37
THEOREM 2.7: Let A be a real 2 2 orthogonal matrix. Then, for some real number y,
A ¼
cos y sin y
 sin y cos y
 
or A ¼
cos y sin y
sin y  cos y
 
(c) Normal Matrices
A real matrix A is normal if it commutes with its transpose AT
—that is, if AAT
¼ AT
A. If A is symmetric,
orthogonal, or skew-symmetric, then A is normal. There are also other normal matrices.
EXAMPLE 2.14 Let A ¼
6 3
3 6
 
. Then
AAT
¼
6 3
3 6
 
6 3
3 6
 
¼
45 0
0 45
 
and AT
A ¼
6 3
3 6
 
6 3
3 6
 
¼
45 0
0 45
 
Because AAT
¼ AT
A, the matrix A is normal.
2.11 Complex Matrices
Let A be a complex matrix—that is, a matrix with complex entries. Recall (Section 1.7) that if z ¼ a þ bi
is a complex number, then 
z ¼ a  bi is its conjugate. The conjugate of a complex matrix A, written 
A, is
the matrix obtained from A by taking the conjugate of each entry in A. That is, if A ¼ ½aij, then 
A ¼ ½bij,
where bij ¼ 
aij. (We denote this fact by writing 
A ¼ ½
aij.)
The two operations of transpose and conjugation commute for any complex matrix A, and the special
notation AH
is used for the conjugate transpose of A. That is,
AH
¼ ð 
AÞT
¼ ðAT Þ
Note that if A is real, then AH
¼ AT
. [Some texts use A* instead of AH
:]
EXAMPLE 2.15 Let A ¼
2 þ 8i 5  3i 4  7i
6i 1  4i 3 þ 2i
 
. Then AH
¼
2  8i 6i
5 þ 3i 1 þ 4i
4 þ 7i 3  2i
2
4
3
5.
Special Complex Matrices: Hermitian, Unitary, Normal [Optional until Chapter 12]
Consider a complex matrix A. The relationship between A and its conjugate transpose AH
yields
important kinds of complex matrices (which are analogous to the kinds of real matrices described above).
A complex matrix A is said to be Hermitian or skew-Hermitian according as to whether
AH
¼ A or AH
¼ A:
Clearly, A ¼ ½aij is Hermitian if and only if symmetric elements are conjugate—that is, if each
aij ¼ 
aji—in which case each diagonal element aii must be real. Similarly, if A is skew-symmetric,
then each diagonal element aii ¼ 0. (Note that A must be square if AH
¼ A or AH
¼ A.)
A complex matrix A is unitary if AH
A1
¼ A1
AH
¼ I—that is, if
AH
¼ A1
:
Thus, A must necessarily be square and invertible. We note that a complex matrix A is unitary if and only
if its rows (columns) form an orthonormal set relative to the dot product of complex vectors.
A complex matrix A is said to be normal if it commutes with AH
—that is, if
AAH
¼ AH
A
38 CHAPTER 2 Algebra of Matrices
(Thus, A must be a square matrix.) This definition reduces to that for real matrices when A is real.
EXAMPLE 2.16 Consider the following complex matrices:
A ¼
3 1  2i 4 þ 7i
1 þ 2i 4 2i
4  7i 2i 5
2
4
3
5 B ¼
1
2
1 i 1 þ i
i 1 1 þ i
1 þ i 1 þ i 0
2
4
3
5 C ¼
2 þ 3i 1
i 1 þ 2i
 
(a) By inspection, the diagonal elements of A are real, and the symmetric elements 1  2i and 1 þ 2i are
conjugate, 4 þ 7i and 4  7i are conjugate, and 2i and 2i are conjugate. Thus, A is Hermitian.
(b) Multiplying B by BH
yields I; that is, BBH
¼ I. This implies BH
B ¼ I, as well. Thus, BH
¼ B1
,
which means B is unitary.
(c) To show C is normal, we evaluate CCH
and CH
C:
CCH
¼
2 þ 3i 1
i 1 þ 2i
 
2  3i i
1 1  2i
 
¼
14 4  4i
4 þ 4i 6
 
and similarly CH
C ¼
14 4  4i
4 þ 4i 6
 
. Because CCH
¼ CH
C, the complex matrix C is normal.
We note that when a matrix A is real, Hermitian is the same as symmetric, and unitary is the same as
orthogonal.
2.12 Block Matrices
Using a system of horizontal and vertical (dashed) lines, we can partition a matrix A into submatrices
called blocks (or cells) of A. Clearly a given matrix may be divided into blocks in different ways. For
example,
1 2 0 1 3
2 3 5 7 2
3 1 4 5 9
4 6 3 1 8
2
6
6
4
3
7
7
5;
1 2 0 1 3
2 3 5 7 2
3 1 4 5 9
4 6 3 1 8
2
6
6
4
3
7
7
5;
1 2 0 1 3
2 3 5 7 2
3 1 4 5 9
4 6 3 1 8
2
6
6
4
3
7
7
5
The convenience of the partition of matrices, say A and B, into blocks is that the result of operations on A
and B can be obtained by carrying out the computation with the blocks, just as if they were the actual
elements of the matrices. This is illustrated below, where the notation A ¼ ½Aij will be used for a block
matrix A with blocks Aij.
Suppose that A ¼ ½Aij and B ¼ ½Bij are block matrices with the same numbers of row and column
blocks, and suppose that corresponding blocks have the same size. Then adding the corresponding blocks
of A and B also adds the corresponding elements of A and B, and multiplying each block of A by a scalar
k multiplies each element of A by k. Thus,
A þ B ¼
A11 þ B11 A12 þ B12 . . . A1n þ B1n
A21 þ B21 A22 þ B22 . . . A2n þ B2n
. . . . . . . . . . . .
Am1 þ Bm1 Am2 þ Bm2 . . . Amn þ Bmn
2
6
6
6
4
3
7
7
7
5
and
kA ¼
kA11 kA12 . . . kA1n
kA21 kA22 . . . kA2n
. . . . . . . . . . . .
kAm1 kAm2 . . . kAmn
2
6
6
4
3
7
7
5
CHAPTER 2 Algebra of Matrices 39
The case of matrix multiplication is less obvious, but still true. That is, suppose that U ¼ ½Uik and
V ¼ ½Vkj are block matrices such that the number of columns of each block Uik is equal to the number of
rows of each block Vkj. (Thus, each product UikVkj is defined.) Then
UV ¼
W11 W12 . . . W1n
W21 W22 . . . W2n
. . . . . . . . . . . .
Wm1 Wm2 . . . Wmn
2
6
6
4
3
7
7
5; where Wij ¼ Ui1V1j þ Ui2V2j þ    þ UipVpj
The proof of the above formula for UV is straightforward but detailed and lengthy. It is left as an exercise
(Problem 2.85).
Square Block Matrices
Let M be a block matrix. Then M is called a square block matrix if
(i) M is a square matrix.
(ii) The blocks form a square matrix.
(iii) The diagonal blocks are also square matrices.
The latter two conditions will occur if and only if there are the same number of horizontal and vertical
lines and they are placed symmetrically.
Consider the following two block matrices:
A ¼
1 2 3 4 5
1 1 1 1 1
9 8 7 6 5
4 4 4 4 4
3 5 3 5 3
2
6
6
6
6
4
3
7
7
7
7
5
and B ¼
1 2 3 4 5
1 1 1 1 1
9 8 7 6 5
4 4 4 4 4
3 5 3 5 3
2
6
6
6
6
4
3
7
7
7
7
5
The block matrix A is not a square block matrix, because the second and third diagonal blocks are not
square. On the other hand, the block matrix B is a square block matrix.
Block Diagonal Matrices
Let M ¼ ½Aij be a square block matrix such that the nondiagonal blocks are all zero matrices; that is,
Aij ¼ 0 when i 6¼ j. Then M is called a block diagonal matrix. We sometimes denote such a block
diagonal matrix by writing
M ¼ diagðA11; A22; . . . ; ArrÞ or M ¼ A11 A22    Arr
The importance of block diagonal matrices is that the algebra of the block matrix is frequently reduced to
the algebra of the individual blocks. Specifically, suppose f ðxÞ is a polynomial and M is the above block
diagonal matrix. Then f ðMÞ is a block diagonal matrix, and
f ðMÞ ¼ diagð f ðA11Þ; f ðA22Þ; . . . ; f ðArrÞÞ
Also, M is invertible if and only if each Aii is invertible, and, in such a case, M1
is a block diagonal
matrix, and
M1
¼ diagðA1
11 ; A1
22 ; . . . ; A1
rr Þ
Analogously, a square block matrix is called a block upper triangular matrix if the blocks below the
diagonal are zero matrices and a block lower triangular matrix if the blocks above the diagonal are zero
matrices.
40 CHAPTER 2 Algebra of Matrices
EXAMPLE 2.17 Determine which of the following square block matrices are upper diagonal, lower
diagonal, or diagonal:
A ¼
1 2 0
3 4 5
0 0 6
2
4
3
5; B ¼
1 0 0 0
2 3 4 0
5 0 6 0
0 7 8 9
2
6
6
4
3
7
7
5; C ¼
1 0 0
0 2 3
0 4 5
2
4
3
5; D ¼
1 2 0
3 4 5
0 6 7
2
4
3
5
(a) A is upper triangular because the block below the diagonal is a zero block.
(b) B is lower triangular because all blocks above the diagonal are zero blocks.
(c) C is diagonal because the blocks above and below the diagonal are zero blocks.
(d) D is neither upper triangular nor lower triangular. Also, no other partitioning of D will make it into
either a block upper triangular matrix or a block lower triangular matrix.
SOLVED PROBLEMS
Matrix Addition and Scalar Multiplication
2.1 Given A ¼
1 2 3
4 5 6
 
and B ¼
3 0 2
7 1 8
 
, find:
(a) A þ B, (b) 2A  3B.
(a) Add the corresponding elements:
A þ B ¼
1 þ 3 2 þ 0 3 þ 2
4  7 5 þ 1 6 þ 8
 
¼
4 2 5
3 6 2
 
(b) First perform the scalar multiplication and then a matrix addition:
2A  3B ¼
2 4 6
8 10 12
 
þ
9 0 6
21 3 24
 
¼
7 4 0
29 7 36
 
(Note that we multiply B by 3 and then add, rather than multiplying B by 3 and subtracting. This usually
prevents errors.)
2.2. Find x; y; z; t where 3
x y
z t
 
¼
x 6
1 2t
 
þ
4 x þ y
z þ t 3
 
:
Write each side as a single equation:
3x 3y
3z 3t
 
¼
x þ 4 x þ y þ 6
z þ t  1 2t þ 3
 
Set corresponding entries equal to each other to obtain the following system of four equations:
3x ¼ x þ 4; 3y ¼ x þ y þ 6; 3z ¼ z þ t  1; 3t ¼ 2t þ 3
or 2x ¼ 4; 2y ¼ 6 þ x; 2z ¼ t  1; t ¼ 3
The solution is x ¼ 2, y ¼ 4, z ¼ 1, t ¼ 3.
2.3. Prove Theorem 2.1 (i) and (v): (i) ðA þ BÞ þ C ¼ A þ ðB þ CÞ, (v) kðA þ BÞ ¼ kA þ kB.
Suppose A ¼ ½aij, B ¼ ½bij, C ¼ ½cij. The proof reduces to showing that corresponding ij-entries
in each side of each matrix equation are equal. [We prove only (i) and (v), because the other parts
of Theorem 2.1 are proved similarly.]
CHAPTER 2 Algebra of Matrices 41
(i) The ij-entry of A þ B is aij þ bij; hence, the ij-entry of ðA þ BÞ þ C is ðaij þ bijÞ þ cij. On the other hand,
the ij-entry of B þ C is bij þ cij; hence, the ij-entry of A þ ðB þ CÞ is aij þ ðbij þ cijÞ. However, for
scalars in K,
ðaij þ bijÞ þ cij ¼ aij þ ðbij þ cijÞ
Thus, ðA þ BÞ þ C and A þ ðB þ CÞ have identical ij-entries. Therefore, ðA þ BÞ þ C ¼ A þ ðB þ CÞ.
(v) The ij-entry of A þ B is aij þ bij; hence, kðaij þ bijÞ is the ij-entry of kðA þ BÞ. On the other hand, the ij-
entries of kA and kB are kaij and kbij, respectively. Thus, kaij þ kbij is the ij-entry of kA þ kB. However,
for scalars in K,
kðaij þ bijÞ ¼ kaij þ kbij
Thus, kðA þ BÞ and kA þ kB have identical ij-entries. Therefore, kðA þ BÞ ¼ kA þ kB.
Matrix Multiplication
2.4. Calculate: (a) ½8; 4; 5
3
2
1
2
4
3
5, (b) ½6; 1; 7; 5
4
9
3
2
2
6
6
4
3
7
7
5, (c) ½3; 8; 2; 4
5
1
6
2
4
3
5
(a) Multiply the corresponding entries and add:
½8; 4; 5
3
2
1
2
4
3
5 ¼ 8ð3Þ þ ð4Þð2Þ þ 5ð1Þ ¼ 24  8  5 ¼ 11
(b) Multiply the corresponding entries and add:
½6; 1; 7; 5
4
9
3
2
2
6
6
6
4
3
7
7
7
5
¼ 24 þ 9  21 þ 10 ¼ 22
(c) The product is not defined when the row matrix and the column matrix have different numbers of elements.
2.5. Let ðr sÞ denote an r s matrix. Find the sizes of those matrix products that are defined:
(a) ð2 3Þð3 4Þ; (c) ð1 2Þð3 1Þ; (e) ð4 4Þð3 3Þ
(b) ð4 1Þð1 2Þ, (d) ð5 2Þð2 3Þ, (f) ð2 2Þð2 4Þ
In each case, the product is defined if the inner numbers are equal, and then the product will have the size of
the outer numbers in the given order.
(a) 2 4, (c) not defined, (e) not defined
(b) 4 2, (d) 5 3, (f) 2 4
2.6. Let A ¼
1 3
2 1
 
and B ¼
2 0 4
3 2 6
 
. Find: (a) AB, (b) BA.
(a) Because A is a 2 2 matrix and B a 2 3 matrix, the product AB is defined and is a 2 3 matrix. To
obtain the entries in the first row of AB, multiply the first row ½1; 3 of A by the columns
2
3
 
;
0
2
 
;
4
6
 
of B, respectively, as follows:
AB ¼
1 3
2 1
 
2 0 4
3 2 6
 
¼
2 þ 9 0  6 4 þ 18
 
¼
11 6 14
 
42 CHAPTER 2 Algebra of Matrices
To obtain the entries in the second row of AB, multiply the second row ½2; 1 of A by the columns of B:
AB ¼
1 3
2 1
 
2 0 4
3 2 6
 
¼
11 6 14
4  3 0 þ 2 8  6
 
Thus,
AB ¼
11 6 14
1 2 14
 
:
(b) The size of B is 2 3 and that of A is 2 2. The inner numbers 3 and 2 are not equal; hence, the product
BA is not defined.
2.7. Find AB, where A ¼
2 3 1
4 2 5
 
and B ¼
2 1 0 6
1 3 5 1
4 1 2 2
2
4
3
5.
Because A is a 2 3 matrix and B a 3 4 matrix, the product AB is defined and is a 2 4 matrix. Multiply
the rows of A by the columns of B to obtain
AB ¼
4 þ 3  4 2 þ 9  1 0  15 þ 2 12 þ 3  2
8  2 þ 20 4  6 þ 5 0 þ 10  10 24  2 þ 10
 
¼
3 6 13 13
26 5 0 32
 
:
2.8. Find: (a)
1 6
3 5
 
2
7
 
, (b)
2
7
 
1 6
3 5
 
, (c) ½2; 7
1 6
3 5
 
.
(a) The first factor is 2 2 and the second is 2 1, so the product is defined as a 2 1 matrix:
1 6
3 5
 
2
7
 
¼
2  42
6  35
 
¼
40
41
 
(b) The product is not defined, because the first factor is 2 1 and the second factor is 2 2.
(c) The first factor is 1 2 and the second factor is 2 2, so the product is defined as a 1 2 (row) matrix:
½2; 7
1 6
3 5
 
¼ ½2 þ 21; 12  35 ¼ ½23; 23
2.9. Clearly, 0A ¼ 0 and A0 ¼ 0, where the 0’s are zero matrices (with possibly different sizes). Find
matrices A and B with no zero entries such that AB ¼ 0.
Let A ¼
1 2
2 4
 
and B ¼
6 2
3 1
 
. Then AB ¼
0 0
0 0
 
.
2.10. Prove Theorem 2.2(i): ðABÞC ¼ AðBCÞ.
Let A ¼ ½aij, B ¼ ½bjk, C ¼ ½ckl, and let AB ¼ S ¼ ½sik, BC ¼ T ¼ ½tjl. Then
sik ¼
P
m
j¼1
aijbjk and tjl ¼
P
n
k¼1
bjkckl
Multiplying S ¼ AB by C, the il-entry of ðABÞC is
si1c1l þ si2c2l þ    þ sincnl ¼
P
n
k¼1
sikckl ¼
P
n
k¼1
P
m
j¼1
ðaijbjkÞckl
On the other hand, multiplying A by T ¼ BC, the il-entry of AðBCÞ is
ai1t1l þ ai2t2l þ    þ aintnl ¼
P
m
j¼1
aijtjl ¼
P
m
j¼1
P
n
k¼1
aijðbjkcklÞ
The above sums are equal; that is, corresponding elements in ðABÞC and AðBCÞ are equal. Thus,
ðABÞC ¼ AðBCÞ.
CHAPTER 2 Algebra of Matrices 43
2.11. Prove Theorem 2.2(ii): AðB þ CÞ ¼ AB þ AC.
Let A ¼ ½aij, B ¼ ½bjk, C ¼ ½cjk, and let D ¼ B þ C ¼ ½djk, E ¼ AB ¼ ½eik, F ¼ AC ¼ ½ fik. Then
djk ¼ bjk þ cjk; eik ¼
P
m
j¼1
aijbjk; fik ¼
P
m
j¼1
aijcjk
Thus, the ik-entry of the matrix AB þ AC is
eik þ fik ¼
P
m
j¼1
aijbjk þ
P
m
j¼1
aijcjk ¼
P
m
j¼1
aijðbjk þ cjkÞ
On the other hand, the ik-entry of the matrix AD ¼ AðB þ CÞ is
ai1d1k þ ai2d2k þ    þ aimdmk ¼
P
m
j¼1
aijdjk ¼
P
m
j¼1
aijðbjk þ cjkÞ
Thus, AðB þ CÞ ¼ AB þ AC, because the corresponding elements are equal.
Transpose
2.12. Find the transpose of each matrix:
A ¼
1 2 3
7 8 9
 
; B ¼
1 2 3
2 4 5
3 5 6
2
4
3
5; C ¼ ½1; 3; 5; 7; D ¼
2
4
6
2
4
3
5
Rewrite the rows of each matrix as columns to obtain the transpose of the matrix:
AT
¼
1 7
2 8
3 9
2
4
3
5; BT
¼
1 2 3
2 4 5
3 5 6
2
4
3
5; CT
¼
1
3
5
7
2
6
6
4
3
7
7
5; DT
¼ ½2; 4; 6
(Note that BT
¼ B; such a matrix is said to be symmetric. Note also that the transpose of the row vector C is a
column vector, and the transpose of the column vector D is a row vector.)
2.13. Prove Theorem 2.3(iv): ðABÞT
¼ BT
AT
.
Let A ¼ ½aik and B ¼ ½bkj. Then the ij-entry of AB is
ai1b1j þ ai2b2j þ    þ aimbmj
This is the ji-entry (reverse order) of ðABÞT
. Now column j of B becomes row j of BT
, and row i of A becomes
column i of AT
. Thus, the ij-entry of BT
AT
is
½b1j; b2j; . . . ; bmj½ai1; ai2; . . . ; aim
T
¼ b1jai1 þ b2jai2 þ    þ bmjaim
Thus, ðABÞT
¼ BT
AT
on because the corresponding entries are equal.
Square Matrices
2.14. Find the diagonal and trace of each matrix:
(a) A ¼
1 3 6
2 5 8
4 2 9
2
4
3
5, (b) B ¼
2 4 8
3 7 9
5 0 2
2
4
3
5, (c) C ¼
1 2 3
4 5 6
 
.
(a) The diagonal of A consists of the elements from the upper left corner of A to the lower right corner of A or,
in other words, the elements a11, a22, a33. Thus, the diagonal of A consists of the numbers 1; 5, and 9. The
trace of A is the sum of the diagonal elements. Thus,
trðAÞ ¼ 1  5 þ 9 ¼ 5
(b) The diagonal of B consists of the numbers 2; 7, and 2. Hence,
trðBÞ ¼ 2  7 þ 2 ¼ 3
(c) The diagonal and trace are only defined for square matrices.
44 CHAPTER 2 Algebra of Matrices
2.15. Let A ¼
1 2
4 3
 
, and let f ðxÞ ¼ 2x3
 4x þ 5 and gðxÞ ¼ x2
þ 2x þ 11. Find
(a) A2
, (b) A3
, (c) f ðAÞ, (d) gðAÞ.
(a) A2
¼ AA ¼
1 2
4 3
 
1 2
4 3
 
¼
1 þ 8 2  6
4  12 8 þ 9
 
¼
9 4
8 17
 
(b) A3
¼ AA2
¼
1 2
4 3
 
9 4
8 17
 
¼
9  16 4 þ 34
36 þ 24 16  51
 
¼
7 30
60 67
 
(c) First substitute A for x and 5I for the constant in f ðxÞ, obtaining
f ðAÞ ¼ 2A3
 4A þ 5I ¼ 2
7 30
60 67
 
 4
1 2
4 3
 
þ 5
1 0
0 1
 
Now perform the scalar multiplication and then the matrix addition:
f ðAÞ ¼
14 60
120 134
 
þ
4 8
16 12
 
þ
5 0
0 5
 
¼
13 52
104 117
 
(d) Substitute A for x and 11I for the constant in gðxÞ, and then calculate as follows:
gðAÞ ¼ A2
þ 2A  11I ¼
9 4
8 17
 
þ 2
1 2
4 3
 
 11
1 0
0 1
 
¼
9 4
8 17
 
þ
2 4
8 6
 
þ
11 0
0 11
 
¼
0 0
0 0
 
Because gðAÞ is the zero matrix, A is a root of the polynomial gðxÞ.
2.16. Let A ¼
1 3
4 3
 
. (a) Find a nonzero column vector u ¼
x
y
 
such that Au ¼ 3u.
(b) Describe all such vectors.
(a) First set up the matrix equation Au ¼ 3u, and then write each side as a single matrix (column vector) as
follows:
1 3
4 3
 
x
y
 
¼ 3
x
y
 
; and then
x þ 3y
4x  3y
 
¼
3x
3y
 
Set the corresponding elements equal to each other to obtain a system of equations:
x þ 3y ¼ 3x
4x  3y ¼ 3y
or
2x  3y ¼ 0
4x  6y ¼ 0
or 2x  3y ¼ 0
The system reduces to one nondegenerate linear equation in two unknowns, and so has an infinite number
of solutions. To obtain a nonzero solution, let, say, y ¼ 2; then x ¼ 3. Thus, u ¼ ð3; 2ÞT
is a desired
nonzero vector.
(b) To find the general solution, set y ¼ a, where a is a parameter. Substitute y ¼ a into 2x  3y ¼ 0 to obtain
x ¼ 3
2 a. Thus, u ¼ ð3
2 a; aÞT
represents all such solutions.
Invertible Matrices, Inverses
2.17. Show that A ¼
1 0 2
2 1 3
4 1 8
2
4
3
5 and B ¼
11 2 2
4 0 1
6 1 1
2
4
3
5 are inverses.
Compute the product AB, obtaining
AB ¼
11 þ 0 þ 12 2 þ 0  2 2 þ 0  2
22 þ 4 þ 18 4 þ 0  3 4  1  3
44  4 þ 48 8 þ 0  8 8 þ 1  8
2
4
3
5 ¼
1 0 0
0 1 0
0 0 1
2
4
3
5 ¼ I
Because AB ¼ I, we can conclude (Theorem 3.16) that BA ¼ I. Accordingly, A and B are inverses.
CHAPTER 2 Algebra of Matrices 45
2.18. Find the inverse, if possible, of each matrix:
(a) A ¼
5 3
4 2
 
; (b) B ¼
2 3
1 3
 
; (c)
2 6
3 9
 
:
Use the formula for the inverse of a 2 2 matrix appearing in Section 2.9.
(a) First find jAj ¼ 5ð2Þ  3ð4Þ ¼ 10  12 ¼ 2. Next interchange the diagonal elements, take the negatives
of the nondiagonal elements, and multiply by 1=jAj:
A1
¼ 
1
2
2 3
4 5
 
¼
1 3
2
2  5
2
 #
(b) First find jBj ¼ 2ð3Þ  ð3Þð1Þ ¼ 6 þ 3 ¼ 9. Next interchange the diagonal elements, take the negatives
of the nondiagonal elements, and multiply by 1=jBj:
B1
¼
1
9
3 3
1 2
 
¼
1
3
1
3
 1
9
2
9
 #
(c) First find jCj ¼ 2ð9Þ  6ð3Þ ¼ 18  18 ¼ 0. Because jCj ¼ 0; C has no inverse.
2.19. Let A ¼
1 1 1
0 1 2
1 2 4
2
6
6
4
3
7
7
5. Find A1
¼
x1 x2 x3
y1 y2 y3
z1 z2 z3
2
4
3
5.
Multiplying A by A1
and setting the nine entries equal to the nine entries of the identity matrix I yields the
following three systems of three equations in three of the unknowns:
x1 þ y1 þ z1 ¼ 1 x2 þ y2 þ z2 ¼ 0 x3 þ y3 þ z3 ¼ 0
y1 þ 2z1 ¼ 0 y2 þ 2z2 ¼ 1 y3 þ 2z3 ¼ 0
x1 þ 2y1 þ 4z1 ¼ 0 x2 þ 2y2 þ 4z2 ¼ 0 x3 þ 2y3 þ 4z3 ¼ 1
[Note that A is the coefficient matrix for all three systems.]
Solving the three systems for the nine unknowns yields
x1 ¼ 0; y1 ¼ 2; z1 ¼ 1; x2 ¼ 2; y2 ¼ 3; z2 ¼ 1; x3 ¼ 1; y3 ¼ 2; z3 ¼ 1
Thus; A1
¼
0 2 1
2 3 2
1 1 1
2
6
4
3
7
5
(Remark: Chapter 3 gives an efficient way to solve the three systems.)
2.20. Let A and B be invertible matrices (with the same size). Show that AB is also invertible and
ðABÞ1
¼ B1
A1
. [Thus, by induction, ðA1A2 . . . AmÞ1
¼ A1
m . . . A1
2 A1
1 .]
Using the associativity of matrix multiplication, we get
ðABÞðB1
A1
Þ ¼ AðBB1
ÞA1
¼ AIA1
¼ AA1
¼ I
ðB1
A1
ÞðABÞ ¼ B1
ðA1
AÞB ¼ A1
IB ¼ B1
B ¼ I
Thus, ðABÞ1
¼ B1
A1
.
46 CHAPTER 2 Algebra of Matrices
Diagonal and Triangular Matrices
2.21. Write out the diagonal matrices A ¼ diagð4; 3; 7Þ, B ¼ diagð2; 6Þ, C ¼ diagð3; 8; 0; 5Þ.
Put the given scalars on the diagonal and 0’s elsewhere:
A ¼
4 0 0
0 3 0
0 0 7
2
4
3
5; B ¼
2 0
0 6
 
; C ¼
3
8
0
5
2
6
6
4
3
7
7
5
2.22. Let A ¼ diagð2; 3; 5Þ and B ¼ diagð7; 0; 4Þ. Find
(a) AB, A2
, B2
; (b) f ðAÞ, where f ðxÞ ¼ x2
þ 3x  2; (c) A1
and B1
.
(a) The product matrix AB is a diagonal matrix obtained by multiplying corresponding diagonal entries; hence,
AB ¼ diagð2ð7Þ; 3ð0Þ; 5ð4ÞÞ ¼ diagð14; 0; 20Þ
Thus, the squares A2
and B2
are obtained by squaring each diagonal entry; hence,
A2
¼ diagð22
; 32
; 52
Þ ¼ diagð4; 9; 25Þ and B2
¼ diagð49; 0; 16Þ
(b) f ðAÞ is a diagonal matrix obtained by evaluating f ðxÞ at each diagonal entry. We have
f ð2Þ ¼ 4 þ 6  2 ¼ 8; f ð3Þ ¼ 9 þ 9  2 ¼ 16; f ð5Þ ¼ 25 þ 15  2 ¼ 38
Thus, f ðAÞ ¼ diagð8; 16; 38Þ.
(c) The inverse of a diagonal matrix is a diagonal matrix obtained by taking the inverse (reciprocal)
of each diagonal entry. Thus, A1
¼ diagð1
2 ; 1
3 ; 1
5Þ, but B has no inverse because there is a 0 on the
diagonal.
2.23. Find a 2 2 matrix A such that A2
is diagonal but not A.
Let A ¼
1 2
3 1
 
. Then A2
¼
7 0
0 7
 
, which is diagonal.
2.24. Find an upper triangular matrix A such that A3
¼
8 57
0 27
 
.
Set A ¼
x y
0 z
 
. Then x3
¼ 8, so x ¼ 2; and z3
¼ 27, so z ¼ 3. Next calculate A3
using x ¼ 2 and y ¼ 3:
A2
¼
2 y
0 3
 
2 y
0 3
 
¼
4 5y
0 9
 
and A3
¼
2 y
0 3
 
4 5y
0 9
 
¼
8 19y
0 27
 
Thus, 19y ¼ 57, or y ¼ 3. Accordingly, A ¼
2 3
0 3
 
.
2.25. Let A ¼ ½aij and B ¼ ½bij be upper triangular matrices. Prove that AB is upper triangular with
diagonal a11b11, a22b22; . . . ; annbnn.
Let AB ¼ ½cij. Then cij ¼
Pn
k¼1 aikbkj and cii ¼
Pn
k¼1 aikbki. Suppose i  j. Then, for any k, either i  k or
k  j, so that either aik ¼ 0 or bkj ¼ 0. Thus, cij ¼ 0, and AB is upper triangular. Suppose i ¼ j. Then, for
k  i, we have aik ¼ 0; and, for k  i, we have bki ¼ 0. Hence, cii ¼ aiibii, as claimed. [This proves one part of
Theorem 2.5(i); the statements for A þ B and kA are left as exercises.]
CHAPTER 2 Algebra of Matrices 47
Special Real Matrices: Symmetric and Orthogonal
2.26. Determine whether or not each of the following matrices is symmetric—that is, AT
¼ A—or
skew-symmetric—that is, AT
¼ A:
(a) A ¼
5 7 1
7 8 2
1 2 4
2
4
3
5; (b) B ¼
0 4 3
4 0 5
3 5 0
2
4
3
5; (c) C ¼
0 0 0
0 0 0
 
(a) By inspection, the symmetric elements (mirror images in the diagonal) are 7 and 7, 1 and 1, 2 and 2.
Thus, A is symmetric, because symmetric elements are equal.
(b) By inspection, the diagonal elements are all 0, and the symmetric elements, 4 and 4, 3 and 3, and 5 and
5, are negatives of each other. Hence, B is skew-symmetric.
(c) Because C is not square, C is neither symmetric nor skew-symmetric.
2.27. Suppose B ¼
4 x þ 2
2x  3 x þ 1
 
is symmetric. Find x and B.
Set the symmetric elements x þ 2 and 2x  3 equal to each other, obtaining 2x  3 ¼ x þ 2 or x ¼ 5.
Hence, B ¼
4 7
7 6
 
.
2.28. Let A be an arbitrary 2 2 (real) orthogonal matrix.
(a) Prove: If ða; bÞ is the first row of A, then a2
þ b2
¼ 1 and
A ¼
a b
b a
 
or A ¼
a b
b a
 
:
(b) Prove Theorem 2.7: For some real number y,
A ¼
cos y sin y
 sin y cos y
 
or A ¼
cos y sin y
sin y  cos y
 
(a) Suppose ðx; yÞ is the second row of A. Because the rows of A form an orthonormal set, we get
a2
þ b2
¼ 1; x2
þ y2
¼ 1; ax þ by ¼ 0
Similarly, the columns form an orthogonal set, so
a2
þ x2
¼ 1; b2
þ y2
¼ 1; ab þ xy ¼ 0
Therefore, x2
¼ 1  a2
¼ b2
, whence x ¼ b:
Case (i): x ¼ b. Then bða þ yÞ ¼ 0, so y ¼ a.
Case (ii): x ¼ b. Then bðy  aÞ ¼ 0, so y ¼ a.
This means, as claimed,
A ¼
a b
b a
 
or A ¼
a b
b a
 
(b) Because a2
þ b2
¼ 1, we have 1  a  1. Let a ¼ cos y. Then b2
¼ 1  cos2
y, so b ¼ sin y. This proves
the theorem.
2.29. Find a 2 2 orthogonal matrix A whose first row is a (positive) multiple of ð3; 4Þ.
Normalize ð3; 4Þ to get ð3
5 ; 4
5Þ. Then, by Problem 2.28,
A ¼
3
5
4
5
 4
5
3
5
 #
or A ¼
3
5
4
5
4
5  3
5
 #
:
2.30. Find a 3 3 orthogonal matrix P whose first two rows are multiples of u1 ¼ ð1; 1; 1Þ and
u2 ¼ ð0; 1; 1Þ, respectively. (Note that, as required, u1 and u2 are orthogonal.)
48 CHAPTER 2 Algebra of Matrices
First find a nonzero vector u3 orthogonal to u1 and u2; say (cross product) u3 ¼ u1 u2 ¼ ð2; 1; 1Þ. Let A be
the matrix whose rows are u1; u2; u3; and let P be the matrix obtained from A by normalizing the rows of A. Thus,
A ¼
1 1 1
0 1 1
2 1 1
2
6
4
3
7
5 and P ¼
1=
ffiffiffi
3
p
1=
ffiffiffi
3
p
1=
ffiffiffi
3
p
0 1=
ffiffiffi
2
p
1=
ffiffiffi
2
p
2=
ffiffiffi
6
p
1=
ffiffiffi
6
p
1=
ffiffiffi
6
p
2
6
6
6
4
3
7
7
7
5
Complex Matrices: Hermitian and Unitary Matrices
2.31. Find AH
where (a) A ¼
3  5i 2 þ 4i
6 þ 7i 1 þ 8i
 
, (b) A ¼
2  3i 5 þ 8i
4 3  7i
6  i 5i
2
4
3
5
Recall that AH
¼ 
AT
, the conjugate tranpose of A. Thus,
(a) AH
¼
3 þ 5i 6  7i
2  4i 1  8i
 
, (b) AH
¼
2 þ 3i 4 6 þ i
5  8i 3 þ 7i 5i
 
2.32. Show that A ¼
1
3  2
3 i 2
3 i
 2
3 i  1
3  2
3 i
 #
is unitary.
The rows of A form an orthonormal set:
1
3

2
3
i;
2
3
i
 

1
3

2
3
i;
2
3
i
 
¼
1
9
þ
4
9
 
þ
4
9
¼ 1
1
3

2
3
i;
2
3
i
 
 
2
3
i; 
1
3

2
3
i
 
¼
2
9
i þ
4
9
 
þ 
2
9
i 
4
9
 
¼ 0

2
3
i; 
1
3

2
3
i
 
 
2
3
i; 
1
3

2
3
i
 
¼
4
9
þ
1
9
þ
4
9
 
¼ 1
Thus, A is unitary.
2.33. Prove the complex analogue of Theorem 2.6: Let A be a complex matrix. Then the following are
equivalent: (i) A is unitary. (ii) The rows of A form an orthonormal set. (iii) The columns of A
form an orthonormal set.
(The proof is almost identical to the proof on page 37 for the case when A is a 3 3 real matrix.)
First recall that the vectors u1; u2; . . . ; un in Cn
form an orthonormal set if they are unit vectors and are
orthogonal to each other, where the dot product in Cn
is defined by
ða1; a2; . . . ; anÞ  ðb1; b2; . . . ; bnÞ ¼ a1

b1 þ a2

b2 þ    þ an

bn
Suppose A is unitary, and R1; R2; . . . ; Rn are its rows. Then 
RT
1 ; 
RT
2 ; . . . ; 
RT
n are the columns of AH
. Let
AAH
¼ ½cij. By matrix multiplication, cij ¼ Ri

RT
j ¼ Ri  Rj. Because A is unitary, we have AAH
¼ I. Multi-
plying A by AH
and setting each entry cij equal to the corresponding entry in I yields the following n2
equations:
R1  R1 ¼ 1; R2  R2 ¼ 1; . . . ; Rn  Rn ¼ 1; and Ri  Rj ¼ 0; for i 6¼ j
Thus, the rows of A are unit vectors and are orthogonal to each other; hence, they form an orthonormal set of
vectors. The condition AT
A ¼ I similarly shows that the columns of A also form an orthonormal set of vectors.
Furthermore, because each step is reversible, the converse is true. This proves the theorem.
Block Matrices
2.34. Consider the following block matrices (which are partitions of the same matrix):
(a)
1 2 0 1 3
2 3 5 7 2
3 1 4 5 9
2
4
3
5, (b)
1 2 0 1 3
2 3 5 7 2
3 1 4 5 9
2
4
3
5
CHAPTER 2 Algebra of Matrices 49
Find the size of each block matrix and also the size of each block.
(a) The block matrix has two rows of matrices and three columns of matrices; hence, its size is 2 3. The
block sizes are 2 2, 2 2, and 2 1 for the first row; and 1 2, 1 2, and 1 1 for the second row.
(b) The size of the block matrix is 3 2; and the block sizes are 1 3 and 1 2 for each of the three rows.
2.35. Compute AB using block multiplication, where
A ¼
1 2 1
3 4 0
0 0 2
2
4
3
5 and B ¼
1 2 3 1
4 5 6 1
0 0 0 1
2
4
3
5
Here A ¼
E F
01 2 G
 
and B ¼
R S
01 3 T
 
, where E; F; G; R; S; T are the given blocks, and 01 2 and 01 3
are zero matrices of the indicated sites. Hence,
AB ¼
ER ES þ FT
01 3 GT
 
¼
½ 0 0 0
9 12 15
19 26 33
 
2
3
7
 
þ
1
0
 
2
6
4
3
7
5 ¼
9 12 15 4
19 26 33 7
0 0 0 2
2
4
3
5
2.36. Let M ¼ diagðA; B; CÞ, where A ¼
1 2
3 4
 
, B ¼ ½5, C ¼
1 3
5 7
 
. Find M2
.
Because M is block diagonal, square each block:
A2
¼
7 10
15 22
 
; B2
¼ ½25; C2
¼
16 24
40 64
 
;
so
M2
¼
7 10
15 22
25
16 24
40 64
2
6
6
6
6
4
3
7
7
7
7
5
Miscellaneous Problem
2.37. Let f ðxÞ and gðxÞ be polynomials and let A be a square matrix. Prove
(a) ð f þ gÞðAÞ ¼ f ðAÞ þ gðAÞ,
(b) ð f  gÞðAÞ ¼ f ðAÞgðAÞ,
(c) f ðAÞgðAÞ ¼ gðAÞ f ðAÞ.
Suppose f ðxÞ ¼
Pr
i¼1 aixi
and gðxÞ ¼
Ps
j¼1 bjxj
.
(a) We can assume r ¼ s ¼ n by adding powers of x with 0 as their coefficients. Then
f ðxÞ þ gðxÞ ¼
P
n
i¼1
ðai þ biÞxi
Hence, ð f þ gÞðAÞ ¼
P
n
i¼1
ðai þ biÞAi
¼
P
n
i¼1
ai Ai
þ
P
n
i¼1
biAi
¼ f ðAÞ þ gðAÞ
(b) We have f ðxÞgðxÞ ¼
P
i;j
aibjxiþj
. Then
f ðAÞgðAÞ ¼
P
i
aiAi
!
P
j
bjAj
!
¼
P
i;j
aibjAiþj
¼ ð fgÞðAÞ
(c) Using f ðxÞgðxÞ ¼ gðxÞf ðxÞ, we have
f ðAÞgðAÞ ¼ ð fgÞðAÞ ¼ ðg f ÞðAÞ ¼ gðAÞ f ðAÞ
50 CHAPTER 2 Algebra of Matrices
SUPPLEMENTARY PROBLEMS
Algebra of Matrices
Problems 2.38–2.41 refer to the following matrices:
A ¼
1 2
3 4
 
; B ¼
5 0
6 7
 
; C ¼
1 3 4
2 6 5
 
; D ¼
3 7 1
4 8 9
 
2.38. Find (a) 5A  2B, (b) 2A þ 3B, (c) 2C  3D.
2.39. Find (a) AB and ðABÞC, (b) BC and AðBCÞ. [Note that ðABÞC ¼ AðBCÞ.]
2.40. Find (a) A2
and A3
, (b) AD and BD, (c) CD.
2.41. Find (a) AT
, (b) BT
, (c) ðABÞT
, (d) AT
BT
. [Note that AT
BT
6¼ ðABÞT
.]
Problems 2.42 and 2.43 refer to the following matrices:
A ¼
1 1 2
0 3 4
 
; B ¼
4 0 3
1 2 3
 
; C ¼
2 3 0 1
5 1 4 2
1 0 0 3
2
4
3
5; D ¼
2
1
3
2
4
3
5:
2.42. Find (a) 3A  4B, (b) AC, (c) BC, (d) AD, (e) BD, ( f ) CD.
2.43. Find (a) AT
, (b) AT
B, (c) AT
C.
2.44. Let A ¼
1 2
3 6
 
. Find a 2 3 matrix B with distinct nonzero entries such that AB ¼ 0.
2.45 Let e1 ¼ ½1; 0; 0, e2 ¼ ½0; 1; 0, e3 ¼ ½0; 0; 1, and A ¼
a1 a2 a3 a4
b1 b2 b3 b4
c1 c2 c3 c4
2
4
3
5. Find e1A, e2A, e3A.
2.46. Let ei ¼ ½0; . . . ; 0; 1; 0; . . . ; 0, where 1 is the ith entry. Show
(a) eiA ¼ Ai, ith row of A. (c) If eiA ¼ eiB, for each i, then A ¼ B.
(b) BeT
j ¼ Bj
, jth column of B. (d) If AeT
j ¼ BeT
j , for each j, then A ¼ B.
2.47. Prove Theorem 2.2(iii) and (iv): (iii) ðB þ CÞA ¼ BA þ CA, (iv) kðABÞ ¼ ðkAÞB ¼ AðkBÞ.
2.48. Prove Theorem 2.3: (i) ðA þ BÞT
¼ AT
þ BT
, (ii) ðAT
ÞT
¼ A, (iii) ðkAÞT
¼ kAT
.
2.49. Show (a) If A has a zero row, then AB has a zero row. (b) If B has a zero column, then AB has a
zero column.
Square Matrices, Inverses
2.50. Find the diagonal and trace of each of the following matrices:
(a) A ¼
2 5 8
3 6 7
4 0 1
2
4
3
5, (b) B ¼
1 3 4
6 1 7
2 5 1
2
4
3
5, (c) C ¼
4 3 6
2 5 0
 
Problems 2.51–2.53 refer to A ¼
2 5
3 1
 
, B ¼
4 2
1 6
 
, C ¼
6 4
3 2
 
.
2.51. Find (a) A2
and A3
, (b) f ðAÞ and gðAÞ, where
f ðxÞ ¼ x3
 2x2
 5; gðxÞ ¼ x2
 3x þ 17:
CHAPTER 2 Algebra of Matrices 51
2.52. Find (a) B2
and B3
, (b) f ðBÞ and gðBÞ, where
f ðxÞ ¼ x2
þ 2x  22; gðxÞ ¼ x2
 3x  6:
2.53. Find a nonzero column vector u such that Cu ¼ 4u.
2.54. Find the inverse of each of the following matrices (if it exists):
A ¼
7 4
5 3
 
; B ¼
2 3
4 5
 
; C ¼
4 6
2 3
 
; D ¼
5 2
6 3
 
2.55. Find the inverses of A ¼
1 1 2
1 2 5
1 3 7
2
4
3
5 and B ¼
1 1 1
0 1 1
1 3 2
2
4
3
5. [Hint: See Problem 2.19.]
2.56. Suppose A is invertible. Show that if AB ¼ AC, then B ¼ C. Give an example of a nonzero matrix
A such that AB ¼ AC but B 6¼ C.
2.57. Find 2 2 invertible matrices A and B such that A þ B 6¼ 0 and A þ B is not invertible.
2.58. Show (a) A is invertible if and only if AT
is invertible. (b) The operations of inversion and
transpose commute; that is, ðAT
Þ1
¼ ðA1
ÞT
. (c) If A has a zero row or zero column, then A is
not invertible.
Diagonal and triangular matrices
2.59. Let A ¼ diagð1; 2; 3Þ and B ¼ diagð2; 5; 0Þ. Find
(a) AB, A2
, B2
; (b) f ðAÞ, where f ðxÞ ¼ x2
þ 4x  3; (c) A1
and B1
.
2.60. Let A ¼
1 2
0 1
 
and B ¼
1 1 0
0 1 1
0 0 1
2
4
3
5. (a) Find An
. (b) Find Bn
.
2.61. Find all real triangular matrices A such that A2
¼ B, where (a) B ¼
4 21
0 25
 
, (b) B ¼
1 4
0 9
 
.
2.62. Let A ¼
5 2
0 k
 
. Find all numbers k for which A is a root of the polynomial:
(a) f ðxÞ ¼ x2
 7x þ 10, (b) gðxÞ ¼ x2
 25, (c) hðxÞ ¼ x2
 4.
2.63. Let B ¼
1 0
26 27
 
: Find a matrix A such that A3
¼ B.
2.64. Let B ¼
1 8 5
0 9 5
0 0 4
2
4
3
5. Find a triangular matrix A with positive diagonal entries such that A2
¼ B.
2.65. Using only the elements 0 and 1, find the number of 3 3 matrices that are (a) diagonal,
(b) upper triangular, (c) nonsingular and upper triangular. Generalize to n n matrices.
2.66. Let Dk ¼ kI, the scalar matrix belonging to the scalar k. Show
(a) DkA ¼ kA, (b) BDk ¼ kB, (c) Dk þ Dk0 ¼ Dkþk0 , (d) DkDk0 ¼ Dkk0
2.67. Suppose AB ¼ C, where A and C are upper triangular.
(a) Find 2 2 nonzero matrices A; B; C, where B is not upper triangular.
(b) Suppose A is also invertible. Show that B must also be upper triangular.
52 CHAPTER 2 Algebra of Matrices
Special Types of Real Matrices
2.68. Find x; y; z such that A is symmetric, where
(a) A ¼
2 x 3
4 5 y
z 1 7
2
4
3
5, (b) A ¼
7 6 2x
y z 2
x 2 5
2
4
3
5.
2.69. Suppose A is a square matrix. Show (a) A þ AT
is symmetric, (b) A  AT
is skew-symmetric,
(c) A ¼ B þ C, where B is symmetric and C is skew-symmetric.
2.70. Write A ¼
4 5
1 3
 
as the sum of a symmetric matrix B and a skew-symmetric matrix C.
2.71. Suppose A and B are symmetric. Show that the following are also symmetric:
(a) A þ B; (b) kA, for any scalar k; (c) A2
;
(d) An
, for n  0; (e) f ðAÞ, for any polynomial f ðxÞ.
2.72. Find a 2 2 orthogonal matrix P whose first row is a multiple of
(a) ð3; 4Þ, (b) ð1; 2Þ.
2.73. Find a 3 3 orthogonal matrix P whose first two rows are multiples of
(a) ð1; 2; 3Þ and ð0; 2; 3Þ, (b) ð1; 3; 1Þ and ð1; 0; 1Þ.
2.74. Suppose A and B are orthogonal matrices. Show that AT
, A1
, AB are also orthogonal.
2.75. Which of the following matrices are normal? A ¼
3 4
4 3
 
, B ¼
1 2
2 3
 
, C ¼
1 1 1
0 1 1
0 0 1
2
4
3
5.
Complex Matrices
2.76. Find real numbers x; y; z such that A is Hermitian, where A ¼
3 x þ 2i yi
3  2i 0 1 þ zi
yi 1  xi 1
2
4
3
5:
2.77. Suppose A is a complex matrix. Show that AAH
and AH
A are Hermitian.
2.78. Let A be a square matrix. Show that (a) A þ AH
is Hermitian, (b) A  AH
is skew-Hermitian,
(c) A ¼ B þ C, where B is Hermitian and C is skew-Hermitian.
2.79. Determine which of the following matrices are unitary:
A ¼
i=2 
ffiffiffi
3
p
=2
ffiffiffi
3
p
=2 i=2
 
; B ¼
1
2
1 þ i 1  i
1  i 1 þ i
 
; C ¼
1
2
1 i 1 þ i
i 1 1 þ i
1 þ i 1 þ i 0
2
4
3
5
2.80. Suppose A and B are unitary. Show that AH
, A1
, AB are unitary.
2.81. Determine which of the following matrices are normal: A ¼
3 þ 4i 1
i 2 þ 3i
 
and
B ¼
1 0
1  i i
 
.
CHAPTER 2 Algebra of Matrices 53
Block Matrices
2.82. Let U ¼
1 2 0 0 0
3 4 0 0 0
0 0 5 1 2
0 0 3 4 1
2
6
6
4
3
7
7
5 and V ¼
3 2 0 0
2 4 0 0
0 0 1 2
0 0 2 3
0 0 4 1
2
6
6
6
6
4
3
7
7
7
7
5
.
(a) Find UV using block multiplication. (b) Are U and V block diagonal matrices?
(c) Is UV block diagonal?
2.83. Partition each of the following matrices so that it becomes a square block matrix with as many
diagonal blocks as possible:
A ¼
1 0 0
0 0 2
0 0 3
2
4
3
5; B ¼
1 2 0 0 0
3 0 0 0 0
0 0 4 0 0
0 0 5 0 0
0 0 0 0 6
2
6
6
6
6
4
3
7
7
7
7
5
; C ¼
0 1 0
0 0 0
2 0 0
2
4
3
5
2.84. Find M2
and M3
for (a) M ¼
2 0 0 0
0 1 4 0
0 2 1 0
0 0 0 3
2
6
6
4
3
7
7
5, (b) M ¼
1 1 0 0
2 3 0 0
0 0 1 2
0 0 4 5
2
6
6
4
3
7
7
5.
2.85. For each matrix M in Problem 2.84, find f ðMÞ where f ðxÞ ¼ x2
þ 4x  5.
2.86. Suppose U ¼ ½Uik and V ¼ ½Vkj are block matrices for which UV is defined and the number of
columns of each block Uik is equal to the number of rows of each block Vkj. Show that UV ¼ ½Wij,
where Wij ¼
P
k UikVkj.
2.87. Suppose M and N are block diagonal matrices where corresponding blocks have the same size,
say M ¼ diagðAiÞ and N ¼ diagðBiÞ. Show
(i) M þ N ¼ diagðAi þ BiÞ, (iii) MN ¼ diagðAiBiÞ,
(ii) kM ¼ diagðkAiÞ, (iv) f ðMÞ ¼ diagð f ðAiÞÞ for any polynomial f ðxÞ.
ANSWERS TO SUPPLEMENTARY PROBLEMS
Notation: A ¼ ½R1; R2; . . . denotes a matrix A with rows R1; R2; . . . .
2.38. (a) ½5; 10; 27; 34, (b) ½17; 4; 12; 13, (c) ½7; 27; 11; 8; 36; 37
2.39. (a) ½7; 14; 39; 28, ½21; 105; 98; 17; 285; 296
(b) ½5; 15; 20; 8; 60; 59, ½21; 105; 98; 17; 285; 296
2.40. (a) ½7; 6; 9; 22, ½11; 38; 57; 106;
(b) ½11; 9; 17; 7; 53; 39, ½15; 35; 5; 10; 98; 69; (c) not defined
2.41. (a) ½1; 3; 2; 4, (b) ½5; 6; 0; 7, (c) ½7; 39; 14; 28; (d) ½5; 15; 10; 40
2.42. (a) ½13; 3; 18; 4; 17; 0, (b) ½5; 2; 4; 5; 11; 3; 12; 18,
(c) ½11; 12; 0; 5; 15; 5; 8; 4, (d) ½9; 9, (e) ½1; 9, (f ) not defined
54 CHAPTER 2 Algebra of Matrices
2.43. (a) ½1; 0; 1; 3; 2; 4, (b) ½4; 0; 3; 7; 6; 12; 4; 8; 6], (c) not defined
2.44. ½2; 4; 6; 1; 2; 3
2.45. ½a1; a2; a3; a4, ½b1; b2; b3; b4, ½c1; c2; c3; c4
2.50. (a) 2; 6; 1; trðAÞ ¼ 5, (b) 1; 1; 1; trðBÞ ¼ 1, (c) not defined
2.51. (a) ½11; 15; 9; 14, ½67; 40; 24; 59, (b) ½50; 70; 42; 36, gðAÞ ¼ 0
2.52. (a) ½14; 4; 2; 34, ½60; 52; 26; 200, (b) f ðBÞ ¼ 0, ½4; 10; 5; 46
2.53. u ¼ ½2a; aT
2.54. ½3; 4; 5; 7, ½ 5
2 ; 3
2; 2; 1, not defined, ½1;  2
3; 2;  5
3
2.55. ½1; 1; 1; 2; 5; 3; 1; 2; 1, ½1; 1; 0; 1; 3; 1; 1; 4; 1
2.56. A ¼ ½1; 2; 1; 2, B ¼ ½0; 0; 1; 1, C ¼ ½2; 2; 0; 0
2.57. A ¼ ½1; 2; 0; 3; B ¼ ½4; 3; 3; 0
2.58. (c) Hint: Use Problem 2.48
2.59. (a) AB ¼ diagð2; 10; 0Þ, A2
¼ diagð1; 4; 9Þ, B2
¼ diagð4; 25; 0Þ;
(b) f ðAÞ ¼ diagð2; 9; 6Þ; (c) A1
¼ diagð1; 1
2 ;  1
3Þ, C1
does not exist
2.60. (a) ½1; 2n; 0; 1, (b) ½1; n; 1
2 nðn  1Þ; 0; 1; n; 0; 0; 1
2.61. (a) ½2; 3; 0; 5, ½2; 3; 0; 5, ½2; 7; 0; 5, ½2; 7; 0; 5, (b) none
2.62. (a) k ¼ 2, (b) k ¼ 5, (c) none
2.63. ½1; 0; 2; 3
2.64. ½1; 2; 1; 0; 3; 1; 0; 0; 2
2.65. All entries below the diagonal must be 0 to be upper triangular, and all diagonal entries must be 1
to be nonsingular.
(a) 8 ð2n
Þ, (b) 26
ð2nðnþ1Þ=2
Þ, (c) 23
ð2nðn1Þ=2
Þ
2.67. (a) A ¼ ½1; 1; 0; 0, B ¼ ½1; 2; 3; 4, C ¼ ½4; 6; 0; 0
2.68. (a) x ¼ 4, y ¼ 1, z ¼ 3; (b) x ¼ 0, y ¼ 6, z any real number
2.69. (c) Hint: Let B ¼ 1
2 ðA þ AT
Þ and C ¼ 1
2 ðA  AT
Þ:
2.70. B ¼ ½4; 3; 3; 3, C ¼ ½0; 2; 2; 0
2.72. (a) ½3
5,  4
5; 4
5, 3
5], (b) ½1=
ffiffiffi
5
p
, 2=
ffiffiffi
5
p
; 2=
ffiffiffi
5
p
, 1=
ffiffiffi
5
p

2.73. (a) ½1=
ffiffiffiffiffi
14
p
, 2=
ffiffiffiffiffi
14
p
, 3=
ffiffiffiffiffi
14
p
; 0; 2=
ffiffiffiffiffi
13
p
, 3=
ffiffiffiffiffi
13
p
; 12=
ffiffiffiffiffiffiffiffi
157
p
, 3=
ffiffiffiffiffiffiffiffi
157
p
, 2=
ffiffiffiffiffiffiffiffi
157
p

(b) ½1=
ffiffiffiffiffi
11
p
, 3=
ffiffiffiffiffi
11
p
, 1=
ffiffiffiffiffi
11
p
; 1=
ffiffiffi
2
p
, 0; 1=
ffiffiffi
2
p
; 3=
ffiffiffiffiffi
22
p
, 2=
ffiffiffiffiffi
22
p
, 3=
ffiffiffiffiffi
22
p

2.75. A; C
CHAPTER 2 Algebra of Matrices 55
2.76. x ¼ 3, y ¼ 0, z ¼ 3
2.78. (c) Hint: Let B ¼ 1
2 ðA þ AH
Þ and C ¼ 1
2 ðA  AH
Þ.
2.79. A; B; C
2.81. A
2.82. (a) UV ¼ diagð½7; 6; 17; 10; ½1; 9; 7; 5); (b) no; (c) yes
2.83. A: line between first and second rows (columns);
B: line between second and third rows (columns) and between fourth and fifth rows (columns);
C: C itself—no further partitioning of C is possible.
2.84. (a) M2
¼ diagð½4, ½9; 8; 4; 9, ½9Þ,
M3
¼ diagð½8; ½25; 44; 22; 25, ½27Þ
(b) M2
¼ diagð½3; 4; 8; 11, ½9; 12; 24; 33Þ
M3
¼ diagð½11; 15; 30; 41, ½57; 78; 156; 213Þ
2.85. (a) diagð½7, ½8; 24; 12; 8, ½16Þ, (b) diagð½2; 8; 16; 181], ½8; 20; 40; 48Þ
56 CHAPTER 2 Algebra of Matrices
Systems of Linear
Equations
3.1 Introduction
Systems of linear equations play an important and motivating role in the subject of linear algebra. In fact,
many problems in linear algebra reduce to finding the solution of a system of linear equations. Thus, the
techniques introduced in this chapter will be applicable to abstract ideas introduced later. On the other
hand, some of the abstract results will give us new insights into the structure and properties of systems of
linear equations.
All our systems of linear equations involve scalars as both coefficients and constants, and such scalars
may come from any number field K. There is almost no loss in generality if the reader assumes that all
our scalars are real numbers—that is, that they come from the real field R.
3.2 Basic Definitions, Solutions
This section gives basic definitions connected with the solutions of systems of linear equations. The
actual algorithms for finding such solutions will be treated later.
Linear Equation and Solutions
A linear equation in unknowns x1; x2; . . . ; xn is an equation that can be put in the standard form
a1x1 þ a2x2 þ    þ anxn ¼ b ð3:1Þ
where a1; a2; . . . ; an, and b are constants. The constant ak is called the coefficient of xk, and b is called the
constant term of the equation.
A solution of the linear equation (3.1) is a list of values for the unknowns or, equivalently, a vector u in
Kn
, say
x1 ¼ k1; x2 ¼ k2; . . . ; xn ¼ kn or u ¼ ðk1; k2; . . . ; knÞ
such that the following statement (obtained by substituting ki for xi in the equation) is true:
a1k1 þ a2k2 þ    þ ankn ¼ b
In such a case we say that u satisfies the equation.
Remark: Equation (3.1) implicitly assumes there is an ordering of the unknowns. In order to avoid
subscripts, we will usually use x; y for two unknowns; x; y; z for three unknowns; and x; y; z; t for four
unknowns; they will be ordered as shown.
57
CHAPTER 3
EXAMPLE 3.1 Consider the following linear equation in three unknowns x; y; z:
x þ 2y  3z ¼ 6
We note that x ¼ 5; y ¼ 2; z ¼ 1, or, equivalently, the vector u ¼ ð5; 2; 1Þ is a solution of the equation. That is,
5 þ 2ð2Þ  3ð1Þ ¼ 6 or 5 þ 4  3 ¼ 6 or 6 ¼ 6
On the other hand, w ¼ ð1; 2; 3Þ is not a solution, because on substitution, we do not get a true statement:
1 þ 2ð2Þ  3ð3Þ ¼ 6 or 1 þ 4  9 ¼ 6 or  4 ¼ 6
System of Linear Equations
A system of linear equations is a list of linear equations with the same unknowns. In particular, a system
of m linear equations L1; L2; . . . ; Lm in n unknowns x1; x2; . . . ; xn can be put in the standard form
a11x1 þ a12x2 þ    þ a1nxn ¼ b1
a21x1 þ a22x2 þ    þ a2nxn ¼ b2 ð3:2Þ
:::::::::::::::::::::::::::::::::::::::::::::::::::
am1x1 þ am2x2 þ    þ amnxn ¼ bm
where the aij and bi are constants. The number aij is the coefficient of the unknown xj in the equation Li,
and the number bi is the constant of the equation Li.
The system (3.2) is called an m n (read: m by n) system. It is called a square system if m ¼ n—that
is, if the number m of equations is equal to the number n of unknowns.
The system (3.2) is said to be homogeneous if all the constant terms are zero—that is, if b1 ¼ 0,
b2 ¼ 0; . . . ; bm ¼ 0. Otherwise the system is said to be nonhomogeneous.
A solution (or a particular solution) of the system (3.2) is a list of values for the unknowns or,
equivalently, a vector u in Kn
, which is a solution of each of the equations in the system. The set of all
solutions of the system is called the solution set or the general solution of the system.
EXAMPLE 3.2 Consider the following system of linear equations:
x1 þ x2 þ 4x3 þ 3x4 ¼ 5
2x1 þ 3x2 þ x3  2x4 ¼ 1
x1 þ 2x2  5x3 þ 4x4 ¼ 3
It is a 3 4 system because it has three equations in four unknowns. Determine whether (a) u ¼ ð8; 6; 1; 1Þ and
(b) v ¼ ð10; 5; 1; 2Þ are solutions of the system.
(a) Substitute the values of u in each equation, obtaining
8 þ 6 þ 4ð1Þ þ 3ð1Þ ¼ 5 or 8 þ 6 þ 4 þ 3 ¼ 5 or 5 ¼ 5
2ð8Þ þ 3ð6Þ þ 1  2ð1Þ ¼ 1 or 16 þ 18 þ 1  2 ¼ 1 or 1 ¼ 1
8 þ 2ð6Þ  5ð1Þ þ 4ð1Þ ¼ 3 or 8 þ 12  5 þ 4 ¼ 3 or 3 ¼ 3
Yes, u is a solution of the system because it is a solution of each equation.
(b) Substitute the values of v into each successive equation, obtaining
10 þ 5 þ 4ð1Þ þ 3ð2Þ ¼ 5 or 10 þ 5 þ 4 þ 6 ¼ 5 or 5 ¼ 5
2ð10Þ þ 3ð5Þ þ 1  2ð2Þ ¼ 1 or 20 þ 15 þ 1  4 ¼ 1 or 8 ¼ 1
No, v is not a solution of the system, because it is not a solution of the second equation. (We do not need to
substitute v into the third equation.)
58 CHAPTER 3 Systems of Linear Equations
The system (3.2) of linear equations is said to be consistent if it has one or more solutions, and it is
said to be inconsistent if it has no solution. If the field K of scalars is infinite, such as when K is the real
field R or the complex field C, then we have the following important result.
THEOREM 3.1: Suppose the field K is infinite. Then any system l of linear equations has
(i) a unique solution, (ii) no solution, or (iii) an infinite number of solutions.
This situation is pictured in Fig. 3-1. The three cases have a geometrical description when the system
l consists of two equations in two unknowns (Section 3.4).
Augmented and Coefficient Matrices of a System
Consider again the general system (3.2) of m equations in n unknowns. Such a system has associated with
it the following two matrices:
M ¼
a11 a12 . . . a1n b1
a21 a22 . . . a2n b2
:::::::::::::::::::::::::::::::::::::::
am1 am2 . . . amn bn
2
6
6
4
3
7
7
5 and A ¼
a11 a12 . . . a1n
a21 a22 . . . a2n
:::::::::::::::::::::::::::::::
am1 am2 . . . amn
2
6
6
4
3
7
7
5
The first matrix M is called the augmented matrix of the system, and the second matrix A is called the
coefficient matrix.
The coefficient matrix A is simply the matrix of coefficients, which is the augmented matrix M without
the last column of constants. Some texts write M ¼ ½A; B to emphasize the two parts of M, where B
denotes the column vector of constants. The augmented matrix M and the coefficient matrix A of the
system in Example 3.2 are as follows:
M ¼
1 1 4 3 5
2 3 1 2 1
1 2 5 4 3
2
4
3
5 and A ¼
1 1 4 3
2 3 1 2
1 2 5 4
2
4
3
5
As expected, A consists of all the columns of M except the last, which is the column of constants.
Clearly, a system of linear equations is completely determined by its augmented matrix M, and vice
versa. Specifically, each row of M corresponds to an equation of the system, and each column of M
corresponds to the coefficients of an unknown, except for the last column, which corresponds to the
constants of the system.
Degenerate Linear Equations
A linear equation is said to be degenerate if all the coefficients are zero—that is, if it has the form
0x1 þ 0x2 þ    þ 0xn ¼ b ð3:3Þ
Figure 3-1
CHAPTER 3 Systems of Linear Equations 59
The solution of such an equation depends only on the value of the constant b. Specifically,
(i) If b 6¼ 0, then the equation has no solution.
(ii) If b ¼ 0, then every vector u ¼ ðk1; k2; . . . ; knÞ in Kn
is a solution.
The following theorem applies.
THEOREM 3.2: Let l be a system of linear equations that contains a degenerate equation L, say with
constant b.
(i) If b 6¼ 0, then the system l has no solution.
(ii) If b ¼ 0, then L may be deleted from the system without changing the solution
set of the system.
Part (i) comes from the fact that the degenerate equation has no solution, so the system has no solution.
Part (ii) comes from the fact that every element in Kn
is a solution of the degenerate equation.
Leading Unknown in a Nondegenerate Linear Equation
Now let L be a nondegenerate linear equation. This means one or more of the coefficients of L are not
zero. By the leading unknown of L, we mean the first unknown in L with a nonzero coefficient. For
example, x3 and y are the leading unknowns, respectively, in the equations
0x1 þ 0x2 þ 5x3 þ 6x4 þ 0x5 þ 8x6 ¼ 7 and 0x þ 2y  4z ¼ 5
We frequently omit terms with zero coefficients, so the above equations would be written as
5x3 þ 6x4 þ 8x6 ¼ 7 and 2y  4z ¼ 5
In such a case, the leading unknown appears first.
3.3 Equivalent Systems, Elementary Operations
Consider the system (3.2) of m linear equations in n unknowns. Let L be the linear equation obtained by
multiplying the m equations by constants c1; c2; . . . ; cm, respectively, and then adding the resulting
equations. Specifically, let L be the following linear equation:
ðc1a11 þ    þ cmam1Þx1 þ    þ ðc1a1n þ    þ cmamnÞxn ¼ c1b1 þ    þ cmbm
Then L is called a linear combination of the equations in the system. One can easily show (Problem 3.43)
that any solution of the system (3.2) is also a solution of the linear combination L.
EXAMPLE 3.3 Let L1, L2, L3 denote, respectively, the three equations in Example 3.2. Let L be the
equation obtained by multiplying L1, L2, L3 by 3; 2; 4, respectively, and then adding. Namely,
3L1: 3x1 þ 3x2 þ 12x3 þ 9x4 ¼ 15
2L2: 4x1  6x2  2x3 þ 4x4 ¼ 2
4L1: 4x1 þ 8x2  20x3 þ 16x4 ¼ 12
ðSumÞ L: 3x1 þ 5x2  10x3 þ 29x4 ¼ 25
60 CHAPTER 3 Systems of Linear Equations
Then L is a linear combination of L1, L2, L3. As expected, the solution u ¼ ð8; 6; 1; 1Þ of the system is also a
solution of L. That is, substituting u in L, we obtain a true statement:
3ð8Þ þ 5ð6Þ  10ð1Þ þ 29ð1Þ ¼ 25 or 24 þ 30  10 þ 29 ¼ 25 or 9 ¼ 9
The following theorem holds.
THEOREM 3.3: Two systems of linear equations have the same solutions if and only if each equation in
each system is a linear combination of the equations in the other system.
Two systems of linear equations are said to be equivalent if they have the same solutions. The next
subsection shows one way to obtain equivalent systems of linear equations.
Elementary Operations
The following operations on a system of linear equations L1; L2; . . . ; Lm are called elementary operations.
½E1 Interchange two of the equations. We indicate that the equations Li and Lj are interchanged by
writing:
‘‘Interchange Li and Lj’’ or ‘‘Li ! Lj’’
½E2 Replace an equation by a nonzero multiple of itself. We indicate that equation Li is replaced by kLi
(where k 6¼ 0) by writing
‘‘Replace Li by kLi’’ or ‘‘kLi ! Li’’
½E3 Replace an equation by the sum of a multiple of another equation and itself. We indicate that
equation Lj is replaced by the sum of kLi and Lj by writing
‘‘Replace Lj by kLi þ Lj’’ or ‘‘kLi þ Lj ! Lj’’
The arrow ! in ½E2 and ½E3 may be read as ‘‘replaces.’’
The main property of the above elementary operations is contained in the following theorem (proved
in Problem 3.45).
THEOREM 3.4: Suppose a system of m of linear equations is obtained from a system l of linear
equations by a finite sequence of elementary operations. Then m and l have the same
solutions.
Remark: Sometimes (say to avoid fractions when all the given scalars are integers) we may apply
½E2 and ½E3 in one step; that is, we may apply the following operation:
½E Replace equation Lj by the sum of kLi and k0
Lj (where k0
6¼ 0), written
‘‘Replace Lj by kLi þ k0
Lj’’ or ‘‘kLi þ k0
Lj ! Lj’’
We emphasize that in operations ½E3 and [E], only equation Lj is changed.
Gaussian elimination, our main method for finding the solution of a given system of linear
equations, consists of using the above operations to transform a given system into an equivalent
system whose solution can be easily obtained.
The details of Gaussian elimination are discussed in subsequent sections.
3.4 Small Square Systems of Linear Equations
This section considers the special case of one equation in one unknown, and two equations in two
unknowns. These simple systems are treated separately because their solution sets can be described
geometrically, and their properties motivate the general case.
CHAPTER 3 Systems of Linear Equations 61
Linear Equation in One Unknown
The following simple basic result is proved in Problem 3.5.
THEOREM 3.5: Consider the linear equation ax ¼ b.
(i) If a 6¼ 0, then x ¼ b=a is a unique solution of ax ¼ b.
(ii) If a ¼ 0, but b 6¼ 0, then ax ¼ b has no solution.
(iii) If a ¼ 0 and b ¼ 0, then every scalar k is a solution of ax ¼ b.
EXAMPLE 3.4 Solve (a) 4x  1 ¼ x þ 6, (b) 2x  5  x ¼ x þ 3, (c) 4 þ x  3 ¼ 2x þ 1  x.
(a) Rewrite the equation in standard form obtaining 3x ¼ 7. Then x ¼ 7
3 is the unique solution [Theorem 3.5(i)].
(b) Rewrite the equation in standard form, obtaining 0x ¼ 8. The equation has no solution [Theorem 3.5(ii)].
(c) Rewrite the equation in standard form, obtaining 0x ¼ 0. Then every scalar k is a solution [Theorem 3.5(iii)].
System of Two Linear Equations in Two Unknowns (2 2 System)
Consider a system of two nondegenerate linear equations in two unknowns x and y, which can be put in
the standard form
A1x þ B1y ¼ C1
A2x þ B2y ¼ C2
ð3:4Þ
Because the equations are nondegenerate, A1 and B1 are not both zero, and A2 and B2 are not both zero.
The general solution of the system (3.4) belongs to one of three types as indicated in Fig. 3-1. If R is
the field of scalars, then the graph of each equation is a line in the plane R2
and the three types may be
described geometrically as pictured in Fig. 3-2. Specifically,
(1) The system has exactly one solution.
Here the two lines intersect in one point [Fig. 3-2(a)]. This occurs when the lines have distinct
slopes or, equivalently, when the coefficients of x and y are not proportional:
A1
A2
6¼
B1
B2
or; equivalently; A1B2  A2B1 6¼ 0
For example, in Fig. 3-2(a), 1=3 6¼ 1=2.
y
L1
x
L2
0
–3 3
–3
3
L x y
L x y
1
2
: – = –
1
: 3 + 2 = 12
6
(a)
y
(b)
L1
x
L2
0 3
–3
3
L x y
L x y
1
2
: + 3 = 3
: 2 + 6 = –
8
6
–3
y
(c)
L L
1 2
and
x
0 3
–3
3
L x y
L x y
1
2
: + 2 = 4
: 2 + 4 = 8
6
–3
Figure 3-2
62 CHAPTER 3 Systems of Linear Equations
(2) The system has no solution.
Here the two lines are parallel [Fig. 3-2(b)]. This occurs when the lines have the same slopes but
different y intercepts, or when
A1
A2
¼
B1
B2
6¼
C1
C2
For example, in Fig. 3-2(b), 1=2 ¼ 3=6 6¼ 3=8.
(3) The system has an infinite number of solutions.
Here the two lines coincide [Fig. 3-2(c)]. This occurs when the lines have the same slopes and same
y intercepts, or when the coefficients and constants are proportional,
A1
A2
¼
B1
B2
¼
C1
C2
For example, in Fig. 3-2(c), 1=2 ¼ 2=4 ¼ 4=8.
Remark: The following expression and its value is called a determinant of order two:
A1 B1
A2 B2







 ¼ A1B2  A2B1
Determinants will be studied in Chapter 8. Thus, the system (3.4) has a unique solution if and only if the
determinant of its coefficients is not zero. (We show later that this statement is true for any square system
of linear equations.)
Elimination Algorithm
The solution to system (3.4) can be obtained by the process of elimination, whereby we reduce the system
to a single equation in only one unknown. Assuming the system has a unique solution, this elimination
algorithm has two parts.
ALGORITHM 3.1: The input consists of two nondegenerate linear equations L1 and L2 in two
unknowns with a unique solution.
Part A. (Forward Elimination) Multiply each equation by a constant so that the resulting coefficients of
one unknown are negatives of each other, and then add the two equations to obtain a new
equation L that has only one unknown.
Part B. (Back-Substitution) Solve for the unknown in the new equation L (which contains only one
unknown), substitute this value of the unknown into one of the original equations, and then
solve to obtain the value of the other unknown.
Part A of Algorithm 3.1 can be applied to any system even if the system does not have a unique
solution. In such a case, the new equation L will be degenerate and Part B will not apply.
EXAMPLE 3.5 (Unique Case). Solve the system
L1: 2x  3y ¼ 8
L2: 3x þ 4y ¼ 5
The unknown x is eliminated from the equations by forming the new equation L ¼ 3L1 þ 2L2. That is, we
multiply L1 by 3 and L2 by 2 and add the resulting equations as follows:
3L1: 6x þ 9y ¼ 24
2L2: 6x þ 8y ¼ 10
Addition : 17y ¼ 34
CHAPTER 3 Systems of Linear Equations 63
We now solve the new equation for y, obtaining y ¼ 2. We substitute y ¼ 2 into one of the original equations, say
L1, and solve for the other unknown x, obtaining
2x  3ð2Þ ¼ 8 or 2x  6 ¼ 8 or 2x ¼ 2 or x ¼ 1
Thus, x ¼ 1, y ¼ 2, or the pair u ¼ ð1; 2Þ is the unique solution of the system. The unique solution is expected,
because 2=3 6¼ 3=4. [Geometrically, the lines corresponding to the equations intersect at the point ð1; 2Þ.]
EXAMPLE 3.6 (Nonunique Cases)
(a) Solve the system
L1: x  3y ¼ 4
L2: 2x þ 6y ¼ 5
We eliminated x from the equations by multiplying L1 by 2 and adding it to L2—that is, by forming the new
equation L ¼ 2L1 þ L2. This yields the degenerate equation
0x þ 0y ¼ 13
which has a nonzero constant b ¼ 13. Thus, this equation and the system have no solution. This is expected,
because 1=ð2Þ ¼ 3=6 6¼ 4=5. (Geometrically, the lines corresponding to the equations are parallel.)
(b) Solve the system
L1: x  3y ¼ 4
L2: 2x þ 6y ¼ 8
We eliminated x from the equations by multiplying L1 by 2 and adding it to L2—that is, by forming the new
equation L ¼ 2L1 þ L2. This yields the degenerate equation
0x þ 0y ¼ 0
where the constant term is also zero. Thus, the system has an infinite number of solutions, which correspond to
the solutions of either equation. This is expected, because 1=ð2Þ ¼ 3=6 ¼ 4=ð8Þ. (Geometrically, the lines
corresponding to the equations coincide.)
To find the general solution, let y ¼ a, and substitute into L1 to obtain
x  3a ¼ 4 or x ¼ 3a þ 4
Thus, the general solution of the system is
x ¼ 3a þ 4; y ¼ a or u ¼ ð3a þ 4; aÞ
where a (called a parameter) is any scalar.
3.5 Systems in Triangular and Echelon Forms
The main method for solving systems of linear equations, Gaussian elimination, is treated in Section 3.6.
Here we consider two simple types of systems of linear equations: systems in triangular form and the
more general systems in echelon form.
Triangular Form
Consider the following system of linear equations, which is in triangular form:
2x1  3x2 þ 5x3  2x4 ¼ 9
5x2  x3 þ 3x4 ¼ 1
7x3  x4 ¼ 3
2x4 ¼ 8
64 CHAPTER 3 Systems of Linear Equations
That is, the first unknown x1 is the leading unknown in the first equation, the second unknown x2 is the
leading unknown in the second equation, and so on. Thus, in particular, the system is square and each
leading unknown is directly to the right of the leading unknown in the preceding equation.
Such a triangular system always has a unique solution, which may be obtained by back-substitution.
That is,
(1) First solve the last equation for the last unknown to get x4 ¼ 4.
(2) Then substitute this value x4 ¼ 4 in the next-to-last equation, and solve for the next-to-last unknown
x3 as follows:
7x3  4 ¼ 3 or 7x3 ¼ 7 or x3 ¼ 1
(3) Now substitute x3 ¼ 1 and x4 ¼ 4 in the second equation, and solve for the second unknown x2 as
follows:
5x2  1 þ 12 ¼ 1 or 5x2 þ 11 ¼ 1 or 5x2 ¼ 10 or x2 ¼ 2
(4) Finally, substitute x2 ¼ 2, x3 ¼ 1, x4 ¼ 4 in the first equation, and solve for the first unknown x1 as
follows:
2x1 þ 6 þ 5  8 ¼ 9 or 2x1 þ 3 ¼ 9 or 2x1 ¼ 6 or x1 ¼ 3
Thus, x1 ¼ 3 , x2 ¼ 2, x3 ¼ 1, x4 ¼ 4, or, equivalently, the vector u ¼ ð3; 2; 1; 4Þ is the unique
solution of the system.
Remark: There is an alternative form for back-substitution (which will be used when solving a
system using the matrix format). Namely, after first finding the value of the last unknown, we substitute
this value for the last unknown in all the preceding equations before solving for the next-to-last
unknown. This yields a triangular system with one less equation and one less unknown. For example, in
the above triangular system, we substitute x4 ¼ 4 in all the preceding equations to obtain the triangular
system
2x1  3x2 þ 5x3 ¼ 17
5x2  x3 ¼ 1
7x3 ¼ 7
We then repeat the process using the new last equation. And so on.
Echelon Form, Pivot and Free Variables
The following system of linear equations is said to be in echelon form:
2x1 þ 6x2  x3 þ 4x4  2x5 ¼ 15
x3 þ 2x4 þ 2x5 ¼ 5
3x4  9x5 ¼ 6
That is, no equation is degenerate and the leading unknown in each equation other than the first is to the
right of the leading unknown in the preceding equation. The leading unknowns in the system, x1, x3, x4,
are called pivot variables, and the other unknowns, x2 and x5, are called free variables.
Generally speaking, an echelon system or a system in echelon form has the following form:
a11x1 þ a12x2 þ a13x3 þ a14x4 þ    þ a1nxn ¼ b1
a2j2
xj2
þ a2;j2þ1xj2þ1 þ    þ a2nxn ¼ b2
::::::::::::::::::::::::::::::::::::::::::::::
arjr
xjr
þ    þ arnxn ¼ br
ð3:5Þ
where 1  j2      jr and a11, a2j2
; . . . ; arjr
are not zero. The pivot variables are x1, xj2
; . . . ; xjr
. Note
that r  n.
The solution set of any echelon system is described in the following theorem (proved in Problem 3.10).
CHAPTER 3 Systems of Linear Equations 65
THEOREM 3.6: Consider a system of linear equations in echelon form, say with r equations in n
unknowns. There are two cases:
(i) r ¼ n. That is, there are as many equations as unknowns (triangular form). Then
the system has a unique solution.
(ii) r  n. That is, there are more unknowns than equations. Then we can arbitrarily
assign values to the n  r free variables and solve uniquely for the r pivot
variables, obtaining a solution of the system.
Suppose an echelon system contains more unknowns than equations. Assuming the field K is infinite,
the system has an infinite number of solutions, because each of the n  r free variables may be assigned
any scalar.
The general solution of a system with free variables may be described in either of two equivalent ways,
which we illustrate using the above echelon system where there are r ¼ 3 equations and n ¼ 5 unknowns.
One description is called the ‘‘Parametric Form’’ of the solution, and the other description is called the
‘‘Free-Variable Form.’’
Parametric Form
Assign arbitrary values, called parameters, to the free variables x2 and x5, say x2 ¼ a and x5 ¼ b, and
then use back-substitution to obtain values for the pivot variables x1, x3, x5 in terms of the parameters a
and b. Specifically,
(1) Substitute x5 ¼ b in the last equation, and solve for x4:
3x4  9b ¼ 6 or 3x4 ¼ 6 þ 9b or x4 ¼ 2 þ 3b
(2) Substitute x4 ¼ 2 þ 3b and x5 ¼ b into the second equation, and solve for x3:
x3 þ 2ð2 þ 3bÞ þ 2b ¼ 5 or x3 þ 4 þ 8b ¼ 5 or x3 ¼ 1  8b
(3) Substitute x2 ¼ a, x3 ¼ 1  8b, x4 ¼ 2 þ 3b, x5 ¼ b into the first equation, and solve for x1:
2x1 þ 6a  ð1  8bÞ þ 4ð2 þ 3bÞ  2b ¼ 15 or x1 ¼ 4  3a  9b
Accordingly, the general solution in parametric form is
x1 ¼ 4  3a  9b; x2 ¼ a; x3 ¼ 1  8b; x4 ¼ 2 þ 3b; x5 ¼ b
or, equivalently, v ¼ ð4  3a  9b; a; 1  8b; 2 þ 3b; bÞ where a and b are arbitrary numbers.
Free-Variable Form
Use back-substitution to solve for the pivot variables x1, x3, x4 directly in terms of the free variables x2
and x5. That is, the last equation gives x4 ¼ 2 þ 3x5. Substitution in the second equation yields
x3 ¼ 1  8x5, and then substitution in the first equation yields x1 ¼ 4  3x2  9x5. Accordingly,
x1 ¼ 4  3x2  9x5; x2 ¼ free variable; x3 ¼ 1  8x5; x4 ¼ 2 þ 3x5; x5 ¼ free variable
or, equivalently,
v ¼ ð4  3x2  9x5; x2; 1  8x5; 2 þ 3x5; x5Þ
is the free-variable form for the general solution of the system.
We emphasize that there is no difference between the above two forms of the general solution, and the
use of one or the other to represent the general solution is simply a matter of taste.
Remark: A particular solution of the above system can be found by assigning any values to the free
variables and then solving for the pivot variables by back-substitution. For example, setting x2 ¼ 1 and
x5 ¼ 1, we obtain
x4 ¼ 2 þ 3 ¼ 5; x3 ¼ 1  8 ¼ 7; x1 ¼ 4  3  9 ¼ 8
Thus, u ¼ ð8; 1; 7; 5; 1Þ is the particular solution corresponding to x2 ¼ 1 and x5 ¼ 1.
66 CHAPTER 3 Systems of Linear Equations
3.6 Gaussian Elimination
The main method for solving the general system (3.2) of linear equations is called Gaussian elimination.
It essentially consists of two parts:
Part A. (Forward Elimination) Step-by-step reduction of the system yielding either a degenerate
equation with no solution (which indicates the system has no solution) or an equivalent simpler
system in triangular or echelon form.
Part B. (Backward Elimination) Step-by-step back-substitution to find the solution of the simpler
system.
Part B has already been investigated in Section 3.4. Accordingly, we need only give the algorithm for
Part A, which is as follows.
ALGORITHM 3.2 for (Part A): Input: The m n system (3.2) of linear equations.
ELIMINATION STEP: Find the first unknown in the system with a nonzero coefficient (which now
must be x1).
(a) Arrange so that a11 6¼ 0. That is, if necessary, interchange equations so that the first unknown x1
appears with a nonzero coefficient in the first equation.
(b) Use a11 as a pivot to eliminate x1 from all equations except the first equation. That is, for i  1:
(1) Set m ¼ ai1=a11; (2) Replace Li by mL1 þ Li
The system now has the following form:
a11x1 þ a12x2 þ a13x3 þ    þ a1nxn ¼ b1
a2j2
xj2
þ    þ a2nxn ¼ b2
:::::::::::::::::::::::::::::::::::::::
amj2
xj2
þ    þ amnxn ¼ bn
where x1 does not appear in any equation except the first, a11 6¼ 0, and xj2
denotes the first
unknown with a nonzero coefficient in any equation other than the first.
(c) Examine each new equation L.
(1) If L has the form 0x1 þ 0x2 þ    þ 0xn ¼ b with b 6¼ 0, then
STOP
The system is inconsistent and has no solution.
(2) If L has the form 0x1 þ 0x2 þ    þ 0xn ¼ 0 or if L is a multiple of another equation, then delete
L from the system.
RECURSION STEP: Repeat the Elimination Step with each new ‘‘smaller’’ subsystem formed by all
the equations excluding the first equation.
OUTPUT: Finally, the system is reduced to triangular or echelon form, or a degenerate equation with
no solution is obtained indicating an inconsistent system.
The next remarks refer to the Elimination Step in Algorithm 3.2.
(1) The following number m in (b) is called the multiplier:
m ¼ 
ai1
a11
¼ 
coefficient to be deleted
pivot
(2) One could alternatively apply the following operation in (b):
Replace Li by  ai1L1 þ a11Li
This would avoid fractions if all the scalars were originally integers.
CHAPTER 3 Systems of Linear Equations 67
Gaussian Elimination Example
Here we illustrate in detail Gaussian elimination using the following system of linear equations:
L1: x  3y  2z ¼ 6
L2: 2x  4y  3z ¼ 8
L3: 3x þ 6y þ 8z ¼ 5
Part A. We use the coefficient 1 of x in the first equation L1 as the pivot in order to eliminate x from
the second equation L2 and from the third equation L3. This is accomplished as follows:
(1) Multiply L1 by the multiplier m ¼ 2 and add it to L2; that is, ‘‘Replace L2 by 2L1 þ L2.’’
(2) Multiply L1 by the multiplier m ¼ 3 and add it to L3; that is, ‘‘Replace L3 by 3L1 þ L3.’’
These steps yield
ð2ÞL1: 2x þ 6y þ 4z ¼ 12
L2: 2x  4y  3z ¼ 8
New L2: 2y þ z ¼ 4
3L1: 3x  9y  6z ¼ 18
L3: 3x þ 6y þ 8z ¼ 5
New L3: 3y þ 2z ¼ 13
Thus, the original system is replaced by the following system:
L1: x  3y  2z ¼ 6
L2: 2y þ z ¼ 4
L3: 3y þ 2z ¼ 13
(Note that the equations L2 and L3 form a subsystem with one less equation and one less unknown than
the original system.)
Next we use the coefficient 2 of y in the (new) second equation L2 as the pivot in order to eliminate y
from the (new) third equation L3. This is accomplished as follows:
(3) Multiply L2 by the multiplier m ¼ 3
2 and add it to L3; that is, ‘‘Replace L3 by 3
2 L2 þ L3:’’
(Alternately, ‘‘Replace L3 by 3L2 þ 2L3,’’ which will avoid fractions.)
This step yields
3
2 L2: 3y þ 3
2 z ¼ 6
L3: 3y þ 2z ¼ 13
New L3: 7
2 z ¼ 7
or
3L2: 6y þ 3z ¼ 12
2L3: 6y þ 4z ¼ 26
New L3: 7z ¼ 14
Thus, our system is replaced by the following system:
L1: x  3y  2z ¼ 6
L2: 2y þ z ¼ 4
L3: 7z ¼ 14 ðor 7
2 z ¼ 7Þ
The system is now in triangular form, so Part A is completed.
Part B. The values for the unknowns are obtained in reverse order, z; y; x, by back-substitution.
Specifically,
(1) Solve for z in L3 to get z ¼ 2.
(2) Substitute z ¼ 2 in L2, and solve for y to get y ¼ 3.
(3) Substitute y ¼ 3 and z ¼ 2 in L1, and solve for x to get x ¼ 1.
Thus, the solution of the triangular system and hence the original system is as follows:
x ¼ 1; y ¼ 3; z ¼ 2 or; equivalently; u ¼ ð1; 3; 2Þ:
68 CHAPTER 3 Systems of Linear Equations
Condensed Format
The Gaussian elimination algorithm involves rewriting systems of linear equations. Sometimes we can
avoid excessive recopying of some of the equations by adopting a ‘‘condensed format.’’ This format for
the solution of the above system follows:
Number Equation Operation
ð1Þ x  3y  2z ¼ 6
ð2Þ 2x  4y  3z ¼ 8
ð3Þ 3x þ 6y þ 8z ¼ 5
ð20
Þ 2y þ z ¼ 4 Replace L2 by 2L1 þ L2
ð30
Þ  3y þ 2z ¼ 13 Replace L3 by 3L1 þ L3
ð300
Þ 7z ¼ 14 Replace L3 by 3L2 þ 2L3
That is, first we write down the number of each of the original equations. As we apply the Gaussian
elimination algorithm to the system, we only write down the new equations, and we label each new equation
using the same number as the original corresponding equation, but with an added prime. (After each new
equation, we will indicate, for instructional purposes, the elementary operation that yielded the new equation.)
The system in triangular form consists of equations (1), ð20
Þ, and ð300
Þ, the numbers with the largest
number of primes. Applying back-substitution to these equations again yields x ¼ 1, y ¼ 3, z ¼ 2.
Remark: If two equations need to be interchanged, say to obtain a nonzero coefficient as a pivot,
then this is easily accomplished in the format by simply renumbering the two equations rather than
changing their positions.
EXAMPLE 3.7 Solve the following system: x þ 2y  3z ¼ 1
2x þ 5y  8z ¼ 4
3x þ 8y  13z ¼ 7
We solve the system by Gaussian elimination.
Part A. (Forward Elimination) We use the coefficient 1 of x in the first equation L1 as the pivot in order to
eliminate x from the second equation L2 and from the third equation L3. This is accomplished as follows:
(1) Multiply L1 by the multiplier m ¼ 2 and add it to L2; that is, ‘‘Replace L2 by 2L1 þ L2.’’
(2) Multiply L1 by the multiplier m ¼ 3 and add it to L3; that is, ‘‘Replace L3 by 3L1 þ L3.’’
The two steps yield
x þ 2y  3z ¼ 1
y  2z ¼ 2
2y  4z ¼ 4
or
x þ 2y  3z ¼ 1
y  2z ¼ 2
(The third equation is deleted, because it is a multiple of the second equation.) The system is now in echelon form
with free variable z.
Part B. (Backward Elimination) To obtain the general solution, let the free variable z ¼ a, and solve for x and y
by back-substitution. Substitute z ¼ a in the second equation to obtain y ¼ 2 þ 2a. Then substitute z ¼ a and
y ¼ 2 þ 2a into the first equation to obtain
x þ 2ð2 þ 2aÞ  3a ¼ 1 or x þ 4 þ 4a  3a ¼ 1 or x ¼ 3  a
Thus, the following is the general solution where a is a parameter:
x ¼ 3  a; y ¼ 2 þ 2a; z ¼ a or u ¼ ð3  a; 2 þ 2a; aÞ
CHAPTER 3 Systems of Linear Equations 69
EXAMPLE 3.8 Solve the following system:
x1 þ 3x2  2x3 þ 5x4 ¼ 4
2x1 þ 8x2  x3 þ 9x4 ¼ 9
3x1 þ 5x2  12x3 þ 17x4 ¼ 7
We use Gaussian elimination.
Part A. (Forward Elimination) We use the coefficient 1 of x1 in the first equation L1 as the pivot in order to
eliminate x1 from the second equation L2 and from the third equation L3. This is accomplished by the following
operations:
(1) ‘‘Replace L2 by 2L1 þ L2’’ and (2) ‘‘Replace L3 by 3L1 þ L3’’
These yield:
x1 þ 3x2  2x3 þ 5x4 ¼ 4
2x2 þ 3x3  x4 ¼ 1
 4x2  6x3 þ 2x4 ¼ 5
We now use the coefficient 2 of x2 in the second equation L2 as the pivot and the multiplier m ¼ 2 in order to
eliminate x2 from the third equation L3. This is accomplished by the operation ‘‘Replace L3 by 2L2 þ L3,’’ which
then yields the degenerate equation
0x1 þ 0x2 þ 0x3 þ 0x4 ¼ 3
This equation and, hence, the original system have no solution:
DO NOT CONTINUE
Remark 1: As in the above examples, Part A of Gaussian elimination tells us whether or not the
system has a solution—that is, whether or not the system is consistent. Accordingly, Part B need never be
applied when a system has no solution.
Remark 2: If a system of linear equations has more than four unknowns and four equations, then it
may be more convenient to use the matrix format for solving the system. This matrix format is discussed
later.
3.7 Echelon Matrices, Row Canonical Form, Row Equivalence
One way to solve a system of linear equations is by working with its augmented matrix M rather than the
system itself. This section introduces the necessary matrix concepts for such a discussion. These
concepts, such as echelon matrices and elementary row operations, are also of independent interest.
Echelon Matrices
A matrix A is called an echelon matrix, or is said to be in echelon form, if the following two conditions
hold (where a leading nonzero element of a row of A is the first nonzero element in the row):
(1) All zero rows, if any, are at the bottom of the matrix.
(2) Each leading nonzero entry in a row is to the right of the leading nonzero entry in the preceding row.
That is, A ¼ ½aij is an echelon matrix if there exist nonzero entries
a1j1
; a2j2
; . . . ; arjr
; where j1  j2      jr
70 CHAPTER 3 Systems of Linear Equations
with the property that
aij ¼ 0 for
ðiÞ i  r; j  ji
ðiiÞ i  r
The entries a1j1
, a2j2
; . . . ; arjr
, which are the leading nonzero elements in their respective rows, are called
the pivots of the echelon matrix.
EXAMPLE 3.9 The following is an echelon matrix whose pivots have been circled:
A ¼
0 2 3 4 5 9 0 7
0 0 0 3 4 1 2 5
0 0 0 0 0 5 7 2
0 0 0 0 0 0 8 6
0 0 0 0 0 0 0 0
2
6
6
6
6
4
3
7
7
7
7
5
Observe that the pivots are in columns C2; C4; C6; C7, and each is to the right of the one above. Using the above
notation, the pivots are
a1j1
¼ 2; a2j2
¼ 3; a3j3
¼ 5; a4j4
¼ 8
where j1 ¼ 2, j2 ¼ 4, j3 ¼ 6, j4 ¼ 7. Here r ¼ 4.
Row Canonical Form
A matrix A is said to be in row canonical form (or row-reduced echelon form) if it is an echelon matrix—
that is, if it satisfies the above properties (1) and (2), and if it satisfies the following additional two
properties:
(3) Each pivot (leading nonzero entry) is equal to 1.
(4) Each pivot is the only nonzero entry in its column.
The major difference between an echelon matrix and a matrix in row canonical form is that in an
echelon matrix there must be zeros below the pivots [Properties (1) and (2)], but in a matrix in row
canonical form, each pivot must also equal 1 [Property (3)] and there must also be zeros above the pivots
[Property (4)].
The zero matrix 0 of any size and the identity matrix I of any size are important special examples of
matrices in row canonical form.
EXAMPLE 3.10
The following are echelon matrices whose pivots have been circled:
2 3 2 0 4 5 6
0 0 0 1 3 2 0
0 0 0 0 0 6 2
0 0 0 0 0 0 0
2
6
6
4
3
7
7
5;
1 2 3
0 0 1
0 0 0
2
4
3
5;
0 1 3 0 0 4
0 0 0 1 0 3
0 0 0 0 1 2
2
4
3
5
The third matrix is also an example of a matrix in row canonical form. The second matrix is not in row canonical
form, because it does not satisfy property (4); that is, there is a nonzero entry above the second pivot in the third
column. The first matrix is not in row canonical form, because it satisfies neither property (3) nor property (4); that
is, some pivots are not equal to 1 and there are nonzero entries above the pivots.
CHAPTER 3 Systems of Linear Equations 71
Elementary Row Operations
Suppose A is a matrix with rows R1; R2; . . . ; Rm. The following operations on A are called elementary row
operations.
½E1 (Row Interchange): Interchange rows Ri and Rj. This may be written as
‘‘Interchange Ri and Rj’’ or ‘‘Ri ! Rj’’
½E2 (Row Scaling): Replace row Ri by a nonzero multiple kRi of itself. This may be written as
‘‘Replace Ri by kRi ðk 6¼ 0Þ’’ or ‘‘kRi ! Ri’’
½E3 (Row Addition): Replace row Rj by the sum of a multiple kRi of a row Ri and itself. This may be
written as
‘‘Replace Rj by kRi þ Rj’’ or ‘‘kRi þ Rj ! Rj’’
The arrow ! in E2 and E3 may be read as ‘‘replaces.’’
Sometimes (say to avoid fractions when all the given scalars are integers) we may apply ½E2 and ½E3
in one step; that is, we may apply the following operation:
½E Replace Rj by the sum of a multiple kRi of a row Ri and a nonzero multiple k0
Rj of itself. This may
be written as
‘‘Replace Rj by kRi þ k0
Rj ðk0
6¼ 0Þ’’ or ‘‘kRi þ k0
Rj ! Rj’’
We emphasize that in operations ½E3 and ½E only row Rj is changed.
Row Equivalence, Rank of a Matrix
A matrix A is said to be row equivalent to a matrix B, written
A B
if B can be obtained from A by a sequence of elementary row operations. In the case that B is also an
echelon matrix, B is called an echelon form of A.
The following are two basic results on row equivalence.
THEOREM 3.7: Suppose A ¼ ½aij and B ¼ ½bij are row equivalent echelon matrices with respective
pivot entries
a1j1
; a2j2
; . . . arjr
and b1k1
; b2k2
; . . . bsks
Then A and B have the same number of nonzero rows—that is, r ¼ s—and the pivot
entries are in the same positions—that is, j1 ¼ k1, j2 ¼ k2; . . . ; jr ¼ kr.
THEOREM 3.8: Every matrix A is row equivalent to a unique matrix in row canonical form.
The proofs of the above theorems will be postponed to Chapter 4. The unique matrix in Theorem 3.8
is called the row canonical form of A.
Using the above theorems, we can now give our first definition of the rank of a matrix.
DEFINITION: The rank of a matrix A, written rankðAÞ, is equal to the number of pivots in an echelon
form of A.
The rank is a very important property of a matrix and, depending on the context in which the
matrix is used, it will be defined in many different ways. Of course, all the definitions lead to the
same number.
The next section gives the matrix format of Gaussian elimination, which finds an echelon form of any
matrix A (and hence the rank of A), and also finds the row canonical form of A.
72 CHAPTER 3 Systems of Linear Equations
One can show that row equivalence is an equivalence relation. That is,
(1) A A for any matrix A.
(2) If A B, then B A.
(3) If A B and B C, then A C.
Property (2) comes from the fact that each elementary row operation has an inverse operation of the same
type. Namely,
(i) ‘‘Interchange Ri and Rj’’ is its own inverse.
(ii) ‘‘Replace Ri by kRi’’ and ‘‘Replace Ri by ð1=kÞRi’’ are inverses.
(iii) ‘‘Replace Rj by kRi þ Rj’’ and ‘‘Replace Rj by kRi þ Rj’’ are inverses.
There is a similar result for operation [E] (Problem 3.73).
3.8 Gaussian Elimination, Matrix Formulation
This section gives two matrix algorithms that accomplish the following:
(1) Algorithm 3.3 transforms any matrix A into an echelon form.
(2) Algorithm 3.4 transforms the echelon matrix into its row canonical form.
These algorithms, which use the elementary row operations, are simply restatements of Gaussian
elimination as applied to matrices rather than to linear equations. (The term ‘‘row reduce’’ or simply
‘‘reduce’’ will mean to transform a matrix by the elementary row operations.)
ALGORITHM 3.3 (Forward Elimination): The input is any matrix A. (The algorithm puts 0’s below
each pivot, working from the ‘‘top-down.’’) The output is
an echelon form of A.
Step 1. Find the first column with a nonzero entry. Let j1 denote this column.
(a) Arrange so that a1j1
6¼ 0. That is, if necessary, interchange rows so that a nonzero entry
appears in the first row in column j1.
(b) Use a1j1
as a pivot to obtain 0’s below a1j1
.
Specifically, for i  1:
ð1Þ Set m ¼ aij1
=a1j1
; ð2Þ Replace Ri by mR1 þ Ri
[That is, apply the operation ðaij1
=a1j1
ÞR1 þ Ri ! Ri:]
Step 2. Repeat Step 1 with the submatrix formed by all the rows excluding the first row. Here we let j2
denote the first column in the subsystem with a nonzero entry. Hence, at the end of Step 2, we
have a2j2
6¼ 0.
Steps 3 to r. Continue the above process until a submatrix has only zero rows.
We emphasize that at the end of the algorithm, the pivots will be
a1j1
; a2j2
; . . . ; arjr
where r denotes the number of nonzero rows in the final echelon matrix.
Remark 1: The following number m in Step 1(b) is called the multiplier:
m ¼ 
aij1
a1j1
¼ 
entry to be deleted
pivot
CHAPTER 3 Systems of Linear Equations 73
Remark 2: One could replace the operation in Step 1(b) by the following which would avoid
fractions if all the scalars were originally integers.
Replace Ri by aij1
R1 þ a1j1
Ri:
ALGORITHM 3.4 (Backward Elimination): The input is a matrix A ¼ ½aij in echelon form with pivot
entries
a1j1
; a2j2
; . . . ; arjr
The output is the row canonical form of A.
Step 1. (a) (Use row scaling so the last pivot equals 1.) Multiply the last nonzero row Rr by 1=arjr
.
(b) (Use arjr
¼ 1 to obtain 0’s above the pivot.) For i ¼ r  1; r  2; . . . ; 2; 1:
ð1Þ Set m ¼ aijr
; ð2Þ Replace Ri by mRr þ Ri
(That is, apply the operations aijr
Rr þ Ri ! Ri.)
Steps 2 to r1. Repeat Step 1 for rows Rr1, Rr2; . . . ; R2.
Step r. (Use row scaling so the first pivot equals 1.) Multiply R1 by 1=a1j1
.
There is an alternative form of Algorithm 3.4, which we describe here in words. The formal
description of this algorithm is left to the reader as a supplementary problem.
ALTERNATIVE ALGORITHM 3.4 Puts 0’s above the pivots row by row from the bottom up (rather
than column by column from right to left).
The alternative algorithm, when applied to an augmented matrix M of a system of linear equations, is
essentially the same as solving for the pivot unknowns one after the other from the bottom up.
Remark: We emphasize that Gaussian elimination is a two-stage process. Specifically,
Stage A (Algorithm 3.3). Puts 0’s below each pivot, working from the top row R1 down.
Stage B (Algorithm 3.4). Puts 0’s above each pivot, working from the bottom row Rr up.
There is another algorithm, called Gauss–Jordan, that also row reduces a matrix to its row canonical
form. The difference is that Gauss–Jordan puts 0’s both below and above each pivot as it works its way
from the top row R1 down. Although Gauss–Jordan may be easier to state and understand, it is much less
efficient than the two-stage Gaussian elimination algorithm.
EXAMPLE 3.11 Consider the matrix A ¼
1 2 3 1 2
2 4 4 6 10
3 6 6 9 13
2
4
3
5.
(a) Use Algorithm 3.3 to reduce A to an echelon form.
(b) Use Algorithm 3.4 to further reduce A to its row canonical form.
(a) First use a11 ¼ 1 as a pivot to obtain 0’s below a11; that is, apply the operations ‘‘Replace R2 by 2R1 þ R2’’
and ‘‘Replace R3 by 3R1 þ R3.’’ Then use a23 ¼ 2 as a pivot to obtain 0 below a23; that is, apply the operation
‘‘Replace R3 by  3
2 R2 þ R3.’’ This yields
A
1 2 3 1 2
0 0 2 4 6
0 0 3 6 7
2
4
3
5
1 2 3 1 2
0 0 2 4 6
0 0 0 0 2
2
4
3
5
The matrix is now in echelon form.
74 CHAPTER 3 Systems of Linear Equations
(b) Multiply R3 by  1
2 so the pivot entry a35 ¼ 1, and then use a35 ¼ 1 as a pivot to obtain 0’s above it by the
operations ‘‘Replace R2 by 6R3 þ R2’’ and then ‘‘Replace R1 by 2R3 þ R1.’’ This yields
A
1 2 3 1 2
0 0 2 4 6
0 0 0 0 1
2
4
3
5
1 2 3 1 0
0 0 2 4 0
0 0 0 0 1
2
4
3
5:
Multiply R2 by 1
2 so the pivot entry a23 ¼ 1, and then use a23 ¼ 1 as a pivot to obtain 0’s above it by the
operation ‘‘Replace R1 by 3R2 þ R1.’’ This yields
A
1 2 3 1 0
0 0 1 2 0
0 0 0 0 1
2
4
3
5
1 2 0 7 0
0 0 1 2 0
0 0 0 0 1
2
4
3
5:
The last matrix is the row canonical form of A.
Application to Systems of Linear Equations
One way to solve a system of linear equations is by working with its augmented matrix M rather than the
equations themselves. Specifically, we reduce M to echelon form (which tells us whether the system has a
solution), and then further reduce M to its row canonical form (which essentially gives the solution of the
original system of linear equations). The justification for this process comes from the following facts:
(1) Any elementary row operation on the augmented matrix M of the system is equivalent to applying
the corresponding operation on the system itself.
(2) The system has a solution if and only if the echelon form of the augmented matrix M does not have a
row of the form ð0; 0; . . . ; 0; bÞ with b 6¼ 0.
(3) In the row canonical form of the augmented matrix M (excluding zero rows), the coefficient of each
basic variable is a pivot entry equal to 1, and it is the only nonzero entry in its respective column;
hence, the free-variable form of the solution of the system of linear equations is obtained by simply
transferring the free variables to the other side.
This process is illustrated below.
EXAMPLE 3.12 Solve each of the following systems:
(a)
x1 þ x2  2x3 þ 4x4 ¼ 5
2x1 þ 2x2  3x3 þ x4 ¼ 3
3x1 þ 3x2  4x3  2x4 ¼ 1
(b)
x1 þ x2  2x3 þ 3x4 ¼ 4
2x1 þ 3x2 þ 3x3  x4 ¼ 3
5x1 þ 7x2 þ 4x3 þ x4 ¼ 5
(c)
x þ 2y þ z ¼ 3
2x þ 5y  z ¼ 4
3x  2y  z ¼ 5
(a) Reduce its augmented matrix M to echelon form and then to row canonical form as follows:
M ¼
1 1 2 4 5
2 2 3 1 3
3 3 4 2 1
2
4
3
5
1 1 2 4 5
0 0 1 7 7
0 0 2 14 14
2
4
3
5
1 1 0 10 9
0 0 1 7 7
0 0 0 0 0
2
4
3
5
Rewrite the row canonical form in terms of a system of linear equations to obtain the free variable form of the
solution. That is,
x1 þ x2  10x4 ¼ 9
x3  7x4 ¼ 7
or
x1 ¼ 9  x2 þ 10x4
x3 ¼ 7 þ 7x4
(The zero row is omitted in the solution.) Observe that x1 and x3 are the pivot variables, and x2 and x4 are the
free variables.
CHAPTER 3 Systems of Linear Equations 75
(b) First reduce its augmented matrix M to echelon form as follows:
M ¼
1 1 2 3 4
2 3 3 1 3
5 7 4 1 5
2
4
3
5
1 1 2 3 4
0 1 7 7 5
0 2 14 14 15
2
4
3
5
1 1 2 3 4
0 1 7 7 5
0 0 0 0 5
2
4
3
5
There is no need to continue to find the row canonical form of M, because the echelon form already tells us that
the system has no solution. Specifically, the third row of the echelon matrix corresponds to the degenerate
equation
0x1 þ 0x2 þ 0x3 þ 0x4 ¼ 5
which has no solution. Thus, the system has no solution.
(c) Reduce its augmented matrix M to echelon form and then to row canonical form as follows:
M ¼
1 2 1 3
2 5 1 4
3 2 1 5
2
6
4
3
7
5
1 2 1 3
0 1 3 10
0 8 4 4
2
6
4
3
7
5
1 2 1 3
0 1 3 10
0 0 28 84
2
6
4
3
7
5
1 2 1 3
0 1 3 10
0 0 1 3
2
6
4
3
7
5
1 2 0 0
0 1 0 1
0 0 1 3
2
6
4
3
7
5
1 0 0 2
0 1 0 1
0 0 1 3
2
6
4
3
7
5
Thus, the system has the unique solution x ¼ 2, y ¼ 1, z ¼ 3, or, equivalently, the vector u ¼ ð2; 1; 3Þ. We
note that the echelon form of M already indicated that the solution was unique, because it corresponded to a
triangular system.
Application to Existence and Uniqueness Theorems
This subsection gives theoretical conditions for the existence and uniqueness of a solution of a system of
linear equations using the notion of the rank of a matrix.
THEOREM 3.9: Consider a system of linear equations in n unknowns with augmented matrix
M ¼ ½A; B. Then,
(a) The system has a solution if and only if rankðAÞ ¼ rankðMÞ.
(b) The solution is unique if and only if rankðAÞ ¼ rankðMÞ ¼ n.
Proof of (a). The system has a solution if and only if an echelon form of M ¼ ½A; B does not have a
row of the form
ð0; 0; . . . ; 0; bÞ; with b 6¼ 0
If an echelon form of M does have such a row, then b is a pivot of M but not of A, and hence,
rankðMÞ  rankðAÞ. Otherwise, the echelon forms of A and M have the same pivots, and hence,
rankðAÞ ¼ rankðMÞ. This proves (a).
Proof of (b). The system has a unique solution if and only if an echelon form has no free variable. This
means there is a pivot for each unknown. Accordingly, n ¼ rankðAÞ ¼ rankðMÞ. This proves (b).
The above proof uses the fact (Problem 3.74) that an echelon form of the augmented matrix
M ¼ ½A; B also automatically yields an echelon form of A.
76 CHAPTER 3 Systems of Linear Equations
3.9 Matrix Equation of a System of Linear Equations
The general system (3.2) of m linear equations in n unknowns is equivalent to the matrix equation
a11 a12 . . . a1n
a21 a22 . . . a2n
:::::::::::::::::::::::::::::::
am1 am2 . . . amn
2
6
6
4
3
7
7
5
x1
x2
x3
. . .
xn
2
6
6
6
6
4
3
7
7
7
7
5
¼
b1
b2
. . .
bm
2
6
6
4
3
7
7
5 or AX ¼ B
where A ¼ ½aij is the coefficient matrix, X ¼ ½xj is the column vector of unknowns, and B ¼ ½bi is the
column vector of constants. (Some texts write Ax ¼ b rather than AX ¼ B, in order to emphasize that x
and b are simply column vectors.)
The statement that the system of linear equations and the matrix equation are equivalent means that
any vector solution of the system is a solution of the matrix equation, and vice versa.
EXAMPLE 3.13 The following system of linear equations and matrix equation are equivalent:
x1 þ 2x2  4x3 þ 7x4 ¼ 4
3x1  5x2 þ 6x3  8x4 ¼ 8
4x1  3x2  2x3 þ 6x4 ¼ 11
and
1 2 4 7
3 5 6 8
4 3 2 6
2
4
3
5
x1
x2
x3
x4
2
6
6
4
3
7
7
5 ¼
4
8
11
2
4
3
5
We note that x1 ¼ 3, x2 ¼ 1, x3 ¼ 2, x4 ¼ 1, or, in other words, the vector u ¼ ½3; 1; 2; 1 is a solution of
the system. Thus, the (column) vector u is also a solution of the matrix equation.
The matrix form AX ¼ B of a system of linear equations is notationally very convenient when
discussing and proving properties of systems of linear equations. This is illustrated with our first theorem
(described in Fig. 3-1), which we restate for easy reference.
THEOREM 3.1: Suppose the field K is infinite. Then the system AX ¼ B has: (a) a unique solution, (b)
no solution, or (c) an infinite number of solutions.
Proof. It suffices to show that if AX ¼ B has more than one solution, then it has infinitely many.
Suppose u and v are distinct solutions of AX ¼ B; that is, Au ¼ B and Av ¼ B. Then, for any k 2 K,
A½u þ kðu  vÞ ¼ Au þ kðAu  AvÞ ¼ B þ kðB  BÞ ¼ B
Thus, for each k 2 K, the vector u þ kðu  vÞ is a solution of AX ¼ B. Because all such solutions are
distinct (Problem 3.47), AX ¼ B has an infinite number of solutions.
Observe that the above theorem is true when K is the real field R (or the complex field C). Section 3.3
shows that the theorem has a geometrical description when the system consists of two equations in two
unknowns, where each equation represents a line in R2
. The theorem also has a geometrical description
when the system consists of three nondegenerate equations in three unknowns, where the three equations
correspond to planes H1, H2, H3 in R3
. That is,
(a) Unique solution: Here the three planes intersect in exactly one point.
(b) No solution: Here the planes may intersect pairwise but with no common point of intersection, or two
of the planes may be parallel.
(c) Infinite number of solutions: Here the three planes may intersect in a line (one free variable), or they
may coincide (two free variables).
These three cases are pictured in Fig. 3-3.
Matrix Equation of a Square System of Linear Equations
A system AX ¼ B of linear equations is square if and only if the matrix A of coefficients is square. In such
a case, we have the following important result.
CHAPTER 3 Systems of Linear Equations 77
THEOREM 3.10: A square system AX ¼ B of linear equations has a unique solution if and only if the
matrix A is invertible. In such a case, A1
B is the unique solution of the system.
We only prove here that if A is invertible, then A1
B is a unique solution. If A is invertible, then
AðA1
BÞ ¼ ðAA1
ÞB ¼ IB ¼ B
and hence, A1
B is a solution. Now suppose v is any solution, so Av ¼ B. Then
v ¼ Iv ¼ ðA1
AÞv ¼ A1
ðAvÞ ¼ A1
B
Thus, the solution A1
B is unique.
EXAMPLE 3.14 Consider the following system of linear equations, whose coefficient matrix A and
inverse A1
are also given:
x þ 2y þ 3z ¼ 1
x þ 3y þ 6z ¼ 3
2x þ 6y þ 13z ¼ 5
; A ¼
1 2 3
1 3 6
2 6 13
2
4
3
5; A1
¼
3 8 3
1 7 3
0 2 1
2
4
3
5
By Theorem 3.10, the unique solution of the system is
A1
B ¼
3 8 3
1 7 3
0 2 1
2
4
3
5
1
3
5
2
4
3
5 ¼
6
5
1
2
4
3
5
That is, x ¼ 6, y ¼ 5, z ¼ 1.
Remark: We emphasize that Theorem 3.10 does not usually help us to find the solution of a square
system. That is, finding the inverse of a coefficient matrix A is not usually any easier than solving the
system directly. Thus, unless we are given the inverse of a coefficient matrix A, as in Example 3.14,
we usually solve a square system by Gaussian elimination (or some iterative method whose discussion
lies beyond the scope of this text).
( ) Unique solution
a
H2
H3
H1
H1
H2
H3
( ) Infinite number of solutions
c
H3
H H
1 2
and
(i) (ii) (iii)
H H H
1 2 3
, , and
(i)
( ) No solutions
b
H3
H2
H1
(ii) (iii) (i )
v
H1
H2
H3
H2
H3
H1
H3
Figure 3-3
78 CHAPTER 3 Systems of Linear Equations
3.10 Systems of Linear Equations and Linear Combinations of Vectors
The general system (3.2) of linear equations may be rewritten as the following vector equation:
x1
a11
a21
. . .
am1
2
6
6
4
3
7
7
5 þ x2
a12
a22
. . .
am2
2
6
6
4
3
7
7
5 þ    þ xn
a1n
a2n
. . .
amn
2
6
6
4
3
7
7
5 ¼
b1
b2
. . .
bm
2
6
6
4
3
7
7
5
Recall that a vector v in Kn
is said to be a linear combination of vectors u1; u2; . . . ; um in Kn
if there exist
scalars a1; a2; . . . ; am in K such that
v ¼ a1u1 þ a2u2 þ    þ amum
Accordingly, the general system (3.2) of linear equations and the above equivalent vector equation have a
solution if and only if the column vector of constants is a linear combination of the columns of the
coefficient matrix. We state this observation formally.
THEOREM 3.11: A system AX ¼ B of linear equations has a solution if and only if B is a linear
combination of the columns of the coefficient matrix A.
Thus, the answer to the problem of expressing a given vector v in Kn
as a linear combination of vectors
u1; u2; . . . ; um in Kn
reduces to solving a system of linear equations.
Linear Combination Example
Suppose we want to write the vector v ¼ ð1; 2; 5Þ as a linear combination of the vectors
u1 ¼ ð1; 1; 1Þ; u2 ¼ ð1; 2; 3Þ; u3 ¼ ð2; 1; 1Þ
First we write v ¼ xu1 þ yu2 þ zu3 with unknowns x; y; z, and then we find the equivalent system of linear
equations which we solve. Specifically, we first write
1
2
5
2
4
3
5 ¼ x
1
1
1
2
4
3
5 þ y
1
2
3
2
4
3
5 þ z
2
1
1
2
4
3
5 ð*Þ
Then
1
2
5
2
4
3
5 ¼
x
x
x
2
4
3
5 þ
y
2y
3y
2
4
3
5 þ
2z
z
z
2
4
3
5 ¼
x þ y þ 2z
x þ 2y  z
x þ 3y þ z
2
4
3
5
Setting corresponding entries equal to each other yields the following equivalent system:
x þ y þ 2z ¼ 1
x þ 2y  z ¼ 2
x þ 3y þ z ¼ 5
ð**Þ
For notational convenience, we have written the vectors in Rn
as columns, because it is then easier to find
the equivalent system of linear equations. In fact, one can easily go from the vector equation (*) directly
to the system (**).
Now we solve the equivalent system of linear equations by reducing the system to echelon form. This
yields
x þ y þ 2z ¼ 1
y  3z ¼ 3
2y  z ¼ 4
and then
x þ y þ 2z ¼ 1
y  3z ¼ 3
5z ¼ 10
Back-substitution yields the solution x ¼ 6, y ¼ 3, z ¼ 2. Thus, v ¼ 6u1 þ 3u2 þ 2u3.
CHAPTER 3 Systems of Linear Equations 79
EXAMPLE 3.15
(a) Write the vector v ¼ ð4; 9; 19Þ as a linear combination of
u1 ¼ ð1; 2; 3Þ; u2 ¼ ð3; 7; 10Þ; u3 ¼ ð2; 1; 9Þ:
Find the equivalent system of linear equations by writing v ¼ xu1 þ yu2 þ zu3, and reduce the system to an
echelon form. We have
x þ 3y þ 2z ¼ 4
2x  7y þ z ¼ 9
3x þ 10y þ 9z ¼ 19
or
x þ 3y þ 2z ¼ 4
y þ 5z ¼ 17
y þ 3z ¼ 7
or
x þ 3y þ 2z ¼ 4
y þ 5z ¼ 17
8z ¼ 24
Back-substitution yields the solution x ¼ 4, y ¼ 2, z ¼ 3. Thus, v is a linear combination of u1; u2; u3.
Specifically, v ¼ 4u1  2u2 þ 3u3.
(b) Write the vector v ¼ ð2; 3; 5Þ as a linear combination of
u1 ¼ ð1; 2; 3Þ; u2 ¼ ð2; 3; 4Þ; u3 ¼ ð1; 3; 5Þ
Find the equivalent system of linear equations by writing v ¼ xu1 þ yu2 þ zu3, and reduce the system to an
echelon form. We have
x þ 2y þ z ¼ 2
2x þ 3y þ 3z ¼ 3
3x  4y  5z ¼ 5
or
x þ 2y þ z ¼ 2
y þ z ¼ 1
2y  2z ¼ 1
or
x þ 2y þ z ¼ 2
 5y þ 5z ¼ 1
0 ¼ 3
The system has no solution. Thus, it is impossible to write v as a linear combination of u1; u2; u3.
Linear Combinations of Orthogonal Vectors, Fourier Coefficients
Recall first (Section 1.4) that the dot (inner) product u  v of vectors u ¼ ða1; . . . ; anÞ and v ¼ ðb1; . . . ; bnÞ
in Rn
is defined by
u  v ¼ a1b1 þ a2b2 þ    þ anbn
Furthermore, vectors u and v are said to be orthogonal if their dot product u  v ¼ 0.
Suppose that u1; u2; . . . ; un in Rn
are n nonzero pairwise orthogonal vectors. This means
ðiÞ ui  uj ¼ 0 for i 6¼ j and ðiiÞ ui  ui 6¼ 0 for each i
Then, for any vector v in Rn
, there is an easy way to write v as a linear combination of u1; u2; . . . ; un,
which is illustrated in the next example.
EXAMPLE 3.16 Consider the following three vectors in R3
:
u1 ¼ ð1; 1; 1Þ; u2 ¼ ð1; 3; 2Þ; u3 ¼ ð5; 1; 4Þ
These vectors are pairwise orthogonal; that is,
u1  u2 ¼ 1  3 þ 2 ¼ 0; u1  u3 ¼ 5  1  4 ¼ 0; u2  u3 ¼ 5 þ 3  8 ¼ 0
Suppose we want to write v ¼ ð4; 14; 9Þ as a linear combination of u1; u2; u3.
Method 1. Find the equivalent system of linear equations as in Example 3.14 and then solve,
obtaining v ¼ 3u1  4u2 þ u3.
Method 2. (This method uses the fact that the vectors u1; u2; u3 are mutually orthogonal, and
hence, the arithmetic is much simpler.) Set v as a linear combination of u1; u2; u3 using unknown scalars
x; y; z as follows:
ð4; 14; 9Þ ¼ xð1; 1; 1Þ þ yð1; 3; 2Þ þ zð5; 1; 4Þ ð*Þ
80 CHAPTER 3 Systems of Linear Equations
Take the dot product of (*) with respect to u1 to get
ð4; 14; 9Þ  ð1; 1; 1Þ ¼ xð1; 1; 1Þ  ð1; 1; 1Þ or 9 ¼ 3x or x ¼ 3
(The last two terms drop out, because u1 is orthogonal to u2 and to u3.) Next take the dot product of (*) with respect
to u2 to obtain
ð4; 14; 9Þ  ð1; 3; 2Þ ¼ yð1; 3; 2Þ  ð1; 3; 2Þ or  56 ¼ 14y or y ¼ 4
Finally, take the dot product of (*) with respect to u3 to get
ð4; 14; 9Þ  ð5; 1; 4Þ ¼ zð5; 1; 4Þ  ð5; 1; 4Þ or 42 ¼ 42z or z ¼ 1
Thus, v ¼ 3u1  4u2 þ u3.
The procedure in Method 2 in Example 3.16 is valid in general. Namely,
THEOREM 3.12: Suppose u1; u2; . . . ; un are nonzero mutually orthogonal vectors in Rn
. Then, for any
vector v in Rn
,
v ¼
v  u1
u1  u1
u1 þ
v  u2
u2  u2
u2 þ    þ
v  un
un  un
un
We emphasize that there must be n such orthogonal vectors ui in Rn
for the formula to be used. Note
also that each ui  ui 6¼ 0, because each ui is a nonzero vector.
Remark: The following scalar ki (appearing in Theorem 3.12) is called the Fourier coefficient of v
with respect to ui:
ki ¼
v  ui
ui  ui
¼
v  ui
kuik2
It is analogous to a coefficient in the celebrated Fourier series of a function.
3.11 Homogeneous Systems of Linear Equations
A system of linear equations is said to be homogeneous if all the constant terms are zero. Thus, a
homogeneous system has the form AX ¼ 0. Clearly, such a system always has the zero vector
0 ¼ ð0; 0; . . . ; 0Þ as a solution, called the zero or trivial solution. Accordingly, we are usually interested
in whether or not the system has a nonzero solution.
Because a homogeneous system AX ¼ 0 has at least the zero solution, it can always be put in an
echelon form, say
a11x1 þ a12x2 þ a13x3 þ a14x4 þ    þ a1nxn ¼ 0
a2j2
xj2
þ a2;j2þ1xj2þ1 þ    þ a2nxn ¼ 0
::::::::::::::::::::::::::::::::::::::::::::
arjr
xjr
þ    þ arnxn ¼ 0
Here r denotes the number of equations in echelon form and n denotes the number of unknowns. Thus,
the echelon system has n  r free variables.
The question of nonzero solutions reduces to the following two cases:
(i) r ¼ n. The system has only the zero solution.
(ii) r  n. The system has a nonzero solution.
Accordingly, if we begin with fewer equations than unknowns, then, in echelon form, r  n, and the
system has a nonzero solution. This proves the following important result.
THEOREM 3.13: A homogeneous system AX ¼ 0 with more unknowns than equations has a nonzero
solution.
CHAPTER 3 Systems of Linear Equations 81
EXAMPLE 3.17 Determine whether or not each of the following homogeneous systems has a nonzero
solution:
(a)
x þ y  z ¼ 0
2x  3y þ z ¼ 0
x  4y þ 2z ¼ 0
(b)
x þ y  z ¼ 0
2x þ 4y  z ¼ 0
3x þ 2y þ 2z ¼ 0
(c)
x1 þ 2x2  3x3 þ 4x4 ¼ 0
2x1  3x2 þ 5x3  7x4 ¼ 0
5x1 þ 6x2  9x3 þ 8x4 ¼ 0
(a) Reduce the system to echelon form as follows:
x þ y  z ¼ 0
5y þ 3z ¼ 0
5y þ 3z ¼ 0
and then
x þ y  z ¼ 0
5y þ 3z ¼ 0
The system has a nonzero solution, because there are only two equations in the three unknowns in echelon form.
Here z is a free variable. Let us, say, set z ¼ 5. Then, by back-substitution, y ¼ 3 and x ¼ 2. Thus, the vector
u ¼ ð2; 3; 5Þ is a particular nonzero solution.
(b) Reduce the system to echelon form as follows:
x þ y  z ¼ 0
2y þ z ¼ 0
y þ 5z ¼ 0
and then
x þ y  z ¼ 0
2y þ z ¼ 0
11z ¼ 0
In echelon form, there are three equations in three unknowns. Thus, the system has only the zero solution.
(c) The system must have a nonzero solution (Theorem 3.13), because there are four unknowns but only three
equations. (Here we do not need to reduce the system to echelon form.)
Basis for the General Solution of a Homogeneous System
Let W denote the general solution of a homogeneous system AX ¼ 0. A list of nonzero solution vectors
u1; u2; . . . ; us of the system is said to be a basis for W if each solution vector w 2 W can be expressed
uniquely as a linear combination of the vectors u1; u2; . . . ; us; that is, there exist unique scalars
a1; a2; . . . ; as such that
w ¼ a1u1 þ a2u2 þ    þ asus
The number s of such basis vectors is equal to the number of free variables. This number s is called the
dimension of W, written as dim W ¼ s. When W ¼ f0g—that is, the system has only the zero solution—
we define dim W ¼ 0.
The following theorem, proved in Chapter 5, page 171, tells us how to find such a basis.
THEOREM 3.14: Let W be the general solution of a homogeneous system AX ¼ 0, and suppose that
the echelon form of the homogeneous system has s free variables. Let u1; u2; . . . ; us
be the solutions obtained by setting one of the free variables equal to 1 (or any
nonzero constant) and the remaining free variables equal to 0. Then dim W ¼ s, and
the vectors u1; u2; . . . ; us form a basis of W.
We emphasize that the general solution W may have many bases, and that Theorem 3.12 only gives us
one such basis.
EXAMPLE 3.18 Find the dimension and a basis for the general solution W of the homogeneous system
x1 þ 2x2  3x3 þ 2x4  4x5 ¼ 0
2x1 þ 4x2  5x3 þ x4  6x5 ¼ 0
5x1 þ 10x2  13x3 þ 4x4  16x5 ¼ 0
82 CHAPTER 3 Systems of Linear Equations
First reduce the system to echelon form. Apply the following operations:
‘‘Replace L2 by 2L1 þ L2’’ and ‘‘Replace L3 by  5L1 þ L3’’ and then ‘‘Replace L3 by 2L2 þ L3’’
These operations yield
x1 þ 2x2  3x3 þ 2x4  4x5 ¼ 0
x3  3x4 þ 2x5 ¼ 0
2x3  6x4 þ 4x5 ¼ 0
and
x1 þ 2x2  3x3 þ 2x4  4x5 ¼ 0
x3  3x4 þ 2x5 ¼ 0
The system in echelon form has three free variables, x2; x4; x5; hence, dim W ¼ 3. Three solution vectors that form a
basis for W are obtained as follows:
(1) Set x2 ¼ 1, x4 ¼ 0, x5 ¼ 0. Back-substitution yields the solution u1 ¼ ð2; 1; 0; 0; 0Þ.
(2) Set x2 ¼ 0, x4 ¼ 1, x5 ¼ 0. Back-substitution yields the solution u2 ¼ ð7; 0; 3; 1; 0Þ.
(3) Set x2 ¼ 0, x4 ¼ 0, x5 ¼ 1. Back-substitution yields the solution u3 ¼ ð2; 0; 2; 0; 1Þ.
The vectors u1 ¼ ð2; 1; 0; 0; 0Þ, u2 ¼ ð7; 0; 3; 1; 0Þ, u3 ¼ ð2; 0; 2; 0; 1Þ form a basis for W.
Remark: Any solution of the system in Example 3.18 can be written in the form
au1 þ bu2 þ cu3 ¼ að2; 1; 0; 0; 0Þ þ bð7; 0; 3; 1; 0Þ þ cð2; 0; 2; 0; 1Þ
¼ ð2a þ 7b  2c; a; 3b  2c; b; cÞ
or
x1 ¼ 2a þ 7b  2c; x2 ¼ a; x3 ¼ 3b  2c; x4 ¼ b; x5 ¼ c
where a; b; c are arbitrary constants. Observe that this representation is nothing more than the parametric
form of the general solution under the choice of parameters x2 ¼ a, x4 ¼ b, x5 ¼ c.
Nonhomogeneous and Associated Homogeneous Systems
Let AX ¼ B be a nonhomogeneous system of linear equations. Then AX ¼ 0 is called the associated
homogeneous system. For example,
x þ 2y  4z ¼ 7
3x  5y þ 6z ¼ 8
and
x þ 2y  4z ¼ 0
3x  5y þ 6z ¼ 0
show a nonhomogeneous system and its associated homogeneous system.
The relationship between the solution U of a nonhomogeneous system AX ¼ B and the solution W of
its associated homogeneous system AX ¼ 0 is contained in the following theorem.
THEOREM 3.15: Let v0 be a particular solution of AX ¼ B and let W be the general solution of
AX ¼ 0. Then the following is the general solution of AX ¼ B:
U ¼ v0 þ W ¼ fv0 þ w : w 2 Wg
That is, U ¼ v0 þ W is obtained by adding v0 to each element in W. We note that this theorem has a
geometrical interpretation in R3
. Specifically, suppose W is a line through the origin O. Then, as pictured
in Fig. 3-4, U ¼ v0 þ W is the line parallel to W obtained by adding v0 to each element of W. Similarly,
whenever W is a plane through the origin O, then U ¼ v0 þ W is a plane parallel to W.
CHAPTER 3 Systems of Linear Equations 83
3.12 Elementary Matrices
Let e denote an elementary row operation and let eðAÞ denote the results of applying the operation e to a
matrix A. Now let E be the matrix obtained by applying e to the identity matrix I; that is,
E ¼ eðIÞ
Then E is called the elementary matrix corresponding to the elementary row operation e. Note that E is
always a square matrix.
EXAMPLE 3.19 Consider the following three elementary row operations:
ð1Þ Interchange R2 and R3: ð2Þ Replace R2 by 6R2: ð3Þ Replace R3 by  4R1 þ R3:
The 3 3 elementary matrices corresponding to the above elementary row operations are as follows:
E1 ¼
1 0 0
0 0 1
0 1 0
2
4
3
5; E2 ¼
1 0 0
0 6 0
0 0 1
2
4
3
5; E3 ¼
1 0 0
0 1 0
4 0 1
2
4
3
5
The following theorem, proved in Problem 3.34, holds.
THEOREM 3.16: Let e be an elementary row operation and let E be the corresponding m m
elementary matrix. Then
eðAÞ ¼ EA
where A is any m n matrix.
In other words, the result of applying an elementary row operation e to a matrix A can be obtained by
premultiplying A by the corresponding elementary matrix E.
Now suppose e0
is the inverse of an elementary row operation e, and let E0
and E be the corresponding
matrices. We note (Problem 3.33) that E is invertible and E0
is its inverse. This means, in particular, that
any product
P ¼ Ek . . . E2E1
of elementary matrices is invertible.
Figure 3-4
84 CHAPTER 3 Systems of Linear Equations
Applications of Elementary Matrices
Using Theorem 3.16, we are able to prove (Problem 3.35) the following important properties of matrices.
THEOREM 3.17: Let A be a square matrix. Then the following are equivalent:
(a) A is invertible (nonsingular).
(b) A is row equivalent to the identity matrix I.
(c) A is a product of elementary matrices.
Recall that square matrices A and B are inverses if AB ¼ BA ¼ I. The next theorem (proved in
Problem 3.36) demonstrates that we need only show that one of the products is true, say AB ¼ I, to prove
that matrices are inverses.
THEOREM 3.18: Suppose AB ¼ I. Then BA ¼ I, and hence, B ¼ A1
.
Row equivalence can also be defined in terms of matrix multiplication. Specifically, we will prove
(Problem 3.37) the following.
THEOREM 3.19: B is row equivalent to A if and only if there exists a nonsingular matrix P such that
B ¼ PA.
Application to Finding the Inverse of an n n Matrix
The following algorithm finds the inverse of a matrix.
ALGORITHM 3.5: The input is a square matrix A. The output is the inverse of A or that the inverse
does not exist.
Step 1. Form the n 2n (block) matrix M ¼ ½A; I, where A is the left half of M and the identity matrix
I is the right half of M.
Step 2. Row reduce M to echelon form. If the process generates a zero row in the A half of M, then
STOP
A has no inverse. (Otherwise A is in triangular form.)
Step 3. Further row reduce M to its row canonical form
M ½I; B
where the identity matrix I has replaced A in the left half of M.
Step 4. Set A1
¼ B, the matrix that is now in the right half of M.
The justification for the above algorithm is as follows. Suppose A is invertible and, say, the sequence
of elementary row operations e1; e2; . . . ; eq applied to M ¼ ½A; I reduces the left half of M, which is A, to
the identity matrix I. Let Ei be the elementary matrix corresponding to the operation ei. Then, by
applying Theorem 3.16. we get
Eq . . . E2E1A ¼ I or ðEq . . . E2E1IÞA ¼ I; so A1
¼ Eq . . . E2E1I
That is, A1
can be obtained by applying the elementary row operations e1; e2; . . . ; eq to the identity
matrix I, which appears in the right half of M. Thus, B ¼ A1
, as claimed.
EXAMPLE 3.20
Find the inverse of the matrix A ¼
1 0 2
2 1 3
4 1 8
2
4
3
5.
CHAPTER 3 Systems of Linear Equations 85
First form the (block) matrix M ¼ ½A; I and row reduce M to an echelon form:
M ¼
1 0 2 1 0 0
2 1 3 0 1 0
4 1 8 0 0 1
2
4
3
5
1 0 2 1 0 0
0 1 1 2 1 0
0 1 0 4 0 1
2
4
3
5
1 0 2 1 0 0
0 1 1 2 1 0
0 0 1 6 1 1
2
4
3
5
In echelon form, the left half of M is in triangular form; hence, A has an inverse. Next we further row reduce M to its
row canonical form:
M
1 0 0 11 2 2
0 1 0 4 0 1
0 0 1 6 1 1
2
4
3
5
1 0 0 11 2 2
0 1 0 4 0 1
0 0 1 6 1 1
2
4
3
5
The identity matrix is now in the left half of the final matrix; hence, the right half is A1
. In other words,
A1
¼
11 2 2
4 0 1
6 1 1
2
4
3
5
Elementary Column Operations
Now let A be a matrix with columns C1; C2; . . . ; Cn. The following operations on A, analogous to the
elementary row operations, are called elementary column operations:
½F1 (Column Interchange): Interchange columns Ci and Cj.
½F2 (Column Scaling): Replace Ci by kCi (where k 6¼ 0).
½F3 (Column Addition): Replace Cj by kCi þ Cj.
We may indicate each of the column operations by writing, respectively,
ð1Þ Ci $ Cj; ð2Þ kCi ! Ci; ð3Þ ðkCi þ CjÞ ! Cj
Moreover, each column operation has an inverse operation of the same type, just like the corresponding
row operation.
Now let f denote an elementary column operation, and let F be the matrix obtained by applying f to
the identity matrix I; that is,
F ¼ f ðIÞ
Then F is called the elementary matrix corresponding to the elementary column operation f . Note that F
is always a square matrix.
EXAMPLE 3.21
Consider the following elementary column operations:
ð1Þ Interchange C1 and C3; ð2Þ Replace C3 by 2C3; ð3Þ Replace C3 by 3C2 þ C3
The corresponding three 3 3 elementary matrices are as follows:
F1 ¼
0 0 1
0 1 0
1 0 0
2
4
3
5; F2 ¼
1 0 0
0 1 0
0 0 2
2
4
3
5; F3 ¼
1 0 0
0 1 3
0 0 1
2
4
3
5
The following theorem is analogous to Theorem 3.16 for the elementary row operations.
THEOREM 3.20: For any matrix A; f ðAÞ ¼ AF.
That is, the result of applying an elementary column operation f on a matrix A can be obtained by
postmultiplying A by the corresponding elementary matrix F.
86 CHAPTER 3 Systems of Linear Equations
Matrix Equivalence
A matrix B is equivalent to a matrix A if B can be obtained from A by a sequence of row and column
operations. Alternatively, B is equivalent to A, if there exist nonsingular matrices P and Q such that
B ¼ PAQ. Just like row equivalence, equivalence of matrices is an equivalence relation.
The main result of this subsection (proved in Problem 3.38) is as follows.
THEOREM 3.21: Every m n matrix A is equivalent to a unique block matrix of the form
Ir 0
0 0
 
where Ir is the r-square identity matrix.
The following definition applies.
DEFINITION: The nonnegative integer r in Theorem 3.18 is called the rank of A, written rankðAÞ.
Note that this definition agrees with the previous definition of the rank of a matrix.
3.13 LU DECOMPOSITION
Suppose A is a nonsingular matrix that can be brought into (upper) triangular form U using only row-
addition operations; that is, suppose A can be triangularized by the following algorithm, which we write
using computer notation.
ALGORITHM 3.6: The input is a matrix A and the output is a triangular matrix U.
Step 1. Repeat for i ¼ 1; 2; . . . ; n  1:
Step 2. Repeat for j ¼ i þ 1, i þ 2; . . . ; n
(a) Set mij : ¼ aij=aii.
(b) Set Rj : ¼ mijRi þ Rj
[End of Step 2 inner loop.]
[End of Step 1 outer loop.]
The numbers mij are called multipliers. Sometimes we keep track of these multipliers by means of the
following lower triangular matrix L:
L ¼
1 0 0 . . . 0 0
m21 1 0 . . . 0 0
m31 m32 1 . . . 0 0
mn1 mn2 mn3 . . . mn;n1 1
2
6
6
6
6
4
3
7
7
7
7
5
That is, L has 1’s on the diagonal, 0’s above the diagonal, and the negative of the multiplier mij as its
ij-entry below the diagonal.
The above matrix L and the triangular matrix U obtained in Algorithm 3.6 give us the classical LU
factorization of such a matrix A. Namely,
THEOREM 3.22: Let A be a nonsingular matrix that can be brought into triangular form U using only
row-addition operations. Then A ¼ LU, where L is the above lower triangular matrix
with 1’s on the diagonal, and U is an upper triangular matrix with no 0’s on the
diagonal.
.........................................................
CHAPTER 3 Systems of Linear Equations 87
EXAMPLE 3.22 SupposeA ¼
1 2 3
3 4 13
2 1 5
2
4
3
5.WenotethatA maybereducedtotriangularformbytheoperations
‘‘Replace R2 by 3R1 þ R2’’; ‘‘Replace R3 by  2R1 þ R3’’; and then ‘‘Replace R3 by 3
2 R2 þ R3’’
That is,
A
1 2 3
0 2 4
0 3 1
2
4
3
5
1 2 3
0 2 4
0 0 7
2
4
3
5
This gives us the classical factorization A ¼ LU, where
L ¼
1 0 0
3 1 0
2  3
2 1
2
6
4
3
7
5 and U ¼
1 2 3
0 2 4
0 0 7
2
6
4
3
7
5
We emphasize:
(1) The entries 3; 2;  3
2 in L are the negatives of the multipliers in the above elementary row operations.
(2) U is the triangular form of A.
Application to Systems of Linear Equations
Consider a computer algorithm M. Let CðnÞ denote the running time of the algorithm as a function of the
size n of the input data. [The function CðnÞ is sometimes called the time complexity or simply the
complexity of the algorithm M.] Frequently, CðnÞ simply counts the number of multiplications and
divisions executed by M, but does not count the number of additions and subtractions because they take
much less time to execute.
Now consider a square system of linear equations AX ¼ B, where
A ¼ ½aij; X ¼ ½x1; . . . ; xnT
; B ¼ ½b1; . . . ; bnT
and suppose A has an LU factorization. Then the system can be brought into triangular form (in order to
apply back-substitution) by applying Algorithm 3.6 to the augmented matrix M ¼ ½A; B of the system.
The time complexity of Algorithm 3.6 and back-substitution are, respectively,
CðnÞ 1
2 n3
and CðnÞ 1
2 n2
where n is the number of equations.
On the other hand, suppose we already have the factorization A ¼ LU. Then, to triangularize the
system, we need only apply the row operations in the algorithm (retained by the matrix L) to the column
vector B. In this case, the time complexity is
CðnÞ 1
2 n2
Of course, to obtain the factorization A ¼ LU requires the original algorithm where CðnÞ 1
2 n3
. Thus,
nothing may be gained by first finding the LU factorization when a single system is involved. However,
there are situations, illustrated below, where the LU factorization is useful.
Suppose, for a given matrix A, we need to solve the system
AX ¼ B
88 CHAPTER 3 Systems of Linear Equations
repeatedly for a sequence of different constant vectors, say B1; B2; . . . ; Bk. Also, suppose some of the Bi
depend upon the solution of the system obtained while using preceding vectors Bj. In such a case, it is
more efficient to first find the LU factorization of A, and then to use this factorization to solve the system
for each new B.
EXAMPLE 3.23 Consider the following system of linear equations:
x þ 2y þ z ¼ k1
2x þ 3y þ 3z ¼ k2
3x þ 10y þ 2z ¼ k3
or AX ¼ B; where A ¼
1 2 1
2 3 3
3 10 2
2
4
3
5 and B ¼
k1
k2
k3
2
4
3
5
Suppose we want to solve the system three times where B is equal, say, to B1; B2; B3. Furthermore, suppose
B1 ¼ ½1; 1; 1T
, and suppose
Bjþ1 ¼ Bj þ Xj ðfor j ¼ 1; 2Þ
where Xj is the solution of AX ¼ Bj. Here it is more efficient to first obtain the LU factorization of A and then use the
LU factorization to solve the system for each of the B’s. (This is done in Problem 3.42.)
SOLVED PROBLEMS
Linear Equations, Solutions, 2 2 Systems
3.1. Determine whether each of the following equations is linear:
(a) 5x þ 7y  8yz ¼ 16, (b) x þ py þ ez ¼ log 5, (c) 3x þ ky  8z ¼ 16
(a) No, because the product yz of two unknowns is of second degree.
(b) Yes, because p; e, and log 5 are constants.
(c) As it stands, there are four unknowns: x; y; z; k. Because of the term ky it is not a linear equation. However,
assuming k is a constant, the equation is linear in the unknowns x; y; z.
3.2. Determine whether the following vectors are solutions of x1 þ 2x2  4x3 þ 3x4 ¼ 15:
(a) u ¼ ð3; 2; 1; 4Þ and (b) v ¼ ð1; 2; 4; 5Þ:
(a) Substitute to obtain 3 þ 2ð2Þ  4ð1Þ þ 3ð4Þ ¼ 15, or 15 ¼ 15; yes, it is a solution.
(b) Substitute to obtain 1 þ 2ð2Þ  4ð4Þ þ 3ð5Þ ¼ 15, or 4 ¼ 15; no, it is not a solution.
3.3. Solve (a) ex ¼ p, (b) 3x  4  x ¼ 2x þ 3, (c) 7 þ 2x  4 ¼ 3x þ 3  x
(a) Because e 6¼ 0, multiply by 1=e to obtain x ¼ p=e.
(b) Rewrite in standard form, obtaining 0x ¼ 7. The equation has no solution.
(c) Rewrite in standard form, obtaining 0x ¼ 0. Every scalar k is a solution.
3.4. Prove Theorem 3.4: Consider the equation ax ¼ b.
(i) If a 6¼ 0, then x ¼ b=a is a unique solution of ax ¼ b.
(ii) If a ¼ 0 but b 6¼ 0, then ax ¼ b has no solution.
(iii) If a ¼ 0 and b ¼ 0, then every scalar k is a solution of ax ¼ b.
Suppose a 6¼ 0. Then the scalar b=a exists. Substituting b=a in ax ¼ b yields aðb=aÞ ¼ b, or b ¼ b;
hence, b=a is a solution. On the other hand, suppose x0 is a solution to ax ¼ b, so that ax0 ¼ b. Multiplying
both sides by 1=a yields x0 ¼ b=a. Hence, b=a is the unique solution of ax ¼ b. Thus, (i) is proved.
On the other hand, suppose a ¼ 0. Then, for any scalar k, we have ak ¼ 0k ¼ 0. If b 6¼ 0, then ak 6¼ b.
Accordingly, k is not a solution of ax ¼ b, and so (ii) is proved. If b ¼ 0, then ak ¼ b. That is, any scalar k is
a solution of ax ¼ b, and so (iii) is proved.
CHAPTER 3 Systems of Linear Equations 89
3.5. Solve each of the following systems:
(a)
2x  5y ¼ 11
3x þ 4y ¼ 5
(b)
2x  3y ¼ 8
6x þ 9y ¼ 6
(c)
2x  3y ¼ 8
4x þ 6y ¼ 16
(a) Eliminate x from the equations by forming the new equation L ¼ 3L1 þ 2L2. This yields the equation
23y ¼ 23; and so y ¼ 1
Substitute y ¼ 1 in one of the original equations, say L1, to get
2x  5ð1Þ ¼ 11 or 2x þ 5 ¼ 11 or 2x ¼ 6 or x ¼ 3
Thus, x ¼ 3, y ¼ 1 or the pair u ¼ ð3; 1Þ is the unique solution of the system.
(b) Eliminate x from the equations by forming the new equation L ¼ 3L1 þ L2. This yields the equation
0x þ 0y ¼ 30
This is a degenerate equation with a nonzero constant; hence, this equation and the system have no
solution. (Geometrically, the lines corresponding to the equations are parallel.)
(c) Eliminate x from the equations by forming the new equation L ¼ 2L1 þ L2. This yields the equation
0x þ 0y ¼ 0
This is a degenerate equation where the constant term is also zero. Thus, the system has an infinite
number of solutions, which correspond to the solution of either equation. (Geometrically, the lines
corresponding to the equations coincide.)
To find the general solution, set y ¼ a and substitute in L1 to obtain
2x  3a ¼ 8 or 2x ¼ 3a þ 8 or x ¼ 3
2 a þ 4
Thus, the general solution is
x ¼ 3
2 a þ 4; y ¼ a or u ¼ 3
2 a þ 4; a
where a is any scalar.
3.6. Consider the system
x þ ay ¼ 4
ax þ 9y ¼ b
(a) For which values of a does the system have a unique solution?
(b) Find those pairs of values (a; b) for which the system has more than one solution.
(a) Eliminate x from the equations by forming the new equation L ¼ aL1 þ L2. This yields the equation
ð9  a2
Þy ¼ b  4a ð1Þ
The system has a unique solution if and only if the coefficient of y in (1) is not zero—that is, if
9  a2
6¼ 0 or if a 6¼ 3.
(b) The system has more than one solution if both sides of (1) are zero. The left-hand side is zero when
a ¼ 3. When a ¼ 3, the right-hand side is zero when b  12 ¼ 0 or b ¼ 12. When a ¼ 3, the right-
hand side is zero when b þ 12  0 or b ¼ 12. Thus, (3; 12) and ð3; 12Þ are the pairs for which the
system has more than one solution.
Systems in Triangular and Echelon Form
3.7. Determine the pivot and free variables in each of the following systems:
2x1  3x2  6x3  5x4 þ 2x5 ¼ 7
x3 þ 3x4  7x5 ¼ 6
x4  2x5 ¼ 1
(a)
2x  6y þ 7z ¼ 1
4y þ 3z ¼ 8
2z ¼ 4
(b)
x þ 2y  3z ¼ 2
2x þ 3y þ z ¼ 4
3x þ 4y þ 5z ¼ 8
(c)
(a) In echelon form, the leading unknowns are the pivot variables, and the others are the free variables. Here
x1, x3, x4 are the pivot variables, and x2 and x5 are the free variables.
90 CHAPTER 3 Systems of Linear Equations
(b) The leading unknowns are x; y; z, so they are the pivot variables. There are no free variables (as in any
triangular system).
(c) The notion of pivot and free variables applies only to a system in echelon form.
3.8. Solve the triangular system in Problem 3.7(b).
Because it is a triangular system, solve by back-substitution.
(i) The last equation gives z ¼ 2.
(ii) Substitute z ¼ 2 in the second equation to get 4y þ 6 ¼ 8 or y ¼ 1
2.
(iii) Substitute z ¼ 2 and y ¼ 1
2 in the first equation to get
2x  6
1
2
 
þ 7ð2Þ ¼ 1 or 2x þ 11 ¼ 1 or x ¼ 5
Thus, x ¼ 5, y ¼ 1
2, z ¼ 2 or u ¼ ð5; 1
2 ; 2Þ is the unique solution to the system.
3.9. Solve the echelon system in Problem 3.7(a).
Assign parameters to the free variables, say x2 ¼ a and x5 ¼ b, and solve for the pivot variables by back-
substitution.
(i) Substitute x5 ¼ b in the last equation to get x4  2b ¼ 1 or x4 ¼ 2b þ 1.
(ii) Substitute x5 ¼ b and x4 ¼ 2b þ 1 in the second equation to get
x3 þ 3ð2b þ 1Þ  7b ¼ 6 or x3  b þ 3 ¼ 6 or x3 ¼ b þ 3
(iii) Substitute x5 ¼ b, x4 ¼ 2b þ 1, x3 ¼ b þ 3, x2 ¼ a in the first equation to get
2x1  3a  6ðb þ 3Þ  5ð2b þ 1Þ þ 2b ¼ 7 or 2x1  3a  14b  23 ¼ 7
or x1 ¼ 3
2 a þ 7b þ 15
Thus,
x1 ¼
3
2
a þ 7b þ 15; x2 ¼ a; x3 ¼ b þ 3; x4 ¼ 2b þ 1; x5 ¼ b
or u ¼
3
2
a þ 7b þ 15; a; b þ 3; 2b þ 1; b
 
is the parametric form of the general solution.
Alternatively, solving for the pivot variable x1; x3; x4 in terms of the free variables x2 and x5 yields the
following free-variable form of the general solution:
x1 ¼
3
2
x2 þ 7x5 þ 15; x3 ¼ x5 þ 3; x4 ¼ 2x5 þ 1
3.10. Prove Theorem 3.6. Consider the system (3.4) of linear equations in echelon form with r equations
and n unknowns.
(i) If r ¼ n, then the system has a unique solution.
(ii) If r  n, then we can arbitrarily assign values to the n  r free variable and solve uniquely for
the r pivot variables, obtaining a solution of the system.
(i) Suppose r ¼ n. Then we have a square system AX ¼ B where the matrix A of coefficients is (upper)
triangular with nonzero diagonal elements. Thus, A is invertible. By Theorem 3.10, the system has a
unique solution.
(ii) Assigning values to the n  r free variables yields a triangular system in the pivot variables, which, by
(i), has a unique solution.
CHAPTER 3 Systems of Linear Equations 91
Gaussian Elimination
3.11. Solve each of the following systems:
x þ 2y  4z ¼ 4
2x þ 5y  9z ¼ 10
3x  2y þ 3z ¼ 11
(a)
x þ 2y  3z ¼ 1
3x þ y  2z ¼ 7
5x þ 3y  4z ¼ 2
(b)
x þ 2y  3z ¼ 1
2x þ 5y  8z ¼ 4
3x þ 8y  13z ¼ 7
(c)
Reduce each system to triangular or echelon form using Gaussian elimination:
(a) Apply ‘‘Replace L2 by 2L1 þ L2’’ and ‘‘Replace L3 by 3L1 þ L3’’ to eliminate x from the second and
third equations, and then apply ‘‘Replace L3 by 8L2 þ L3’’ to eliminate y from the third equation. These
operations yield
x þ 2y  4z ¼ 4
y  z ¼ 2
8y þ 15z ¼ 23
and then
x þ 2y  4z ¼ 4
y  z ¼ 2
7z ¼ 7
The system is in triangular form. Solve by back-substitution to obtain the unique solution
u ¼ ð2; 1; 1Þ.
(b) Eliminate x from the second and third equations by the operations ‘‘Replace L2 by 3L1 þ L2’’ and
‘‘Replace L3 by 5L1 þ L3.’’ This gives the equivalent system
x þ 2y  3z ¼ 1
7y  11z ¼ 10
7y þ 11z ¼ 7
The operation ‘‘Replace L3 by L2 þ L3’’ yields the following degenerate equation with a nonzero
constant:
0x þ 0y þ 0z ¼ 3
This equation and hence the system have no solution.
(c) Eliminate x from the second and third equations by the operations ‘‘Replace L2 by 2L1 þ L2’’ and
‘‘Replace L3 by 3L1 þ L3.’’ This yields the new system
x þ 2y  3z ¼ 1
y  2z ¼ 2
2y  4z ¼ 4
or
x þ 2y  3z ¼ 1
y  2z ¼ 2
(The third equation is deleted, because it is a multiple of the second equation.) The system is in echelon
form with pivot variables x and y and free variable z.
To find the parametric form of the general solution, set z ¼ a and solve for x and y by back-
substitution. Substitute z ¼ a in the second equation to get y ¼ 2 þ 2a. Then substitute z ¼ a and
y ¼ 2 þ 2a in the first equation to get
x þ 2ð2 þ 2aÞ  3a ¼ 1 or x þ 4 þ a ¼ 1 or x ¼ 3  a
Thus, the general solution is
x ¼ 3  a; y ¼ 2 þ 2a; z ¼ a or u ¼ ð3  a; 2 þ 2a; aÞ
where a is a parameter.
3.12. Solve each of the following systems:
x1  3x2 þ 2x3  x4 þ 2x5 ¼ 2
3x1  9x2 þ 7x3  x4 þ 3x5 ¼ 7
2x1  6x2 þ 7x3 þ 4x4  5x5 ¼ 7
(a)
x1 þ 2x2  3x3 þ 4x4 ¼ 2
2x1 þ 5x2  2x3 þ x4 ¼ 1
5x1 þ 12x2  7x3 þ 6x4 ¼ 3
(b)
Reduce each system to echelon form using Gaussian elimination:
92 CHAPTER 3 Systems of Linear Equations
(a) Apply ‘‘Replace L2 by 3L1 þ L2’’ and ‘‘Replace L3 by 2L1 þ L3’’ to eliminate x from the second and
third equations. This yields
x1  3x2 þ 2x3  x4 þ 2x5 ¼ 2
x3 þ 2x4  3x5 ¼ 1
3x3 þ 6x4  9x5 ¼ 3
or
x1  3x2 þ 2x3  x4 þ 2x5 ¼ 2
x3 þ 2x4  3x5 ¼ 1
(We delete L3, because it is a multiple of L2.) The system is in echelon form with pivot variables x1 and
x3 and free variables x2; x4; x5.
To find the parametric form of the general solution, set x2 ¼ a, x4 ¼ b, x5 ¼ c, where a; b; c are
parameters. Back-substitution yields x3 ¼ 1  2b þ 3c and x1 ¼ 3a þ 5b  8c. The general solution is
x1 ¼ 3a þ 5b  8c; x2 ¼ a; x3 ¼ 1  2b þ 3c; x4 ¼ b; x5 ¼ c
or, equivalently, u ¼ ð3a þ 5b  8c; a; 1  2b þ 3c; b; cÞ.
(b) Eliminate x1 from the second and third equations by the operations ‘‘Replace L2 by 2L1 þ L2’’ and
‘‘Replace L3 by 5L1 þ L3.’’ This yields the system
x1 þ 2x2  3x3 þ 4x4 ¼ 2
x2 þ 4x3  7x4 ¼ 3
2x2 þ 8x3  14x4 ¼ 7
The operation ‘‘Replace L3 by 2L2 þ L3’’ yields the degenerate equation 0 ¼ 1. Thus, the system
has no solution (even though the system has more unknowns than equations).
3.13. Solve using the condensed format:
2y þ 3z ¼ 3
x þ y þ z ¼ 4
4x þ 8y  3z ¼ 35
The condensed format follows:
Number Equation Operation
ð2Þ ð 1
=Þ 2y þ 3z ¼ 3 L1 $ L2
ð1Þ ð 2
=Þ x þ y þ z ¼ 4 L1 $ L2
ð3Þ 4x þ 8y  3z ¼ 35
ð30
Þ 4y  7z ¼ 19 Replace L3 by  4L1 þ L3
ð300
Þ  13z ¼ 13 Replace L3 by  2L2 þ L3
Here (1), (2), and (300
) form a triangular system. (We emphasize that the interchange of L1 and L2 is
accomplished by simply renumbering L1 and L2 as above.)
Using back-substitution with the triangular system yields z ¼ 1 from L3, y ¼ 3 from L2, and x ¼ 2
from L1. Thus, the unique solution of the system is x ¼ 2, y ¼ 3, z ¼ 1 or the triple u ¼ ð2; 3; 1Þ.
3.14. Consider the system
x þ 2y þ z ¼ 3
ay þ 5z ¼ 10
2x þ 7y þ az ¼ b
(a) Find those values of a for which the system has a unique solution.
(b) Find those pairs of values ða; bÞ for which the system has more than one solution.
Reduce the system to echelon form. That is, eliminate x from the third equation by the operation
‘‘Replace L3 by 2L1 þ L3’’ and then eliminate y from the third equation by the operation
CHAPTER 3 Systems of Linear Equations 93
‘‘Replace L3 by 3L2 þ aL3.’’ This yields
x þ 2y þ z ¼ 3
ay þ 5z ¼ 10
3y þ ða  2Þz ¼ b  6
and then
x þ 2y þ z ¼ 3
ay þ 5z ¼ 10
ða2
 2a  15Þz ¼ ab  6a  30
Examine the last equation ða2
 2a  15Þz ¼ ab  6a  30.
(a) The system has a unique solution if and only if the coefficient of z is not zero; that is, if
a2
 2a  15 ¼ ða  5Þða þ 3Þ 6¼ 0 or a 6¼ 5 and a 6¼ 3:
(b) The system has more than one solution if both sides are zero. The left-hand side is zero when a ¼ 5 or
a ¼ 3. When a ¼ 5, the right-hand side is zero when 5b  60 ¼ 0, or b ¼ 12. When a ¼ 3, the right-
hand side is zero when 3b  12 ¼ 0, or b ¼ 4. Thus, ð5; 12Þ and ð3; 4Þ are the pairs for which the
system has more than one solution.
Echelon Matrices, Row Equivalence, Row Canonical Form
3.15. Row reduce each of the following matrices to echelon form:
(a) A ¼
1 2 3 0
2 4 2 2
3 6 4 3
2
4
3
5; (b) B ¼
4 1 6
1 2 5
6 3 4
2
4
3
5
(a) Use a11 ¼ 1 as a pivot to obtain 0’s below a11; that is, apply the row operations ‘‘Replace R2 by
2R1 þ R2’’ and ‘‘Replace R3 by 3R1 þ R3:’’ Then use a23 ¼ 4 as a pivot to obtain a 0 below a23; that
is, apply the row operation ‘‘Replace R3 by 5R2 þ 4R3.’’ These operations yield
A
1 2 3 0
0 0 4 2
0 0 5 3
2
4
3
5
1 2 3 0
0 0 4 2
0 0 0 2
2
4
3
5
The matrix is now in echelon form.
(b) Hand calculations are usually simpler if the pivot element equals 1. Therefore, first interchange R1 and R2.
Next apply the operations ‘‘Replace R2 by 4R1 þ R2’’ and ‘‘Replace R3 by 6R1 þ R3’’; and then apply
the operation ‘‘Replace R3 by R2 þ R3.’’ These operations yield
B
1 2 5
4 1 6
6 3 4
2
4
3
5
1 2 5
0 9 26
0 9 26
2
4
3
5
1 2 5
0 9 26
0 0 0
2
4
3
5
The matrix is now in echelon form.
3.16. Describe the pivoting row-reduction algorithm. Also describe the advantages, if any, of using this
pivoting algorithm.
The row-reduction algorithm becomes a pivoting algorithm if the entry in column j of greatest absolute
value is chosen as the pivot a1j1
and if one uses the row operation
ðaij1
=a1j1
ÞR1 þ Ri ! Ri
The main advantage of the pivoting algorithm is that the above row operation involves division by the
(current) pivot a1j1
, and, on the computer, roundoff errors may be substantially reduced when one divides by
a number as large in absolute value as possible.
3.17. Let A ¼
2 2 2 1
3 6 0 1
1 7 10 2
2
4
3
5. Reduce A to echelon form using the pivoting algorithm.
94 CHAPTER 3 Systems of Linear Equations
First interchange R1 and R2 so that 3 can be used as the pivot, and then apply the operations ‘‘Replace R2
by 2
3 R1 þ R2’’ and ‘‘Replace R3 by 1
3 R1 þ R3.’’ These operations yield
A
3 6 0 1
2 2 2 1
1 7 10 2
2
4
3
5
3 6 0 1
0 2 2 1
3
0 5 10 5
3
2
6
4
3
7
5
Now interchange R2 and R3 so that 5 can be used as the pivot, and then apply the operation ‘‘Replace R3 by
2
5 R2 þ R3.’’ We obtain
A
3 6 0 1
0 5 10 5
3
0 2 2 1
3
2
4
3
5
3 6 0 1
0 5 10 5
3
0 0 6 1
2
4
3
5
The matrix has been brought to echelon form using partial pivoting.
3.18. Reduce each of the following matrices to row canonical form:
(a) A ¼
2 2 1 6 4
4 4 1 10 13
8 8 1 26 23
2
4
3
5; (b) B ¼
5 9 6
0 2 3
0 0 7
2
4
3
5
(a) First reduce A to echelon form by applying the operations ‘‘Replace R2 by 2R1 þ R2’’ and ‘‘Replace R3
by 4R1 þ R3,’’ and then applying the operation ‘‘Replace R3 by R2 þ R3.’’ These operations yield
A
2 2 1 6 4
0 0 3 2 5
0 0 3 2 7
2
4
3
5
2 2 1 6 4
0 0 3 2 5
0 0 0 4 2
2
4
3
5
Now use back-substitution on the echelon matrix to obtain the row canonical form of A. Specifically,
first multiply R3 by 1
4 to obtain the pivot a34 ¼ 1, and then apply the operations ‘‘Replace R2 by
2R3 þ R2’’ and ‘‘Replace R1 by 6R3 þ R1.’’ These operations yield
A
2 2 1 6 4
0 0 3 2 5
0 0 0 1 1
2
2
4
3
5
2 2 1 0 1
0 0 3 0 6
0 0 0 1 1
2
2
4
3
5
Now multiply R2 by 1
3 , making the pivot a23 ¼ 1, and then apply ‘‘Replace R1 by R2 þ R1,’’ yielding
A
2 2 1 0 1
0 0 1 0 2
0 0 0 1 1
2
2
4
3
5
2 2 0 0 3
0 0 1 0 2
0 0 0 1 1
2
2
4
3
5
Finally, multiply R1 by 1
2 , so the pivot a11 ¼ 1. Thus, we obtain the following row canonical form of A:
A
1 1 0 0 3
2
0 0 1 0 2
0 0 0 1 1
2
2
4
3
5
(b) Because B is in echelon form, use back-substitution to obtain
B
5 9 6
0 2 3
0 0 1
2
6
4
3
7
5
5 9 0
0 2 0
0 0 1
2
6
4
3
7
5
5 9 0
0 1 0
0 0 1
2
6
4
3
7
5
5 0 0
0 1 0
0 0 1
2
6
4
3
7
5
1 0 0
0 1 0
0 0 1
2
6
4
3
7
5
The last matrix, which is the identity matrix I, is the row canonical form of B. (This is expected, because
B is invertible, and so its row canonical form must be I.)
3.19. Describe the Gauss–Jordan elimination algorithm, which also row reduces an arbitrary matrix A to
its row canonical form.
CHAPTER 3 Systems of Linear Equations 95
The Gauss–Jordan algorithm is similar in some ways to the Gaussian elimination algorithm, except that
here each pivot is used to place 0’s both below and above the pivot, not just below the pivot, before working
with the next pivot. Also, one variation of the algorithm first normalizes each row—that is, obtains a unit
pivot—before it is used to produce 0’s in the other rows, rather than normalizing the rows at the end of the
algorithm.
3.20. Let A ¼
1 2 3 1 2
1 1 4 1 3
2 5 9 2 8
2
4
3
5. Use Gauss–Jordan to find the row canonical form of A.
Use a11 ¼ 1 as a pivot to obtain 0’s below a11 by applying the operations ‘‘Replace R2 by R1 þ R2’’
and ‘‘Replace R3 by 2R1 þ R3.’’ This yields
A
1 2 3 1 2
0 3 1 2 1
0 9 3 4 4
2
4
3
5
Multiply R2 by 1
3 to make the pivot a22 ¼ 1, and then produce 0’s below and above a22 by applying the
operations ‘‘Replace R3 by 9R2 þ R3’’ and ‘‘Replace R1 by 2R2 þ R1.’’ These operations yield
A
1 2 3 1 2
0 1 1
3  2
3
1
3
0 9 3 4 4
2
6
6
4
3
7
7
5
1 0 11
3  1
3
8
3
0 1 1
3  2
3
1
3
0 0 0 2 1
2
6
6
4
3
7
7
5
Finally, multiply R3 by 1
2 to make the pivot a34 ¼ 1, and then produce 0’s above a34 by applying the
operations ‘‘Replace R2 by 2
3 R3 þ R2’’ and ‘‘Replace R1 by 1
3 R3 þ R1.’’ These operations yield
A
1 0 11
3  1
3
8
3
0 1 1
3  2
3
1
3
0 0 0 1 1
2
2
6
6
4
3
7
7
5
1 0 11
3 0 17
6
0 1 1
3 0 2
3
0 0 0 1 1
2
2
6
6
4
3
7
7
5
which is the row canonical form of A.
Systems of Linear Equations in Matrix Form
3.21. Find the augmented matrix M and the coefficient matrix A of the following system:
x þ 2y  3z ¼ 4
3y  4z þ 7x ¼ 5
6z þ 8x  9y ¼ 1
First align the unknowns in the system, and then use the aligned system to obtain M and A. We have
x þ 2y  3z ¼ 4
7x þ 3y  4z ¼ 5
8x  9y þ 6z ¼ 1
; then M ¼
1 2 3 4
7 3 4 5
8 9 6 1
2
4
3
5 and A ¼
1 2 3
7 3 4
8 9 6
2
4
3
5
3.22. Solve each of the following systems using its augmented matrix M:
x þ 2y  z ¼ 3
x þ 3y þ z ¼ 5
3x þ 8y þ 4z ¼ 17
(a)
x  2y þ 4z ¼ 2
2x  3y þ 5z ¼ 3
3x  4y þ 6z ¼ 7
(b)
x þ y þ 3z ¼ 1
2x þ 3y  z ¼ 3
5x þ 7y þ z ¼ 7
(c)
(a) Reduce the augmented matrix M to echelon form as follows:
M ¼
1 2 1 3
1 3 1 5
3 8 4 17
2
4
3
5
1 2 1 3
0 1 2 2
0 2 7 8
2
4
3
5
1 2 1 3
0 1 2 2
0 0 3 4
2
4
3
5
96 CHAPTER 3 Systems of Linear Equations
Now write down the corresponding triangular system
x þ 2y  z ¼ 3
y þ 2z ¼ 2
3z ¼ 4
and solve by back-substitution to obtain the unique solution
x ¼ 17
3 ; y ¼  2
3 ; z ¼ 4
3 or u ¼ ð17
3 ;  2
3 ; 4
3Þ
Alternately, reduce the echelon form of M to row canonical form, obtaining
M
1 2 1 3
0 1 2 2
0 0 1 4
3
2
6
6
4
3
7
7
5
1 2 0 13
3
0 1 0  2
3
0 0 1 4
3
2
6
6
4
3
7
7
5
1 0 0 17
3
0 1 0  2
3
0 0 1 4
3
2
6
6
4
3
7
7
5
This also corresponds to the above solution.
(b) First reduce the augmented matrix M to echelon form as follows:
M ¼
1 2 4 2
2 3 5 3
3 4 6 7
2
4
3
5
1 2 4 2
0 1 3 1
0 2 6 1
2
4
3
5
1 2 4 2
0 1 3 1
0 0 0 3
2
4
3
5
The third row corresponds to the degenerate equation 0x þ 0y þ 0z ¼ 3, which has no solution. Thus,
‘‘DO NOT CONTINUE.’’ The original system also has no solution. (Note that the echelon form
indicates whether or not the system has a solution.)
(c) Reduce the augmented matrix M to echelon form and then to row canonical form:
M ¼
1 1 3 1
2 3 1 3
5 7 1 7
2
4
3
5
1 1 3 1
0 1 7 1
0 2 14 2
2
4
3
5 1 0 10 0
0 1 7 1
 
(The third row of the second matrix is deleted, because it is a multiple of the second row and will result
in a zero row.) Write down the system corresponding to the row canonical form of M and then transfer
the free variables to the other side to obtain the free-variable form of the solution:
x þ 10z ¼ 0
y  7z ¼ 1
and
x ¼ 10z
y ¼ 1 þ 7z
Here z is the only free variable. The parametric solution, using z ¼ a, is as follows:
x ¼ 10a; y ¼ 1 þ 7a; z ¼ a or u ¼ ð10a; 1 þ 7a; aÞ
3.23. Solve the following system using its augmented matrix M:
x1 þ 2x2  3x3  2x4 þ 4x5 ¼ 1
2x1 þ 5x2  8x3  x4 þ 6x5 ¼ 4
x1 þ 4x2  7x3 þ 5x4 þ 2x5 ¼ 8
Reduce the augmented matrix M to echelon form and then to row canonical form:
M ¼
1 2 3 2 4 1
2 5 8 1 6 4
1 4 7 5 2 8
2
6
4
3
7
5
1 2 3 2 4 1
0 1 2 3 2 2
0 2 4 7 2 7
2
6
4
3
7
5
1 2 3 2 4 1
0 1 2 3 2 2
0 0 0 1 2 3
2
6
4
3
7
5
1 2 3 0 8 7
0 1 2 0 8 7
0 0 0 1 2 3
2
6
4
3
7
5
1 0 1 0 24 21
0 1 2 0 8 7
0 0 0 1 2 3
2
6
4
3
7
5
Write down the system corresponding to the row canonical form of M and then transfer the free variables to
the other side to obtain the free-variable form of the solution:
x1 þ x3 þ 24x5 ¼ 21
x2  2x3  8x5 ¼ 7
x4 þ 2x5 ¼ 3
and
x1 ¼ 21  x3  24x5
x2 ¼ 7 þ 2x3 þ 8x5
x4 ¼ 3  2x5
CHAPTER 3 Systems of Linear Equations 97
Here x1; x2; x4 are the pivot variables and x3 and x5 are the free variables. Recall that the parametric form of
the solution can be obtained from the free-variable form of the solution by simply setting the free variables
equal to parameters, say x3 ¼ a, x5 ¼ b. This process yields
x1 ¼ 21  a  24b; x2 ¼ 7 þ 2a þ 8b; x3 ¼ a; x4 ¼ 3  2b; x5 ¼ b
or u ¼ ð21  a  24b; 7 þ 2a þ 8b; a; 3  2b; bÞ
which is another form of the solution.
Linear Combinations, Homogeneous Systems
3.24. Write v as a linear combination of u1; u2; u3, where
(a) v ¼ ð3; 10; 7Þ and u1 ¼ ð1; 3; 2Þ; u2 ¼ ð1; 4; 2Þ; u3 ¼ ð2; 8; 1Þ;
(b) v ¼ ð2; 7; 10Þ and u1 ¼ ð1; 2; 3Þ, u2 ¼ ð1; 3; 5Þ, u3 ¼ ð1; 5; 9Þ;
(c) v ¼ ð1; 5; 4Þ and u1 ¼ ð1; 3; 2Þ, u2 ¼ ð2; 7; 1Þ, u3 ¼ ð1; 6; 7Þ.
Find the equivalent system of linear equations by writing v ¼ xu1 þ yu2 þ zu3. Alternatively, use the
augmented matrix M of the equivalent system, where M ¼ ½u1; u2; u3; v. (Here u1; u2; u3; v are the columns
of M.)
(a) The vector equation v ¼ xu1 þ yu2 þ zu3 for the given vectors is as follows:
3
10
7
2
4
3
5 ¼ x
1
3
2
2
4
3
5 þ y
1
4
2
2
4
3
5 þ z
2
8
1
2
4
3
5 ¼
x þ y þ 2z
3x þ 4y þ 8z
2x þ 2y þ z
2
4
3
5
Form the equivalent system of linear equations by setting corresponding entries equal to each other, and
then reduce the system to echelon form:
x þ y þ 2z ¼ 3
3x þ 4y þ 8z ¼ 10
2x þ 2y þ z ¼ 7
or
x þ y þ 2z ¼ 3
y þ 2z ¼ 1
4y þ 5z ¼ 13
or
x þ y þ 2z ¼ 3
y þ 2z ¼ 1
3z ¼ 9
The system is in triangular form. Back-substitution yields the unique solution x ¼ 2, y ¼ 7, z ¼ 3.
Thus, v ¼ 2u1 þ 7u2  3u3.
Alternatively, form the augmented matrix M ¼ [u1; u2; u3; v] of the equivalent system, and reduce
M to echelon form:
M ¼
1 1 2 3
3 4 8 10
2 2 1 7
2
4
3
5
1 1 2 3
0 1 2 1
0 4 5 13
2
4
3
5
1 1 2 3
0 1 2 1
0 0 3 9
2
4
3
5
The last matrix corresponds to a triangular system that has a unique solution. Back-substitution yields
the solution x ¼ 2, y ¼ 7, z ¼ 3. Thus, v ¼ 2u1 þ 7u2  3u3.
(b) Form the augmented matrix M ¼ ½u1; u2; u3; v of the equivalent system, and reduce M to the echelon
form:
M ¼
1 1 1 2
2 3 5 7
3 5 9 10
2
4
3
5
1 1 1 2
0 1 3 3
0 2 6 4
2
4
3
5
1 1 1 2
0 1 3 3
0 0 0 2
2
4
3
5
The third row corresponds to the degenerate equation 0x þ 0y þ 0z ¼ 2, which has no solution. Thus,
the system also has no solution, and v cannot be written as a linear combination of u1; u2; u3.
(c) Form the augmented matrix M ¼ ½u1; u2; u3; v of the equivalent system, and reduce M to echelon form:
M ¼
1 2 1 1
3 7 6 5
2 1 7 4
2
4
3
5
1 2 1 1
0 1 3 2
0 3 9 6
2
4
3
5
1 2 1 1
0 1 3 2
0 0 0 0
2
4
3
5
98 CHAPTER 3 Systems of Linear Equations
The last matrix corresponds to the following system with free variable z:
x þ 2y þ z ¼ 1
y þ 3z ¼ 2
Thus, v can be written as a linear combination of u1; u2; u3 in many ways. For example, let the free
variable z ¼ 1, and, by back-substitution, we get y ¼ 2 and x ¼ 2. Thus, v ¼ 2u1  2u2 þ u3.
3.25. Let u1 ¼ ð1; 2; 4Þ, u2 ¼ ð2; 3; 1Þ, u3 ¼ ð2; 1; 1Þ in R3
. Show that u1; u2; u3 are orthogonal, and
write v as a linear combination of u1; u2; u3, where (a) v ¼ ð7; 16; 6Þ, (b) v ¼ ð3; 5; 2Þ.
Take the dot product of pairs of vectors to get
u1  u2 ¼ 2  6 þ 4 ¼ 0; u1  u3 ¼ 2 þ 2  4 ¼ 0; u2  u3 ¼ 4  3  1 ¼ 0
Thus, the three vectors in R3
are orthogonal, and hence Fourier coefficients can be used. That is,
v ¼ xu1 þ yu2 þ zu3, where
x ¼
v  u1
u1  u1
; y ¼
v  u2
u2  u2
; z ¼
v  u3
u3  u3
(a) We have
x ¼
7 þ 32 þ 24
1 þ 4 þ 16
¼
63
21
¼ 3; y ¼
14  48 þ 6
4 þ 9 þ 1
¼
28
14
¼ 2; z ¼
14 þ 16  6
4 þ 1 þ 1
¼
24
6
¼ 4
Thus, v ¼ 3u1  2u2 þ 4u3.
(b) We have
x ¼
3 þ 10 þ 8
1 þ 4 þ 16
¼
21
21
¼ 1; y ¼
6  15 þ 2
4 þ 9 þ 1
¼
7
14
¼ 
1
2
; z ¼
6 þ 5  2
4 þ 1 þ 1
¼
9
6
¼
3
2
Thus, v ¼ u1  1
2 u2 þ 3
2 u3.
3.26. Find the dimension and a basis for the general solution W of each of the following homogeneous
systems:
2x1 þ 4x2  5x3 þ 3x4 ¼ 0
3x1 þ 6x2  7x3 þ 4x4 ¼ 0
5x1 þ 10x2  11x3 þ 6x4 ¼ 0
(a)
x  2y  3z ¼ 0
2x þ y þ 3z ¼ 0
3x  4y  2z ¼ 0
(b)
(a) Reduce the system to echelon form using the operations ‘‘Replace L2 by 3L1 þ 2L2,’’ ‘‘Replace L3 by
5L1 þ 2L3,’’ and then ‘‘Replace L3 by 2L2 þ L3.’’ These operations yield
2x1 þ 4x2  5x3 þ 3x4 ¼ 0
x3  x4 ¼ 0
3x3  3x4 ¼ 0
and
2x1 þ 4x2  5x3 þ 3x4 ¼ 0
x3  x4 ¼ 0
The system in echelon form has two free variables, x2 and x4, so dim W ¼ 2. A basis ½u1; u2 for W may
be obtained as follows:
(1) Set x2 ¼ 1, x4 ¼ 0. Back-substitution yields x3 ¼ 0, and then x1 ¼ 2. Thus, u1 ¼ ð2; 1; 0; 0Þ.
(2) Set x2 ¼ 0, x4 ¼ 1. Back-substitution yields x3 ¼ 1, and then x1 ¼ 1. Thus, u2 ¼ ð1; 0; 1; 1Þ.
(b) Reduce the system to echelon form, obtaining
x  2y  3z ¼ 0
5y þ 9z ¼ 0
2y þ 7z ¼ 0
and
x  2y  3z ¼ 0
5y þ 9z ¼ 0
17z ¼ 0
There are no free variables (the system is in triangular form). Hence, dim W ¼ 0, and W has no basis.
Specifically, W consists only of the zero solution; that is, W ¼ f0g.
3.27. Find the dimension and a basis for the general solution W of the following homogeneous system
using matrix notation:
x1 þ 2x2 þ 3x3  2x4 þ 4x5 ¼ 0
2x1 þ 4x2 þ 8x3 þ x4 þ 9x5 ¼ 0
3x1 þ 6x2 þ 13x3 þ 4x4 þ 14x5 ¼ 0
Show how the basis gives the parametric form of the general solution of the system.
When a system is homogeneous, we represent the system by its coefficient matrix A rather than by its
CHAPTER 3 Systems of Linear Equations 99
augmented matrix M, because the last column of the augmented matrix M is a zero column, and it will
remain a zero column during any row-reduction process.
Reduce the coefficient matrix A to echelon form, obtaining
A ¼
1 2 3 2 4
2 4 8 1 9
3 6 13 4 14
2
4
3
5
1 2 3 2 4
0 0 2 5 1
0 0 4 10 2
2
4
3
5 1 2 3 2 4
0 0 2 5 1
 
(The third row of the second matrix is deleted, because it is a multiple of the second row and will result in a
zero row.) We can now proceed in one of two ways.
(a) Write down the corresponding homogeneous system in echelon form:
x1 þ 2x2 þ 3x3  2x4 þ 4x5 ¼ 0
2x3 þ 5x4 þ x5 ¼ 0
The system in echelon form has three free variables, x2; x4; x5, so dim W ¼ 3. A basis ½u1; u2; u3 for W
may be obtained as follows:
(1) Set x2 ¼ 1, x4 ¼ 0, x5 ¼ 0. Back-substitution yields x3 ¼ 0, and then x1 ¼ 2. Thus,
u1 ¼ ð2; 1; 0; 0; 0Þ.
(2) Set x2 ¼ 0, x4 ¼ 1, x5 ¼ 0. Back-substitution yields x3 ¼  5
2, and then x1 ¼ 19
2 . Thus,
u2 ¼ ð19
2 ; 0;  5
2 ; 1; 0Þ.
(3) Set x2 ¼ 0, x4 ¼ 0, x5 ¼ 1. Back-substitution yields x3 ¼  1
2, and then x1 ¼  5
2. Thus,
u3 ¼ ð 5
2, 0,  1
2 ; 0; 1Þ.
[One could avoid fractions in the basis by choosing x4 ¼ 2 in (2) and x5 ¼ 2 in (3), which yields
multiples of u2 and u3.] The parametric form of the general solution is obtained from the following
linear combination of the basis vectors using parameters a; b; c:
au1 þ bu2 þ cu3 ¼ ð2a þ 19
2 b  5
2 c; a;  5
2 b  1
2 c; b; cÞ
(b) Reduce the echelon form of A to row canonical form:
A
1 2 3 2 4
0 0 1 5
2
1
2
 #
1 2 3  19
2
5
2
0 0 1 5
2
1
2
 #
Write down the corresponding free-variable solution:
x1 ¼ 2x2 þ
19
2
x4 
5
2
x5
x3 ¼ 
5
2
x4 
1
2
x5
Using these equations for the pivot variables x1 and x3, repeat the above process to obtain a basis ½u1; u2; u3
for W. That is, set x2 ¼ 1, x4 ¼ 0, x5 ¼ 0 to get u1; set x2 ¼ 0, x4 ¼ 1, x5 ¼ 0 to get u2; and set x2 ¼ 0,
x4 ¼ 0, x5 ¼ 1 to get u3.
3.28. Prove Theorem 3.15. Let v0 be a particular solution of AX ¼ B, and let W be the general solution
of AX ¼ 0. Then U ¼ v0 þ W ¼ fv0 þ w : w 2 Wg is the general solution of AX ¼ B.
Let w be a solution of AX ¼ 0. Then
Aðv0 þ wÞ ¼ Av0 þ Aw ¼ B þ 0 ¼ B
Thus, the sum v0 þ w is a solution of AX ¼ B. On the other hand, suppose v is also a solution of AX ¼ B.
Then
Aðv  v0Þ ¼ Av  Av0 ¼ B  B ¼ 0
Therefore, v  v0 belongs to W. Because v ¼ v0 þ ðv  v0Þ, we find that any solution of AX ¼ B can be
obtained by adding a solution of AX ¼ 0 to a solution of AX ¼ B. Thus, the theorem is proved.
100 CHAPTER 3 Systems of Linear Equations
Elementary Matrices, Applications
3.29. Let e1; e2; e3 denote, respectively, the elementary row operations
‘‘Interchange rows R1 and R2; ’’ ‘‘Replace R3 by 7R3; ’’ ‘‘Replace R2 by 3R1 þ R2’’
Find the corresponding three-square elementary matrices E1; E2; E3. Apply each operation to the 3 3 identity
matrix I3 to obtain
E1 ¼
0 1 0
1 0 0
0 0 1
2
4
3
5; E2 ¼
1 0 0
0 1 0
0 0 7
2
4
3
5; E3 ¼
1 0 0
3 1 0
0 0 1
2
4
3
5
3.30. Consider the elementary row operations in Problem 3.29.
(a) Describe the inverse operations e1
1 , e1
2 , e1
3 .
(b) Find the corresponding three-square elementary matrices E0
1, E0
2, E0
3.
(c) What is the relationship between the matrices E0
1, E0
2, E0
3 and the matrices E1, E2, E3?
(a) The inverses of e1, e2, e3 are, respectively,
‘‘Interchange rows R1 and R2; ’’ ‘‘Replace R3 by 1
7 R3; ’’ ‘‘Replace R2 by 3R1 þ R2:’’
(b) Apply each inverse operation to the 3 3 identity matrix I3 to obtain
E0
1 ¼
0 1 0
1 0 0
0 0 1
2
4
3
5; E0
2 ¼
1 0 0
0 1 0
0 0 1
7
2
4
3
5; E0
3 ¼
1 0 0
3 1 0
0 0 1
2
4
3
5
(c) The matrices E0
1, E0
2, E0
3 are, respectively, the inverses of the matrices E1, E2, E3.
3.31. Write each of the following matrices as a product of elementary matrices:
(a) A ¼
1 3
2 4
 
; (b) B ¼
1 2 3
0 1 4
0 0 1
2
4
3
5; (c) C ¼
1 1 2
2 3 8
3 1 2
2
4
3
5
The following three steps write a matrix M as a product of elementary matrices:
Step 1. Row reduce M to the identity matrix I, keeping track of the elementary row operations.
Step 2. Write down the inverse row operations.
Step 3. Write M as the product of the elementary matrices corresponding to the inverse operations. This
gives the desired result.
If a zero row appears in Step 1, then M is not row equivalent to the identity matrix I, and M cannot be
written as a product of elementary matrices.
(a) (1) We have
A ¼
1 3
2 4
 
1 3
0 2
 
1 3
0 1
 
1 0
0 1
 
¼ I
where the row operations are, respectively,
‘‘Replace R2 by 2R1 þ R2; ’’ ‘‘Replace R2 by  1
2 R2; ’’ ‘‘Replace R1 by 3R2 þ R1’’
(2) Inverse operations:
‘‘Replace R2 by 2R1 þ R2;’’ ‘‘Replace R2 by 2R2;’’ ‘‘Replace R1 by 3R2 þ R1’’
(3) A ¼
1 0
2 1
 
1 0
0 2
 
1 3
0 1
 
CHAPTER 3 Systems of Linear Equations 101
(b) (1) We have
B ¼
1 2 3
0 1 4
0 0 1
2
4
3
5
1 2 0
0 1 0
0 0 1
2
4
3
5
1 0 0
0 1 0
0 0 1
2
4
3
5 ¼ I
where the row operations are, respectively,
‘‘Replace R2 by  4R3 þ R2; ’’ ‘‘Replace R1 by  3R3 þ R1; ’’ ‘‘Replace R1 by 2R2 þ R1’’
(2) Inverse operations:
‘‘Replace R2 by 4R3 þ R2; ’’ ‘‘Replace R1 by 3R3 þ R1; ’’ ‘‘Replace R1 by 2R2 þ R1’’
(3) B ¼
1 0 0
0 1 4
0 0 1
2
4
3
5
1 0 3
0 1 0
0 0 1
2
4
3
5
1 2 0
0 1 0
0 0 1
2
4
3
5
(c) (1) First row reduce C to echelon form. We have
C ¼
1 1 2
2 3 8
3 1 2
2
4
3
5
1 1 2
0 1 4
0 2 8
2
4
3
5
1 1 2
0 1 4
0 0 0
2
4
3
5
In echelon form, C has a zero row. ‘‘STOP.’’ The matrix C cannot be row reduced to the identity
matrix I, and C cannot be written as a product of elementary matrices. (We note, in particular, that
C has no inverse.)
3.32. Find the inverse of (a) A ¼
1 2 4
1 1 5
2 7 3
2
4
3
5; (b) B ¼
1 3 4
1 5 1
3 13 6
2
4
3
5.
(a) Form the matrix M ¼ [A; I] and row reduce M to echelon form:
M ¼
1 2 4 1 0 0
1 1 5 0 1 0
2 7 3 0 0 1
2
6
4
3
7
5
1 2 4 1 0 0
0 1 1 1 1 0
0 3 5 2 0 1
2
6
4
3
7
5
1 2 4 1 0 0
0 1 1 1 1 0
0 0 2 5 3 1
2
6
4
3
7
5
In echelon form, the left half of M is in triangular form; hence, A has an inverse. Further reduce M to
row canonical form:
M
1 2 0 9 6 2
0 1 0 7
2
5
2  1
2
0 0 1  5
2  3
2
1
2
2
6
6
4
3
7
7
5
1 0 0 16 11 3
0 1 0 7
2
5
2  1
2
0 0 1  5
2  3
2
1
2
2
6
6
4
3
7
7
5
The final matrix has the form ½I; A1
; that is, A1
is the right half of the last matrix. Thus,
A1
¼
16 11 3
7
2
5
2  1
2
 5
2  3
2
1
2
2
6
6
4
3
7
7
5
(b) Form the matrix M ¼ ½B; I and row reduce M to echelon form:
M ¼
1 3 4 1 0 0
1 5 1 0 1 0
3 13 6 0 0 1
2
4
3
5
1 3 4 1 0 0
0 2 3 1 1 0
0 4 6 3 0 1
2
4
3
5
1 3 4 1 0 0
0 2 3 1 1 0
0 0 0 1 2 1
2
4
3
5
In echelon form, M has a zero row in its left half; that is, B is not row reducible to triangular form.
Accordingly, B has no inverse.
102 CHAPTER 3 Systems of Linear Equations
3.33. Show that every elementary matrix E is invertible, and its inverse is an elementary matrix.
Let E be the elementary matrix corresponding to the elementary operation e; that is, eðIÞ ¼ E. Let e0
be
the inverse operation of e and let E0
be the corresponding elementary matrix; that is, e0
ðIÞ ¼ E0
. Then
I ¼ e0
ðeðIÞÞ ¼ e0
ðEÞ ¼ E0
E and I ¼ eðe0
ðIÞÞ ¼ eðE0
Þ ¼ EE0
Therefore, E0
is the inverse of E.
3.34. Prove Theorem 3.16: Let e be an elementary row operation and let E be the corresponding
m-square elementary matrix; that is, E ¼ eðIÞ. Then eðAÞ ¼ EA, where A is any m n matrix.
Let Ri be the row i of A; we denote this by writing A ¼ ½R1; . . . ; Rm. If B is a matrix for which AB is
defined then AB ¼ ½R1B; . . . ; RmB. We also let
ei ¼ ð0; . . . ; 0; ^
1; 0; . . . ; 0Þ; ^¼ i
Here ^¼ i means 1 is the ith entry. One can show (Problem 2.45) that eiA ¼ Ri. We also note that
I ¼ ½e1; e2; . . . ; em is the identity matrix.
(i) Let e be the elementary row operation ‘‘Interchange rows Ri and Rj.’’ Then, for ^¼ i and ^
^ ¼ j,
E ¼ eðIÞ ¼ ½e1; . . . ; b
ej; . . . ; b
b
ei; . . . ; em
and
eðAÞ ¼ ½R1; . . . ; b
Rj; . . . ;
b
b
Ri; . . . ; Rm
Thus,
EA ¼ ½e1A; . . . ; c
ejA; . . . ;
c
c
eiA; . . . ; emA ¼ ½R1; . . . ; b
Rj; . . . ;
b
b
Ri; . . . ; Rm ¼ eðAÞ
(ii) Let e be the elementary row operation ‘‘Replace Ri by kRi ðk 6¼ 0Þ.’’ Then, for^¼ i,
E ¼ eðIÞ ¼ ½e1; . . . ; b
kei; . . . ; em
and
eðAÞ ¼ ½R1; . . . ; c
kRi; . . . ; Rm
Thus,
EA ¼ ½e1A; . . . ; d
keiA; . . . ; emA ¼ ½R1; . . . ; c
kRi; . . . ; Rm ¼ eðAÞ
(iii) Let e be the elementary row operation ‘‘Replace Ri by kRj þ Ri.’’ Then, for^¼ i,
E ¼ eðIÞ ¼ ½e1; . . . ; d
kej þ ei; . . . ; em
and
eðAÞ ¼ ½R1; . . . ; d
kRj þ Ri; . . . ; Rm
Using ðkej þ eiÞA ¼ kðejAÞ þ eiA ¼ kRj þ Ri, we have
EA ¼ ½e1A; . . . ; ðkej þ eiÞA; . . . ; emA
¼ ½R1; . . . ; d
kRj þ Ri; . . . ; Rm ¼ eðAÞ
3.35. Prove Theorem 3.17: Let A be a square matrix. Then the following are equivalent:
(a) A is invertible (nonsingular).
(b) A is row equivalent to the identity matrix I.
(c) A is a product of elementary matrices.
Suppose A is invertible and suppose A is row equivalent to matrix B in row canonical form. Then there
exist elementary matrices E1; E2; . . . ; Es such that Es . . . E2E1A ¼ B. Because A is invertible and each
elementary matrix is invertible, B is also invertible. But if B 6¼ I, then B has a zero row; whence B is not
invertible. Thus, B ¼ I, and (a) implies (b).
CHAPTER 3 Systems of Linear Equations 103
If (b) holds, then there exist elementary matrices E1; E2; . . . ; Es such that Es . . . E2E1A ¼ I. Hence,
A ¼ ðEs . . . E2E1Þ1
¼ E1
1 E1
2 . . . ; E1
s . But the E1
i are also elementary matrices. Thus (b) implies (c).
If (c) holds, then A ¼ E1E2 . . . Es. The Ei are invertible matrices; hence, their product A is also
invertible. Thus, (c) implies (a). Accordingly, the theorem is proved.
3.36. Prove Theorem 3.18: If AB ¼ I, then BA ¼ I, and hence B ¼ A1
.
Suppose A is not invertible. Then A is not row equivalent to the identity matrix I, and so A is row
equivalent to a matrix with a zero row. In other words, there exist elementary matrices E1; . . . ; Es such
that Es . . . E2E1A has a zero row. Hence, Es . . . E2E1AB ¼ Es . . . E2E1, an invertible matrix, also has a
zero row. But invertible matrices cannot have zero rows; hence A is invertible, with inverse A1
. Then
also,
B ¼ IB ¼ ðA1
AÞB ¼ A1
ðABÞ ¼ A1
I ¼ A1
3.37. Prove Theorem 3.19: B is row equivalent to A (written B AÞ if and only if there exists a
nonsingular matrix P such that B ¼ PA.
If B A, then B ¼ esð. . . ðe2ðe1ðAÞÞÞ . . .Þ ¼ Es . . . E2E1A ¼ PA where P ¼ Es . . . E2E1 is nonsingular.
Conversely, suppose B ¼ PA, where P is nonsingular. By Theorem 3.17, P is a product of elementary
matrices, and so B can be obtained from A by a sequence of elementary row operations; that is, B A. Thus,
the theorem is proved.
3.38. Prove Theorem 3.21: Every m n matrix A is equivalent to a unique block matrix of the form
Ir 0
0 0
 
, where Ir is the r r identity matrix.
The proof is constructive, in the form of an algorithm.
Step 1. Row reduce A to row canonical form, with leading nonzero entries a1j1
, a2j2
; . . . ; arjr
.
Step 2. Interchange C1 and C1j1
, interchange C2 and C2j2
; . . . , and interchange Cr and Cjr. This gives a
matrix in the form
Ir B
0 0
 
, with leading nonzero entries a11; a22; . . . ; arr.
Step 3. Use column operations, with the aii as pivots, to replace each entry in B with a zero; that is, for
i ¼ 1; 2; . . . ; r and j ¼ r þ 1, r þ 2; . . . ; n, apply the operation bijCi þ Cj ! Cj.
The final matrix has the desired form
Ir 0
0 0
 
.
Lu Factorization
3.39. Find the LU factorization of (a) A ¼
1 3 5
2 4 7
1 2 1
2
4
3
5; (b) B ¼
1 4 3
2 8 1
5 9 7
2
4
3
5:
(a) Reduce A to triangular form by the following operations:
‘‘Replace R2 by  2R1 þ R2; ’’ ‘‘Replace R3 by R1 þ R3; ’’ and then
‘‘Replace R3 by 5
2 R2 þ R3’’
These operations yield the following, where the triangular form is U:
A
1 3 5
0 2 3
0 5 6
2
4
3
5
1 3 5
0 2 3
0 0  3
2
2
4
3
5 ¼ U and L ¼
1 0 0
2 1 0
1  5
2 1
2
4
3
5
The entries 2; 1;  5
2 in L are the negatives of the multipliers 2; 1; 5
2 in the above row operations. (As
a check, multiply L and U to verify A ¼ LU.)
104 CHAPTER 3 Systems of Linear Equations
(b) Reduce B to triangular form by first applying the operations ‘‘Replace R2 by 2R1 þ R2’’ and ‘‘Replace
R3 by 5R1 þ R3.’’ These operations yield
B
1 4 3
0 0 7
0 11 8
2
4
3
5:
Observe that the second diagonal entry is 0. Thus, B cannot be brought into triangular form without row
interchange operations. Accordingly, B is not LU-factorable. (There does exist a PLU factorization of
such a matrix B, where P is a permutation matrix, but such a factorization lies beyond the scope of this
text.)
3.40. Find the LDU factorization of the matrix A in Problem 3.39.
The A ¼ LDU factorization refers to the situation where L is a lower triangular matrix with 1’s on the
diagonal (as in the LU factorization of A), D is a diagonal matrix, and U is an upper triangular matrix with 1’s
on the diagonal. Thus, simply factor out the diagonal entries in the matrix U in the above LU factorization of A
to obtain D and L. That is,
L ¼
1 0 0
2 1 0
1  5
2 1
2
4
3
5; D ¼
1 0 0
0 2 0
0 0  3
2
2
4
3
5; U ¼
1 3 5
0 1 3
0 0 1
2
4
3
5
3.41. Find the LU factorization of the matrix A ¼
1 2 1
2 3 3
3 10 2
2
4
3
5.
Reduce A to triangular form by the following operations:
ð1Þ ‘‘Replace R2 by 2R1 þ R2; ’’ ð2Þ ‘‘Replace R3 by 3R1 þ R3; ’’ ð3Þ ‘‘Replace R3 by 4R2 þ R3’’
These operations yield the following, where the triangular form is U:
A
1 2 1
0 1 1
0 4 5
2
4
3
5
1 2 1
0 1 1
0 0 1
2
4
3
5 ¼ U and L ¼
1 0 0
2 1 0
3 4 1
2
4
3
5
The entries 2; 3; 4 in L are the negatives of the multipliers 2; 3; 4 in the above row operations. (As a
check, multiply L and U to verify A ¼ LU.)
3.42. Let A be the matrix in Problem 3.41. Find X1; X2; X3, where Xi is the solution of AX ¼ Bi for
(a) B1 ¼ ð1; 1; 1Þ, (b) B2 ¼ B1 þ X1, (c) B3 ¼ B2 þ X2.
(a) Find L1
B1 by applying the row operations (1), (2), and then (3) in Problem 3.41 to B1:
B1 ¼
1
1
1
2
4
3
5 


!
ð1Þ and ð2Þ
1
1
4
2
4
3
5 

!
ð3Þ
1
1
8
2
4
3
5
Solve UX ¼ B for B ¼ ð1; 1; 8Þ by back-substitution to obtain X1 ¼ ð25; 9; 8Þ.
(b) First find B2 ¼ B1 þ X1 ¼ ð1; 1; 1Þ þ ð25; 9; 8Þ ¼ ð24; 10; 9Þ. Then as above
B2 ¼ ½24; 10; 9T



!
ð1Þ and ð2Þ
½24; 58; 63T


!
ð3Þ
½24; 58; 295T
Solve UX ¼ B for B ¼ ð24; 58; 295Þ by back-substitution to obtain X2 ¼ ð943; 353; 295Þ.
(c) First find B3 ¼ B2 þ X2 ¼ ð24; 10; 9Þ þ ð943; 353; 295Þ ¼ ð919; 343; 286Þ. Then, as above
B3 ¼ ½943; 353; 295T


!
ð1Þ and ð2Þ
½919; 2181; 2671T



!
ð3Þ
½919; 2181; 11 395T
Solve UX ¼ B for B ¼ ð919; 2181; 11 395Þ by back-substitution to obtain
X3 ¼ ð37 628; 13 576; 11 395Þ.
CHAPTER 3 Systems of Linear Equations 105
Miscellaneous Problems
3.43. Let L be a linear combination of the m equations in n unknowns in the system (3.2). Say L is the
equation
ðc1a11 þ    þ cmam1Þx1 þ    þ ðc1a1n þ    þ cmamnÞxn ¼ c1b1 þ    þ cmbm ð1Þ
Show that any solution of the system (3.2) is also a solution of L.
Let u ¼ ðk1; . . . ; knÞ be a solution of (3.2). Then
ai1k1 þ ai2k2 þ    þ ainkn ¼ bi ði ¼ 1; 2; . . . ; mÞ ð2Þ
Substituting u in the left-hand side of (1) and using (2), we get
ðc1a11 þ    þ cmam1Þk1 þ    þ ðc1a1n þ    þ cmamnÞkn
¼ c1ða11k1 þ    þ a1nknÞ þ    þ cmðam1k1 þ    þ amnknÞ
¼ c1b1 þ    þ cmbm
This is the right-hand side of (1); hence, u is a solution of (1).
3.44. Suppose a system m of linear equations is obtained from a system l by applying an elementary
operation (page 64). Show that m and l have the same solutions.
Each equation L in m is a linear combination of equations in l. Hence, by Problem 3.43, any solution
of l will also be a solution of m. On the other hand, each elementary operation has an inverse elementary
operation, so l can be obtained from m by an elementary operation. This means that any solution of m is a
solution of l. Thus, l and m have the same solutions.
3.45. Prove Theorem 3.4: Suppose a system m of linear equations is obtained from a system l by a
sequence of elementary operations. Then m and l have the same solutions.
Each step of the sequence does not change the solution set (Problem 3.44). Thus, the original system l
and the final system m (and any system in between) have the same solutions.
3.46. A system l of linear equations is said to be consistent if no linear combination of its equations is
a degenerate equation L with a nonzero constant. Show that l is consistent if and only if l is
reducible to echelon form.
Suppose l is reducible to echelon form. Then l has a solution, which must also be a solution of every
linear combination of its equations. Thus, L, which has no solution, cannot be a linear combination of the
equations in l. Thus, l is consistent.
On the other hand, suppose l is not reducible to echelon form. Then, in the reduction process, it must
yield a degenerate equation L with a nonzero constant, which is a linear combination of the equations in l.
Therefore, l is not consistent; that is, l is inconsistent.
3.47. Suppose u and v are distinct vectors. Show that, for distinct scalars k, the vectors u þ kðu  vÞ are
distinct.
Suppose u þ k1ðu  vÞ ¼ u þ k2ðu  vÞ: We need only show that k1 ¼ k2. We have
k1ðu  vÞ ¼ k2ðu  vÞ; and so ðk1  k2Þðu  vÞ ¼ 0
Because u and v are distinct, u  v 6¼ 0. Hence, k1  k2 ¼ 0, and so k1 ¼ k2.
3.48. Suppose AB is defined. Prove
(a) Suppose A has a zero row. Then AB has a zero row.
(b) Suppose B has a zero column. Then AB has a zero column.
106 CHAPTER 3 Systems of Linear Equations
(a) Let Ri be the zero row of A, and C1; . . . ; Cn the columns of B. Then the ith row of AB is
ðRiC1; RiC2; . . . ; RiCnÞ ¼ ð0; 0; 0; . . . ; 0Þ
(b) BT
has a zero row, and so BT
AT
¼ ðABÞT
has a zero row. Hence, AB has a zero column.
SUPPLEMENTARY PROBLEMS
Linear Equations, 2 2 Systems
3.49. Determine whether each of the following systems is linear:
(a) 3x  4y þ 2yz ¼ 8, (b) ex þ 3y ¼ p, (c) 2x  3y þ kz ¼ 4
3.50. Solve (a) px ¼ 2, (b) 3x þ 2 ¼ 5x þ 7  2x, (c) 6x þ 2  4x ¼ 5 þ 2x  3
3.51. Solve each of the following systems:
(a) 2x þ 3y ¼ 1
5x þ 7y ¼ 3
(b) 4x  2y ¼ 5
6x þ 3y ¼ 1
(c) 2x  4 ¼ 3y
5y  x ¼ 5
(d) 2x  4y ¼ 10
3x  6y ¼ 15
3.52. Consider each of the following systems in unknowns x and y:
(a) x  ay ¼ 1
ax  4y ¼ b
(b) ax þ 3y ¼ 2
12x þ ay ¼ b
(c) x þ ay ¼ 3
2x þ 5y ¼ b
For which values of a does each system have a unique solution, and for which pairs of values ða; bÞ does
each system have more than one solution?
General Systems of Linear Equations
3.53. Solve
(a) x þ y þ 2z ¼ 4
2x þ 3y þ 6z ¼ 10
3x þ 6y þ 10z ¼ 17
(b) x  2y þ 3z ¼ 2
2x  3y þ 8z ¼ 7
3x  4y þ 13z ¼ 8
(c) x þ 2y þ 3z ¼ 3
2x þ 3y þ 8z ¼ 4
5x þ 8y þ 19z ¼ 11
3.54. Solve
(a) x  2y ¼ 5
2x þ 3y ¼ 3
3x þ 2y ¼ 7
(b) x þ 2y  3z þ 2t ¼ 2
2x þ 5y  8z þ 6t ¼ 5
3x þ 4y  5z þ 2t ¼ 4
(c) x þ 2y þ 4z  5t ¼ 3
3x  y þ 5z þ 2t ¼ 4
5x  4y þ 6z þ 9t ¼ 2
3.55. Solve
(a) 2x  y  4z ¼ 2
4x  2y  6z ¼ 5
6x  3y  8z ¼ 8
(b) x þ 2y  z þ 3t ¼ 3
2x þ 4y þ 4z þ 3t ¼ 9
3x þ 6y  z þ 8t ¼ 10
3.56. Consider each of the following systems in unknowns x; y; z:
(a) x  2y ¼ 1
x  y þ az ¼ 2
ay þ 9z ¼ b
(b) x þ 2y þ 2z ¼ 1
x þ ay þ 3z ¼ 3
x þ 11y þ az ¼ b
(c) x þ y þ az ¼ 1
x þ ay þ z ¼ 4
ax þ y þ z ¼ b
For which values of a does the system have a unique solution, and for which pairs of values ða; bÞ does the
system have more than one solution? The value of b does not have any effect on whether the system has a
unique solution. Why?
CHAPTER 3 Systems of Linear Equations 107
Linear Combinations, Homogeneous Systems
3.57. Write v as a linear combination of u1; u2; u3, where
(a) v ¼ ð4; 9; 2Þ, u1 ¼ ð1; 2; 1Þ, u2 ¼ ð1; 4; 2Þ, u3 ¼ ð1; 3; 2Þ;
(b) v ¼ ð1; 3; 2Þ, u1 ¼ ð1; 2; 1Þ, u2 ¼ ð2; 6; 5Þ, u3 ¼ ð1; 7; 8Þ;
(c) v ¼ ð1; 4; 6Þ, u1 ¼ ð1; 1; 2Þ, u2 ¼ ð2; 3; 5Þ, u3 ¼ ð3; 5; 8Þ.
3.58. Let u1 ¼ ð1; 1; 2Þ, u2 ¼ ð1; 3; 2Þ, u3 ¼ ð4; 2; 1Þ in R3
. Show that u1; u2; u3 are orthogonal, and write v
as a linear combination of u1; u2; u3, where (a) v ¼ ð5; 5; 9Þ, (b) v ¼ ð1; 3; 3Þ, (c) v ¼ ð1; 1; 1Þ.
(Hint: Use Fourier coefficients.)
3.59. Find the dimension and a basis of the general solution W of each of the following homogeneous systems:
(a) x  y þ 2z ¼ 0
2x þ y þ z ¼ 0
5x þ y þ 4z ¼ 0
(b) x þ 2y  3z ¼ 0
2x þ 5y þ 2z ¼ 0
3x  y  4z ¼ 0
(c) x þ 2y þ 3z þ t ¼ 0
2x þ 4y þ 7z þ 4t ¼ 0
3x þ 6y þ 10z þ 5t ¼ 0
3.60. Find the dimension and a basis of the general solution W of each of the following systems:
(a) x1 þ 3x2 þ 2x3  x4  x5 ¼ 0
2x1 þ 6x2 þ 5x3 þ x4  x5 ¼ 0
5x1 þ 15x2 þ 12x3 þ x4  3x5 ¼ 0
(b) 2x1  4x2 þ 3x3  x4 þ 2x5 ¼ 0
3x1  6x2 þ 5x3  2x4 þ 4x5 ¼ 0
5x1  10x2 þ 7x3  3x4 þ 18x5 ¼ 0
Echelon Matrices, Row Canonical Form
3.61. Reduce each of the following matrices to echelon form and then to row canonical form:
(a)
1 1 2
2 4 9
1 5 12
2
4
3
5; (b)
1 2 1 2 1
2 4 1 2 5
3 6 3 7 7
2
4
3
5; (c)
2 4 2 2 5 1
3 6 2 2 0 4
4 8 2 6 5 7
2
4
3
5
3.62. Reduce each of the following matrices to echelon form and then to row canonical form:
(a)
1 2 1 2 1 2
2 4 3 5 5 7
3 6 4 9 10 11
1 2 4 3 6 9
2
6
6
4
3
7
7
5; (b)
0 1 2 3
0 3 8 12
0 0 4 6
0 2 7 10
2
6
6
4
3
7
7
5; (c)
1 3 1 3
2 8 5 10
1 7 7 11
3 11 7 15
2
6
6
4
3
7
7
5
3.63. Using only 0’s and 1’s, list all possible 2 2 matrices in row canonical form.
3.64. Using only 0’s and 1’s, find the number n of possible 3 3 matrices in row canonical form.
Elementary Matrices, Applications
3.65. Let e1; e2; e3 denote, respectively, the following elementary row operations:
‘‘Interchange R2 and R3; ’’ ‘‘Replace R2 by 3R2; ’’ ‘‘Replace R1 by 2R3 þ R1’’
(a) Find the corresponding elementary matrices E1; E2; E3.
(b) Find the inverse operations e1
1 , e1
2 , e1
3 ; their corresponding elementary matrices E0
1, E0
2, E0
3; and the
relationship between them and E1; E2; E3.
(c) Describe the corresponding elementary column operations f1; f2; f3.
(d) Find elementary matrices F1; F2; F3 corresponding to f1; f2; f3, and the relationship between them and
E1; E2; E3.
108 CHAPTER 3 Systems of Linear Equations
3.66. Express each of the following matrices as a product of elementary matrices:
A ¼
1 2
3 4
 
; B ¼
3 6
2 4
 
; C ¼
2 6
3 7
 
; D ¼
1 2 0
0 1 3
3 8 7
2
4
3
5
3.67. Find the inverse of each of the following matrices (if it exists):
A ¼
1 2 1
2 3 1
3 4 4
2
4
3
5; B ¼
1 2 3
2 6 1
3 10 1
2
4
3
5; C ¼
1 3 2
2 8 3
1 7 1
2
4
3
5; D ¼
2 1 1
5 2 3
0 2 1
2
4
3
5
3.68. Find the inverse of each of the following n n matrices:
(a) A has 1’s on the diagonal and superdiagonal (entries directly above the diagonal) and 0’s elsewhere.
(b) B has 1’s on and above the diagonal, and 0’s below the diagonal.
Lu Factorization
3.69. Find the LU factorization of each of the following matrices:
(a)
1 1 1
3 4 2
2 3 2
2
4
3
5, (b)
1 3 1
2 5 1
3 4 2
2
4
3
5, (c)
2 3 6
4 7 9
3 5 4
2
4
3
5, (d)
1 2 3
2 4 7
3 7 10
2
4
3
5
3.70. Let A be the matrix in Problem 3.69(a). Find X1; X2; X3; X4, where
(a) X1 is the solution of AX ¼ B1, where B1 ¼ ð1; 1; 1ÞT
.
(b) For k  1, Xk is the solution of AX ¼ Bk, where Bk ¼ Bk1 þ Xk1.
3.71. Let B be the matrix in Problem 3.69(b). Find the LDU factorization of B.
Miscellaneous Problems
3.72. Consider the following systems in unknowns x and y:
ðaÞ
ax þ by ¼ 1
cx þ dy ¼ 0
ðbÞ
ax þ by ¼ 0
cx þ dy ¼ 1
Suppose D ¼ ad  bc 6¼ 0. Show that each system has the unique solution:
(a) x ¼ d=D, y ¼ c=D, (b) x ¼ b=D, y ¼ a=D.
3.73. Find the inverse of the row operation ‘‘Replace Ri by kRj þ k0
Ri ðk0
6¼ 0Þ.’’
3.74. Prove that deleting the last column of an echelon form (respectively, the row canonical form) of an
augmented matrix M ¼ ½A; B yields an echelon form (respectively, the row canonical form) of A.
3.75. Let e be an elementary row operation and E its elementary matrix, and let f be the corresponding elementary
column operation and F its elementary matrix. Prove
(a) f ðAÞ ¼ ðeðAT
ÞÞT
, (b) F ¼ ET
, (c) f ðAÞ ¼ AF.
3.76. Matrix A is equivalent to matrix B, written A B, if there exist nonsingular matrices P and Q such that
B ¼ PAQ. Prove that is an equivalence relation; that is,
(a) A A, (b) If A B, then B A, (c) If A B and B C, then A C.
CHAPTER 3 Systems of Linear Equations 109
ANSWERS TO SUPPLEMENTARY PROBLEMS
Notation: A ¼ ½R1; R2; . . . denotes the matrix A with rows R1; R2; . . . . The elements in each row are separated
by commas (which may be omitted with single digits), the rows are separated by semicolons, and 0 denotes a zero
row. For example,
A ¼ ½1; 2; 3; 4; 5; 6; 7; 8; 0 ¼
1 2 3 4
5 6 7 8
0 0 0 0
2
4
3
5
3.49. (a) no, (b) yes, (c) linear in x; y; z, not linear in x; y; z; k
3.50. (a) x ¼ 2=p, (b) no solution, (c) every scalar k is a solution
3.51. (a) ð2; 1Þ, (b) no solution, (c) ð5; 2Þ, (d) ð5  2a; aÞ
3.52. (a) a 6¼ 2; ð2; 2Þ; ð2; 2Þ, (b) a 6¼ 6; ð6; 4Þ; ð6; 4Þ, (c) a 6¼ 5
2 ; ð5
2 ; 6Þ
3.53. (a) ð2; 1; 1
2Þ, (b) no solution, (c) u ¼ ð7a  1; 2a þ 2; aÞ.
3.54. (a) ð3; 1Þ, (b) u ¼ ða þ 2b; 1 þ 2a  2b; a; bÞ, (c) no solution
3.55. (a) u ¼ ð1
2 a þ 2; a; 1
2Þ, (b) u ¼ ð1
2 ð7  5b  4aÞ; a; 1
2 ð1 þ bÞ; bÞ
3.56. (a) a 6¼ 3; ð3; 3Þ; ð3; 3Þ, (b) a 6¼ 5 and a 6¼ 1; ð5; 7Þ; ð1; 5Þ,
(c) a 6¼ 1 and a 6¼ 2; ð2; 5Þ
3.57. (a) 2; 1; 3, (b) 6; 3; 1, (c) not possible
3.58. (a) 3; 2; 1, (b) 2
3 ; 1; 1
3, (c) 2
3 ; 1
7 ; 1
21
3.59. (a) dim W ¼ 1; u1 ¼ ð1; 1; 1Þ, (b) dim W ¼ 0, no basis,
(c) dim W ¼ 2; u1 ¼ ð2; 1; 0; 0Þ; u2 ¼ ð5; 0; 2; 1Þ
3.60. (a) dim W ¼ 3; u1 ¼ ð3; 1; 0; 0; 0Þ, u2 ¼ ð7; 0; 3; 1; 0Þ, u3 ¼ ð3; 0; 1; 0; 1Þ,
(b) dim W ¼ 2, u1 ¼ ð2; 1; 0; 0; 0Þ, u2 ¼ ð5; 0; 5; 3; 1Þ
3.61. (a) ½1; 0;  1
2 ; 0; 1; 5
2 ; 0, (b) ½1; 2; 0; 0; 2; 0; 0; 1; 0; 5; 0; 0; 0; 1; 2,
(c) ½1; 2; 0; 4; 5; 3; 0; 0; 1; 5; 15
2 ;  5
2 ; 0
3.62. (a) ½1; 2; 0; 0; 4; 2; 0; 0; 1; 0; 1; 2; 0; 0; 0; 1; 2; 1; 0,
(b) ½0; 1; 0; 0; 0; 0; 1; 0; 0; 0; 0; 1; 0, (c) ½1; 0; 0; 4; 0; 1; 0; 1; 0; 0; 1; 2; 0
3.63. 5: ½1; 0; 0; 1, ½1; 1; 0; 0, ½1; 0; 0; 0, ½0; 1; 0; 0; 0
3.64. 16
3.65. (a) ½1; 0; 0; 0; 0; 1; 0; 1; 0, ½1; 0; 0; 0; 3; 0; 0; 0; 1, ½1; 0; 2; 0; 1; 0; 0; 0; 1,
(b) R2 $ R3; 1
3 R2 ! R2; 2R3 þ R1 ! R1; each E0
i ¼ E1
i ,
(c) C2 $ C3; 3C2 ! C2; 2C3 þ C1 ! C1, (d) each Fi ¼ ET
i .
3.66. A ¼ ½1; 0; 3; 1½1; 0; 0; 2½1; 2; 0; 1, B is not invertible,
C ¼ ½1; 0;  3
2 ; 1½1; 0; 0; 2½1; 6; 0; 1½2; 0; 0; 1,
D ¼ ½100; 010; 301½100; 010; 021½100; 013; 001½120; 010; 001
3.67. A1
¼ ½8; 12; 5; 5; 7; 3; 1; 2; 1, B has no inverse,
C1
¼ ½29
2 ;  17
2 ; 7
2 ;  5
2 ; 3
2 ;  1
2 ; 3; 2; 1; D1
¼ ½8; 3; 1; 5; 2; 1; 10; 4; 1
110 CHAPTER 3 Systems of Linear Equations
3.68. A1
¼ ½1; 1; 1; 1; . . . ; 0; 1; 1; 1; 1; . . . ; 0; 0; 1; 1; 1; 1; 1; . . . ; . . . ; . . . ; 0; . . . 0; 1
B1
has 1’s on diagonal, 1’s on superdiagonal, and 0’s elsewhere.
3.69. (a) ½100; 310; 211½1; 1; 1; 0; 1; 1; 0; 0; 1,
(b) ½100; 210; 351½1; 3; 1; 0; 1; 3; 0; 0; 10,
(c) ½100; 210; 3
2 ; 1
2 ; 1½2; 3; 6; 0; 1; 3; 0; 0;  7
2,
(d) There is no LU decomposition.
3.70. X1 ¼ ½1; 1; 1T
; B2 ¼ ½2; 2; 0T
, X2 ¼ ½6; 4; 0T
, B3 ¼ ½8; 6; 0T
, X3 ¼ ½22; 16; 2T
,
B4 ¼ ½30; 22; 2T
, X4 ¼ ½86; 62; 6T
3.71. B ¼ ½100; 210; 351 diagð1; 1; 10Þ ½1; 3; 1; 0; 1; 3; 0; 0; 1
3.73. Replace Ri by kRj þ ð1=k0
ÞRi.
3.75. (c) f ðAÞ ¼ ðeðAT
ÞÞT
¼ ðEAT
ÞT
¼ ðAT
ÞT
ET
¼ AF
3.76. (a) A ¼ IAI: (b) If A ¼ PBQ, then B ¼ P1
AQ1
.
(c) If A ¼ PBQ and B ¼ P0
CQ0
, then A ¼ ðPP0
ÞCðQ 0
QÞ.
CHAPTER 3 Systems of Linear Equations 111
Vector Spaces
4.1 Introduction
This chapter introduces the underlying structure of linear algebra, that of a finite-dimensional vector
space. The definition of a vector space V, whose elements are called vectors, involves an arbitrary field K,
whose elements are called scalars. The following notation will be used (unless otherwise stated or
implied):
V the given vector space
u; v; w vectors in V
K the given number field
a; b; c; or k scalars in K
Almost nothing essential is lost if the reader assumes that K is the real field R or the complex field C.
The reader might suspect that the real line R has ‘‘dimension’’ one, the cartesian plane R2
has
‘‘dimension’’ two, and the space R3
has ‘‘dimension’’ three. This chapter formalizes the notion of
‘‘dimension,’’ and this definition will agree with the reader’s intuition.
Throughout this text, we will use the following set notation:
a 2 A Element a belongs to set A
a; b 2 A Elements a and b belong to A
8x 2 A For every x in A
9x 2 A There exists an x in A
A  B A is a subset of B
A  B Intersection of A and B
A [ B Union of A and B
; Empty set
4.2 Vector Spaces
The following defines the notion of a vector space V where K is the field of scalars.
DEFINITION: Let V be a nonempty set with two operations:
(i) Vector Addition: This assigns to any u; v 2 V a sum u þ v in V.
(ii) Scalar Multiplication: This assigns to any u 2 V, k 2 K a product ku 2 V.
Then V is called a vector space (over the field K) if the following axioms hold for any
vectors u; v; w 2 V:
112
CHAPTER 4
[A1] ðu þ vÞ þ w ¼ u þ ðv þ wÞ
[A2] There is a vector in V, denoted by 0 and called the zero vector, such that, for any
u 2 V;
u þ 0 ¼ 0 þ u ¼ u
[A3] For each u 2 V; there is a vector in V, denoted by u, and called the negative of u,
such that
u þ ðuÞ ¼ ðuÞ þ u ¼ 0.
[A4] u þ v ¼ v þ u.
[M1] kðu þ vÞ ¼ ku þ kv, for any scalar k 2 K:
[M2] ða þ bÞu ¼ au þ bu; for any scalars a; b 2 K.
[M3] ðabÞu ¼ aðbuÞ; for any scalars a; b 2 K.
[M4] 1u ¼ u, for the unit scalar 1 2 K.
The above axioms naturally split into two sets (as indicated by the labeling of the axioms). The first
four are concerned only with the additive structure of V and can be summarized by saying V is a
commutative group under addition. This means
(a) Any sum v1 þ v2 þ    þ vm of vectors requires no parentheses and does not depend on the order of
the summands.
(b) The zero vector 0 is unique, and the negative u of a vector u is unique.
(c) (Cancellation Law) If u þ w ¼ v þ w, then u ¼ v.
Also, subtraction in V is defined by u  v ¼ u þ ðvÞ, where v is the unique negative of v.
On the other hand, the remaining four axioms are concerned with the ‘‘action’’ of the field K of scalars
on the vector space V. Using these additional axioms, we prove (Problem 4.2) the following simple
properties of a vector space.
THEOREM 4.1: Let V be a vector space over a field K.
(i) For any scalar k 2 K and 0 2 V; k0 ¼ 0.
(ii) For 0 2 K and any vector u 2 V; 0u ¼ 0.
(iii) If ku ¼ 0, where k 2 K and u 2 V, then k ¼ 0 or u ¼ 0.
(iv) For any k 2 K and any u 2 V; ðkÞu ¼ kðuÞ ¼ ku.
4.3 Examples of Vector Spaces
This section lists important examples of vector spaces that will be used throughout the text.
Space Kn
Let K be an arbitrary field. The notation Kn
is frequently used to denote the set of all n-tuples of elements
in K. Here Kn
is a vector space over K using the following operations:
(i) Vector Addition: ða1; a2; . . . ; anÞ þ ðb1; b2; . . . ; bnÞ ¼ ða1 þ b1; a2 þ b2; . . . ; an þ bnÞ
(ii) Scalar Multiplication: kða1; a2; . . . ; anÞ ¼ ðka1; ka2; . . . ; kanÞ
The zero vector in Kn
is the n-tuple of zeros,
0 ¼ ð0; 0; . . . ; 0Þ
and the negative of a vector is defined by
ða1; a2; . . . ; anÞ ¼ ða1; a2; . . . ; anÞ
Observe that these are the same as the operations defined for Rn
in Chapter 1. The proof that Kn
is a
vector space is identical to the proof of Theorem 1.1, which we now regard as stating that Rn
with the
operations defined there is a vector space over R.
CHAPTER 4 Vector Spaces 113
Polynomial Space PðtÞ
Let PðtÞ denote the set of all polynomials of the form
pðtÞ ¼ a0 þ a1t þ a2t2
þ    þ asts
ðs ¼ 1; 2; . . .Þ
where the coefficients ai belong to a field K. Then PðtÞ is a vector space over K using the following operations:
(i) Vector Addition: Here pðtÞ þ qðtÞ in PðtÞ is the usual operation of addition of polynomials.
(ii) Scalar Multiplication: Here kpðtÞ in PðtÞ is the usual operation of the product of a scalar k and a
polynomial pðtÞ.
The zero polynomial 0 is the zero vector in PðtÞ.
Polynomial Space PnðtÞ
Let PnðtÞ denote the set of all polynomials pðtÞ over a field K, where the degree of pðtÞ is less than or
equal to n; that is,
pðtÞ ¼ a0 þ a1t þ a2t2
þ    þ asts
where s  n. Then PnðtÞ is a vector space over K with respect to the usual operations of addition of
polynomials and of multiplication of a polynomial by a constant (just like the vector space PðtÞ above).
We include the zero polynomial 0 as an element of PnðtÞ, even though its degree is undefined.
Matrix Space Mm;n
The notation Mm;n, or simply M; will be used to denote the set of all m n matrices with entries in a field
K. Then Mm;n is a vector space over K with respect to the usual operations of matrix addition and scalar
multiplication of matrices, as indicated by Theorem 2.1.
Function Space FðXÞ
Let X be a nonempty set and let K be an arbitrary field. Let FðXÞ denote the set of all functions of X into
K. [Note that FðXÞ is nonempty, because X is nonempty.] Then FðXÞ is a vector space over K with
respect to the following operations:
(i) Vector Addition: The sum of two functions f and g in FðXÞ is the function f þ g in FðXÞ defined by
ð f þ gÞðxÞ ¼ f ðxÞ þ gðxÞ 8x 2 X
(ii) Scalar Multiplication: The product of a scalar k 2 K and a function f in FðXÞ is the function kf in
FðXÞ defined by
ðkf ÞðxÞ ¼ kf ðxÞ 8x 2 X
The zero vector in FðXÞ is the zero function 0, which maps every x 2 X into the zero element 0 2 K;
0ðxÞ ¼ 0 8x 2 X
Also, for any function f in FðXÞ, negative of f is the function f in FðXÞ defined by
ðf ÞðxÞ ¼ f ðxÞ 8x 2 X
Fields and Subfields
Suppose a field E is an extension of a field K; that is, suppose E is a field that contains K as a subfield.
Then E may be viewed as a vector space over K using the following operations:
(i) Vector Addition: Here u þ v in E is the usual addition in E.
(ii) Scalar Multiplication: Here ku in E, where k 2 K and u 2 E, is the usual product of k and u as
elements of E.
That is, the eight axioms of a vector space are satisfied by E and its subfield K with respect to the above
two operations.
114 CHAPTER 4 Vector Spaces
4.4 Linear Combinations, Spanning Sets
Let V be a vector space over a field K. A vector v in V is a linear combination of vectors u1; u2; . . . ; um in
V if there exist scalars a1; a2; . . . ; am in K such that
v ¼ a1u1 þ a2u2 þ    þ amum
Alternatively, v is a linear combination of u1; u2; . . . ; um if there is a solution to the vector equation
v ¼ x1u1 þ x2u2 þ    þ xmum
where x1; x2; . . . ; xm are unknown scalars.
EXAMPLE 4.1 (Linear Combinations in Rn
) Suppose we want to express v ¼ ð3; 7; 4Þ in R3
as a linear
combination of the vectors
u1 ¼ ð1; 2; 3Þ; u2 ¼ ð2; 3; 7Þ; u3 ¼ ð3; 5; 6Þ
We seek scalars x, y, z such that v ¼ xu1 þ yu2 þ zu3; that is,
3
3
4
2
4
3
5 ¼ x
1
2
3
2
4
3
5 þ y
2
3
7
2
4
3
5 þ z
3
5
6
2
4
3
5 or
x þ 2y þ 3z ¼ 3
2x þ 3y þ 5z ¼ 7
3x þ 7y þ 6z ¼ 4
(For notational convenience, we have written the vectors in R3
as columns, because it is then easier to find the
equivalent system of linear equations.) Reducing the system to echelon form yields
x þ 2y þ 3z ¼ 3
y  z ¼ 1
y  3z ¼ 13
and then
x þ 2y þ 3z ¼ 3
y  z ¼ 1
 4z ¼ 12
Back-substitution yields the solution x ¼ 2, y ¼ 4, z ¼ 3. Thus, v ¼ 2u1  4u2 þ 3u3.
Remark: Generally speaking, the question of expressing a given vector v in Kn
as a linear
combination of vectors u1; u2; . . . ; um in Kn
is equivalent to solving a system AX ¼ B of linear equations,
where v is the column B of constants, and the u’s are the columns of the coefficient matrix A. Such a
system may have a unique solution (as above), many solutions, or no solution. The last case—no
solution—means that v cannot be written as a linear combination of the u’s.
EXAMPLE 4.2 (Linear combinations in PðtÞ) Suppose we want to express the polynomial v ¼ 3t2
þ 5t  5 as a
linear combination of the polynomials
p1 ¼ t2
þ 2t þ 1; p2 ¼ 2t2
þ 5t þ 4; p3 ¼ t2
þ 3t þ 6
We seek scalars x, y, z such that v ¼ xp1 þ yp2 þ zp3; that is,
3t2
þ 5t  5 ¼ xðt2
þ 2t þ 1Þ þ yð2t2
þ 5t þ 4Þ þ zðt2
þ 3t þ 6Þ ð*Þ
There are two ways to proceed from here.
(1) Expand the right-hand side of (*) obtaining:
3t2
þ 5t  5 ¼ xt2
þ 2xt þ x þ 2yt2
þ 5yt þ 4y þ zt2
þ 3zt þ 6z
¼ ðx þ 2y þ zÞt2
þ ð2x þ 5y þ 3zÞt þ ðx þ 4y þ 6zÞ
Set coefficients of the same powers of t equal to each other, and reduce the system to echelon form:
x þ 2y þ z ¼ 3
2x þ 5y þ 3z ¼ 5
x þ 4y þ 6z ¼ 5
or
x þ 2y þ z ¼ 3
y þ z ¼ 1
2y þ 5z ¼ 8
or
x þ 2y þ z ¼ 3
y þ z ¼ 1
3z ¼ 6
CHAPTER 4 Vector Spaces 115
The system is in triangular form and has a solution. Back-substitution yields the solution x ¼ 3, y ¼ 1, z ¼ 2.
Thus,
v ¼ 3p1 þ p2  2p3
(2) The equation (*) is actually an identity in the variable t; that is, the equation holds for any value
of t. We can obtain three equations in the unknowns x, y, z by setting t equal to any three values.
For example,
Set t ¼ 0 in ð1Þ to obtain: x þ 4y þ 6z ¼ 5
Set t ¼ 1 in ð1Þ to obtain: 4x þ 11y þ 10z ¼ 3
Set t ¼ 1 in ð1Þ to obtain: y þ 4z ¼ 7
Reducing this system to echelon form and solving by back-substitution again yields the solution x ¼ 3, y ¼ 1,
z ¼ 2. Thus (again), v ¼ 3p1 þ p2  2p3.
Spanning Sets
Let V be a vector space over K. Vectors u1; u2; . . . ; um in V are said to span V or to form a spanning set of
V if every v in V is a linear combination of the vectors u1; u2; . . . ; um—that is, if there exist scalars
a1; a2; . . . ; am in K such that
v ¼ a1u1 þ a2u2 þ    þ amum
The following remarks follow directly from the definition.
Remark 1: Suppose u1; u2; . . . ; um span V. Then, for any vector w, the set w; u1; u2; . . . ; um also
spans V.
Remark 2: Suppose u1; u2; . . . ; um span V and suppose uk is a linear combination of some of the
other u’s. Then the u’s without uk also span V.
Remark 3: Suppose u1; u2; . . . ; um span V and suppose one of the u’s is the zero vector. Then the
u’s without the zero vector also span V.
EXAMPLE 4.3 Consider the vector space V ¼ R3
.
(a) We claim that the following vectors form a spanning set of R3
:
e1 ¼ ð1; 0; 0Þ; e2 ¼ ð0; 1; 0Þ; e3 ¼ ð0; 0; 1Þ
Specifically, if v ¼ ða; b; cÞ is any vector in R3
, then
v ¼ ae1 þ be2 þ ce3
For example, v ¼ ð5; 6; 2Þ ¼ 5e1  6e2 þ 2e3.
(b) We claim that the following vectors also form a spanning set of R3
:
w1 ¼ ð1; 1; 1Þ; w2 ¼ ð1; 1; 0Þ; w3 ¼ ð1; 0; 0Þ
Specifically, if v ¼ ða; b; cÞ is any vector in R3
, then (Problem 4.62)
v ¼ ða; b; cÞ ¼ cw1 þ ðb  cÞw2 þ ða  bÞw3
For example, v ¼ ð5; 6; 2Þ ¼ 2w1  8w2 þ 11w3.
(c) One can show (Problem 3.24) that v ¼ ð2; 7; 8Þ cannot be written as a linear combination of the vectors
u1 ¼ ð1; 2; 3Þ; u2 ¼ ð1; 3; 5Þ; u3 ¼ ð1; 5; 9Þ
Accordingly, u1, u2, u3 do not span R3
.
116 CHAPTER 4 Vector Spaces
EXAMPLE 4.4 Consider the vector space V ¼ PnðtÞ consisting of all polynomials of degree n.
(a) Clearly every polynomial in PnðtÞ can be expressed as a linear combination of the n þ 1 polynomials
1; t; t2
; t3
; . . . ; tn
Thus, these powers of t (where 1 ¼ t0
) form a spanning set for PnðtÞ.
(b) One can also show that, for any scalar c, the following n þ 1 powers of t  c,
1; t  c; ðt  cÞ2
; ðt  cÞ3
; . . . ; ðt  cÞn
(where ðt  cÞ0
¼ 1), also form a spanning set for PnðtÞ.
EXAMPLE 4.5 Consider the vector space M ¼ M2;2 consisting of all 2 2 matrices, and consider the following
four matrices in M:
E11 ¼
1 0
0 0
 
; E12 ¼
0 1
0 0
 
; E21 ¼
0 0
1 0
 
; E22 ¼
0 0
0 1
 
Then clearly any matrix A in M can be written as a linear combination of the four matrices. For example,
A ¼
5  6
7 8
 
¼ 5E11  6E12 þ 7E21 þ 8E22
Accordingly, the four matrices E11, E12, E21, E22 span M.
4.5 Subspaces
This section introduces the important notion of a subspace.
DEFINITION: Let V be a vector space over a field K and let W be a subset of V. Then W is a subspace
of V if W is itself a vector space over K with respect to the operations of vector
addition and scalar multiplication on V.
The way in which one shows that any set W is a vector space is to show that W satisfies the eight
axioms of a vector space. However, if W is a subset of a vector space V, then some of the axioms
automatically hold in W, because they already hold in V. Simple criteria for identifying subspaces follow.
THEOREM 4.2: Suppose W is a subset of a vector space V. Then W is a subspace of V if the following
two conditions hold:
(a) The zero vector 0 belongs to W.
(b) For every u; v 2 W; k 2 K: (i) The sum u þ v 2 W. (ii) The multiple ku 2 W.
Property (i) in (b) states that W is closed under vector addition, and property (ii) in (b) states that W is
closed under scalar multiplication. Both properties may be combined into the following equivalent single
statement:
(b0
) For every u; v 2 W; a; b 2 K, the linear combination au þ bv 2 W.
Now let V be any vector space. Then V automatically contains two subspaces: the set {0} consisting of
the zero vector alone and the whole space V itself. These are sometimes called the trivial subspaces of V.
Examples of nontrivial subspaces follow.
EXAMPLE 4.6 Consider the vector space V ¼ R3
.
(a) Let U consist of all vectors in R3
whose entries are equal; that is,
U ¼ fða; b; cÞ : a ¼ b ¼ cg
For example, (1, 1, 1), (73, 73, 73), (7, 7, 7), (72, 72, 72) are vectors in U. Geometrically, U is the line
through the origin O and the point (1, 1, 1) as shown in Fig. 4-1(a). Clearly 0 ¼ ð0; 0; 0Þ belongs to U, because
CHAPTER 4 Vector Spaces 117
all entries in 0 are equal. Further, suppose u and v are arbitrary vectors in U, say, u ¼ ða; a; aÞ and v ¼ ðb; b; bÞ.
Then, for any scalar k 2 R, the following are also vectors in U:
u þ v ¼ ða þ b; a þ b; a þ bÞ and ku ¼ ðka; ka; kaÞ
Thus, U is a subspace of R3
.
(b) Let W be any plane in R3
passing through the origin, as pictured in Fig. 4-1(b). Then 0 ¼ ð0; 0; 0Þ belongs to W,
because we assumed W passes through, the origin O. Further, suppose u and v are vectors in W. Then u and v
may be viewed as arrows in the plane W emanating from the origin O, as in Fig. 4-1(b). The sum u þ v and any
multiple ku of u also lie in the plane W. Thus, W is a subspace of R3
.
EXAMPLE 4.7
(a) Let V ¼ Mn;n, the vector space of n n matrices. Let W1 be the subset of all (upper) triangular matrices and let
W2 be the subset of all symmetric matrices. Then W1 is a subspace of V, because W1 contains the zero matrix 0
and W1 is closed under matrix addition and scalar multiplication; that is, the sum and scalar multiple of such
triangular matrices are also triangular. Similarly, W2 is a subspace of V.
(b) Let V ¼ PðtÞ, the vector space PðtÞ of polynomials. Then the space PnðtÞ of polynomials of degree at most n
may be viewed as a subspace of PðtÞ. Let QðtÞ be the collection of polynomials with only even powers of t. For
example, the following are polynomials in QðtÞ:
p1 ¼ 3 þ 4t2
 5t6
and p2 ¼ 6  7t4
þ 9t6
þ 3t12
(We assume that any constant k ¼ kt0
is an even power of t.) Then QðtÞ is a subspace of PðtÞ.
(c) Let V be the vector space of real-valued functions. Then the collection W1 of continuous functions and the
collection W2 of differentiable functions are subspaces of V.
Intersection of Subspaces
Let U and W be subspaces of a vector space V. We show that the intersection U  W is also a subspace of
V. Clearly, 0 2 U and 0 2 W, because U and W are subspaces; whence 0 2 U  W. Now suppose u and v
belong to the intersection U  W. Then u; v 2 U and u; v 2 W. Further, because U and W are subspaces,
for any scalars a; b 2 K,
au þ bv 2 U and au þ bv 2 W
Thus, au þ bv 2 U  W. Therefore, U  W is a subspace of V.
The above result generalizes as follows.
THEOREM 4.3: The intersection of any number of subspaces of a vector space V is a subspace of V.
Figure 4-1
118 CHAPTER 4 Vector Spaces
Solution Space of a Homogeneous System
Consider a system AX ¼ B of linear equations in n unknowns. Then every solution u may be viewed as a
vector in Kn
. Thus, the solution set of such a system is a subset of Kn
. Now suppose the system is
homogeneous; that is, suppose the system has the form AX ¼ 0. Let W be its solution set. Because
A0 ¼ 0, the zero vector 0 2 W. Moreover, suppose u and v belong to W. Then u and v are solutions of
AX ¼ 0, or, in other words, Au ¼ 0 and Av ¼ 0. Therefore, for any scalars a and b, we have
Aðau þ bvÞ ¼ aAu þ bAv ¼ a0 þ b0 ¼ 0 þ 0 ¼ 0
Thus, au þ bv belongs to W, because it is a solution of AX ¼ 0. Accordingly, W is a subspace of Kn
.
We state the above result formally.
THEOREM 4.4: The solution set W of a homogeneous system AX ¼ 0 in n unknowns is a subspace
of Kn
.
We emphasize that the solution set of a nonhomogeneous system AX ¼ B is not a subspace of Kn
. In
fact, the zero vector 0 does not belong to its solution set.
4.6 Linear Spans, Row Space of a Matrix
Suppose u1; u2; . . . ; um are any vectors in a vector space V. Recall (Section 4.4) that any vector of the
form a1u1 þ a2u2 þ    þ amum, where the ai are scalars, is called a linear combination of u1; u2; . . . ; um.
The collection of all such linear combinations, denoted by
spanðu1; u2; . . . ; umÞ or spanðuiÞ
is called the linear span of u1; u2; . . . ; um.
Clearly the zero vector 0 belongs to spanðuiÞ, because
0 ¼ 0u1 þ 0u2 þ    þ 0um
Furthermore, suppose v and v0
belong to spanðuiÞ, say,
v ¼ a1u1 þ a2u2 þ    þ amum and v0
¼ b1u1 þ b2u2 þ    þ bmum
Then,
v þ v0
¼ ða1 þ b1Þu1 þ ða2 þ b2Þu2 þ    þ ðam þ bmÞum
and, for any scalar k 2 K,
kv ¼ ka1u1 þ ka2u2 þ    þ kamum
Thus, v þ v0
and kv also belong to spanðuiÞ. Accordingly, spanðuiÞ is a subspace of V.
More generally, for any subset S of V, spanðSÞ consists of all linear combinations of vectors in S or,
when S ¼ f, span(S) ¼ f0g. Thus, in particular, S is a spanning set (Section 4.4) of spanðSÞ.
The following theorem, which was partially proved above, holds.
THEOREM 4.5: Let S be a subset of a vector space V.
(i) Then spanðSÞ is a subspace of V that contains S.
(ii) If W is a subspace of V containing S, then spanðSÞ  W.
Condition (ii) in theorem 4.5 may be interpreted as saying that spanðSÞ is the ‘‘smallest’’ subspace of
V containing S.
EXAMPLE 4.8 Consider the vector space V ¼ R3
.
(a) Let u be any nonzero vector in R3
. Then spanðuÞ consists of all scalar multiples of u. Geometrically, spanðuÞ is
the line through the origin O and the endpoint of u, as shown in Fig. 4-2(a).
CHAPTER 4 Vector Spaces 119
(b) Let u and v be vectors in R3
that are not multiples of each other. Then spanðu; vÞ is the plane through the origin
O and the endpoints of u and v as shown in Fig. 4-2(b).
(c) Consider the vectors e1 ¼ ð1; 0; 0Þ, e2 ¼ ð0; 1; 0Þ, e3 ¼ ð0; 0; 1Þ in R3
. Recall [Example 4.1(a)] that every vector
in R3
is a linear combination of e1, e2, e3. That is, e1, e2, e3 form a spanning set of R3
. Accordingly,
spanðe1; e2; e3Þ ¼ R3
.
Row Space of a Matrix
Let A ¼ ½aij be an arbitrary m n matrix over a field K. The rows of A,
R1 ¼ ða11; a12; . . . ; a1nÞ; R2 ¼ ða21; a22; . . . ; a2nÞ; . . . ; Rm ¼ ðam1; am2; . . . ; amnÞ
may be viewed as vectors in Kn
; hence, they span a subspace of Kn
called the row space of A and denoted
by rowsp(A). That is,
rowspðAÞ ¼ spanðR1; R2; . . . ; RmÞ
Analagously, the columns of A may be viewed as vectors in Km
called the column space of A and denoted
by colsp(A). Observe that colspðAÞ ¼ rowspðAT
Þ.
Recall that matrices A and B are row equivalent, written A B, if B can be obtained from A by a
sequence of elementary row operations. Now suppose M is the matrix obtained by applying one of the
following elementary row operations on a matrix A:
ð1Þ Interchange Ri and Rj; ð2Þ Replace Ri by kRi; ð3Þ Replace Rj by kRi þ Rj
Then each row of M is a row of A or a linear combination of rows of A. Hence, the row space of M is
contained in the row space of A. On the other hand, we can apply the inverse elementary row operation on
M to obtain A; hence, the row space of A is contained in the row space of M. Accordingly, A and M have
the same row space. This will be true each time we apply an elementary row operation. Thus, we have
proved the following theorem.
THEOREM 4.6: Row equivalent matrices have the same row space.
We are now able to prove (Problems 4.45–4.47) basic results on row equivalence (which first
appeared as Theorems 3.7 and 3.8 in Chapter 3).
THEOREM 4.7: Suppose A ¼ ½aij and B ¼ ½bij are row equivalent echelon matrices with respective
pivot entries
a1j1
; a2j2
; . . . ; arjr
and b1k1
; b2k2
; . . . ; bsks
Then A and B have the same number of nonzero rows—that is, r ¼ s—and their
pivot entries are in the same positions—that is, j1 ¼ k1; j2 ¼ k2; . . . ; jr ¼ kr.
THEOREM 4.8: Suppose A and B are row canonical matrices. Then A and B have the same row space
if and only if they have the same nonzero rows.
0
(a)
u
Figure 4-2
0
(b)
u
120 CHAPTER 4 Vector Spaces
COROLLARY 4.9: Every matrix A is row equivalent to a unique matrix in row canonical form.
We apply the above results in the next example.
EXAMPLE 4.9 Consider the following two sets of vectors in R4
:
u1 ¼ ð1; 2; 1; 3Þ; u2 ¼ ð2; 4; 1; 2Þ; u3 ¼ ð3; 6; 3; 7Þ
w1 ¼ ð1; 2; 4; 11Þ; w2 ¼ ð2; 4; 5; 14Þ
Let U ¼ spanðuiÞ and W ¼ spanðwiÞ. There are two ways to show that U ¼ W.
(a) Show that each ui is a linear combination of w1 and w2, and show that each wi is a linear combination of u1, u2,
u3. Observe that we have to show that six systems of linear equations are consistent.
(b) Form the matrix A whose rows are u1, u2, u3 and row reduce A to row canonical form, and form the matrix B
whose rows are w1 and w2 and row reduce B to row canonical form:
A ¼
1 2 1 3
2 4 1 2
3 6 3 7
2
6
4
3
7
5
1 2 1 3
0 0 3 8
0 0 6 16
2
6
4
3
7
5
1 2 0 1
3
0 0 1  8
3
0 0 0 0
2
6
4
3
7
5
B ¼
1 2 4 11
2 4 5 14
 
1 2 4 11
0 0 3 8
 
1 2 0 1
3
0 0 1  8
3
 #
Because the nonzero rows of the matrices in row canonical form are identical, the row spaces of A and B are
equal. Therefore, U ¼ W.
Clearly, the method in (b) is more efficient than the method in (a).
4.7 Linear Dependence and Independence
Let V be a vector space over a field K. The following defines the notion of linear dependence and
independence of vectors over K. (One usually suppresses mentioning K when the field is understood.)
This concept plays an essential role in the theory of linear algebra and in mathematics in general.
DEFINITION: We say that the vectors v1; v2; . . . ; vm in V are linearly dependent if there exist scalars
a1; a2; . . . ; am in K, not all of them 0, such that
a1v1 þ a2v2 þ    þ amvm ¼ 0
Otherwise, we say that the vectors are linearly independent.
The above definition may be restated as follows. Consider the vector equation
x1v1 þ x2v2 þ    þ xmvm ¼ 0 ð*Þ
where the x’s are unknown scalars. This equation always has the zero solution x1 ¼ 0;
x2 ¼ 0; . . . ; xm ¼ 0. Suppose this is the only solution; that is, suppose we can show:
x1v1 þ x2v2 þ    þ xmvm ¼ 0 implies x1 ¼ 0; x2 ¼ 0; . . . ; xm ¼ 0
Then the vectors v1; v2; . . . ; vm are linearly independent, On the other hand, suppose the equation (*) has
a nonzero solution; then the vectors are linearly dependent.
A set S ¼ fv1; v2; . . . ; vmg of vectors in V is linearly dependent or independent according to whether
the vectors v1; v2; . . . ; vm are linearly dependent or independent.
An infinite set S of vectors is linearly dependent or independent according to whether there do or do
not exist vectors v1; v2; . . . ; vk in S that are linearly dependent.
Warning: The set S ¼ fv1; v2; . . . ; vmg above represents a list or, in other words, a finite sequence
of vectors where the vectors are ordered and repetition is permitted.
CHAPTER 4 Vector Spaces 121
The following remarks follow directly from the above definition.
Remark 1: Suppose 0 is one of the vectors v1; v2; . . . ; vm, say v1 ¼ 0. Then the vectors must be
linearly dependent, because we have the following linear combination where the coefficient of v1 6¼ 0:
1v1 þ 0v2 þ    þ 0vm ¼ 1  0 þ 0 þ    þ 0 ¼ 0
Remark 2: Suppose v is a nonzero vector. Then v, by itself, is linearly independent, because
kv ¼ 0; v 6¼ 0 implies k ¼ 0
Remark 3: Suppose two of the vectors v1; v2; . . . ; vm are equal or one is a scalar multiple of the
other, say v1 ¼ kv2. Then the vectors must be linearly dependent, because we have the following linear
combination where the coefficient of v1 6¼ 0:
v1  kv2 þ 0v3 þ    þ 0vm ¼ 0
Remark 4: Two vectors v1 and v2 are linearly dependent if and only if one of them is a multiple of
the other.
Remark 5: If the set fv1; . . . ; vmg is linearly independent, then any rearrangement of the vectors
fvi1
; vi2
; . . . ; vim
g is also linearly independent.
Remark 6: If a set S of vectors is linearly independent, then any subset of S is linearly
independent. Alternatively, if S contains a linearly dependent subset, then S is linearly dependent.
EXAMPLE 4.10
(a) Let u ¼ ð1; 1; 0Þ, v ¼ ð1; 3; 2Þ, w ¼ ð4; 9; 5Þ. Then u, v, w are linearly dependent, because
3u þ 5v  2w ¼ 3ð1; 1; 0Þ þ 5ð1; 3; 2Þ  2ð4; 9; 5Þ ¼ ð0; 0; 0Þ ¼ 0
(b) We show that the vectors u ¼ ð1; 2; 3Þ, v ¼ ð2; 5; 7Þ, w ¼ ð1; 3; 5Þ are linearly independent. We form the vector
equation xu þ yv þ zw ¼ 0, where x, y, z are unknown scalars. This yields
x
1
2
3
2
4
3
5 þ y
2
5
7
2
4
3
5 þ z
1
3
5
2
4
3
5 ¼
0
0
0
2
4
3
5 or
x þ 2y þ z ¼ 0
2x þ 5y þ 3z ¼ 0
3x þ 7y þ 5z ¼ 0
or
x þ 2y þ z ¼ 0
y þ z ¼ 0
2z ¼ 0
Back-substitution yields x ¼ 0, y ¼ 0, z ¼ 0. We have shown that
xu þ yv þ zw ¼ 0 implies x ¼ 0; y ¼ 0; z ¼ 0
Accordingly, u, v, w are linearly independent.
(c) Let V be the vector space of functions from R into R. We show that the functions f ðtÞ ¼ sin t, gðtÞ ¼ et
,
hðtÞ ¼ t2
are linearly independent. We form the vector (function) equation xf þ yg þ zh ¼ 0, where x, y, z are
unknown scalars. This function equation means that, for every value of t,
x sin t þ yet
þ zt2
¼ 0
Thus, in this equation, we choose appropriate values of t to easily get x ¼ 0, y ¼ 0, z ¼ 0. For example,
ðiÞ Substitute t ¼ 0
ðiiÞ Substitute t ¼ p
ðiiiÞ Substitute t ¼ p=2
to obtain xð0Þ þ yð1Þ þ zð0Þ ¼ 0
to obtain xð0Þ þ 0ðep
Þ þ zðp2
Þ ¼ 0
to obtain xð1Þ þ 0ðep=2
Þ þ 0ðp2
=4Þ ¼ 0
or
or
or
y ¼ 0
z ¼ 0
x ¼ 0
We have shown
xf þ yg þ zf ¼ 0 implies x ¼ 0; y ¼ 0; z ¼ 0
Accordingly, u, v, w are linearly independent.
122 CHAPTER 4 Vector Spaces
Linear Dependence in R3
Linear dependence in the vector space V ¼ R3
can be described geometrically as follows:
(a) Any two vectors u and v in R3
are linearly dependent if and only if they lie on the same line through
the origin O, as shown in Fig. 4-3(a).
(b) Any three vectors u, v, w in R3
are linearly dependent if and only if they lie on the same plane
through the origin O, as shown in Fig. 4-3(b).
Later, we will be able to show that any four or more vectors in R3
are automatically linearly dependent.
Linear Dependence and Linear Combinations
The notions of linear dependence and linear combinations are closely related. Specifically, for more than
one vector, we show that the vectors v1; v2; . . . ; vm are linearly dependent if and only if one of them is a
linear combination of the others.
Suppose, say, vi is a linear combination of the others,
vi ¼ a1v1 þ    þ ai1vi1 þ aiþ1viþ1 þ    þ amvm
Then by adding vi to both sides, we obtain
a1v1 þ    þ ai1vi1  vi þ aiþ1viþ1 þ    þ amvm ¼ 0
where the coefficient of vi is not 0. Hence, the vectors are linearly dependent. Conversely, suppose the
vectors are linearly dependent, say,
b1v1 þ    þ bjvj þ    þ bmvm ¼ 0; where bj 6¼ 0
Then we can solve for vj obtaining
vj ¼ b1
j b1v1      b1
j bj1vj1  b1
j bjþ1vjþ1      b1
j bmvm
and so vj is a linear combination of the other vectors.
We now state a slightly stronger statement than the one above. This result has many important
consequences.
LEMMA 4.10: Suppose two or more nonzero vectors v1; v2; . . . ; vm are linearly dependent. Then one
of the vectors is a linear combination of the preceding vectors; that is, there exists
k  1 such that
vk ¼ c1v1 þ c2v2 þ    þ ck1vk1
Figure 4-3
CHAPTER 4 Vector Spaces 123
Linear Dependence and Echelon Matrices
Consider the following echelon matrix A, whose pivots have been circled:
A ¼
0 
2 3 4 5 6 7
0 0 
4 3 2 3 4
0 0 0 0 
7 8 9
0 0 0 0 0 
6 7
0 0 0 0 0 0 0
2
6
6
6
6
4
3
7
7
7
7
5
Observe that the rows R2, R3, R4 have 0’s in the second column below the nonzero pivot in R1, and hence
any linear combination of R2, R3, R4 must have 0 as its second entry. Thus, R1 cannot be a linear
combination of the rows below it. Similarly, the rows R3 and R4 have 0’s in the third column below the
nonzero pivot in R2, and hence R2 cannot be a linear combination of the rows below it. Finally, R3 cannot
be a multiple of R4, because R4 has a 0 in the fifth column below the nonzero pivot in R3. Viewing the
nonzero rows from the bottom up, R4, R3, R2, R1, no row is a linear combination of the preceding rows.
Thus, the rows are linearly independent by Lemma 4.10.
The argument used with the above echelon matrix A can be used for the nonzero rows of any echelon
matrix. Thus, we have the following very useful result.
THEOREM 4.11: The nonzero rows of a matrix in echelon form are linearly independent.
4.8 Basis and Dimension
First we state two equivalent ways to define a basis of a vector space V. (The equivalence is proved in
Problem 4.28.)
DEFINITION A: A set S ¼ fu1; u2; . . . ; ung of vectors is a basis of V if it has the following two
properties: (1) S is linearly independent. (2) S spans V.
DEFINITION B: A set S ¼ fu1; u2; . . . ; ung of vectors is a basis of V if every v 2 V can be written
uniquely as a linear combination of the basis vectors.
The following is a fundamental result in linear algebra.
THEOREM 4.12: Let V be a vector space such that one basis has m elements and another basis has n
elements. Then m ¼ n.
A vector space V is said to be of finite dimension n or n-dimensional, written
dim V ¼ n
if V has a basis with n elements. Theorem 4.12 tells us that all bases of V have the same number of
elements, so this definition is well defined.
The vector space {0} is defined to have dimension 0.
Suppose a vector space V does not have a finite basis. Then V is said to be of infinite dimension or to
be infinite-dimensional.
The above fundamental Theorem 4.12 is a consequence of the following ‘‘replacement lemma’’
(proved in Problem 4.35).
LEMMA 4.13: Suppose fv1; v2; . . . ; vng spans V, and suppose fw1; w2; . . . ; wmg is linearly indepen-
dent. Then m  n, and V is spanned by a set of the form
fw1; w2; . . . ; wm; vi1
; vi2
; . . . ; vinm
g
Thus, in particular, n þ 1 or more vectors in V are linearly dependent.
Observe in the above lemma that we have replaced m of the vectors in the spanning set of V by the m
independent vectors and still retained a spanning set.
124 CHAPTER 4 Vector Spaces
Examples of Bases
This subsection presents important examples of bases of some of the main vector spaces appearing in this
text.
(a) Vector space Kn
: Consider the following n vectors in Kn
:
e1 ¼ ð1; 0; 0; 0; . . . ; 0; 0Þ; e2 ¼ ð0; 1; 0; 0; . . . ; 0; 0Þ; . . . ; en ¼ ð0; 0; 0; 0; . . . ; 0; 1Þ
These vectors are linearly independent. (For example, they form a matrix in echelon form.)
Furthermore, any vector u ¼ ða1; a2; . . . ; anÞ in Kn
can be written as a linear combination of the
above vectors. Specifically,
v ¼ a1e1 þ a2e2 þ    þ anen
Accordingly, the vectors form a basis of Kn
called the usual or standard basis of Kn
. Thus (as one
might expect), Kn
has dimension n. In particular, any other basis of Kn
has n elements.
(b) Vector space M ¼ Mr;s of all r s matrices: The following six matrices form a basis of the
vector space M2;3 of all 2 3 matrices over K:
1 0 0
0 0 0
 
;
0 1 0
0 0 0
 
;
0 0 1
0 0 0
 
;
0 0 0
1 0 0
 
;
0 0 0
0 1 0
 
;
0 0 0
0 0 1
 
More generally, in the vector space M ¼ Mr;s of all r s matrices, let Eij be the matrix with ij-entry 1
and 0’s elsewhere. Then all such matrices form a basis of Mr;s called the usual or standard basis of
Mr;s. Accordingly, dim Mr;s ¼ rs.
(c) Vector space PnðtÞ of all polynomials of degree  n: The set S ¼ f1; t; t2
; t3
; . . . ; tn
g of n þ 1
polynomials is a basis of PnðtÞ. Specifically, any polynomial f ðtÞ of degree n can be expessed as a
linear combination of these powers of t, and one can show that these polynomials are linearly
independent. Therefore, dim PnðtÞ ¼ n þ 1.
(d) Vector space PðtÞ of all polynomials: Consider any finite set S ¼ ff1ðtÞ; f2ðtÞ; . . . ; fmðtÞg of
polynomials in PðtÞ, and let m denote the largest of the degrees of the polynomials. Then any
polynomial gðtÞ of degree exceeding m cannot be expressed as a linear combination of the elements of
S. Thus, S cannot be a basis of PðtÞ. This means that the dimension of PðtÞ is infinite. We note that the
infinite set S0
¼ f1; t; t2
; t3
; . . .g, consisting of all the powers of t, spans PðtÞ and is linearly
independent. Accordingly, S0
is an infinite basis of PðtÞ.
Theorems on Bases
The following three theorems (proved in Problems 4.37, 4.38, and 4.39) will be used frequently.
THEOREM 4.14: Let V be a vector space of finite dimension n. Then:
(i) Any n þ 1 or more vectors in V are linearly dependent.
(ii) Any linearly independent set S ¼ fu1; u2; . . . ; ung with n elements is a basis
of V.
(iii) Any spanning set T ¼ fv1; v2; . . . ; vng of V with n elements is a basis of V.
THEOREM 4.15: Suppose S spans a vector space V. Then:
(i) Any maximum number of linearly independent vectors in S form a basis of V.
(ii) Suppose one deletes from S every vector that is a linear combination of
preceding vectors in S. Then the remaining vectors form a basis of V.
CHAPTER 4 Vector Spaces 125
THEOREM 4.16: Let V be a vector space of finite dimension and let S ¼ fu1; u2; . . . ; urg be a set of
linearly independent vectors in V. Then S is part of a basis of V; that is, S may be
extended to a basis of V.
EXAMPLE 4.11
(a) The following four vectors in R4
form a matrix in echelon form:
ð1; 1; 1; 1Þ; ð0; 1; 1; 1Þ; ð0; 0; 1; 1Þ; ð0; 0; 0; 1Þ
Thus, the vectors are linearly independent, and, because dim R4
¼ 4, the four vectors form a basis of R4
.
(b) The following n þ 1 polynomials in PnðtÞ are of increasing degree:
1; t  1; ðt  1Þ2
; . . . ; ðt  1Þn
Therefore, no polynomial is a linear combination of preceding polynomials; hence, the polynomials are linear
independent. Furthermore, they form a basis of PnðtÞ, because dim PnðtÞ ¼ n þ 1.
(c) Consider any four vectors in R3
, say
ð257; 132; 58Þ; ð43; 0; 17Þ; ð521; 317; 94Þ; ð328; 512; 731Þ
By Theorem 4.14(i), the four vectors must be linearly dependent, because they come from the three-dimensional
vector space R3
.
Dimension and Subspaces
The following theorem (proved in Problem 4.40) gives the basic relationship between the dimension of a
vector space and the dimension of a subspace.
THEOREM 4.17: Let W be a subspace of an n-dimensional vector space V. Then dim W  n. In
particular, if dim W ¼ n, then W ¼ V.
EXAMPLE 4.12 Let W be a subspace of the real space R3
. Note that dim R3
¼ 3. Theorem 4.17 tells us that the
dimension of W can only be 0, 1, 2, or 3. The following cases apply:
(a) If dim W ¼ 0, then W ¼ f0g, a point.
(b) If dim W ¼ 1, then W is a line through the origin 0.
(c) If dim W ¼ 2, then W is a plane through the origin 0.
(d) If dim W ¼ 3, then W is the entire space R3
.
4.9 Application to Matrices, Rank of a Matrix
Let A be any m n matrix over a field K. Recall that the rows of A may be viewed as vectors in Kn
and
that the row space of A, written rowsp(A), is the subspace of Kn
spanned by the rows of A. The following
definition applies.
DEFINITION: The rank of a matrix A, written rank(A), is equal to the maximum number of linearly
independent rows of A or, equivalently, the dimension of the row space of A.
Recall, on the other hand, that the columns of an m n matrix A may be viewed as vectors in Km
and
that the column space of A, written colsp(A), is the subspace of Km
spanned by the columns of A.
Although m may not be equal to n—that is, the rows and columns of A may belong to different vector
spaces—we have the following fundamental result.
THEOREM 4.18: The maximum number of linearly independent rows of any matrix A is equal to the
maximum number of linearly independent columns of A. Thus, the dimension of the
row space of A is equal to the dimension of the column space of A.
Accordingly, one could restate the above definition of the rank of A using columns instead of rows.
126 CHAPTER 4 Vector Spaces
Basis-Finding Problems
This subsection shows how an echelon form of any matrix A gives us the solution to certain problems
about A itself. Specifically, let A and B be the following matrices, where the echelon matrix B (whose
pivots are circled) is an echelon form of A:
A ¼
1 2 1 3 1 2
2 5 5 6 4 5
3 7 6 11 6 9
1 5 10 8 9 9
2 6 8 11 9 12
2
6
6
6
6
4
3
7
7
7
7
5
and B ¼

1 2 1 3 1 2
0 
1 3 1 2 1
0 0 0 
1 1 2
0 0 0 0 0 0
0 0 0 0 0 0
2
6
6
6
6
4
3
7
7
7
7
5
We solve the following four problems about the matrix A, where C1; C2; . . . ; C6 denote its columns:
(a) Find a basis of the row space of A.
(b) Find each column Ck of A that is a linear combination of preceding columns of A.
(c) Find a basis of the column space of A.
(d) Find the rank of A.
(a) We are given that A and B are row equivalent, so they have the same row space. Moreover, B is in
echelon form, so its nonzero rows are linearly independent and hence form a basis of the row space
of B. Thus, they also form a basis of the row space of A. That is,
basis of rowspðAÞ: ð1; 2; 1; 3; 1; 2Þ; ð0; 1; 3; 1; 2; 1Þ; ð0; 0; 0; 1; 1; 2Þ
(b) Let Mk ¼ ½C1; C2; . . . ; Ck, the submatrix of A consisting of the first k columns of A. Then Mk1 and
Mk are, respectively, the coefficient matrix and augmented matrix of the vector equation
x1C1 þ x2C2 þ    þ xk1Ck1 ¼ Ck
Theorem 3.9 tells us that the system has a solution, or, equivalently, Ck is a linear combination of
the preceding columns of A if and only if rankðMkÞ ¼ rankðMk1Þ, where rankðMkÞ means the
number of pivots in an echelon form of Mk. Now the first k column of the echelon matrix B is also
an echelon form of Mk. Accordingly,
rankðM2Þ ¼ rankðM3Þ ¼ 2 and rankðM4Þ ¼ rankðM5Þ ¼ rankðM6Þ ¼ 3
Thus, C3, C5, C6 are each a linear combination of the preceding columns of A.
(c) The fact that the remaining columns C1, C2, C4 are not linear combinations of their respective
preceding columns also tells us that they are linearly independent. Thus, they form a basis of the
column space of A. That is,
basis of colspðAÞ: ½1; 2; 3; 1; 2T
; ½2; 5; 7; 5; 6T
; ½3; 6; 11; 8; 11T
Observe that C1, C2, C4 may also be characterized as those columns of A that contain the pivots in
any echelon form of A.
(d) Here we see that three possible definitions of the rank of A yield the same value.
(i) There are three pivots in B, which is an echelon form of A.
(ii) The three pivots in B correspond to the nonzero rows of B, which form a basis of the row
space of A.
(iii) The three pivots in B correspond to the columns of A, which form a basis of the column space
of A.
Thus, rankðAÞ ¼ 3.
CHAPTER 4 Vector Spaces 127
Application to Finding a Basis for W ¼ spanðu1; u2; . . . ; urÞ
Frequently, we are given a list S ¼ fu1; u2; . . . ; urg of vectors in Kn
and we want to find a basis for the
subspace W of Kn
spanned by the given vectors—that is, a basis of
W ¼ spanðSÞ ¼ spanðu1; u2; . . . ; urÞ
The following two algorithms, which are essentially described in the above subsection, find such a basis
(and hence the dimension) of W.
Algorithm 4.1 (Row space algorithm)
Step 1. Form the matrix M whose rows are the given vectors.
Step 2. Row reduce M to echelon form.
Step 3. Output the nonzero rows of the echelon matrix.
Sometimes we want to find a basis that only comes from the original given vectors. The next algorithm
accomplishes this task.
Algorithm 4.2 (Casting-out algorithm)
Step 1. Form the matrix M whose columns are the given vectors.
Step 2. Row reduce M to echelon form.
Step 3. For each column Ck in the echelon matrix without a pivot, delete (cast out) the vector uk from
the list S of given vectors.
Step 4. Output the remaining vectors in S (which correspond to columns with pivots).
We emphasize that in the first algorithm we form a matrix whose rows are the given vectors, whereas
in the second algorithm we form a matrix whose columns are the given vectors.
EXAMPLE 4.13 Let W be the subspace of R5
spanned by the following vectors:
u1 ¼ ð1; 2; 1; 3; 2Þ; u2 ¼ ð1; 3; 3; 5; 3Þ; u3 ¼ ð3; 8; 7; 13; 8Þ
u4 ¼ ð1; 4; 6; 9; 7Þ; u5 ¼ ð5; 13; 13; 25; 19Þ
Find a basis of W consisting of the original given vectors, and find dim W.
Form the matrix M whose columns are the given vectors, and reduce M to echelon form:
M ¼
1 1 3 1 5
2 3 8 4 13
1 3 7 6 13
3 5 13 9 25
2 3 8 7 19
2
6
6
6
6
4
3
7
7
7
7
5
1 1 3 1 5
0 1 2 2 3
0 0 0 1 2
0 0 0 0 0
0 0 0 0 0
2
6
6
6
6
4
3
7
7
7
7
5
The pivots in the echelon matrix appear in columns C1, C2, C4. Accordingly, we ‘‘cast out’’ the vectors u3 and u5
from the original five vectors. The remaining vectors u1, u2, u4, which correspond to the columns in the echelon
matrix with pivots, form a basis of W. Thus, in particular, dim W ¼ 3.
Remark: The justification of the casting-out algorithm is essentially described above, but we repeat
it again here for emphasis. The fact that column C3 in the echelon matrix in Example 4.13 does not have a
pivot means that the vector equation
xu1 þ yu2 ¼ u3
has a solution, and hence u3 is a linear combination of u1 and u2. Similarly, the fact that C5 does not have
a pivot means that u5 is a linear combination of the preceding vectors. We have deleted each vector in the
original spanning set that is a linear combination of preceding vectors. Thus, the remaining vectors are
linearly independent and form a basis of W.
128 CHAPTER 4 Vector Spaces
Application to Homogeneous Systems of Linear Equations
Consider again a homogeneous system AX ¼ 0 of linear equations over K with n unknowns. By
Theorem 4.4, the solution set W of such a system is a subspace of Kn
, and hence W has a dimension.
The following theorem, whose proof is postponed until Chapter 5, holds.
THEOREM 4.19: The dimension of the solution space W of a homogeneous system AX ¼ 0 is n  r,
where n is the number of unknowns and r is the rank of the coefficient matrix A.
In the case where the system AX ¼ 0 is in echelon form, it has precisely n  r free variables, say
xi1
; xi2
; . . . ; xinr
. Let vj be the solution obtained by setting xij
¼ 1 (or any nonzero constant) and the
remaining free variables equal to 0. We show (Problem 4.50) that the solutions v1; v2; . . . ; vnr are
linearly independent; hence, they form a basis of the solution space W.
We have already used the above process to find a basis of the solution space W of a homogeneous
system AX ¼ 0 in Section 3.11. Problem 4.48 gives three other examples.
4.10 Sums and Direct Sums
Let U and W be subsets of a vector space V. The sum of U and W, written U þ W, consists of all sums
u þ w where u 2 U and w 2 W. That is,
U þ W ¼ fv : v ¼ u þ w; where u 2 U and w 2 Wg
Now suppose U and W are subspaces of V. Then one can easily show (Problem 4.53) that U þ W is a
subspace of V. Recall that U  W is also a subspace of V. The following theorem (proved in Problem
4.58) relates the dimensions of these subspaces.
THEOREM 4.20: Suppose U and W are finite-dimensional subspaces of a vector space V. Then
U þ W has finite dimension and
dimðU þ WÞ ¼ dim U þ dim W  dimðU  WÞ
EXAMPLE 4.14 Let V ¼ M2;2, the vector space of 2 2 matrices. Let U consist of those matrices whose second
row is zero, and let W consist of those matrices whose second column is zero. Then
U ¼
a b
0 0
 
; W ¼
a 0
c 0
 
and U þ W ¼
a b
c 0
 
; U  W ¼
a 0
0 0
 
That is, U þ W consists of those matrices whose lower right entry is 0, and U  W consists of those matrices
whose second row and second column are zero. Note that dim U ¼ 2, dim W ¼ 2, dimðU  WÞ ¼ 1. Also,
dimðU þ WÞ ¼ 3, which is expected from Theorem 4.20. That is,
dimðU þ WÞ ¼ dim U þ dim V  dimðU  WÞ ¼ 2 þ 2  1 ¼ 3
Direct Sums
The vector space V is said to be the direct sum of its subspaces U and W, denoted by
V ¼ U W
if every v 2 V can be written in one and only one way as v ¼ u þ w where u 2 U and w 2 W.
The following theorem (proved in Problem 4.59) characterizes such a decomposition.
THEOREM 4.21: The vector space V is the direct sum of its subspaces U and W if and only if:
(i) V ¼ U þ W, (ii) U  W ¼ f0g.
CHAPTER 4 Vector Spaces 129
EXAMPLE 4.15 Consider the vector space V ¼ R3
:
(a) Let U be the xy-plane and let W be the yz-plane; that is,
U ¼ fða; b; 0Þ : a; b 2 Rg and W ¼ fð0; b; cÞ : b; c 2 Rg
Then R3
¼ U þ W, because every vector in R3
is the sum of a vector in U and a vector in W. However, R3
is not
the direct sum of U and W, because such sums are not unique. For example,
ð3; 5; 7Þ ¼ ð3; 1; 0Þ þ ð0; 4; 7Þ and also ð3; 5; 7Þ ¼ ð3; 4; 0Þ þ ð0; 9; 7Þ
(b) Let U be the xy-plane and let W be the z-axis; that is,
U ¼ fða; b; 0Þ : a; b 2 Rg and W ¼ fð0; 0; cÞ : c 2 Rg
Now any vector ða; b; cÞ 2 R3
can be written as the sum of a vector in U and a vector in V in one and only one
way:
ða; b; cÞ ¼ ða; b; 0Þ þ ð0; 0; cÞ
Accordingly, R3
is the direct sum of U and W; that is, R3
¼ U W.
General Direct Sums
The notion of a direct sum is extended to more than one factor in the obvious way. That is, V is the direct
sum of subspaces W1; W2; . . . ; Wr, written
V ¼ W1 W2    Wr
if every vector v 2 V can be written in one and only one way as
v ¼ w1 þ w2 þ    þ wr
where w1 2 W1; w2 2 W2; . . . ; wr 2 Wr.
The following theorems hold.
THEOREM 4.22: Suppose V ¼ W1 W2    Wr. Also, for each k, suppose Sk is a linearly
independent subset of Wk. Then
(a) The union S ¼
S
k Sk is linearly independent in V.
(b) If each Sk is a basis of Wk, then
S
k Sk is a basis of V.
(c) dim V ¼ dim W1 þ dim W2 þ    þ dim Wr.
THEOREM 4.23: Suppose V ¼ W1 þ W2 þ    þ Wr and dim V ¼
P
k dim Wk. Then
V ¼ W1 W2    Wr:
4.11 Coordinates
Let V be an n-dimensional vector space over K with basis S ¼ fu1; u2; . . . ; ung. Then any vector v 2 V
can be expressed uniquely as a linear combination of the basis vectors in S, say
v ¼ a1u1 þ a2u2 þ    þ anun
These n scalars a1; a2; . . . ; an are called the coordinates of v relative to the basis S, and they form a vector
[a1; a2; . . . ; an] in Kn
called the coordinate vector of v relative to S. We denote this vector by ½vS, or
simply ½v; when S is understood. Thus,
½vS ¼ ½a1; a2; . . . ; an
For notational convenience, brackets ½. . ., rather than parentheses ð. . .Þ, are used to denote the coordinate
vector.
130 CHAPTER 4 Vector Spaces
Remark: The above n scalars a1; a2; . . . ; an also form the coordinate column vector
½a1; a2; . . . ; anT
of v relative to S. The choice of the column vector rather than the row vector to
represent v depends on the context in which it is used. The use of such column vectors will become clear
later in Chapter 6.
EXAMPLE 4.16 Consider the vector space P2ðtÞ of polynomials of degree 2. The polynomials
p1 ¼ t þ 1; p2 ¼ t  1; p3 ¼ ðt  1Þ2
¼ t2
 2t þ 1
form a basis S of P2ðtÞ. The coordinate vector [v] of v ¼ 2t2
 5t þ 9 relative to S is obtained as follows.
Set v ¼ xp1 þ yp2 þ zp3 using unknown scalars x, y, z, and simplify:
2t2
 5t þ 9 ¼ xðt þ 1Þ þ yðt  1Þ þ zðt2
 2t þ 1Þ
¼ xt þ x þ yt  y þ zt2
 2zt þ z
¼ zt2
þ ðx þ y  2zÞt þ ðx  y þ zÞ
Then set the coefficients of the same powers of t equal to each other to obtain the system
z ¼ 2; x þ y  2z ¼ 5; x  y þ z ¼ 9
The solution of the system is x ¼ 3, y ¼ 4, z ¼ 2. Thus,
v ¼ 3p1  4p2 þ 2p3; and hence; ½v ¼ ½3; 4; 2
EXAMPLE 4.17 Consider real space R3
. The following vectors form a basis S of R3
:
u1 ¼ ð1; 1; 0Þ; u2 ¼ ð1; 1; 0Þ; u3 ¼ ð0; 1; 1Þ
The coordinates of v ¼ ð5; 3; 4Þ relative to the basis S are obtained as follows.
Set v ¼ xv1 þ yv2 þ zv3; that is, set v as a linear combination of the basis vectors using unknown scalars x, y, z.
This yields
5
3
4
2
4
3
5 ¼ x
1
1
0
2
4
3
5 þ y
1
1
0
2
4
3
5 þ z
0
1
1
2
4
3
5
The equivalent system of linear equations is as follows:
x þ y ¼ 5; x þ y þ z ¼ 3; z ¼ 4
The solution of the system is x ¼ 3, y ¼ 2, z ¼ 4. Thus,
v ¼ 3u1 þ 2u2 þ 4u3; and so ½vs ¼ ½3; 2; 4
Remark 1: There is a geometrical interpretation of the coordinates of a vector v relative to a basis
S for the real space Rn
, which we illustrate using the basis S of R3
in Example 4.17. First consider the
space R3
with the usual x, y, z axes. Then the basis vectors determine a new coordinate system of R3
, say
with x0
, y0
, z0
axes, as shown in Fig. 4-4. That is,
(1) The x0
-axis is in the direction of u1 with unit length ku1k.
(2) The y0
-axis is in the direction of u2 with unit length ku2k.
(3) The z0
-axis is in the direction of u3 with unit length ku3k.
Then each vector v ¼ ða; b; cÞ or, equivalently, the point Pða; b; cÞ in R3
will have new coordinates with
respect to the new x0
, y0
, z0
axes. These new coordinates are precisely ½vS, the coordinates of v with
respect to the basis S. Thus, as shown in Example 4.17, the coordinates of the point Pð5; 3; 4Þ with the
new axes form the vector [3, 2, 4].
Remark 2: Consider the usual basis E ¼ fe1; e2; . . . ; eng of Kn
defined by
e1 ¼ ð1; 0; 0; . . . ; 0; 0Þ; e2 ¼ ð0; 1; 0; . . . ; 0; 0Þ; . . . ; en ¼ ð0; 0; 0; . . . ; 0; 1Þ
CHAPTER 4 Vector Spaces 131
Let v ¼ ða1; a2; . . . ; anÞ be any vector in Kn
. Then one can easily show that
v ¼ a1e1 þ a2e2 þ    þ anen; and so ½vE ¼ ½a1; a2; . . . ; an
That is, the coordinate vector ½vE of any vector v relative to the usual basis E of Kn
is identical to the
original vector v.
Isomorphism of V and Kn
Let V be a vector space of dimension n over K, and suppose S ¼ fu1; u2; . . . ; ung is a basis of V. Then
each vector v 2 V corresponds to a unique n-tuple ½vS in Kn
. On the other hand, each n-tuple
[c1; c2; . . . ; cn] in Kn
corresponds to a unique vector c1u1 þ c2u2 þ    þ cnun in V. Thus, the basis S
induces a one-to-one correspondence between V and Kn
. Furthermore, suppose
v ¼ a1u1 þ a2u2 þ    þ anun and w ¼ b1u1 þ b2u2 þ    þ bnun
Then
v þ w ¼ ða1 þ b1Þu1 þ ða2 þ b2Þu2 þ    þ ðan þ bnÞun
kv ¼ ðka1Þu1 þ ðka2Þu2 þ    þ ðkanÞun
where k is a scalar. Accordingly,
½v þ wS ¼ ½a1 þ b1; . . . ; an þ bn ¼ ½a1; . . . ; an þ ½b1; . . . ; bn ¼ ½vS þ ½wS
½kvS ¼ ½ka1; ka2; . . . ; kan ¼ k½a1; a2; . . . ; an ¼ k½vS
Thus, the above one-to-one correspondence between V and Kn
preserves the vector space operations of
vector addition and scalar multiplication. We then say that V and Kn
are isomorphic, written
V ffi Kn
We state this result formally.
Figure 4-4
132 CHAPTER 4 Vector Spaces
THEOREM 4.24: Let V be an n-dimensional vector space over a field K. Then V and Kn
are
isomorphic.
The next example gives a practical application of the above result.
EXAMPLE 4.18 Suppose we want to determine whether or not the following matrices in V ¼ M2;3 are linearly
dependent:
A ¼
1 2 3
4 0 1
 
; B ¼
1 3 4
6 5 4
 
; C ¼
3 8 11
16 10 9
 
The coordinate vectors of the matrices in the usual basis of M2;3 are as follows:
½A ¼ ½1; 2; 3; 4; 0; 1; ½B ¼ ½1; 3; 4; 6; 5; 4; ½C ¼ ½3; 8; 11; 16; 10; 9
Form the matrix M whose rows are the above coordinate vectors and reduce M to an echelon form:
M ¼
1 2 3 4 0 1
1 3 4 6 5 4
3 8 11 16 10 9
2
4
3
5
1 2 3 4 0 1
0 1 1 2 5 3
0 2 2 4 10 6
2
4
3
5
1 2 3 4 0 1
0 1 1 2 5 3
0 0 0 0 0 0
2
4
3
5
Because the echelon matrix has only two nonzero rows, the coordinate vectors [A], [B], [C] span a subspace of
dimension 2 and so are linearly dependent. Accordingly, the original matrices A, B, C are linearly dependent.
SOLVED PROBLEMS
Vector Spaces, Linear Combinations
4.1. Suppose u and v belong to a vector space V. Simplify each of the following expressions:
(a) E1 ¼ 3ð2u  4vÞ þ 5u þ 7v, (c) E3 ¼ 2uv þ 3ð2u þ 4vÞ
(b) E2 ¼ 3u  6ð3u  5vÞ þ 7u, (d) E4 ¼ 5u 
3
v
þ 5u
Multiply out and collect terms:
(a) E1 ¼ 6u  12v þ 5u þ 7v ¼ 11u  5v
(b) E2 ¼ 3u  18u þ 30v þ 7u ¼ 8u þ 30v
(c) E3 is not defined because the product uv of vectors is not defined.
(d) E4 is not defined because division by a vector is not defined.
4.2. Prove Theorem 4.1: Let V be a vector space over a field K.
(i) k0 ¼ 0. (ii) 0u ¼ 0. (iii) If ku ¼ 0, then k ¼ 0 or u ¼ 0. (iv) ðkÞu ¼ kðuÞ ¼ ku.
(i) By Axiom [A2] with u ¼ 0, we have 0 þ 0 ¼ 0. Hence, by Axiom [M1], we have
k0 ¼ kð0 þ 0Þ ¼ k0 þ k0
Adding k0 to both sides gives the desired result.
(ii) For scalars, 0 þ 0 ¼ 0. Hence, by Axiom [M2], we have
0u ¼ ð0 þ 0Þu ¼ 0u þ 0u
Adding 0u to both sides gives the desired result.
(iii) Suppose ku ¼ 0 and k 6¼ 0. Then there exists a scalar k1
such that k1
k ¼ 1. Thus,
u ¼ 1u ¼ ðk1
kÞu ¼ k1
ðkuÞ ¼ k1
0 ¼ 0
(iv) Using u þ ðuÞ ¼ 0 and k þ ðkÞ ¼ 0 yields
0 ¼ k0 ¼ k½u þ ðuÞ ¼ ku þ kðuÞ and 0 ¼ 0u ¼ ½k þ ðkÞu ¼ ku þ ðkÞu
Adding ku to both sides of the first equation gives ku ¼ kðuÞ; and adding ku to both sides of the
second equation gives ku ¼ ðkÞu. Thus, ðkÞu ¼ kðuÞ ¼ ku.
CHAPTER 4 Vector Spaces 133
4.3. Show that (a) kðu  vÞ ¼ ku  kv, (b) u þ u ¼ 2u.
(a) Using the definition of subtraction, that u  v ¼ u þ ðvÞ, and Theorem 4.1(iv), that kðvÞ ¼ kv, we
have
kðu  vÞ ¼ k½u þ ðvÞ ¼ ku þ kðvÞ ¼ ku þ ðkvÞ ¼ ku  kv
(b) Using Axiom [M4] and then Axiom [M2], we have
u þ u ¼ 1u þ 1u ¼ ð1 þ 1Þu ¼ 2u
4.4. Express v ¼ ð1; 2; 5Þ in R3
as a linear combination of the vectors
u1 ¼ ð1; 1; 1Þ; u2 ¼ ð1; 2; 3Þ; u3 ¼ ð2; 1; 1Þ
We seek scalars x, y, z, as yet unknown, such that v ¼ xu1 þ yu2 þ zu3. Thus, we require
1
2
5
2
4
3
5 ¼ x
1
1
1
2
4
3
5 þ y
1
2
3
2
4
3
5 þ z
2
1
1
2
4
3
5 or
x þ y þ 2z ¼ 1
x þ 2y  z ¼ 2
x þ 3y þ z ¼ 5
(For notational convenience, we write the vectors in R3
as columns, because it is then easier to find the
equivalent system of linear equations.) Reducing the system to echelon form yields the triangular system
x þ y þ 2z ¼ 1; y  3z ¼ 3; 5z ¼ 10
The system is consistent and has a solution. Solving by back-substitution yields the solution x ¼ 6, y ¼ 3,
z ¼ 2. Thus, v ¼ 6u1 þ 3u2 þ 2u3.
Alternatively, write down the augmented matrix M of the equivalent system of linear equations, where
u1, u2, u3 are the first three columns of M and v is the last column, and then reduce M to echelon form:
M ¼
1 1 2 1
1 2 1 2
1 3 1 5
2
4
3
5
1 1 2 1
0 1 3 3
0 2 1 4
2
4
3
5
1 1 2 1
0 1 3 3
0 0 5 10
2
4
3
5
The last matrix corresponds to a triangular system, which has a solution. Solving the triangular system by
back-substitution yields the solution x ¼ 6, y ¼ 3, z ¼ 2. Thus, v ¼ 6u1 þ 3u2 þ 2u3.
4.5. Express v ¼ ð2; 5; 3Þ in R3
as a linear combination of the vectors
u1 ¼ ð1; 3; 2Þ; u2 ¼ ð2; 4; 1Þ; u3 ¼ ð1; 5; 7Þ
We seek scalars x, y, z, as yet unknown, such that v ¼ xu1 þ yu2 þ zu3. Thus, we require
2
5
3
2
4
3
5 ¼ x
1
3
2
2
4
3
5 þ y
2
4
1
2
4
3
5 þ z
1
5
7
2
4
3
5 or
x þ 2y þ z ¼ 2
3x  4y  5z ¼ 5
2x  y þ 7z ¼ 3
Reducing the system to echelon form yields the system
x þ 2y þ z ¼ 2; 2y  2z ¼ 1; 0 ¼ 3
The system is inconsistent and so has no solution. Thus, v cannot be written as a linear combination of
u1, u2, u3.
4.6. Express the polynomial v ¼ t2
þ 4t  3 in PðtÞ as a linear combination of the polynomials
p1 ¼ t2
 2t þ 5; p2 ¼ 2t2
 3t; p3 ¼ t þ 1
Set v as a linear combination of p1, p2, p3 using unknowns x, y, z to obtain
t2
þ 4t  3 ¼ xðt2
 2t þ 5Þ þ yð2t2
 3tÞ þ zðt þ 1Þ ð*Þ
We can proceed in two ways.
134 CHAPTER 4 Vector Spaces
Method 1. Expand the right side of (*) and express it in terms of powers of t as follows:
t2
þ 4t  3 ¼ xt2
 2xt þ 5x þ 2yt2
 3yt þ zt þ z
¼ ðx þ 2yÞt2
þ ð2x  3y þ zÞt þ ð5x þ 3zÞ
Set coefficients of the same powers of t equal to each other, and reduce the system to echelon form. This
yields
x þ 2y ¼ 1
2x  3y þ z ¼ 4
5x þ 3z ¼ 3
or
x þ 2y ¼ 1
y þ z ¼ 6
10y þ 3z ¼ 8
or
x þ 2y ¼ 1
y þ z ¼ 6
13z ¼ 52
The system is consistent and has a solution. Solving by back-substitution yields the solution x ¼ 3, y ¼ 2,
z ¼ 4. Thus, v ¼ 3p1 þ 2p2 þ 4p2.
Method 2. The equation (*) is an identity in t; that is, the equation holds for any value of t. Thus, we can
set t equal to any numbers to obtain equations in the unknowns.
(a) Set t ¼ 0 in (*) to obtain the equation 3 ¼ 5x þ z.
(b) Set t ¼ 1 in (*) to obtain the equation 2 ¼ 4x  y þ 2z.
(c) Set t ¼ 1 in (*) to obtain the equation 6 ¼ 8x þ 5y.
Solve the system of the three equations to again obtain the solution x ¼ 3, y ¼ 2, z ¼ 4. Thus,
v ¼ 3p1 þ 2p2 þ 4p3.
4.7. Express M as a linear combination of the matrices A, B, C, where
M ¼
4 7
7 9
 
; and A ¼
1 1
1 1
 
; B ¼
1 2
3 4
 
; C ¼
1 1
4 5
 
Set M as a linear combination of A, B, C using unknown scalars x, y, z; that is, set M ¼ xA þ yB þ zC.
This yields
4 7
7 9
 
¼ x
1 1
1 1
 
þ y
1 2
3 4
 
þ z
1 1
4 5
 
¼
x þ y þ z x þ 2y þ z
x þ 3y þ 4z x þ 4y þ 5z
 
Form the equivalent system of equations by setting corresponding entries equal to each other:
x þ y þ z ¼ 4; x þ 2y þ z ¼ 7; x þ 3y þ 4z ¼ 7; x þ 4y þ 5z ¼ 9
Reducing the system to echelon form yields
x þ y þ z ¼ 4; y ¼ 3; 3z ¼ 3; 4z ¼ 4
The last equation drops out. Solving the system by back-substitution yields z ¼ 1, y ¼ 3, x ¼ 2. Thus,
M ¼ 2A þ 3B  C.
Subspaces
4.8. Prove Theorem 4.2: W is a subspace of V if the following two conditions hold:
(a) 0 2 W. (b) If u; v 2 W, then u þ v, ku 2 W.
By (a), W is nonempty, and, by (b), the operations of vector addition and scalar multiplication are well
defined for W. Axioms [A1], [A4], [M1], [M2], [M3], [M4] hold in W because the vectors in W belong to V.
Thus, we need only show that [A2] and [A3] also hold in W. Now [A2] holds because the zero vector in V
belongs to W by (a). Finally, if v 2 W, then ð1Þv ¼ v 2 W, and v þ ðvÞ ¼ 0. Thus [A3] holds.
4.9. Let V ¼ R3
. Show that W is not a subspace of V, where
(a) W ¼ fða; b; cÞ : a  0g, (b) W ¼ fða; b; cÞ : a2
þ b2
þ c2
 1g.
In each case, show that Theorem 4.2 does not hold.
CHAPTER 4 Vector Spaces 135
(a) W consists of those vectors whose first entry is nonnegative. Thus, v ¼ ð1; 2; 3Þ belongs to W. Let
k ¼ 3. Then kv ¼ ð3; 6; 9Þ does not belong to W, because 3 is negative. Thus, W is not a
subspace of V.
(b) W consists of vectors whose length does not exceed 1. Hence, u ¼ ð1; 0; 0Þ and v ¼ ð0; 1; 0Þ belong to
W, but u þ v ¼ ð1; 1; 0Þ does not belong to W, because 12
þ 12
þ 02
¼ 2  1. Thus, W is not a
subspace of V.
4.10. Let V ¼ PðtÞ, the vector space of real polynomials. Determine whether or not W is a subspace of
V, where
(a) W consists of all polynomials with integral coefficients.
(b) W consists of all polynomials with degree 6 and the zero polynomial.
(c) W consists of all polynomials with only even powers of t.
(a) No, because scalar multiples of polynomials in W do not always belong to W. For example,
f ðtÞ ¼ 3 þ 6t þ 7t2
2 W but 1
2 f ðtÞ ¼ 3
2 þ 3t þ 7
2 t2
62 W
(b and c) Yes. In each case, W contains the zero polynomial, and sums and scalar multiples of polynomials
in W belong to W.
4.11. Let V be the vector space of functions f : R ! R. Show that W is a subspace of V, where
(a) W ¼ f f ðxÞ : f ð1Þ ¼ 0g, all functions whose value at 1 is 0.
(b) W ¼ f f ðxÞ : f ð3Þ ¼ f ð1Þg, all functions assigning the same value to 3 and 1.
(c) W ¼ f f ðtÞ : f ðxÞ ¼ f ðxÞg, all odd functions.
Let ^
0 denote the zero function, so ^
0ðxÞ ¼ 0 for every value of x.
(a) ^
0 2 W, because ^
0ð1Þ ¼ 0. Suppose f ; g 2 W. Then f ð1Þ ¼ 0 and gð1Þ ¼ 0. Also, for scalars a and b, we
have
ðaf þ bgÞð1Þ ¼ af ð1Þ þ bgð1Þ ¼ a0 þ b0 ¼ 0
Thus, af þ bg 2 W, and hence W is a subspace.
(b) ^
0 2 W, because ^
0ð3Þ ¼ 0 ¼ ^
0ð1Þ. Suppose f; g 2 W. Then f ð3Þ ¼ f ð1Þ and gð3Þ ¼ gð1Þ. Thus, for any
scalars a and b, we have
ðaf þ bgÞð3Þ ¼ af ð3Þ þ bgð3Þ ¼ af ð1Þ þ bgð1Þ ¼ ðaf þ bgÞð1Þ
Thus, af þ bg 2 W, and hence W is a subspace.
(c) ^
0 2 W, because ^
0ðxÞ ¼ 0 ¼ 0 ¼ ^
0ðxÞ. Suppose f; g 2 W. Then f ðxÞ ¼ f ðxÞ and gðxÞ ¼ gðxÞ.
Also, for scalars a and b,
ðaf þ bgÞðxÞ ¼ af ðxÞ þ bgðxÞ ¼ af ðxÞ  bgðxÞ ¼ ðaf þ bgÞðxÞ
Thus, ab þ gf 2 W, and hence W is a subspace of V.
4.12. Prove Theorem 4.3: The intersection of any number of subspaces of V is a subspace of V.
Let fWi : i 2 Ig be a collection of subspaces of V and let W ¼ ðWi : i 2 IÞ. Because each Wi is a
subspace of V, we have 0 2 Wi, for every i 2 I. Hence, 0 2 W. Suppose u; v 2 W. Then u; v 2 Wi, for every
i 2 I. Because each Wi is a subspace, au þ bv 2 Wi , for every i 2 I. Hence, au þ bv 2 W. Thus, W is a
subspace of V.
Linear Spans
4.13. Show that the vectors u1 ¼ ð1; 1; 1Þ, u2 ¼ ð1; 2; 3Þ, u3 ¼ ð1; 5; 8Þ span R3
.
We need to show that an arbitrary vector v ¼ ða; b; cÞ in R3
is a linear combination of u1, u2, u3. Set
v ¼ xu1 þ yu2 þ zu3; that is, set
ða; b; cÞ ¼ xð1; 1; 1Þ þ yð1; 2; 3Þ þ zð1; 5; 8Þ ¼ ðx þ y þ z; x þ 2y þ 5z; x þ 3y þ 8zÞ
136 CHAPTER 4 Vector Spaces
Form the equivalent system and reduce it to echelon form:
x þ y þ z ¼ a
x þ 2y þ 5z ¼ b
x þ 3y þ 8z ¼ c
or
x þ y þ z ¼ a
y þ 4z ¼ b  a
2y þ 7c ¼ c  a
or
x þ y þ z ¼ a
y þ 4z ¼ b  a
z ¼ c  2b þ a
The above system is in echelon form and is consistent; in fact,
x ¼ a þ 5b  3c; y ¼ 3a  7b þ 4c; z ¼ a þ 2b  c
is a solution. Thus, u1, u2, u3 span R3
.
4.14. Find conditions on a, b, c so that v ¼ ða; b; cÞ in R3
belongs to W ¼ spanðu1; u2; u3Þ; where
u1 ¼ ð1; 2; 0Þ; u2 ¼ ð1; 1; 2Þ; u3 ¼ ð3; 0; 4Þ
Set v as a linear combination of u1, u2, u3 using unknowns x, y, z; that is, set v ¼ xu1 þ yu2 þ zu3: This
yields
ða; b; cÞ ¼ xð1; 2; 0Þ þ yð1; 1; 2Þ þ zð3; 0; 4Þ ¼ ðx  y þ 3z; 2x þ y; 2y  4zÞ
Form the equivalent system of linear equations and reduce it to echelon form:
x  y þ 3z ¼ a
2x þ y ¼ b
2y  4z ¼ c
or
x  y þ 3z ¼ a
3y  6z ¼ b  2a
2y  4z ¼ c
or
x  y þ 3z ¼ a
3y  6z ¼ b  2a
0 ¼ 4a  2b þ 3c
The vector v ¼ ða; b; cÞ belongs to W if and only if the system is consistent, and it is consistent if and only if
4a  2b þ 3c ¼ 0. Note, in particular, that u1, u2, u3 do not span the whole space R3
.
4.15. Show that the vector space V ¼ PðtÞ of real polynomials cannot be spanned by a finite number of
polynomials.
Any finite set S of polynomials contains a polynomial of maximum degree, say m. Then the linear span
span(S) of S cannot contain a polynomial of degree greater than m. Thus, spanðSÞ 6¼ V, for any finite set S.
4.16. Prove Theorem 4.5: Let S be a subset of V. (i) Then span(S) is a subspace of V containing S.
(ii) If W is a subspace of V containing S, then spanðSÞ  W.
(i) Suppose S is empty. By definition, spanðSÞ ¼ f0g. Hence spanðSÞ ¼ f0g is a subspace of V and
S  spanðSÞ. Suppose S is not empty and v 2 S. Then v ¼ 1v 2 spanðSÞ; hence, S  spanðSÞ. Also
0 ¼ 0v 2 spanðSÞ. Now suppose u; w 2 spanðSÞ, say
u ¼ a1u1 þ    þ arur ¼
P
i
aiui and w ¼ b1w1 þ    þ bsws ¼
P
j
bjwj
where ui, wj 2 S and ai; bj 2 K. Then
u þ v ¼
P
i
aiui þ
P
j
bjwj and ku ¼ k
P
i
aiui
 
¼
P
i
kaiui
belong to span(S) because each is a linear combination of vectors in S. Thus, span(S) is a subspace of V.
(ii) Suppose u1; u2; . . . ; ur 2 S. Then all the ui belong to W. Thus, all multiples a1u1; a2u2; . . . ; arur 2 W,
and so the sum a1u1 þ a2u2 þ    þ arur 2 W. That is, W contains all linear combinations of elements
in S, or, in other words, spanðSÞ  W, as claimed.
Linear Dependence
4.17. Determine whether or not u and v are linearly dependent, where
(a) u ¼ ð1; 2Þ, v ¼ ð3; 5Þ, (c) u ¼ ð1; 2; 3Þ, v ¼ ð4; 5; 6Þ
(b) u ¼ ð1; 3Þ, v ¼ ð2; 6Þ, (d) u ¼ ð2; 4; 8Þ, v ¼ ð3; 6; 12Þ
Two vectors u and v are linearly dependent if and only if one is a multiple of the other.
(a) No. (b) Yes; for v ¼ 2u. (c) No. (d) Yes, for v ¼ 3
2 u.
CHAPTER 4 Vector Spaces 137
4.18. Determine whether or not u and v are linearly dependent, where
(a) u ¼ 2t2
þ 4t  3, v ¼ 4t2
þ 8t  6, (b) u ¼ 2t2
 3t þ 4, v ¼ 4t2
 3t þ 2,
(c) u ¼
1 3 4
5 0 1
 
; v ¼
4 12 16
20 0 4
 
, (d) u ¼
1 1 1
2 2 2
 
; v ¼
2 2 2
3 3 3
 
Two vectors u and v are linearly dependent if and only if one is a multiple of the other.
(a) Yes; for v ¼ 2u. (b) No. (c) Yes, for v ¼ 4u. (d) No.
4.19. Determine whether or not the vectors u ¼ ð1; 1; 2Þ, v ¼ ð2; 3; 1Þ, w ¼ ð4; 5; 5Þ in R3
are linearly
dependent.
Method 1. Set a linear combination of u, v, w equal to the zero vector using unknowns x, y, z to obtain
the equivalent homogeneous system of linear equations and then reduce the system to echelon form.
This yields
x
1
1
1
2
4
3
5 þ y
2
3
1
2
4
3
5 þ z
4
5
5
2
4
3
5 ¼
0
0
0
2
4
3
5 or
x þ 2y þ 4z ¼ 0
x þ 3y þ 5z ¼ 0
2x þ y þ 5z ¼ 0
or
x þ 2y þ 4z ¼ 0
y þ z ¼ 0
The echelon system has only two nonzero equations in three unknowns; hence, it has a free variable and a
nonzero solution. Thus, u, v, w are linearly dependent.
Method 2. Form the matrix A whose columns are u, v, w and reduce to echelon form:
A ¼
1 2 4
1 3 5
2 1 5
2
4
3
5
1 2 4
0 1 1
0 3 3
2
4
3
5
1 2 4
0 1 1
0 0 0
2
4
3
5
The third column does not have a pivot; hence, the third vector w is a linear combination of the first two
vectors u and v. Thus, the vectors are linearly dependent. (Observe that the matrix A is also the coefficient
matrix in Method 1. In other words, this method is essentially the same as the first method.)
Method 3. Form the matrix B whose rows are u, v, w, and reduce to echelon form:
B ¼
1 1 2
2 3 1
4 5 5
2
4
3
5
0 1 2
0 1 3
0 1 3
2
4
3
5
1 1 2
0 1 3
0 0 0
2
4
3
5
Because the echelon matrix has only two nonzero rows, the three vectors are linearly dependent. (The three
given vectors span a space of dimension 2.)
4.20. Determine whether or not each of the following lists of vectors in R3
is linearly dependent:
(a) u1 ¼ ð1; 2; 5Þ, u2 ¼ ð1; 3; 1Þ, u3 ¼ ð2; 5; 7Þ, u4 ¼ ð3; 1; 4Þ,
(b) u ¼ ð1; 2; 5Þ, v ¼ ð2; 5; 1Þ, w ¼ ð1; 5; 2Þ,
(c) u ¼ ð1; 2; 3Þ, v ¼ ð0; 0; 0Þ, w ¼ ð1; 5; 6Þ.
(a) Yes, because any four vectors in R3
are linearly dependent.
(b) Use Method 2 above; that is, form the matrix A whose columns are the given vectors, and reduce the
matrix to echelon form:
A ¼
1 2 1
2 5 5
5 1 2
2
4
3
5
1 2 1
0 1 3
0 9 3
2
4
3
5
1 2 1
0 1 3
0 0 24
2
4
3
5
Every column has a pivot entry; hence, no vector is a linear combination of the previous vectors. Thus,
the vectors are linearly independent.
(c) Because 0 ¼ ð0; 0; 0Þ is one of the vectors, the vectors are linearly dependent.
138 CHAPTER 4 Vector Spaces
4.21. Show that the functions f ðtÞ ¼ sin t, gðtÞ cos t, hðtÞ ¼ t from R into R are linearly independent.
Set a linear combination of the functions equal to the zero function 0 using unknown scalars x, y, z; that
is, set xf þ yg þ zh ¼ 0. Then show x ¼ 0, y ¼ 0, z ¼ 0. We emphasize that xf þ yg þ zh ¼ 0 means that,
for every value of t, we have xf ðtÞ þ ygðtÞ þ zhðtÞ ¼ 0.
Thus, in the equation x sin t þ y cos t þ zt ¼ 0:
ðiÞ Set t ¼ 0
ðiiÞ Set t ¼ p=2
ðiiiÞ Set t ¼ p
to obtain
to obtain
to obtain
xð0Þ þ yð1Þ þ zð0Þ ¼ 0
xð1Þ þ yð0Þ þ zp=2 ¼ 0
xð0Þ þ yð1Þ þ zðpÞ ¼ 0
or
or
or
y ¼ 0:
x þ pz=2 ¼ 0:
y þ pz ¼ 0:
The three equations have only the zero solution; that is, x ¼ 0, y ¼ 0, z ¼ 0. Thus, f , g, h are linearly
independent.
4.22. Suppose the vectors u, v, w are linearly independent. Show that the vectors u þ v, u  v,
u  2v þ w are also linearly independent.
Suppose xðu þ vÞ þ yðu  vÞ þ zðu  2v þ wÞ ¼ 0. Then
xu þ xv þ yu  yv þ zu  2zv þ zw ¼ 0
or
ðx þ y þ zÞu þ ðx  y  2zÞv þ zw ¼ 0
Because u, v, w are linearly independent, the coefficients in the above equation are each 0; hence,
x þ y þ z ¼ 0; x  y  2z ¼ 0; z ¼ 0
The only solution to the above homogeneous system is x ¼ 0, y ¼ 0, z ¼ 0. Thus, u þ v, u  v, u  2v þ w
are linearly independent.
4.23. Show that the vectors u ¼ ð1 þ i; 2iÞ and w ¼ ð1; 1 þ iÞ in C2
are linearly dependent over the
complex field C but linearly independent over the real field R.
Recall that two vectors are linearly dependent (over a field K) if and only if one of them is a multiple of
the other (by an element in K). Because
ð1 þ iÞw ¼ ð1 þ iÞð1; 1 þ iÞ ¼ ð1 þ i; 2iÞ ¼ u
u and w are linearly dependent over C. On the other hand, u and w are linearly independent over R, as no real
multiple of w can equal u. Specifically, when k is real, the first component of kw ¼ ðk; k þ kiÞ must be real,
and it can never equal the first component 1 þ i of u, which is complex.
Basis and Dimension
4.24. Determine whether or not each of the following form a basis of R3
:
(a) (1, 1, 1), (1, 0, 1); (c) (1, 1, 1), (1, 2, 3), ð2; 1; 1Þ;
(b) (1, 2, 3), (1, 3, 5), (1, 0, 1), (2, 3, 0); (d) (1, 1, 2), (1, 2, 5), (5, 3, 4).
(a and b) No, because a basis of R3
must contain exactly three elements because dim R3
¼ 3.
(c) The three vectors form a basis if and only if they are linearly independent. Thus, form the matrix whose
rows are the given vectors, and row reduce the matrix to echelon form:
1 1 1
1 2 3
2 1 1
2
4
3
5
1 1 1
0 1 2
0 3 1
2
4
3
5
1 1 1
0 1 2
0 0 5
2
4
3
5
The echelon matrix has no zero rows; hence, the three vectors are linearly independent, and so they do
form a basis of R3
.
CHAPTER 4 Vector Spaces 139
(d) Form the matrix whose rows are the given vectors, and row reduce the matrix to echelon form:
1 1 2
1 2 5
5 3 4
2
4
3
5
1 1 2
0 1 3
0 2 6
2
4
3
5
1 1 2
0 1 3
0 0 0
2
4
3
5
The echelon matrix has a zero row; hence, the three vectors are linearly dependent, and so they do not
form a basis of R3
.
4.25. Determine whether (1, 1, 1, 1), (1, 2, 3, 2), (2, 5, 6, 4), (2, 6, 8, 5) form a basis of R4
. If not, find
the dimension of the subspace they span.
Form the matrix whose rows are the given vectors, and row reduce to echelon form:
B ¼
1 1 1 1
1 2 3 2
2 5 6 4
2 6 8 5
2
6
6
4
3
7
7
5
1 1 1 1
0 1 2 1
0 3 4 2
0 4 6 3
2
6
6
4
3
7
7
5
1 1 1 1
0 1 2 1
0 0 2 1
0 0 2 1
2
6
6
4
3
7
7
5
1 1 1 1
0 1 2 1
0 0 2 1
0 0 0 0
2
6
6
4
3
7
7
5
The echelon matrix has a zero row. Hence, the four vectors are linearly dependent and do not form a basis of
R4
. Because the echelon matrix has three nonzero rows, the four vectors span a subspace of dimension 3.
4.26. Extend fu1 ¼ ð1; 1; 1; 1Þ; u2 ¼ ð2; 2; 3; 4Þg to a basis of R4
.
First form the matrix with rows u1 and u2, and reduce to echelon form:
1 1 1 1
2 2 3 4
 
1 1 1 1
0 0 1 2
 
Then w1 ¼ ð1; 1; 1; 1Þ and w2 ¼ ð0; 0; 1; 2Þ span the same set of vectors as spanned by u1 and u2. Let
u3 ¼ ð0; 1; 0; 0Þ and u4 ¼ ð0; 0; 0; 1Þ. Then w1, u3, w2, u4 form a matrix in echelon form. Thus, they are
linearly independent, and they form a basis of R4
. Hence, u1, u2, u3, u4 also form a basis of R4
.
4.27. Consider the complex field C, which contains the real field R, which contains the rational field Q.
(Thus, C is a vector space over R, and R is a vector space over Q.)
(a) Show that f1; ig is a basis of C over R; hence, C is a vector space of dimension 2 over R.
(b) Show that R is a vector space of infinite dimension over Q.
(a) For any v 2 C, we have v ¼ a þ bi ¼ að1Þ þ bðiÞ, where a; b 2 R. Hence, f1; ig spans C over R.
Furthermore, if xð1Þ þ yðiÞ ¼ 0 or x þ yi ¼ 0, where x, y 2 R, then x ¼ 0 and y ¼ 0. Hence, f1; ig is
linearly independent over R. Thus, f1; ig is a basis for C over R.
(b) It can be shown that p is a transcendental number; that is, p is not a root of any polynomial over Q.
Thus, for any n, the n þ 1 real numbers 1; p; p2
; . . . ; pn
are linearly independent over Q. R cannot be of
dimension n over Q. Accordingly, R is of infinite dimension over Q.
4.28. Suppose S ¼ fu1; u2; . . . ; ung is a subset of V. Show that the following Definitions A and B of a
basis of V are equivalent:
(A) S is linearly independent and spans V.
(B) Every v 2 V is a unique linear combination of vectors in S.
Suppose (A) holds. Because S spans V, the vector v is a linear combination of the ui, say
u ¼ a1u1 þ a2u2 þ    þ anun and u ¼ b1u1 þ b2u2 þ    þ bnun
Subtracting, we get
0 ¼ v  v ¼ ða1  b1Þu1 þ ða2  b2Þu2 þ    þ ðan  bnÞun
140 CHAPTER 4 Vector Spaces
But the ui are linearly independent. Hence, the coefficients in the above relation are each 0:
a1  b1 ¼ 0; a2  b2 ¼ 0; . . . ; an  bn ¼ 0
Therefore, a1 ¼ b1; a2 ¼ b2; . . . ; an ¼ bn. Hence, the representation of v as a linear combination of the ui is
unique. Thus, (A) implies (B).
Suppose (B) holds. Then S spans V. Suppose
0 ¼ c1u1 þ c2u2 þ    þ cnun
However, we do have
0 ¼ 0u1 þ 0u2 þ    þ 0un
By hypothesis, the representation of 0 as a linear combination of the ui is unique. Hence, each ci ¼ 0 and the
ui are linearly independent. Thus, (B) implies (A).
Dimension and Subspaces
4.29. Find a basis and dimension of the subspace W of R3
where
(a) W ¼ fða; b; cÞ : a þ b þ c ¼ 0g, (b) W ¼ fða; b; cÞ : ða ¼ b ¼ cÞg
(a) Note that W 6¼ R3
, because, for example, ð1; 2; 3Þ 62 W. Thus, dim W  3. Note that u1 ¼ ð1; 0; 1Þ
and u2 ¼ ð0; 1; 1Þ are two independent vectors in W. Thus, dim W ¼ 2, and so u1 and u2 form a basis
of W.
(b) The vector u ¼ ð1; 1; 1Þ 2 W. Any vector w 2 W has the form w ¼ ðk; k; kÞ. Hence, w ¼ ku. Thus, u
spans W and dim W ¼ 1.
4.30. Let W be the subspace of R4
spanned by the vectors
u1 ¼ ð1; 2; 5; 3Þ; u2 ¼ ð2; 3; 1; 4Þ; u3 ¼ ð3; 8; 3; 5Þ
(a) Find a basis and dimension of W. (b) Extend the basis of W to a basis of R4
.
(a) Apply Algorithm 4.1, the row space algorithm. Form the matrix whose rows are the given vectors, and
reduce it to echelon form:
A ¼
1 2 5 3
2 3 1 4
3 8 3 5
2
4
3
5
1 2 5 3
0 7 9 2
0 14 18 4
2
4
3
5
1 2 5 3
0 7 9 2
0 0 0 0
2
4
3
5
The nonzero rows ð1; 2; 5; 3Þ and ð0; 7; 9; 2Þ of the echelon matrix form a basis of the row space
of A and hence of W. Thus, in particular, dim W ¼ 2.
(b) We seek four linearly independent vectors, which include the above two vectors. The four vectors
ð1; 2; 5; 3Þ, ð0; 7; 9; 2Þ, (0, 0, 1, 0), and (0, 0, 0, 1) are linearly independent (because they form an
echelon matrix), and so they form a basis of R4
, which is an extension of the basis of W.
4.31. Let W be the subspace of R5
spanned by u1 ¼ ð1; 2; 1; 3; 4Þ, u2 ¼ ð2; 4; 2; 6; 8Þ,
u3 ¼ ð1; 3; 2; 2; 6Þ, u4 ¼ ð1; 4; 5; 1; 8Þ, u5 ¼ ð2; 7; 3; 3; 9Þ. Find a subset of the vectors that
form a basis of W.
Here we use Algorithm 4.2, the casting-out algorithm. Form the matrix M whose columns (not rows)
are the given vectors, and reduce it to echelon form:
M ¼
1 2 1 1 2
2 4 3 4 7
1 2 2 5 3
3 6 2 1 3
4 8 6 8 9
2
6
6
6
6
4
3
7
7
7
7
5
1 2 1 1 2
0 0 1 2 3
0 0 3 6 5
0 0 1 2 3
0 0 2 4 1
2
6
6
6
6
4
3
7
7
7
7
5
1 2 1 1 2
0 0 1 2 3
0 0 0 0 4
0 0 0 0 0
0 0 0 0 0
2
6
6
6
6
4
3
7
7
7
7
5
The pivot positions are in columns C1, C3, C5. Hence, the corresponding vectors u1, u3, u5 form a basis of W,
and dim W ¼ 3.
CHAPTER 4 Vector Spaces 141
4.32. Let V be the vector space of 2 2 matrices over K. Let W be the subspace of symmetric matrices.
Show that dim W ¼ 3, by finding a basis of W.
Recall that a matrix A ¼ ½aij is symmetric if AT
¼ A, or, equivalently, each aij ¼ aji. Thus, A ¼
a b
b d
 
denotes an arbitrary 2 2 symmetric matrix. Setting (i) a ¼ 1, b ¼ 0, d ¼ 0; (ii) a ¼ 0, b ¼ 1, d ¼ 0;
(iii) a ¼ 0, b ¼ 0, d ¼ 1, we obtain the respective matrices:
E1 ¼
1 0
0 0
 
; E2 ¼
0 1
1 0
 
; E3 ¼
0 0
0 1
 
We claim that S ¼ fE1; E2; E3g is a basis of W; that is, (a) S spans W and (b) S is linearly independent.
(a) The above matrix A ¼
a b
b d
 
¼ aE1 þ bE2 þ dE3. Thus, S spans W.
(b) Suppose xE1 þ yE2 þ zE3 ¼ 0, where x, y, z are unknown scalars. That is, suppose
x
1 0
0 0
 
þ y
0 1
1 0
 
þ z
0 0
0 1
 
¼
0 0
0 0
 
or
x y
y z
 
¼
0 0
0 0
 
Setting corresponding entries equal to each other yields x ¼ 0, y ¼ 0, z ¼ 0. Thus, S is linearly independent.
Therefore, S is a basis of W, as claimed.
Theorems on Linear Dependence, Basis, and Dimension
4.33. Prove Lemma 4.10: Suppose two or more nonzero vectors v1; v2; . . . ; vm are linearly dependent.
Then one of them is a linear combination of the preceding vectors.
Because the vi are linearly dependent, there exist scalars a1; . . . ; am, not all 0, such that
a1v1 þ    þ amvm ¼ 0. Let k be the largest integer such that ak 6¼ 0. Then
a1v1 þ    þ akvk þ 0vkþ1 þ    þ 0vm ¼ 0 or a1v1 þ    þ akvk ¼ 0
Suppose k ¼ 1; then a1v1 ¼ 0, a1 6¼ 0, and so v1 ¼ 0. But the vi are nonzero vectors. Hence, k  1 and
vk ¼ a1
k a1v1      a1
k ak1vk1
That is, vk is a linear combination of the preceding vectors.
4.34. Suppose S ¼ fv1; v2; . . . ; vmg spans a vector space V.
(a) If w 2 V, then fw; v1; . . . ; vmg is linearly dependent and spans V.
(b) If vi is a linear combination of v1; . . . ; vi1, then S without vi spans V.
(a) The vector w is a linear combination of the vi, because fvig spans V. Accordingly, fw; v1; . . . ; vmg is
linearly dependent. Clearly, w with the vi span V, as the vi by themselves span V; that is, fw; v1; . . . ; vmg
spans V.
(b) Suppose vi ¼ k1v1 þ    þ ki1vi1. Let u 2 V. Because fvig spans V, u is a linear combination of the
vj’s, say u ¼ a1v1 þ    þ amvm: Substituting for vi, we obtain
u ¼ a1v1 þ    þ ai1vi1 þ aiðk1v1 þ    þ ki1vi1Þ þ aiþ1viþ1 þ    þ amvm
¼ ða1 þ aik1Þv1 þ    þ ðai1 þ aiki1Þvi1 þ aiþ1viþ1 þ    þ amvm
Thus, fv1; . . . ; vi1; viþ1; . . . ; vmg spans V. In other words, we can delete vi from the spanning set and still
retain a spanning set.
4.35. Prove Lemma 4.13: Suppose fv1; v2; . . . ; vng spans V, and suppose fw1; w2; . . . ; wmg is linearly
independent. Then m  n, and V is spanned by a set of the form
fw1; w2; . . . ; wm; vi1
; vi2
; . . . ; vinm
g
Thus, any n þ 1 or more vectors in V are linearly dependent.
142 CHAPTER 4 Vector Spaces
It suffices to prove the lemma in the case that the vi are all not 0. (Prove!) Because fvig spans V, we
have by Problem 4.34 that
fw1; v1; . . . ; vng ð1Þ
is linearly dependent and also spans V. By Lemma 4.10, one of the vectors in (1) is a linear combination of
the preceding vectors. This vector cannot be w1, so it must be one of the v’s, say vj: Thus by Problem 4.34,
we can delete vj from the spanning set (1) and obtain the spanning set
fw1; v1; . . . ; vj1; vjþ1; . . . ; vng ð2Þ
Now we repeat the argument with the vector w2. That is, because (2) spans V, the set
fw1; w2; v1; . . . ; vj1; vjþ1; . . . ; vng ð3Þ
is linearly dependent and also spans V. Again by Lemma 4.10, one of the vectors in (3) is a linear
combination of the preceding vectors. We emphasize that this vector cannot be w1 or w2, because
fw1; . . . ; wmg is independent; hence, it must be one of the v’s, say vk. Thus, by Problem 4.34, we can
delete vk from the spanning set (3) and obtain the spanning set
fw1; w2; v1; . . . ; vj1; vjþ1; . . . ; vk1; vkþ1; . . . ; vng
We repeat the argument with w3, and so forth. At each step, we are able to add one of the w’s and delete
one of the v’s in the spanning set. If m  n, then we finally obtain a spanning set of the required form:
fw1; . . . ; wm; vi1
; . . . ; vinm
g
Finally, we show that m  n is not possible. Otherwise, after n of the above steps, we obtain the
spanning set fw1; . . . ; wng. This implies that wnþ1 is a linear combination of w1; . . . ; wn, which contradicts
the hypothesis that fwig is linearly independent.
4.36. Prove Theorem 4.12: Every basis of a vector space V has the same number of elements.
Suppose fu1; u2; . . . ; ung is a basis of V, and suppose fv1; v2; . . .g is another basis of V. Because fuig
spans V, the basis fv1; v2; . . .g must contain n or less vectors, or else it is linearly dependent by
Problem 4.35—Lemma 4.13. On the other hand, if the basis fv1; v2; . . .g contains less than n elements,
then fu1; u2; . . . ; ung is linearly dependent by Problem 4.35. Thus, the basis fv1; v2; . . .g contains exactly n
vectors, and so the theorem is true.
4.37. Prove Theorem 4.14: Let V be a vector space of finite dimension n. Then
(i) Any n þ 1 or more vectors must be linearly dependent.
(ii) Any linearly independent set S ¼ fu1; u2; . . . ung with n elements is a basis of V.
(iii) Any spanning set T ¼ fv1; v2; . . . ; vng of V with n elements is a basis of V.
Suppose B ¼ fw1; w2; . . . ; wng is a basis of V.
(i) Because B spans V, any n þ 1 or more vectors are linearly dependent by Lemma 4.13.
(ii) By Lemma 4.13, elements from B can be adjoined to S to form a spanning set of V with n elements.
Because S already has n elements, S itself is a spanning set of V. Thus, S is a basis of V.
(iii) Suppose T is linearly dependent. Then some vi is a linear combination of the preceding vectors. By
Problem 4.34, V is spanned by the vectors in T without vi and there are n  1 of them. By Lemma
4.13, the independent set B cannot have more than n  1 elements. This contradicts the fact that B has
n elements. Thus, T is linearly independent, and hence T is a basis of V.
4.38. Prove Theorem 4.15: Suppose S spans a vector space V. Then
(i) Any maximum number of linearly independent vectors in S form a basis of V.
(ii) Suppose one deletes from S every vector that is a linear combination of preceding vectors in
S. Then the remaining vectors form a basis of V.
(i) Suppose fv1; . . . ; vmg is a maximum linearly independent subset of S, and suppose w 2 S. Accord-
ingly, fv1; . . . ; vm; wg is linearly dependent. No vk can be a linear combination of preceding vectors.
CHAPTER 4 Vector Spaces 143
Hence, w is a linear combination of the vi. Thus, w 2 spanðviÞ, and hence S  spanðviÞ. This leads to
V ¼ spanðSÞ  spanðviÞ  V
Thus, fvig spans V, and, as it is linearly independent, it is a basis of V.
(ii) The remaining vectors form a maximum linearly independent subset of S; hence, by (i), it is a basis
of V.
4.39. Prove Theorem 4.16: Let V be a vector space of finite dimension and let S ¼ fu1; u2; . . . ; urg be a
set of linearly independent vectors in V. Then S is part of a basis of V; that is, S may be extended
to a basis of V.
Suppose B ¼ fw1; w2; . . . ; wng is a basis of V. Then B spans V, and hence V is spanned by
S [ B ¼ fu1; u2; . . . ; ur; w1; w2; . . . ; wng
By Theorem 4.15, we can delete from S [ B each vector that is a linear combination of preceding vectors to
obtain a basis B0
for V. Because S is linearly independent, no uk is a linear combination of preceding vectors.
Thus, B0
contains every vector in S, and S is part of the basis B0
for V.
4.40. Prove Theorem 4.17: Let W be a subspace of an n-dimensional vector space V. Then dim W  n.
In particular, if dim W ¼ n, then W ¼ V.
Because V is of dimension n, any n þ 1 or more vectors are linearly dependent. Furthermore, because a
basis of W consists of linearly independent vectors, it cannot contain more than n elements. Accordingly,
dim W  n.
In particular, if fw1; . . . ; wng is a basis of W, then, because it is an independent set with n elements, it is
also a basis of V. Thus, W ¼ V when dim W ¼ n.
Rank of a Matrix, Row and Column Spaces
4.41. Find the rank and basis of the row space of each of the following matrices:
(a) A ¼
1 2 0 1
2 6 3 3
3 10 6 5
2
4
3
5, (b) B ¼
1 3 1 2 3
1 4 3 1 4
2 3 4 7 3
3 8 1 7 8
2
6
6
4
3
7
7
5.
(a) Row reduce A to echelon form:
A
1 2 0 1
0 2 3 1
0 4 6 2
2
4
3
5
1 2 0 1
0 2 3 1
0 0 0 0
2
4
3
5
The two nonzero rows ð1; 2; 0; 1Þ and ð0; 2; 3; 1Þ of the echelon form of A form a basis for
rowsp(A). In particular, rankðAÞ ¼ 2.
(b) Row reduce B to echelon form:
B
1 3 1 2 3
0 1 2 1 1
0 3 6 3 3
0 1 2 1 1
2
6
6
4
3
7
7
5
1 3 1 2 3
0 1 2 1 1
0 0 0 0 0
0 0 0 0 0
2
6
6
4
3
7
7
5
The two nonzero rows ð1; 3; 1; 2; 3Þ and ð0; 1; 2; 1; 1Þ of the echelon form of B form a basis for
rowsp(B). In particular, rankðBÞ ¼ 2.
4.42. Show that U ¼ W, where U and W are the following subspaces of R3
:
U ¼ spanðu1; u2; u3Þ ¼ spanð1; 1; 1Þ; ð2; 3; 1Þ; ð3; 1; 5Þg
W ¼ spanðw1; w2; w3Þ ¼ spanð1; 1; 3Þ; ð3; 2; 8Þ; ð2; 1; 3Þg
144 CHAPTER 4 Vector Spaces
Form the matrix A whose rows are the ui, and row reduce A to row canonical form:
A ¼
1 1 1
2 3 1
3 1 5
2
4
3
5
1 1 1
0 1 1
0 2 2
2
4
3
5
1 0 2
0 1 1
0 0 0
2
4
3
5
Next form the matrix B whose rows are the wj, and row reduce B to row canonical form:
B ¼
1 1 3
3 2 8
2 1 3
2
4
3
5
1 1 3
0 1 1
0 3 3
2
4
3
5
1 0 2
0 1 1
0 0 0
2
4
3
5
Because A and B have the same row canonical form, the row spaces of A and B are equal, and so U ¼ W.
4.43. Let A ¼
1 2 1 2 3 1
2 4 3 7 7 4
1 2 2 5 5 6
3 6 6 15 14 15
2
6
6
4
3
7
7
5.
(a) Find rankðMkÞ, for k ¼ 1; 2; . . . ; 6, where Mk is the submatrix of A consisting of the first k
columns C1; C2; . . . ; Ck of A.
(b) Which columns Ckþ1 are linear combinations of preceding columns C1; . . . ; Ck?
(c) Find columns of A that form a basis for the column space of A.
(d) Express column C4 as a linear combination of the columns in part (c).
(a) Row reduce A to echelon form:
A
1 2 1 2 3 1
0 0 1 3 1 2
0 0 1 3 2 5
0 0 3 9 5 12
2
6
6
4
3
7
7
5
1 2 1 2 3 1
0 0 1 3 1 2
0 0 0 0 1 3
0 0 0 0 0 0
2
6
6
4
3
7
7
5
Observe that this simultaneously reduces all the matrices Mk to echelon form; for example, the first four
columns of the echelon form of A are an echelon form of M4. We know that rankðMkÞ is equal to the
number of pivots or, equivalently, the number of nonzero rows in an echelon form of Mk. Thus,
rankðM1Þ ¼ rankðM2Þ ¼ 1; rankðM3Þ ¼ rankðM4Þ ¼ 2
rankðM5Þ ¼ rankðM6Þ ¼ 3
(b) The vector equation x1C1 þ x2C2 þ    þ xkCk ¼ Ckþ1 yields the system with coefficient matrix Mk
and augmented Mkþ1. Thus, Ckþ1 is a linear combination of C1; . . . ; Ck if and only if
rankðMkÞ ¼ rankðMkþ1Þ or, equivalently, if Ckþ1 does not contain a pivot. Thus, each of C2, C4, C6
is a linear combination of preceding columns.
(c) In the echelon form of A, the pivots are in the first, third, and fifth columns. Thus, columns C1, C3, C5
of A form a basis for the columns space of A. Alternatively, deleting columns C2, C4, C6 from the
spanning set of columns (they are linear combinations of other columns), we obtain, again, C1, C3, C5.
(d) The echelon matrix tells us that C4 is a linear combination of columns C1 and C3. The augmented
matrix M of the vector equation C4 ¼ xC1 þ yC2 consists of the columns C1, C3, C4 of A which, when
reduced to echelon form, yields the matrix (omitting zero rows)
1 1 2
0 1 3
 
or
x þ y ¼ 2
y ¼ 3
or x ¼ 1; y ¼ 3
Thus, C4 ¼ C1 þ 3C3 ¼ C1 þ 3C3 þ 0C5.
4.44. Suppose u ¼ ða1; a2; . . . ; anÞ is a linear combination of the rows R1; R2; . . . ; Rm of a matrix
B ¼ ½bij, say u ¼ k1R1 þ k2R2 þ    þ kmRm: Prove that
ai ¼ k1b1i þ k2b2i þ    þ kmbmi; i ¼ 1; 2; . . . ; n
where b1i; b2i; . . . ; bmi are the entries in the ith column of B.
CHAPTER 4 Vector Spaces 145
We are given that u ¼ k1R1 þ k2R2 þ    þ kmRm. Hence,
ða1; a2; . . . ; anÞ ¼ k1ðb11; . . . ; b1nÞ þ    þ kmðbm1; . . . ; bmnÞ
¼ ðk1b11 þ    þ kmbm1; . . . ; k1b1n þ    þ kmbmnÞ
Setting corresponding components equal to each other, we obtain the desired result.
4.45. Prove Theorem 4.7: Suppose A ¼ ½aij and B ¼ ½bij are row equivalent echelon matrices with
respective pivot entries
a1j1
; a2j2
; . . . ; arjr
and b1k1
; b2k2
; . . . ; bsks
(pictured in Fig. 4-5). Then A and B have the same number of nonzero rows—that is, r ¼ s—and
their pivot entries are in the same positions; that is, j1 ¼ k1; j2 ¼ k2; . . . ; jr ¼ kr.
Clearly A ¼ 0 if and only if B ¼ 0, and so we need only prove the theorem when r  1 and s  1. We
first show that j1 ¼ k1. Suppose j1  k1. Then the j1th column of B is zero. Because the first row R* of A is in
the row space of B, we have R* ¼ c1R1 þ c1R2 þ    þ cmRm, where the Ri are the rows of B. Because the
j1th column of B is zero, we have
a1j1
¼ c10 þ c20 þ    þ cm0 ¼ 0
But this contradicts the fact that the pivot entry a1j1
6¼ 0. Hence, j1  k1 and, similarly, k1  j1. Thus j1 ¼ k1.
Now let A0
be the submatrix of A obtained by deleting the first row of A, and let B0
be the submatrix of B
obtained by deleting the first row of B. We prove that A0
and B0
have the same row space. The theorem will
then follow by induction, because A0
and B0
are also echelon matrices.
Let R ¼ ða1; a2; ... ; anÞ be any row of A0
and let R1; ... ; Rm be the rows of B. Because R is in the row
space of B, there exist scalars d1; ... ; dm such that R ¼ d1R1 þ d2R2 þ  þ dmRm. Because A is in echelon
form and R is not the first row of A, the j1th entry of R is zero: ai ¼ 0 for i ¼ j1 ¼ k1. Furthermore, because B is
in echelon form, all the entries in the k1th column of B are 0 except the first: b1k1
6¼ 0, but
b2k1
¼ 0; ... ; bmk1
¼ 0. Thus,
0 ¼ ak1
¼ d1b1k1
þ d20 þ    þ dm0 ¼ d1b1k1
Now b1k1
6¼ 0 and so d1 ¼ 0. Thus, R is a linear combination of R2; . . . ; Rm and so is in the row space of B0
.
Because R was any row of A0
, the row space of A0
is contained in the row space of B0
. Similarly, the row
space of B0
is contained in the row space of A0
. Thus, A0
and B0
have the same row space, and so the theorem
is proved.
4.46. Prove Theorem 4.8: Suppose A and B are row canonical matrices. Then A and B have the same
row space if and only if they have the same nonzero rows.
Obviously, if A and B have the same nonzero rows, then they have the same row space. Thus we only
have to prove the converse.
Suppose A and B have the same row space, and suppose R 6¼ 0 is the ith row of A. Then there exist
scalars c1; . . . ; cs such that
R ¼ c1R1 þ c2R2 þ    þ csRs ð1Þ
where the Ri are the nonzero rows of B. The theorem is proved if we show that R ¼ Ri; that is, that ci ¼ 1 but
ck ¼ 0 for k 6¼ i.
A ¼
a1j1
     
a2j2
   
::::::::::::::::::::::::::::::::::::::
arjr
 
2
6
6
4
3
7
7
5; b ¼
b1k1
     
b2k2
   
::::::::::::::::::::::::::::::::::::::
bsks
 
2
6
6
4
3
7
7
5
Figure 4-5
146 CHAPTER 4 Vector Spaces
Let aij, be the pivot entry in R—that is, the first nonzero entry of R. By (1) and Problem 4.44,
aiji
¼ c1b1ji
þ c2b2ji
þ    þ csbsji
ð2Þ
But, by Problem 4.45, biji
is a pivot entry of B, and, as B is row reduced, it is the only nonzero entry in the jth
column of B. Thus, from (2), we obtain aiji
¼ cibiji
. However, aiji
¼ 1 and biji
¼ 1, because A and B are row
reduced; hence, ci ¼ 1.
Now suppose k 6¼ i, and bkjk
is the pivot entry in Rk. By (1) and Problem 4.44,
aijk
¼ c1b1jk
þ c2b2jk
þ    þ csbsjk
ð3Þ
Because B is row reduced, bkjk
is the only nonzero entry in the jth column of B. Hence, by (3), aijk
¼ ckbkjk
.
Furthermore, by Problem 4.45, akjk
is a pivot entry of A, and because A is row reduced, aijk
¼ 0. Thus,
ckbkjk
¼ 0, and as bkjk
¼ 1, ck ¼ 0. Accordingly R ¼ Ri; and the theorem is proved.
4.47. Prove Corollary 4.9: Every matrix A is row equivalent to a unique matrix in row canonical
form.
Suppose A is row equivalent to matrices A1 and A2, where A1 and A2 are in row canonical form. Then
rowspðAÞ ¼ rowspðA1Þ and rowspðAÞ ¼ rowspðA2Þ. Hence, rowspðA1Þ ¼ rowspðA2Þ. Because A1 and A2 are
in row canonical form, A1 ¼ A2 by Theorem 4.8. Thus, the corollary is proved.
4.48. Suppose RB and AB are defined, where R is a row vector and A and B are matrices. Prove
(a) RB is a linear combination of the rows of B.
(b) The row space of AB is contained in the row space of B.
(c) The column space of AB is contained in the column space of A.
(d) If C is a column vector and AC is defined, then AC is a linear combination of the columns
of A:
(e) rankðABÞ  rankðBÞ and rankðABÞ  rankðAÞ.
(a) Suppose R ¼ ða1; a2; . . . ; amÞ and B ¼ ½bij. Let B1; . . . ; Bm denote the rows of B and B1
; . . . ; Bn
its
columns. Then
RB ¼ ðRB1
; RB2
; . . . ; RBn
Þ
¼ ða1b11 þ a2b21 þ    þ ambm1; . . . ; a1b1n þ a2b2n þ    þ ambmnÞ
¼ a1ðb11; b12; . . . ; b1nÞ þ a2ðb21; b22; . . . ; b2nÞ þ    þ amðbm1; bm2; . . . ; bmnÞ
¼ a1B1 þ a2B2 þ    þ amBm
Thus, RB is a linear combination of the rows of B, as claimed.
(b) The rows of AB are RiB, where Ri is the ith row of A. Thus, by part (a), each row of AB is in the row
space of B. Thus, rowspðABÞ  rowspðBÞ, as claimed.
(c) Using part (b), we have colspðABÞ ¼ rowspðABÞT
¼ rowspðBT
AT
Þ  rowspðAT
Þ ¼ colspðAÞ:
(d) Follows from ðcÞ where C replaces B:
(e) The row space of AB is contained in the row space of B; hence, rankðABÞ  rankðBÞ. Furthermore, the
column space of AB is contained in the column space of A; hence, rankðABÞ  rankðAÞ.
4.49. Let A be an n-square matrix. Show that A is invertible if and only if rankðAÞ ¼ n.
Note that the rows of the n-square identity matrix In are linearly independent, because In is in echelon
form; hence, rankðInÞ ¼ n. Now if A is invertible, then A is row equivalent to In; hence, rankðAÞ ¼ n. But if
A is not invertible, then A is row equivalent to a matrix with a zero row; hence, rankðAÞ  n; that is, A is
invertible if and only if rankðAÞ ¼ n.
CHAPTER 4 Vector Spaces 147
Applications to Linear Equations
4.50. Find the dimension and a basis of the solution space W of each homogeneous system:
x þ 2y þ 2z  s þ 3t ¼ 0
x þ 2y þ 3z þ s þ t ¼ 0
3x þ 6y þ 8z þ s þ 5t ¼ 0
(a)
x þ 2y þ z  2t ¼ 0
2x þ 4y þ 4z  3t ¼ 0
3x þ 6y þ 7z  4t ¼ 0
(b)
x þ y þ 2z ¼ 0
2x þ 3y þ 3z ¼ 0
x þ 3y þ 5z ¼ 0
(c)
(a) Reduce the system to echelon form:
x þ 2y þ 2z  s þ 3t ¼ 0
z þ 2s  2t ¼ 0
2z þ 4s  4t ¼ 0
or
x þ 2y þ 2z  s þ 3t ¼ 0
z þ 2s  2t ¼ 0
The system in echelon form has two (nonzero) equations in five unknowns. Hence, the system has
5  2 ¼ 3 free variables, which are y, s, t. Thus, dim W ¼ 3. We obtain a basis for W:
ð1Þ Set y ¼ 1; s ¼ 0; t ¼ 0 to obtain the solution v1 ¼ ð2; 1; 0; 0; 0Þ:
ð2Þ Set y ¼ 0; s ¼ 1; t ¼ 0 to obtain the solution v2 ¼ ð5; 0; 2; 1; 0Þ:
ð3Þ Set y ¼ 0; s ¼ 0; t ¼ 1 to obtain the solution v3 ¼ ð7; 0; 2; 0; 1Þ:
The set fv1; v2; v3g is a basis of the solution space W.
(b) (Here we use the matrix format of our homogeneous system.) Reduce the coefficient matrix A to
echelon form:
A ¼
1 2 1 2
2 4 4 3
3 6 7 4
2
4
3
5
1 2 1 2
0 0 2 1
0 0 4 2
2
4
3
5
1 2 1 2
0 0 2 1
0 0 0 0
2
4
3
5
This corresponds to the system
x þ 2y þ 2z  2t ¼ 0
2z þ t ¼ 0
The free variables are y and t, and dim W ¼ 2.
(i) Set y ¼ 1, z ¼ 0 to obtain the solution u1 ¼ ð2; 1; 0; 0Þ.
(ii) Set y ¼ 0, z ¼ 2 to obtain the solution u2 ¼ ð6; 0; 1; 2Þ.
Then fu1; u2g is a basis of W.
(c) Reduce the coefficient matrix A to echelon form:
A ¼
1 1 2
2 3 3
1 3 5
2
4
3
5
1 1 2
0 1  1
0 2 3
2
4
3
5
1 1 2
0 1  1
0 0 5
2
4
3
5
This corresponds to a triangular system with no free variables. Thus, 0 is the only solution; that is,
W ¼ f0g. Hence, dim W ¼ 0.
4.51. Find a homogeneous system whose solution set W is spanned by
fu1; u2; u3g ¼ fð1; 2; 0; 3Þ; ð1; 1; 1; 4Þ; ð1; 0; 2; 5Þg
Let v ¼ ðx; y; z; tÞ. Then v 2 W if and only if v is a linear combination of the vectors u1, u2, u3 that span
W. Thus, form the matrix M whose first columns are u1, u2, u3 and whose last column is v, and then row
reduce M to echelon form. This yields
M ¼
1 1 1 x
2 1 0 y
0 1 2 z
3 4 5 t
2
6
6
4
3
7
7
5
1 1 1 x
0 1 2 2x þ y
0 1 2 z
0 1 2 3x þ t
2
6
6
4
3
7
7
5
1 1 1 x
0 1 2 2x þ y
0 0 0 2x þ y þ z
0 0 0 5x  y þ t
2
6
6
4
3
7
7
5
148 CHAPTER 4 Vector Spaces
Then v is a linear combination of u1, u2, u3 if rankðMÞ ¼ rankðAÞ, where A is the submatrix without column
v. Thus, set the last two entries in the fourth column on the right equal to zero to obtain the required
homogeneous system:
2x þ y þ z ¼ 0
5x þ y  t ¼ 0
4.52. Let xi1
; xi2
; . . . ; xik
be the free variables of a homogeneous system of linear equations with n
unknowns. Let vj be the solution for which xij
¼ 1, and all other free variables equal 0. Show that
the solutions v1; v2; . . . ; vk are linearly independent.
Let A be the matrix whose rows are the vi. We interchange column 1 and column i1, then column 2 and
column i2; . . . ; then column k and column ik, and we obtain the k n matrix
B ¼ ½I; C ¼
1 0 0 . . . 0 0 c1;kþ1 . . . c1n
0 1 0 . . . 0 0 c2;kþ1 . . . c2n
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
0 0 0 . . . 0 1 ck;kþ1 . . . ckn
2
6
6
4
3
7
7
5
The above matrix B is in echelon form, and so its rows are independent; hence, rankðBÞ ¼ k. Because A and
B are column equivalent, they have the same rank—rankðAÞ ¼ k. But A has k rows; hence, these rows (i.e.,
the vi) are linearly independent, as claimed.
Sums, Direct Sums, Intersections
4.53. Let U and W be subspaces of a vector space V. Show that
(a) U þ V is a subspace of V.
(b) U and W are contained in U þ W.
(c) U þ W is the smallest subspace containing U and W; that is, U þ W ¼ spanðU; WÞ.
(d) W þ W ¼ W.
(a) Because U and W are subspaces, 0 2 U and 0 2 W. Hence, 0 ¼ 0 þ 0 belongs to U þ W. Now suppose
v; v0
2 U þ W. Then v ¼ u þ w and v0
¼ u0
þ v0
, where u; u0
2 U and w; w0
2 W. Then
av þ bv0
¼ ðau þ bu0
Þ þ ðaw þ bw0
Þ 2 U þ W
Thus, U þ W is a subspace of V.
(b) Let u 2 U. Because W is a subspace, 0 2 W. Hence, u ¼ u þ 0 belongs to U þ W. Thus, U  U þ W.
Similarly, W  U þ W.
(c) Because U þ W is a subspace of V containing U and W, it must also contain the linear span of U and
W. That is, spanðU; WÞ  U þ W.
On the other hand, if v 2 U þ W, then v ¼ u þ w ¼ 1u þ 1w, where u 2 U and w 2 W. Thus, v is
a linear combination of elements in U [ W, and so v 2 spanðU; WÞ. Hence, U þ W  spanðU; WÞ.
The two inclusion relations give the desired result.
(d) Because W is a subspace of V, we have that W is closed under vector addition; hence, W þ W  W. By
part (a), W  W þ W. Hence, W þ W ¼ W.
4.54. Consider the following subspaces of R5
:
U ¼ spanðu1; u2; u3Þ ¼ spanfð1; 3; 2; 2; 3Þ; ð1; 4; 3; 4; 2Þ; ð2; 3; 1; 2; 9Þg
W ¼ spanðw1; w2; w3Þ ¼ spanfð1; 3; 0; 2; 1Þ; ð1; 5; 6; 6; 3Þ; ð2; 5; 3; 2; 1Þg
Find a basis and the dimension of (a) U þ W, (b) U  W.
CHAPTER 4 Vector Spaces 149
(a) U þ W is the space spanned by all six vectors. Hence, form the matrix whose rows are the given six
vectors, and then row reduce to echelon form:
1 3 2 2 3
1 4 3 4 2
2 3 1 2 9
1 3 0 2 1
1 5 6 6 3
2 5 3 2 1
2
6
6
6
6
6
6
4
3
7
7
7
7
7
7
5
1 3 2 2 3
0 1 1 2 1
0 3 3 6 3
0 0 2 0 2
0 2 4 4 0
0 1 7 2 5
2
6
6
6
6
6
6
4
3
7
7
7
7
7
7
5
1 3 2 2 3
0 1 1 2 1
0 0 1 0 1
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
2
6
6
6
6
6
6
4
3
7
7
7
7
7
7
5
The following three nonzero rows of the echelon matrix form a basis of U  W:
ð1; 3; 2; 2; 2; 3Þ; ð0; 1; 1; 2; 1Þ; ð0; 0; 1; 0; 1Þ
Thus, dimðU þ WÞ ¼ 3.
(b) Let v ¼ ðx; y; z; s; tÞ denote an arbitrary element in R5
. First find, say as in Problem 4.49, homogeneous
systems whose solution sets are U and W, respectively.
Let M be the matrix whose columns are the ui and v, and reduce M to echelon form:
M ¼
1 1 2 x
3 4 3 y
2 3 1 z
2 4 2 s
3 2 9 t
2
6
6
6
6
4
3
7
7
7
7
5
1 1 2 x
0 1 3 3x þ y
0 0 0 x þ y þ z
0 0 0 4x  2y þ s
0 0 0 6x þ y þ t
2
6
6
6
6
4
3
7
7
7
7
5
Set the last three entries in the last column equal to zero to obtain the following homogeneous system whose
solution set is U:
x þ y þ z ¼ 0; 4x  2y þ s ¼ 0; 6x þ y þ t ¼ 0
Now let M0
be the matrix whose columns are the wi and v, and reduce M0
to echelon form:
M0
¼
1 1 2 x
3 5 5 y
0 6 3 z
2 6 2 s
1 3 1 t
2
6
6
6
6
4
3
7
7
7
7
5
1 1 2 x
0 2 1 3x þ y
0 0 0 9x þ 3y þ z
0 0 0 4x  2y þ s
0 0 0 2x  y þ t
2
6
6
6
6
4
3
7
7
7
7
5
Again set the last three entries in the last column equal to zero to obtain the following homogeneous system
whose solution set is W:
9 þ 3 þ z ¼ 0; 4x  2y þ s ¼ 0; 2x  y þ t ¼ 0
Combine both of the above systems to obtain a homogeneous system, whose solution space is U  W, and
reduce the system to echelon form, yielding
x þ y þ z ¼ 0
2y þ 4z þ s ¼ 0
8z þ 5s þ 2t ¼ 0
s  2t ¼ 0
There is one free variable, which is t; hence, dimðU  WÞ ¼ 1. Setting t ¼ 2, we obtain the solution
u ¼ ð1; 4; 3; 4; 2Þ, which forms our required basis of U  W.
4.55. Suppose U and W are distinct four-dimensional subspaces of a vector space V, where dim V ¼ 6.
Find the possible dimensions of U  W.
Because U and W are distinct, U þ W properly contains U and W; consequently, dimðU þ WÞ  4.
But dimðU þ WÞ cannot be greater than 6, as dim V ¼ 6. Hence, we have two possibilities: (a)
dimðU þ WÞ ¼ 5 or (b) dimðU þ WÞ ¼ 6. By Theorem 4.20,
dimðU  WÞ ¼ dim U þ dim W  dimðU þ WÞ ¼ 8  dimðU þ WÞ
Thus (a) dimðU  WÞ ¼ 3 or (b) dimðU  WÞ ¼ 2.
150 CHAPTER 4 Vector Spaces
4.56. Let U and W be the following subspaces of R3
:
U ¼ fða; b; cÞ : a ¼ b ¼ cg and W ¼ fð0; b; cÞg
(Note that W is the yz-plane.) Show that R3
¼ U W.
First we show that U  W ¼ f0g. Suppose v ¼ ða; b; cÞ 2 U  W. Then a ¼ b ¼ c and a ¼ 0. Hence,
a ¼ 0, b ¼ 0, c ¼ 0. Thus, v ¼ 0 ¼ ð0; 0; 0Þ.
Next we show that R3
¼ U þ W. For, if v ¼ ða; b; cÞ 2 R3
, then
v ¼ ða; a; aÞ þ ð0; b  a; c  aÞ where ða; a; aÞ 2 U and ð0; b  a; c  aÞ 2 W
Both conditions U  W ¼ f0g and U þ W ¼ R3
imply that R3
¼ U W.
4.57. Suppose that U and W are subspaces of a vector space V and that S ¼ fuig spans U and S0
¼ fwjg
spans W. Show that S [ S0
spans U þ W. (Accordingly, by induction, if Si spans Wi, for
i ¼ 1; 2; . . . ; n, then S1 [ . . . [ Sn spans W1 þ    þ Wn.)
Let v 2 U þ W. Then v ¼ u þ w, where u 2 U and w 2 W. Because S spans U, u is a linear
combination of ui, and as S0
spans W, w is a linear combination of wj; say
u ¼ a1ui1
þ a2ui2
þ    þ aruir
and v ¼ b1wj1
þ b2wj2
þ    þ bswjs
where ai; bj 2 K. Then
v ¼ u þ w ¼ a1ui1
þ a2ui2
þ    þ aruir
þ b1wj1
þ b2wj2
þ    þ bswjs
Accordingly, S [ S0
¼ fui; wjg spans U þ W.
4.58. Prove Theorem 4.20: Suppose U and V are finite-dimensional subspaces of a vector space V. Then
U þ W has finite dimension and
dimðU þ WÞ ¼ dim U þ dim W  dimðU  WÞ
Observe that U  W is a subspace of both U and W. Suppose dim U ¼ m, dim W ¼ n,
dimðU  WÞ ¼ r. Suppose fv1; . . . ; vrg is a basis of U  W. By Theorem 4.16, we can extend fvig to a
basis of U and to a basis of W; say
fv1; . . . ; vr; u1; . . . ; umrg and fv1; . . . ; vr; w1; . . . ; wnrg
are bases of U and W, respectively. Let
B ¼ fv1; . . . ; vr; u1; . . . ; umr; w1; . . . ; wnrg
Note that B has exactly m þ n  r elements. Thus, the theorem is proved if we can show that B is a basis
of U þ W. Because fvi; ujg spans U and fvi; wkg spans W, the union B ¼ fvi; uj; wkg spans U þ W. Thus, it
suffices to show that B is independent.
Suppose
a1v1 þ    þ arvr þ b1u1 þ    þ bmrumr þ c1w1 þ    þ cnrwnr ¼ 0 ð1Þ
where ai, bj, ck are scalars. Let
v ¼ a1v1 þ    þ arvr þ b1u1 þ    þ bmrumr ð2Þ
By (1), we also have
v ¼ c1w1      cnrwnr ð3Þ
Because fvi; ujg  U, v 2 U by (2); and as fwkg  W, v 2 W by (3). Accordingly, v 2 U  W. Now fvig is
a basis of U  W, and so there exist scalars d1; . . . ; dr for which v ¼ d1v1 þ    þ drvr. Thus, by (3), we have
d1v1 þ    þ drvr þ c1w1 þ    þ cnrwnr ¼ 0
But fvi; wkg is a basis of W, and so is independent. Hence, the above equation forces c1 ¼ 0; . . . ; cnr ¼ 0.
Substituting this into (1), we obtain
a1v1 þ    þ arvr þ b1u1 þ    þ bmrumr ¼ 0
But fvi; ujg is a basis of U, and so is independent. Hence, the above equation forces a1 ¼
0; . . . ; ar ¼ 0; b1 ¼ 0; . . . ; bmr ¼ 0.
Because (1) implies that the ai, bj, ck are all 0, B ¼ fvi; uj; wkg is independent, and the theorem is
proved.
CHAPTER 4 Vector Spaces 151
4.59. Prove Theorem 4.21: V ¼ U W if and only if (i) V ¼ U þ W, (ii) U  W ¼ f0g.
Suppose V ¼ U W. Then any v 2 V can be uniquely written in the form v ¼ u þ w, where u 2 U and
w 2 W. Thus, in particular, V ¼ U þ W. Now suppose v 2 U  W. Then
ð1Þ v ¼ v þ 0; where v 2 U; 0 2 W; ð2Þ v ¼ 0 þ v; where 0 2 U; v 2 W:
Thus, v ¼ 0 þ 0 ¼ 0 and U  W ¼ f0g.
On the other hand, suppose V ¼ U þ W and U  W ¼ f0g. Let v 2 V. Because V ¼ U þ W, there exist
u 2 U and w 2 W such that v ¼ u þ w. We need to show that such a sum is unique. Suppose also that
v ¼ u0
þ w0
, where u0
2 U and w0
2 W. Then
u þ w ¼ u0
þ w0
; and so u  u0
¼ w0
 w
But u  u0
2 U and w0
 w 2 W; hence, by U  W ¼ f0g,
u  u0
¼ 0; w0
 w ¼ 0; and so u ¼ u0
; w ¼ w0
Thus, such a sum for v 2 V is unique, and V ¼ U W.
4.60. Prove Theorem 4.22 (for two factors): Suppose V ¼ U W. Also, suppose S ¼ fu1; . . . ; umg and
S0
¼ fw1; . . . ; wng are linearly independent subsets of U and W, respectively. Then
(a) The union S [ S0
is linearly independent in V.
(b) If S and S0
are bases of U and W, respectively, then S [ S0
is a basis of V.
(c) dim V ¼ dim U þ dim W.
(a) Suppose a1u1 þ    þ amum þ b1w1 þ    þ bnwn ¼ 0, where ai, bj are scalars. Then
ða1u1 þ    þ amumÞ þ ðb1w1 þ    þ bnwnÞ ¼ 0 ¼ 0 þ 0
where 0; a1u1 þ    þ amum 2 U and 0; b1w1 þ    þ bnwn 2 W. Because such a sum for 0 is unique,
this leads to
a1u1 þ    þ amum ¼ 0 and b1w1 þ    þ bnwn ¼ 0
Because S1 is linearly independent, each ai ¼ 0, and because S2 is linearly independent, each bj ¼ 0.
Thus, S ¼ S1 [ S2 is linearly independent.
(b) By part (a), S ¼ S1 [ S2 is linearly independent, and, by Problem 4.55, S ¼ S1 [ S2 spans V ¼ U þ W.
Thus, S ¼ S1 [ S2 is a basis of V.
(c) This follows directly from part (b).
Coordinates
4.61. Relative to the basis S ¼ fu1; u2g ¼ fð1; 1Þ; ð2; 3Þg of R2
, find the coordinate vector of v, where
(a) v ¼ ð4; 3Þ, (b) v ¼ ða; bÞ.
In each case, set
v ¼ xu1 þ yu2 ¼ xð1; 1Þ þ yð2; 3Þ ¼ ðx þ 2y; x þ 3yÞ
and then solve for x and y.
(a) We have
ð4; 3Þ ¼ ðx þ 2y; x þ 3yÞ or
x þ 2y ¼ 4
x þ 3y ¼ 3
The solution is x ¼ 18, y ¼ 7. Hence, ½v ¼ ½18; 7.
(b) We have
ða; bÞ ¼ ðx þ 2y; x þ 3yÞ or
x þ 2y ¼ a
x þ 3y ¼ b
The solution is x ¼ 3a  2b, y ¼ a þ b. Hence, ½v ¼ ½3a  2b; a þ b.
152 CHAPTER 4 Vector Spaces
4.62. Find the coordinate vector of v ¼ ða; b; cÞ in R3
relative to
(a) the usual basis E ¼ fð1; 0; 0Þ; ð0; 1; 0Þ; ð0; 0; 1Þg,
(b) the basis S ¼ fu1; u2; u3g ¼ fð1; 1; 1Þ; ð1; 1; 0Þ; ð1; 0; 0Þg.
(a) Relative to the usual basis E, the coordinates of ½vE are the same as v. That is, ½vE ¼ ½a; b; c.
(b) Set v as a linear combination of u1, u2, u3 using unknown scalars x, y, z. This yields
a
b
c
2
4
3
5 ¼ x
1
1
1
2
4
3
5 þ y
1
1
0
2
4
3
5 þ z
1
0
0
2
4
3
5 or
x þ y þ z ¼ a
x þ y ¼ b
x ¼ c
Solving the system yields x ¼ c, y ¼ b  c, z ¼ a  b. Thus, ½vS ¼ ½c; b  c; a  b.
4.63. Consider the vector space P3ðtÞ of polynomials of degree 3.
(a) Show that S ¼ fðt  1Þ3
; ðt  1Þ2
; t  1; 1g is a basis of P3ðtÞ.
(b) Find the coordinate vector ½v of v ¼ 3t3
 4t2
þ 2t  5 relative to S.
(a) The degree of ðt  1Þk
is k; writing the polynomials of S in reverse order, we see that no polynomial is
a linear combination of preceding polynomials. Thus, the polynomials are linearly independent, and,
because dim P3ðtÞ ¼ 4, they form a basis of P3ðtÞ.
(b) Set v as a linear combination of the basis vectors using unknown scalars x, y, z, s. We have
v ¼ 3t3
þ 4t2
þ 2t  5 ¼ xðt  1Þ3
þ yðt  1Þ2
þ zðt  1Þ þ sð1Þ
¼ xðt3
 3t2
þ 3t  1Þ þ yðt2
 2t þ 1Þ þ zðt  1Þ þ sð1Þ
¼ xt3
 3xt2
þ 3xt  x þ yt2
 2yt þ y þ zt  z þ s
¼ xt3
þ ð3x þ yÞt2
þ ð3x  2y þ zÞt þ ðx þ y  z þ sÞ
Then set coefficients of the same powers of t equal to each other to obtain
x ¼ 3; 3x þ y ¼ 4; 3x  2y þ z ¼ 2; x þ y  z þ s ¼ 5
Solving the system yields x ¼ 3, y ¼ 13, z ¼ 19, s ¼ 4. Thus, ½v ¼ ½3; 13; 19; 4.
4.64. Find the coordinate vector of A ¼
2 3
4 7
 
in the real vector space M ¼ M2;2 relative to
(a) the basis S ¼
1 1
1 1
 
;
1 1
1 0
 
;
1 1
0 0
 
;
1 0
0 0
 
,
(b) the usual basis E ¼
1 0
0 0
 
;
0 1
0 0
 
;
0 0
1 0
 
;
0 0
0 1
 
(a) Set A as a linear combination of the basis vectors using unknown scalars x, y, z, t as follows:
A ¼
2 3
4  7
 
¼ x
1 1
1 1
 
þ y
1  1
1 0
 
þ z
1  1
0 0
 
þ t
1 0
0 0
 
¼
x þ z þ t x  y  z
x þ y x
 
Set corresponding entries equal to each other to obtain the system
x þ z þ t ¼ 2; x  y  z ¼ 3; x þ y ¼ 4; x ¼ 7
Solving the system yields x ¼ 7, y ¼ 11, z ¼ 21, t ¼ 30. Thus, ½AS ¼ ½7; 11; 21; 30. (Note that
the coordinate vector of A is a vector in R4
, because dim M ¼ 4.)
(b) Expressing A as a linear combination of the basis matrices yields
2 3
4 7
 
¼ x
1 0
0 0
 
þ y
0 1
0 0
 
þ z
0 0
1 0
 
þ t
0 0
0 1
 
¼
x y
z t
 
Thus, x ¼ 2, y ¼ 3, z ¼ 4, t ¼ 7. Hence, ½A ¼ ½2; 3; 4; 7, whose components are the elements of A
written row by row.
CHAPTER 4 Vector Spaces 153
Remark: This result is true in general; that is, if A is any m n matrix in M ¼ Mm;n, then the
coordinates of A relative to the usual basis of M are the elements of A written row by row.
4.65. In the space M ¼ M2;3, determine whether or not the following matrices are linearly dependent:
A ¼
1 2 3
4 0 5
 
; B ¼
2 4 7
10 1 13
 
; C ¼
1 2 5
8 2 11
 
If the matrices are linearly dependent, find the dimension and a basis of the subspace W of M
spanned by the matrices.
The coordinate vectors of the above matrices relative to the usual basis of M are as follows:
½A ¼ ½1; 2; 3; 4; 0; 5; ½B ¼ ½2; 4; 7; 10; 1; 13; ½C ¼ ½1; 2; 5; 8; 2; 11
Form the matrix M whose rows are the above coordinate vectors, and reduce M to echelon form:
M ¼
1 2 3 4 0 5
2 4 7 10 1 13
1 2 5 8 2 11
2
4
3
5
1 2 3 4 0 5
0 0 1 2 1 3
0 0 0 0 0 0
2
4
3
5
Because the echelon matrix has only two nonzero rows, the coordinate vectors ½A, ½B, ½C span a space of
dimension two, and so they are linearly dependent. Thus, A, B, C are linearly dependent. Furthermore,
dim W ¼ 2, and the matrices
w1 ¼
1 2 3
4 0 5
 
and w2 ¼
0 0 1
2 1 3
 
corresponding to the nonzero rows of the echelon matrix form a basis of W.
Miscellaneous Problems
4.66. Consider a finite sequence of vectors S ¼ fv1; v2; . . . ; vng. Let T be the sequence of vectors
obtained from S by one of the following ‘‘elementary operations’’: (i) interchange two vectors,
(ii) multiply a vector by a nonzero scalar, (iii) add a multiple of one vector to another. Show that S
and T span the same space W. Also show that T is independent if and only if S is independent.
Observe that, for each operation, the vectors in T are linear combinations of vectors in S. On the other
hand, each operation has an inverse of the same type (Prove!); hence, the vectors in S are linear combinations
of vectors in T. Thus S and T span the same space W. Also, T is independent if and only if dim W ¼ n, and this
is true if and only if S is also independent.
4.67. Let A ¼ ½aij and B ¼ ½bij be row equivalent m n matrices over a field K, and let v1; . . . ; vn be
any vectors in a vector space V over K. Let
u1 ¼ a11v1 þ a12v2 þ    þ a1nvn
u2 ¼ a21v1 þ a22v2 þ    þ a2nvn
um ¼ am1v1 þ am2v2 þ    þ amnvn
w1 ¼ b11v1 þ b12v2 þ    þ b1nvn
w2 ¼ b21v1 þ b22v2 þ    þ b2nvn
::::::::::::::::::::::::::::::::::::::::::::::::::::: :::::::::::::::::::::::::::::::::::::::::::::::::::::::
wm ¼ bm1v1 þ bm2v2 þ    þ bmnvn
Show that fuig and fwig span the same space.
Applying an ‘‘elementary operation’’ of Problem 4.66 to fuig is equivalent to applying an elementary
row operation to the matrix A. Because A and B are row equivalent, B can be obtained from A by a sequence
of elementary row operations; hence, fwig can be obtained from fuig by the corresponding sequence of
operations. Accordingly, fuig and fwig span the same space.
4.68. Let v1; . . . ; vn belong to a vector space V over K, and let P ¼ ½aij be an n-square matrix over K. Let
w1 ¼ a11v1 þ a12v2 þ    þ a1nvn; . . . ; wn ¼ an1v1 þ an2v2 þ    þ annvn
(a) Suppose P is invertible. Show that fwig and fvig span the same space; hence, fwig is
independent if and only if fvig is independent.
(b) Suppose P is not invertible. Show that fwig is dependent.
(c) Suppose fwig is independent. Show that P is invertible.
154 CHAPTER 4 Vector Spaces
(a) Because P is invertible, it is row equivalent to the identity matrix I. Hence, by Problem 4.67, fwig and
fvig span the same space. Thus, one is independent if and only if the other is.
(b) Because P is not invertible, it is row equivalent to a matrix with a zero row. This means that fwig spans
a space that has a spanning set of less than n elements. Thus, fwig is dependent.
(c) This is the contrapositive of the statement of (b), and so it follows from (b).
4.69. Suppose that A1; A2; . . . are linearly independent sets of vectors, and that A1  A2  . . .. Show
that the union A ¼ A1 [ A2 [ . . . is also linearly independent.
Suppose A is linearly dependent. Then there exist vectors v1; . . . ; vn 2 A and scalars a1; . . . ; an 2 K, not
all of them 0, such that
a1v1 þ a2v2 þ    þ anvn ¼ 0 ð1Þ
Because A ¼ [ Ai and the vi 2 A, there exist sets Ai1
; . . . ; Ain
such that
v1 2 Ai1
; v2 2 Ai2
; . . . ; vn 2 Ain
Let k be the maximum index of the sets Aij
: k ¼ maxði1; . . . ; inÞ. It follows then, as A1  A2  . . . ; that
each Aij
is contained in Ak. Hence, v1; v2; . . . ; vn 2 Ak, and so, by (1), Ak is linearly dependent, which
contradicts our hypothesis. Thus, A is linearly independent.
4.70. Let K be a subfield of a field L, and let L be a subfield of a field E. (Thus, K  L  E, and K is a
subfield of E.) Suppose E is of dimension n over L, and L is of dimension m over K. Show that E is
of dimension mn over K.
Suppose fv1; . . . ; vng is a basis of E over L and fa1; . . . ; amg is a basis of L over K. We claim that
faivj : i ¼ 1; . . . ; m; j ¼ 1; . . . ; ng is a basis of E over K. Note that faivjg contains mn elements.
Let w be any arbitrary element in E. Because fv1; . . . ; vng spans E over L, w is a linear combination of
the vi with coefficients in L:
w ¼ b1v1 þ b2v2 þ    þ bnvn; bi 2 L ð1Þ
Because fa1; . . . ; amg spans L over K, each bi 2 L is a linear combination of the aj with coefficients in K:
b1 ¼ k11a1 þ k12a2 þ    þ k1mam
b2 ¼ k21a1 þ k22a2 þ    þ k2mam
::::::::::::::::::::::::::::::::::::::::::::::::::
bn ¼ kn1a1 þ kn2a2 þ    þ kmnam
where kij 2 K. Substituting in (1), we obtain
w ¼ ðk11a1 þ    þ k1mamÞv1 þ ðk21a1 þ    þ k2mamÞv2 þ    þ ðkn1a1 þ    þ knmamÞvn
¼ k11a1v1 þ    þ k1mamv1 þ k21a1v2 þ    þ k2mamv2 þ    þ kn1a1vn þ    þ knmamvn
¼
P
i;j
kjiðaivjÞ
where kji 2 K. Thus, w is a linear combination of the aivj with coefficients in K; hence, faivjg spans E over
K.
The proof is complete if we show that faivjg is linearly independent over K. Suppose, for scalars
xji 2 K; we have
P
i;j xjiðaivjÞ ¼ 0; that is,
ðx11a1v1 þ x12a2v1 þ    þ x1mamv1Þ þ    þ ðxn1a1vn þ xn2a2vn þ    þ xnmamvmÞ ¼ 0
or
ðx11a1 þ x12a2 þ    þ x1mamÞv1 þ    þ ðxn1a1 þ xn2a2 þ    þ xnmamÞvn ¼ 0
Because fv1; . . . ; vng is linearly independent over L and the above coefficients of the vi belong to L, each
coefficient must be 0:
x11a1 þ x12a2 þ    þ x1mam ¼ 0; . . . ; xn1a1 þ xn2a2 þ    þ xnmam ¼ 0
CHAPTER 4 Vector Spaces 155
But fa1; . . . ; amg is linearly independent over K; hence, because the xji 2 K,
x11 ¼ 0; x12 ¼ 0; . . . ; x1m ¼ 0; . . . ; xn1 ¼ 0; xn2 ¼ 0; . . . ; xnm ¼ 0
Accordingly, faivjg is linearly independent over K, and the theorem is proved.
SUPPLEMENTARY PROBLEMS
Vector Spaces
4.71. Suppose u and v belong to a vector space V. Simplify each of the following expressions:
(a) E1 ¼ 4ð5u  6vÞ þ 2ð3u þ vÞ, (c) E3 ¼ 6ð3u þ 2vÞ þ 5u  7v,
(b) E2 ¼ 5ð2u  3vÞ þ 4ð7v þ 8Þ, (d) E4 ¼ 3ð5u þ 2=vÞ:
4.72. Let V be the set of ordered pairs (a; b) of real numbers with addition in V and scalar multiplication on V
defined by
ða; bÞ þ ðc; dÞ ¼ ða þ c; b þ dÞ and kða; bÞ ¼ ðka; 0Þ
Show that V satisfies all the axioms of a vector space except [M4]—that is, except 1u ¼ u. Hence, [M4] is
not a consequence of the other axioms.
4.73. Show that Axiom [A4] of a vector space V (that u þ v ¼ v þ u) can be derived from the other axioms for V.
4.74. Let V be the set of ordered pairs (a; b) of real numbers. Show that V is not a vector space over R with
addition and scalar multiplication defined by
(i) ða; bÞ þ ðc; dÞ ¼ ða þ d; b þ cÞ and kða; bÞ ¼ ðka; kbÞ,
(ii) ða; bÞ þ ðc; dÞ ¼ ða þ c; b þ dÞ and kða; bÞ ¼ ða; bÞ,
(iii) ða; bÞ þ ðc; dÞ ¼ ð0; 0Þ and kða; bÞ ¼ ðka; kbÞ,
(iv) ða; bÞ þ ðc; dÞ ¼ ðac; bdÞ and kða; bÞ ¼ ðka; kbÞ.
4.75. Let V be the set of infinite sequences (a1; a2; . . .) in a field K. Show that V is a vector space over K with
addition and scalar multiplication defined by
ða1; a2; . . .Þ þ ðb1; b2; . . .Þ ¼ ða1 þ b1; a2 þ b2; . . .Þ and kða1; a2; . . .Þ ¼ ðka1; ka2; . . .Þ
4.76. Let U and W be vector spaces over a field K. Let V be the set of ordered pairs (u; w) where u 2 U and
w 2 W. Show that V is a vector space over K with addition in V and scalar multiplication on V defined by
ðu; wÞ þ ðu0
; w0
Þ ¼ ðu þ u0
; w þ w0
Þ and kðu; wÞ ¼ ðku; kwÞ
(This space V is called the external direct product of U and W.)
Subspaces
4.77. Determine whether or not W is a subspace of R3
where W consists of all vectors (a; b; c) in R3
such that
(a) a ¼ 3b, (b) a  b  c, (c) ab ¼ 0, (d) a þ b þ c ¼ 0, (e) b ¼ a2
, ( f ) a ¼ 2b ¼ 3c.
4.78. Let V be the vector space of n-square matrices over a field K. Show that W is a subspace of V if W consists
of all matrices A ¼ ½aij that are
(a) symmetric (AT
¼ A or aij ¼ aji), (b) (upper) triangular, (c) diagonal, (d) scalar.
4.79. Let AX ¼ B be a nonhomogeneous system of linear equations in n unknowns; that is, B 6¼ 0. Show that the
solution set is not a subspace of Kn
.
4.80. Suppose U and W are subspaces of V for which U [ W is a subspace. Show that U  W or W  U.
4.81. Let V be the vector space of all functions from the real field R into R. Show that W is a subspace of V
where W consists of all: (a) bounded functions, (b) even functions. [Recall that f : R ! R is bounded if
9M 2 R such that 8x 2 R, we have j f ðxÞj  M; and f ðxÞ is even if f ðxÞ ¼ f ðxÞ; 8x 2 R.]
156 CHAPTER 4 Vector Spaces
4.82. Let V be the vector space (Problem 4.75) of infinite sequences (a1; a2; . . .) in a field K. Show that W is a
subspace of V if W consists of all sequences with (a) 0 as the first element, (b) only a finite number of
nonzero elements.
Linear Combinations, Linear Spans
4.83. Consider the vectors u ¼ ð1; 2; 3Þ and v ¼ ð2; 3; 1Þ in R3
.
(a) Write w ¼ ð1; 3; 8Þ as a linear combination of u and v.
(b) Write w ¼ ð2; 4; 5Þ as a linear combination of u and v.
(c) Find k so that w ¼ ð1; k; 4Þ is a linear combination of u and v.
(d) Find conditions on a, b, c so that w ¼ ða; b; cÞ is a linear combination of u and v.
4.84. Write the polynomial f ðtÞ ¼ at2
þ bt þ c as a linear combination of the polynomials p1 ¼ ðt  1Þ2
,
p2 ¼ t  1, p3 ¼ 1. [Thus, p1, p2, p3 span the space P2ðtÞ of polynomials of degree  2.]
4.85. Find one vector in R3
that spans the intersection of U and W where U is the xy-plane—that is,
U ¼ fða; b; 0Þg—and W is the space spanned by the vectors (1, 1, 1) and (1, 2, 3).
4.86. Prove that span(S) is the intersection of all subspaces of V containing S.
4.87. Show that spanðSÞ ¼ spanðS [ f0gÞ. That is, by joining or deleting the zero vector from a set, we do not
change the space spanned by the set.
4.88. Show that (a) If S  T, then spanðSÞ  spanðTÞ. (b) span½spanðSÞ ¼ spanðSÞ.
Linear Dependence and Linear Independence
4.89. Determine whether the following vectors in R4
are linearly dependent or independent:
(a) ð1; 2; 3; 1Þ, ð3; 7; 1; 2Þ, ð1; 3; 7; 4Þ; (b) ð1; 3; 1; 2Þ, ð2; 5; 1; 3Þ, ð1; 3; 7; 2Þ.
4.90. Determine whether the following polynomials u, v, w in PðtÞ are linearly dependent or independent:
(a) u ¼ t3
 4t2
þ 3t þ 3, v ¼ t3
þ 2t2
þ 4t  1, w ¼ 2t3
 t2
 3t þ 5;
(b) u ¼ t3
 5t2
 2t þ 3, v ¼ t3
 4t2
 3t þ 4, w ¼ 2t3
 17t2
 7t þ 9.
4.91. Show that the following functions f , g, h are linearly independent:
(a) f ðtÞ ¼ et
, gðtÞ ¼ sin t, hðtÞ ¼ t2
; (b) f ðtÞ ¼ et
, gðtÞ ¼ e2t
, hðtÞ ¼ t.
4.92. Show that u ¼ ða; bÞ and v ¼ ðc; dÞ in K2
are linearly dependent if and only if ad  bc ¼ 0.
4.93. Suppose u, v, w are linearly independent vectors. Prove that S is linearly independent where
(a) S ¼ fu þ v  2w; u  v  w; u þ wg; (b) S ¼ fu þ v  3w; u þ 3v  w; v þ wg.
4.94. Suppose fu1; . . . ; ur; w1; . . . ; wsg is a linearly independent subset of V. Show that
spanðuiÞ  spanðwjÞ ¼ f0g
4.95. Suppose v1; v2; . . . ; vn are linearly independent. Prove that S is linearly independent where
(a) S ¼ fa1v1; a2v2; . . . ; anvng and each ai 6¼ 0.
(b) S ¼ fv1; . . . ; vk1; w; vkþ1; . . . ; vng and w ¼
P
i bivi and bk 6¼ 0.
4.96. Suppose ða11; . . . ; a1nÞ; ða21; . . . ; a2nÞ; . . . ; ðam1; . . . ; amnÞ are linearly independent vectors in Kn
, and
suppose v1; v2; . . . ; vn are linearly independent vectors in a vector space V over K. Show that the following
CHAPTER 4 Vector Spaces 157
vectors are also linearly independent:
w1 ¼ a11v1 þ    þ a1nvn; w2 ¼ a21v1 þ    þ a2nvn; . . . ; wm ¼ am1v1 þ    þ amnvn
Basis and Dimension
4.97. Find a subset of u1, u2, u3, u4 that gives a basis for W ¼ spanðuiÞ of R5
, where
(a) u1 ¼ ð1; 1; 1; 2; 3Þ, u2 ¼ ð1; 2; 1; 2; 1Þ, u3 ¼ ð3; 5; 1; 2; 5Þ, u4 ¼ ð1; 2; 1; 1; 4Þ
(b) u1 ¼ ð1; 2; 1; 3; 1Þ, u2 ¼ ð2; 4; 2; 6; 2Þ, u3 ¼ ð1; 3; 1; 2; 1Þ, u4 ¼ ð3; 7; 3; 8; 1Þ
(c) u1 ¼ ð1; 0; 1; 0; 1Þ, u2 ¼ ð1; 1; 2; 1; 0Þ, u3 ¼ ð2; 1; 3; 1; 1Þ, u4 ¼ ð1; 2; 1; 1; 1Þ
(d) u1 ¼ ð1; 0; 1; 1; 1Þ, u2 ¼ ð2; 1; 2; 0; 1Þ, u3 ¼ ð1; 1; 2; 3; 4Þ, u4 ¼ ð4; 2; 5; 4; 6Þ
4.98. Consider the subspaces U ¼ fða; b; c; dÞ : b  2c þ d ¼ 0g and W ¼ fða; b; c; dÞ : a ¼ d; b ¼ 2cg of R4
.
Find a basis and the dimension of (a) U, (b) W, (c) U  W.
4.99. Find a basis and the dimension of the solution space W of each of the following homogeneous systems:
ðaÞ x þ 2y  2z þ 2s  t ¼ 0
x þ 2y  z þ 3s  2t ¼ 0
2x þ 4y  7z þ s þ t ¼ 0
ðbÞ x þ 2y  z þ 3s  4t ¼ 0
2x þ 4y  2z  s þ 5t ¼ 0
2x þ 4y  2z þ 4s  2t ¼ 0
4.100. Find a homogeneous system whose solution space is spanned by the following sets of three vectors:
(a) ð1; 2; 0; 3; 1Þ, ð2; 3; 2; 5; 3Þ, ð1; 2; 1; 2; 2Þ;
(b) (1, 1, 2, 1, 1), (1, 2, 1, 4, 3), (3, 5, 4, 9, 7).
4.101. Determine whether each of the following is a basis of the vector space PnðtÞ:
(a) f1; 1 þ t; 1 þ t þ t2
; 1 þ t þ t2
þ t3
; . . . ; 1 þ t þ t2
þ    þ tn1
þ tn
g;
(b) f1 þ t; t þ t2
; t2
þ t3
; . . . ; tn2
þ tn1
; tn1
þ tn
g:
4.102. Find a basis and the dimension of the subspace W of PðtÞ spanned by
(a) u ¼ t3
þ 2t2
 2t þ 1, v ¼ t3
þ 3t2
 3t þ 4, w ¼ 2t3
þ t2
 7t  7,
(b) u ¼ t3
þ t2
 3t þ 2, v ¼ 2t3
þ t2
þ t  4, w ¼ 4t3
þ 3t2
 5t þ 2.
4.103. Find a basis and the dimension of the subspace W of V ¼ M2;2 spanned by
A ¼
1 5
4 2
 
; B ¼
1 1
1 5
 
; C ¼
2 4
5 7
 
; D ¼
1 7
5 1
 
Rank of a Matrix, Row and Column Spaces
4.104. Find the rank of each of the following matrices:
(a)
1 3 2 5 4
1 4 1 3 5
1 4 2 4 3
2 7 3 6 13
2
6
6
4
3
7
7
5, (b)
1 2 3 2
1 3 2 0
3 8 7 2
2 1 9 10
2
6
6
4
3
7
7
5, (c)
1 1 2
4 5 5
5 8 1
1 2 2
2
6
6
4
3
7
7
5
4.105. For k ¼ 1; 2; . . . ; 5, find the number nk of linearly independent subsets consisting of k columns for each of
the following matrices:
(a) A ¼
1 1 0 2 3
1 2 0 2 5
1 3 0 2 7
2
4
3
5, (b) B ¼
1 2 1 0 2
1 2 3 0 4
1 1 5 0 6
2
4
3
5
158 CHAPTER 4 Vector Spaces
4.106. Let (a) A ¼
1 2 1 3 1 6
2 4 3 8 3 15
1 2 2 5 3 11
4 8 6 16 7 32
2
6
6
4
3
7
7
5, (b) B ¼
1 2 2 1 2 1
2 4 5 4 5 5
1 2 3 4 4 6
3 6 7 7 9 10
2
6
6
4
3
7
7
5
For each matrix (where C1; . . . ; C6 denote its columns):
(i) Find its row canonical form M.
(ii) Find the columns that are linear combinations of preceding columns.
(iii) Find columns (excluding C6) that form a basis for the column space.
(iv) Express C6 as a linear combination of the basis vectors obtained in (iii).
4.107. Determine which of the following matrices have the same row space:
A ¼
1 2 1
3 4 5
 
; B ¼
1 1 2
2 3 1
 
; C ¼
1 1 3
2 1 10
3 5 1
2
4
3
5
4.108. Determine which of the following subspaces of R3
are identical:
U1 ¼ span½ð1; 1; 1Þ; ð2; 3; 1Þ; ð3; 1; 5Þ; U2 ¼ span½ð1; 1; 3Þ; ð3; 2; 8Þ; ð2; 1; 3Þ
U3 ¼ span½ð1; 1; 1Þ; ð1; 1; 3Þ; ð3; 1; 7Þ
4.109. Determine which of the following subspaces of R4
are identical:
U1 ¼ span½ð1; 2; 1; 4Þ; ð2; 4; 1; 5Þ; ð3; 6; 2; 9Þ; U2 ¼ span½ð1; 2; 1; 2Þ; ð2; 4; 1; 3Þ;
U3 ¼ span½ð1; 2; 3; 10Þ; ð2; 4; 3; 11Þ
4.110. Find a basis for (i) the row space and (ii) the column space of each matrix M:
(a) M ¼
0 0 3 1 4
1 3 1 2 1
3 9 4 5 2
4 12 8 8 7
2
6
6
4
3
7
7
5, (b) M ¼
1 2 1 0 1
1 2 2 1 3
3 6 5 2 7
2 4 1 1 0
2
6
6
4
3
7
7
5.
4.111. Show that if any row is deleted from a matrix in echelon (respectively, row canonical) form, then the
resulting matrix is still in echelon (respectively, row canonical) form.
4.112. Let A and B be arbitrary m n matrices. Show that rankðA þ BÞ  rankðAÞ þ rankðBÞ.
4.113. Let r ¼ rankðA þ BÞ. Find 2 2 matrices A and B such that
(a) r  rankðAÞ, rank(B); (b) r ¼ rankðAÞ ¼ rankðBÞ; (c) r  rankðAÞ, rank(B).
Sums, Direct Sums, Intersections
4.114. Suppose U and W are two-dimensional subspaces of K3
. Show that U  W 6¼ f0g.
4.115. Suppose U and W are subspaces of V such that dim U ¼ 4, dim W ¼ 5, and dim V ¼ 7. Find the possible
dimensions of U  W.
4.116. Let U and W be subspaces of R3
for which dim U ¼ 1, dim W ¼ 2, and U 6 W. Show that R3
¼ U W.
4.117. Consider the following subspaces of R5
:
U ¼ span½ð1; 1; 1; 2; 0Þ; ð1; 2; 2; 0; 3Þ; ð1; 1; 2; 2; 1Þ
W ¼ span½ð1; 2; 3; 0; 2Þ; ð1; 1; 3; 2; 4Þ; ð1; 1; 2; 2; 5Þ
CHAPTER 4 Vector Spaces 159
(a) Find two homogeneous systems whose solution spaces are U and W, respectively.
(b) Find a basis and the dimension of U  W.
4.118. Let U1, U2, U3 be the following subspaces of R3
:
U1 ¼ fða; b; cÞ : a ¼ cg; U2 ¼ fða; b; cÞ : a þ b þ c ¼ 0g; U3 ¼ fð0; 0; cÞg
Show that (a) R3
¼ U1 þ U2, (b) R3
¼ U2 þ U3, (c) R3
¼ U1 þ U3. When is the sum direct?
4.119. Suppose U, W1, W2 are subspaces of a vector space V. Show that
ðU  W1Þ þ ðU  W2Þ  U  ðW1 þ W2Þ
Find subspaces of R2
for which equality does not hold.
4.120. Suppose W1; W2; . . . ; Wr are subspaces of a vector space V. Show that
(a) spanðW1; W2; . . . ; WrÞ ¼ W1 þ W2 þ    þ Wr.
(b) If Si spans Wi for i ¼ 1; . . . ; r, then S1 [ S2 [    [ Sr spans W1 þ W2 þ    þ Wr.
4.121. Suppose V ¼ U W. Show that dim V ¼ dim U þ dim W.
4.122. Let S and T be arbitrary nonempty subsets (not necessarily subspaces) of a vector space V and let k be a
scalar. The sum S þ T and the scalar product kS are defined by
S þ T ¼ ðu þ v : u 2 S; v 2 Tg; kS ¼ fku : u 2 Sg
[We also write w þ S for fwg þ S.] Let
S ¼ fð1; 2Þ; ð2; 3Þg; T ¼ fð1; 4Þ; ð1; 5Þ; ð2; 5Þg; w ¼ ð1; 1Þ; k ¼ 3
Find: (a) S þ T, (b) w þ S, (c) kS, (d) kT, (e) kS þ kT, (f) kðS þ TÞ.
4.123. Show that the above operations of S þ T and kS satisfy
(a) Commutative law: S þ T ¼ T þ S.
(b) Associative law: ðS1 þ S2Þ þ S3 ¼ S1 þ ðS2 þ S3Þ.
(c) Distributive law: kðS þ TÞ ¼ kS þ kT.
(d) S þ f0g ¼ f0g þ S ¼ S and S þ V ¼ V þ S ¼ V.
4.124. Let V be the vector space of n-square matrices. Let U be the subspace of upper triangular matrices, and let
W be the subspace of lower triangular matrices. Find (a) U  W, (b) U þ W.
4.125. Let V be the external direct sum of vector spaces U and W over a field K. (See Problem 4.76.) Let
^
U ¼ fðu; 0Þ : u 2 Ug and ^
W ¼ fð0; wÞ : w 2 Wg
Show that (a) ^
U and ^
W are subspaces of V, (b) V ¼ ^
U ^
W.
4.126. Suppose V ¼ U þ W. Let ^
V be the external direct sum of U and W. Show that V is isomorphic to ^
V under
the correspondence v ¼ u þ w $ ðu; wÞ.
4.127. Use induction to prove (a) Theorem 4.22, (b) Theorem 4.23.
Coordinates
4.128. The vectors u1 ¼ ð1; 2Þ and u2 ¼ ð4; 7Þ form a basis S of R2
. Find the coordinate vector ½v of v relative
to S where (a) v ¼ ð5; 3Þ, (b) v ¼ ða; bÞ.
4.129. The vectors u1 ¼ ð1; 2; 0Þ, u2 ¼ ð1; 3; 2Þ, u3 ¼ ð0; 1; 3Þ form a basis S of R3
. Find the coordinate vector ½v
of v relative to S where (a) v ¼ ð2; 7; 4Þ, (b) v ¼ ða; b; cÞ.
160 CHAPTER 4 Vector Spaces
4.130. S ¼ ft3
þ t2
; t2
þ t; t þ 1; 1g is a basis of P3ðtÞ. Find the coordinate vector ½v of v relative to S
where (a) v ¼ 2t3
þ t2
 4t þ 2, (b) v ¼ at3
þ bt2
þ ct þ d.
4.131. Let V ¼ M2;2. Find the coordinate vector [A] of A relative to S where
S ¼
1 1
1 1
 
;
1 1
1 0
 
;
1 1
0 0
 
;
1 0
0 0
 
and ðaÞ A ¼
3 5
6 7
 
; ðbÞ A ¼
a b
c d
 
4.132. Find the dimension and a basis of the subspace W of P3ðtÞ spanned by
u ¼ t3
þ 2t2
 3t þ 4; v ¼ 2t3
þ 5t2
 4t þ 7; w ¼ t3
þ 4t2
þ t þ 2
4.133. Find the dimension and a basis of the subspace W of M ¼ M2;3 spanned by
A ¼
1 2 1
3 1 2
 
; B ¼
2 4 3
7 5 6
 
; C ¼
1 2 3
5 7 6
 
Miscellaneous Problems
4.134. Answer true or false. If false, prove it with a counterexample.
(a) If u1, u2, u3 span V, then dim V ¼ 3.
(b) If A is a 4 8 matrix, then any six columns are linearly dependent.
(c) If u1, u2, u3 are linearly independent, then u1, u2, u3, w are linearly dependent.
(d) If u1, u2, u3, u4 are linearly independent, then dim V  4.
(e) If u1, u2, u3 span V, then w, u1, u2, u3 span V.
(f) If u1, u2, u3, u4 are linearly independent, then u1, u2, u3 are linearly independent.
4.135. Answer true or false. If false, prove it with a counterexample.
(a) If any column is deleted from a matrix in echelon form, then the resulting matrix is still in echelon
form.
(b) If any column is deleted from a matrix in row canonical form, then the resulting matrix is still in row
canonical form.
(c) If any column without a pivot is deleted from a matrix in row canonical form, then the resulting matrix
is in row canonical form.
4.136. Determine the dimension of the vector space W of the following n-square matrices:
(a) symmetric matrices, (b) antisymmetric matrices,
(d) diagonal matrices, (c) scalar matrices.
4.137. Let t1; t2; . . . ; tn be symbols, and let K be any field. Let V be the following set of expressions where ai 2 K:
a1t1 þ a2t2 þ    þ antn
Define addition in V and scalar multiplication on V by
ða1t1 þ    þ antnÞ þ ðb1t1 þ    þ bntnÞ ¼ ða1 þ b1Þt1 þ    þ ðanbnmÞtn
kða1t1 þ a2t2 þ    þ antnÞ ¼ ka1t1 þ ka2t2 þ    þ kantn
Show that V is a vector space over K with the above operations. Also, show that ft1; . . . ; tng is a basis of V,
where
tj ¼ 0t1 þ    þ 0tj1 þ 1tj þ 0tjþ1 þ    þ 0tn
CHAPTER 4 Vector Spaces 161
ANSWERS TO SUPPLEMENTARY PROBLEMS
[Some answers, such as bases, need not be unique.]
4.71. (a) E1 ¼ 26u  22v; (b) The sum 7v þ 8 is not defined, so E2 is not defined;
(c) E3 ¼ 23u þ 5v; (d) Division by v is not defined, so E4 is not defined.
4.77. (a) Yes; (b) No; e.g., ð1; 2; 3Þ 2 W but 2ð1; 2; 3Þ 62 W;
(c) No; e.g., ð1; 0; 0Þ; ð0; 1; 0Þ 2 W, but not their sum; (d) Yes;
(e) No; e.g., ð1; 1; 1Þ 2 W, but 2ð1; 1; 1Þ 62 W; (f) Yes
4.79. The zero vector 0 is not a solution.
4.83. (a) w ¼ 3u1  u2, (b) Impossible, (c) k ¼ 11
5 , (d) 7a  5b þ c ¼ 0
4.84. Using f ¼ xp1 þ yp2 þ zp3, we get x ¼ a, y ¼ 2a þ b, z ¼ a þ b þ c
4.85. v ¼ ð2; 1; 0Þ
4.89. (a) Dependent, (b) Independent
4.90. (a) Independent, (b) Dependent
4.97. (a) u1, u2, u4; (b) u1, u2, u3; (c) u1, u2, u4; (d) u1, u2, u3
4.98. (a) dim U ¼ 3, (b) dim W ¼ 2, (c) dimðU  WÞ ¼ 1
4.99. (a) Basis: fð2; 1; 0; 0; 0Þ; ð4; 0; 1; 1; 0Þ; ð3; 0; 1; 0; 1Þg; dim W ¼ 3;
(b) Basis: fð2; 1; 0; 0; 0Þ; ð1; 0; 1; 0; 0Þg; dim W ¼ 2
4.100. (a) 5x þ y  z  s ¼ 0; x þ y  z  t ¼ 0;
(b) 3x  y  z ¼ 0; 2x  3y þ s ¼ 0; x  2y þ t ¼ 0
4.101. (a) Yes, (b) No, because dim PnðtÞ ¼ n þ 1, but the set contains only n elements.
4.102. (a) dim W ¼ 2, (b) dim W ¼ 3
4.103. dim W ¼ 2
4.104. (a) 3, (b) 2, (c) 3
4.105. (a) n1 ¼ 4; n2 ¼ 5; n3 ¼ n4 ¼ n5 ¼ 0; (b) n1 ¼ 4; n2 ¼ 6; n3 ¼ 3; n4 ¼ n5 ¼ 0
4.106. (a) (i) M ¼ ½1; 2; 0; 1; 0; 3; 0; 0; 1; 2; 0; 1; 0; 0; 0; 0; 1; 2; 0;
(ii) C2, C4, C6; (iii) C1, C3, C5; (iv) C6 ¼ 3C1 þ C3 þ 2C5.
(b) (i) M ¼ ½1; 2; 0; 0; 3; 1; 0; 0; 1; 0; 1; 1; 0; 0; 0; 1; 1; 2; 0;
(ii) C2, C5, C6; (iii) C1, C3, C4; (iv) C6 ¼ C1  C3 þ 2C4
4.107. A and C are row equivalent to
1 0 7
0 1 4
 
, but not B
4.108. U1 and U2 are row equivalent to
1 0 2
0 1 1
 
, but not U3
4.109. U1 and U3 are row equivalent to
1 2 0 1
0 0 1 3
 
; but not U2
4.110. (a) (i) ð1; 3; 1; 2; 1Þ, ð0; 0; 1; 1; 1Þ, ð0; 0; 0; 4; 7Þ; (ii) C1, C3, C4;
(b) (i) ð1; 2; 1; 0; 1Þ, ð0; 0; 1; 1; 2Þ; (ii) C1, C3
162 CHAPTER 4 Vector Spaces
4.113. (a) A ¼
1 1
0 0
 
; B ¼
1 1
0 0
 
; (b) A ¼
1 0
0 0
 
; B ¼
0 2
0 0
 
;
(c) A ¼
1 0
0 0
 
; B ¼
0 0
0 1
 
4.115. dimðU  WÞ ¼ 2, 3, or 4
4.117. (a) (i)
3x þ 4y  z  t ¼ 0
4x þ 2y þ s ¼ 0
(ii)
4x þ 2y  s ¼ 0
9x þ 2y þ z þ t ¼ 0
;
(b) Basis: fð1; 2; 5; 0; 0Þ; ð0; 0; 1; 0; 1Þg; dimðU  WÞ ¼ 2
4.118. The sum is direct in (b) and (c).
4.119. In R2
, let U, V, W be, respectively, the line y ¼ x, the x-axis, the y-axis.
4.122. (a) fð2; 6Þ; ð2; 7Þ; ð3; 7Þ; ð3; 8Þ; ð4; 8Þg; (b) fð2; 3Þ; ð3; 4Þg;
(c) fð3; 6Þ; ð6; 9Þg; (d) fð3; 12Þ; ð3; 15Þ; ð6; 15Þg;
(e and f) fð6; 18Þ; ð6; 21Þ; ð9; 21Þ; ð9; 24Þ; ð12; 24Þg
4.124. (a) Diagonal matrices, (b) V
4.128. (a) [41; 11], (b) [7a  4b; 2a þ b]
4.129. (a) [11; 13; 10], (b) [c  3b þ 7a; c þ 3b  6a; c  2b þ 4a]
4.130. (a) [2; 1; 2; 2], (b) [a; b  c; c  b þ a; d  c þ b  a]
4.131. (a) [7; 1; 13; 10], (b) [d; c  d; b þ c  2d; a  b  2c þ 2d]
4.132. dim W ¼ 2; basis: ft3
þ 2t2
 3t þ 4; t2
þ 2t  1g
4.133. dim W ¼ 2; basis: f½1; 2; 1; 3; 1; 2; ½0; 0; 1; 1; 3; 2g
4.134. (a) False; (1, 1), (1, 2), (2, 1) span R2
; (b) True;
(c) False; (1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), w ¼ ð0; 0; 0; 1Þ;
(d) True; (e) True; (f) True
4.135. (a) True; (b) False; e.g. delete C2 from
1 0 3
0 1 2
 
; (c) True
4.136. (a) 1
2 nðn þ 1Þ, (b) 1
2 nðn  1Þ, (c) n, (d) 1
CHAPTER 4 Vector Spaces 163
Linear Mappings
5.1 Introduction
The main subject matter of linear algebra is the study of linear mappings and their representation by
means of matrices. This chapter introduces us to these linear maps and Chapter 6 shows how they can be
represented by matrices. First, however, we begin with a study of mappings in general.
5.2 Mappings, Functions
Let A and B be arbitrary nonempty sets. Suppose to each element in a 2 A there is assigned a unique
element of B; called the image of a. The collection f of such assignments is called a mapping (or map)
from A into B, and it is denoted by
f : A ! B
The set A is called the domain of the mapping, and B is called the target set. We write f ðaÞ, read ‘‘f of a;’’
for the unique element of B that f assigns to a 2 A.
One may also view a mapping f : A ! B as a computer that, for each input value a 2 A, produces a
unique output f ðaÞ 2 B.
Remark: The term function is used synonymously with the word mapping, although some texts
reserve the word ‘‘function’’ for a real-valued or complex-valued mapping.
Consider a mapping f : A ! B. If A0
is any subset of A, then f ðA0
Þ denotes the set of images of
elements of A0
; and if B0
is any subset of B, then f 1
ðB0
Þ denotes the set of elements of A; each of whose
image lies in B. That is,
f ðA0
Þ ¼ f f ðaÞ : a 2 A0
g and f 1
ðB0
Þ ¼ fa 2 A : f ðaÞ 2 B0
g
We call f ðA0
) the image of A0
and f 1
ðB0
Þ the inverse image or preimage of B0
. In particular, the set of all
images (i.e., f ðAÞ) is called the image or range of f.
To each mapping f : A ! B there corresponds the subset of A B given by fða; f ðaÞÞ : a 2 Ag. We
call this set the graph of f . Two mappings f : A ! B and g : A ! B are defined to be equal, written
f ¼ g, if f ðaÞ ¼ gðaÞ for every a 2 A—that is, if they have the same graph. Thus, we do not distinguish
between a function and its graph. The negation of f ¼ g is written f 6¼ g and is the statement:
There exists an a 2 A for which f ðaÞ 6¼ gðaÞ:
Sometimes the ‘‘barred’’ arrow 7! is used to denote the image of an arbitrary element x 2 A under a
mapping f : A ! B by writing
x 7! f ðxÞ
This is illustrated in the following example.
164
CHAPTER 5
EXAMPLE 5.1
(a) Let f : R ! R be the function that assigns to each real number x its square x2
. We can denote this function by
writing
f ðxÞ ¼ x2
or x 7! x2
Here the image of 3 is 9, so we may write f ð3Þ ¼ 9. However, f 1
ð9Þ ¼ f3; 3g. Also,
f ðRÞ ¼ ½0; 1Þ ¼ fx : x  0g is the image of f.
(b) Let A ¼ fa; b; c; dg and B ¼ fx; y; z; tg. Then the following defines a mapping f : A ! B:
f ðaÞ ¼ y; f ðbÞ ¼ x; f ðcÞ ¼ z; f ðdÞ ¼ y or f ¼ fða; yÞ; ðb; xÞ; ðc; zÞ; ðd; yÞg
The first defines the mapping explicitly, and the second defines the mapping by its graph. Here,
f ðfa; b; dgÞ ¼ f f ðaÞ; f ðbÞ; f ðdÞg ¼ fy; x; yg ¼ fx; yg
Furthermore, f ðAÞ ¼ fx; y; zg is the image of f.
EXAMPLE 5.2 Let V be the vector space of polynomials over R, and let pðtÞ ¼ 3t2
 5t þ 2.
(a) The derivative defines a mapping D : V ! V where, for any polynomials f ðtÞ, we have Dð f Þ ¼ df =dt. Thus,
DðpÞ ¼ Dð3t2
 5t þ 2Þ ¼ 6t  5
(b) The integral, say from 0 to 1, defines a mapping J : V ! R. That is, for any polynomial f ðtÞ,
Jð f Þ ¼
ð1
0
f ðtÞ dt; and so JðpÞ ¼
ð1
0
ð3t2
 5t þ 2Þ ¼ 1
2
Observe that the mapping in (b) is from the vector space V into the scalar field R, whereas the mapping in (a) is from
the vector space V into itself.
Matrix Mappings
Let A be any m n matrix over K. Then A determines a mapping FA : Kn
! Km
by
FAðuÞ ¼ Au
where the vectors in Kn
and Km
are written as columns. For example, suppose
A ¼
1 4 5
2 3 6
 
and u ¼
1
3
5
2
4
3
5
then
FAðuÞ ¼ Au ¼
1 4 5
2 3 6
  1
3
5
2
4
3
5 ¼
36
41
 
Remark: For notational convenience, we will frequently denote the mapping FA by the letter A, the
same symbol as used for the matrix.
Composition of Mappings
Consider two mappings f : A ! B and g : B ! C, illustrated below:
A 
!
f
B 
!
g
C
The composition of f and g, denoted by g  f , is the mapping g  f : A ! C defined by
ðg  f ÞðaÞ  gð f ðaÞÞ
CHAPTER 5 Linear Mappings 165
That is, first we apply f to a 2 A, and then we apply g to f ðaÞ 2 B to get gð f ðaÞÞ 2 C. Viewing f and g
as ‘‘computers,’’ the composition means we first input a 2 A to get the output f ðaÞ 2 B using f , and then
we input f ðaÞ to get the output gð f ðaÞÞ 2 C using g.
Our first theorem tells us that the composition of mappings satisfies the associative law.
THEOREM 5.1: Let f : A ! B, g : B ! C, h : C ! D. Then
h  ðg  f Þ ¼ ðh  gÞ  f
We prove this theorem here. Let a 2 A. Then
ðh  ðg  f ÞÞðaÞ ¼ hððg  f ÞðaÞÞ ¼ hðgð f ðaÞÞÞ
ððh  gÞ  f ÞðaÞ ¼ ðh  gÞð f ðaÞÞ ¼ hðgð f ðaÞÞÞ
Thus, ðh  ðg  f ÞÞðaÞ ¼ ððh  gÞ  f ÞðaÞ for every a 2 A, and so h  ðg  f Þ ¼ ðh  gÞ  f.
One-to-One and Onto Mappings
We formally introduce some special types of mappings.
DEFINITION: A mapping f : A ! B is said to be one-to-one (or 1-1 or injective) if different elements
of A have distinct images; that is,
If f ðaÞ ¼ f ða0
Þ; then a ¼ a0
:
DEFINITION: A mapping f : A ! B is said to be onto (or f maps A onto B or surjective) if every b 2 B
is the image of at least one a 2 A.
DEFINITION: A mapping f : A ! B is said to be a one-to-one correspondence between A and B (or
bijective) if f is both one-to-one and onto.
EXAMPLE 5.3 Let f : R ! R, g : R ! R, h : R ! R be defined by
f ðxÞ ¼ 2x
; gðxÞ ¼ x3
 x; hðxÞ ¼ x2
The graphs of these functions are shown in Fig. 5-1. The function f is one-to-one. Geometrically, this means
that each horizontal line does not contain more than one point of f. The function g is onto. Geometrically,
this means that each horizontal line contains at least one point of g. The function h is neither one-to-one nor
onto. For example, both 2 and 2 have the same image 4, and 16 has no preimage.
Identity and Inverse Mappings
Let A be any nonempty set. The mapping f : A ! A defined by f ðaÞ ¼ a—that is, the function that
assigns to each element in A itself—is called identity mapping. It is usually denoted by 1A or 1 or I. Thus,
for any a 2 A, we have 1AðaÞ ¼ a.
Figure 5-1
166 CHAPTER 5 Linear Mappings
Now let f : A ! B. We call g : B ! A the inverse of f, written f 1
, if
f  g ¼ 1B and g  f ¼ 1A
We emphasize that f has an inverse if and only if f is a one-to-one correspondence between A and B; that
is, f is one-to-one and onto (Problem 5.7). Also, if b 2 B, then f 1
ðbÞ ¼ a, where a is the unique element
of A for which f ðaÞ ¼ b
5.3 Linear Mappings (Linear Transformations)
We begin with a definition.
DEFINITION: Let V and U be vector spaces over the same field K. A mapping F : V ! U is called a
linear mapping or linear transformation if it satisfies the following two conditions:
(1) For any vectors v; w 2 V, Fðv þ wÞ ¼ FðvÞ þ FðwÞ.
(2) For any scalar k and vector v 2 V, FðkvÞ ¼ kFðvÞ.
Namely, F : V ! U is linear if it ‘‘preserves’’ the two basic operations of a vector space, that of
vector addition and that of scalar multiplication.
Substituting k ¼ 0 into condition (2), we obtain Fð0Þ ¼ 0. Thus, every linear mapping takes the zero
vector into the zero vector.
Now for any scalars a; b 2 K and any vector v; w 2 V, we obtain
Fðav þ bwÞ ¼ FðavÞ þ FðbwÞ ¼ aFðvÞ þ bFðwÞ
More generally, for any scalars ai 2 K and any vectors vi 2 V, we obtain the following basic property of
linear mappings:
Fða1v1 þ a2v2 þ    þ amvmÞ ¼ a1Fðv1Þ þ a2Fðv2Þ þ    þ amFðvmÞ
Remark 1: A linear mapping F : V ! U is completely characterized by the condition
Fðav þ bwÞ ¼ aFðvÞ þ bFðwÞ ð*Þ
and so this condition is sometimes used as its defintion.
Remark 2: The term linear transformation rather than linear mapping is frequently used for linear
mappings of the form F : Rn
! Rm
.
EXAMPLE 5.4
(a) Let F : R3
! R3
be the ‘‘projection’’ mapping into the xy-plane; that is, F is the mapping defined by
Fðx; y; zÞ ¼ ðx; y; 0Þ. We show that F is linear. Let v ¼ ða; b; cÞ and w ¼ ða0
; b0
; c0
Þ. Then
Fðv þ wÞ ¼ Fða þ a0
; b þ b0
; c þ c0
Þ ¼ ða þ a0
; b þ b0
; 0Þ
¼ ða; b; 0Þ þ ða0
; b0
; 0Þ ¼ FðvÞ þ FðwÞ
and, for any scalar k,
FðkvÞ ¼ Fðka; kb; kcÞ ¼ ðka; kb; 0Þ ¼ kða; b; 0Þ ¼ kFðvÞ
Thus, F is linear.
(b) Let G : R2
! R2
be the ‘‘translation’’ mapping defined by Gðx; yÞ ¼ ðx þ 1; y þ 2Þ. [That is, G adds the vector
(1, 2) to any vector v ¼ ðx; yÞ in R2
.] Note that
Gð0Þ ¼ Gð0; 0Þ ¼ ð1; 2Þ 6¼ 0
Thus, the zero vector is not mapped into the zero vector. Hence, G is not linear.
CHAPTER 5 Linear Mappings 167
EXAMPLE 5.5 (Derivative and Integral Mappings) Consider the vector space V ¼ PðtÞ of polynomials over the
real field R. Let uðtÞ and vðtÞ be any polynomials in V and let k be any scalar.
(a) Let D : V ! V be the derivative mapping. One proves in calculus that
dðu þ vÞ
dt
¼
du
dt
þ
dv
dt
and
dðkuÞ
dt
¼ k
du
dt
That is, Dðu þ vÞ ¼ DðuÞ þ DðvÞ and DðkuÞ ¼ kDðuÞ. Thus, the derivative mapping is linear.
(b) Let J : V ! R be an integral mapping, say
Jð f ðtÞÞ ¼
ð1
0
f ðtÞ dt
One also proves in calculus that,
ð1
0
½uðtÞ þ vðtÞdt ¼
ð1
0
uðtÞ dt þ
ð1
0
vðtÞ dt
and
ð1
0
kuðtÞ dt ¼ k
ð1
0
uðtÞ dt
That is, Jðu þ vÞ ¼ JðuÞ þ JðvÞ and JðkuÞ ¼ kJðuÞ. Thus, the integral mapping is linear.
EXAMPLE 5.6 (Zero and Identity Mappings)
(a) Let F : V ! U be the mapping that assigns the zero vector 0 2 U to every vector v 2 V. Then, for any vectors
v; w 2 V and any scalar k 2 K, we have
Fðv þ wÞ ¼ 0 ¼ 0 þ 0 ¼ FðvÞ þ FðwÞ and FðkvÞ ¼ 0 ¼ k0 ¼ kFðvÞ
Thus, F is linear. We call F the zero mapping, and we usually denote it by 0.
(b) Consider the identity mapping I : V ! V, which maps each v 2 V into itself. Then, for any vectors v; w 2 V
and any scalars a; b 2 K, we have
Iðav þ bwÞ ¼ av þ bw ¼ aIðvÞ þ bIðwÞ
Thus, I is linear.
Our next theorem (proved in Problem 5.13) gives us an abundance of examples of linear mappings. In
particular, it tells us that a linear mapping is completely determined by its values on the elements of a basis.
THEOREM 5.2: Let V and U be vector spaces over a field K. Let fv1; v2; . . . ; vng be a basis of V and
let u1; u2; . . . ; un be any vectors in U. Then there exists a unique linear mapping
F : V ! U such that Fðv1Þ ¼ u1; Fðv2Þ ¼ u2; . . . ; FðvnÞ ¼ un.
We emphasize that the vectors u1; u2; . . . ; un in Theorem 5.2 are completely arbitrary; they may be
linearly dependent or they may even be equal to each other.
Matrices as Linear Mappings
Let A be any real m n matrix. Recall that A determines a mapping FA : Kn
! Km
by FAðuÞ ¼ Au
(where the vectors in Kn
and Km
are written as columns). We show FA is linear. By matrix multiplication,
FAðv þ wÞ ¼ Aðv þ wÞ ¼ Av þ Aw ¼ FAðvÞ þ FAðwÞ
FAðkvÞ ¼ AðkvÞ ¼ kðAvÞ ¼ kFAðvÞ
In other words, using A to represent the mapping, we have
Aðv þ wÞ ¼ Av þ Aw and AðkvÞ ¼ kðAvÞ
Thus, the matrix mapping A is linear.
168 CHAPTER 5 Linear Mappings
Vector Space Isomorphism
The notion of two vector spaces being isomorphic was defined in Chapter 4 when we investigated the
coordinates of a vector relative to a basis. We now redefine this concept.
DEFINITION: Two vector spaces V and U over K are isomorphic, written V ffi U, if there exists a
bijective (one-to-one and onto) linear mapping F : V ! U. The mapping F is then
called an isomorphism between V and U.
Consider any vector space V of dimension n and let S be any basis of V. Then the mapping
v 7! ½vS
which maps each vector v 2 V into its coordinate vector ½vS, is an isomorphism between V and Kn
.
5.4 Kernel and Image of a Linear Mapping
We begin by defining two concepts.
DEFINITION: Let F : V ! U be a linear mapping. The kernel of F, written Ker F, is the set of
elements in V that map into the zero vector 0 in U; that is,
Ker F ¼ fv 2 V : FðvÞ ¼ 0g
The image (or range) of F, written Im F, is the set of image points in U; that is,
Im F ¼ fu 2 U : there exists v 2 V for which FðvÞ ¼ ug
The following theorem is easily proved (Problem 5.22).
THEOREM 5.3: Let F : V ! U be a linear mapping. Then the kernel of F is a subspace of V and the
image of F is a subspace of U.
Now suppose that v1; v2; . . . ; vm span a vector space V and that F : V ! U is linear. We show that
Fðv1Þ; Fðv2Þ; . . . ; FðvmÞ span Im F. Let u 2 Im F. Then there exists v 2 V such that FðvÞ ¼ u. Because
the vi’s span V and v 2 V, there exist scalars a1; a2; . . . ; am for which
v ¼ a1v1 þ a2v2 þ    þ amvm
Therefore,
u ¼ FðvÞ ¼ Fða1v1 þ a2v2 þ    þ amvmÞ ¼ a1Fðv1Þ þ a2Fðv2Þ þ    þ amFðvmÞ
Thus, the vectors Fðv1Þ; Fðv2Þ; . . . ; FðvmÞ span Im F.
We formally state the above result.
PROPOSITION 5.4: Suppose v1; v2; . . . ; vm span a vector space V, and suppose F : V ! U is linear.
Then Fðv1Þ; Fðv2Þ; . . . ; FðvmÞ span Im F.
EXAMPLE 5.7
(a) Let F : R3
! R3
be the projection of a vector v into the xy-plane [as pictured in Fig. 5-2(a)]; that is,
Fðx; y; zÞ ¼ ðx; y; 0Þ
Clearly the image of F is the entire xy-plane—that is, points of the form (x; y; 0). Moreover, the kernel of F is
the z-axis—that is, points of the form (0; 0; c). That is,
Im F ¼ fða; b; cÞ : c ¼ 0g ¼ xy-plane and Ker F ¼ fða; b; cÞ : a ¼ 0; b ¼ 0g ¼ z-axis
(b) Let G : R3
! R3
be the linear mapping that rotates a vector v about the z-axis through an angle y [as pictured in
Fig. 5-2(b)]; that is,
Gðx; y; zÞ ¼ ðx cos y  y sin y; x sin y þ y cos y; zÞ
CHAPTER 5 Linear Mappings 169
Observe that the distance of a vector v from the origin O does not change under the rotation, and so only the zero
vector 0 is mapped into the zero vector 0. Thus, Ker G ¼ f0g. On the other hand, every vector u in R3
is the image
of a vector v in R3
that can be obtained by rotating u back by an angle of y. Thus, Im G ¼ R3
, the entire space.
EXAMPLE 5.8 Consider the vector space V ¼ PðtÞ of polynomials over the real field R, and let H : V ! V be the
third-derivative operator; that is, H½ f ðtÞ ¼ d3
f =dt3
. [Sometimes the notation D3
is used for H, where D is the
derivative operator.] We claim that
Ker H ¼ fpolynomials of degree  2g ¼ P2ðtÞ and Im H ¼ V
The first comes from the fact that Hðat2
þ bt þ cÞ ¼ 0 but Hðtn
Þ 6¼ 0 for n  3. The second comes from that fact
that every polynomial gðtÞ in V is the third derivative of some polynomial f ðtÞ (which can be obtained by taking the
antiderivative of gðtÞ three times).
Kernel and Image of Matrix Mappings
Consider, say, a 3 4 matrix A and the usual basis fe1; e2; e3; e4g of K4
(written as columns):
A ¼
a1 a2 a3 a4
b1 b2 b3 b4
c1 c2 c3 c4
2
4
3
5; e1 ¼
1
0
0
0
2
6
6
4
3
7
7
5; e2 ¼
1
0
0
0
2
6
6
4
3
7
7
5; e3 ¼
1
0
0
0
2
6
6
4
3
7
7
5; e4 ¼
1
0
0
0
2
6
6
4
3
7
7
5
Recall that A may be viewed as a linear mapping A : K4
! K3
, where the vectors in K4
and K3
are
viewed as column vectors. Now the usual basis vectors span K4
, so their images Ae1, Ae2, Ae3, Ae4 span
the image of A. But the vectors Ae1, Ae2, Ae3, Ae4 are precisely the columns of A:
Ae1 ¼ ½a1; b1; c1T
; Ae2 ¼ ½a2; b2; c2T
; Ae3 ¼ ½a3; b3; c3T
; Ae4 ¼ ½a4; b4; c4T
Thus, the image of A is precisely the column space of A.
On the other hand, the kernel of A consists of all vectors v for which Av ¼ 0. This means that the
kernel of A is the solution space of the homogeneous system AX ¼ 0, called the null space of A.
We state the above results formally.
PROPOSITION 5.5: Let A be any m n matrix over a field K viewed as a linear map A : Kn
! Km
. Then
Ker A ¼ nullspðAÞ and Im A ¼ colspðAÞ
Here colsp(A) denotes the column space of A, and nullsp(A) denotes the null space of A.
Figure 5-2
170 CHAPTER 5 Linear Mappings
Rank and Nullity of a Linear Mapping
Let F : V ! U be a linear mapping. The rank of F is defined to be the dimension of its image, and the
nullity of F is defined to be the dimension of its kernel; namely,
rankðFÞ ¼ dimðIm FÞ and nullityðFÞ ¼ dimðKer FÞ
The following important theorem (proved in Problem 5.23) holds.
THEOREM 5.6 Let V be of finite dimension, and let F : V ! U be linear. Then
dim V ¼ dimðKer FÞ þ dimðIm FÞ ¼ nullityðFÞ þ rankðFÞ
Recall that the rank of a matrix A was also defined to be the dimension of its column space and row
space. If we now view A as a linear mapping, then both definitions correspond, because the image of A is
precisely its column space.
EXAMPLE 5.9 Let F : R4
! R3
be the linear mapping defined by
Fðx; y; z; tÞ ¼ ðx  y þ z þ t; 2x  2y þ 3z þ 4t; 3x  3y þ 4z þ 5tÞ
(a) Find a basis and the dimension of the image of F.
First find the image of the usual basis vectors of R4
,
Fð1; 0; 0; 0Þ ¼ ð1; 2; 3Þ; Fð0; 0; 1; 0Þ ¼ ð1; 3; 4Þ
Fð0; 1; 0; 0Þ ¼ ð1; 2; 3Þ; Fð0; 0; 0; 1Þ ¼ ð1; 4; 5Þ
By Proposition 5.4, the image vectors span Im F. Hence, form the matrix M whose rows are these image vectors
and row reduce to echelon form:
M ¼
1 2 3
1 2 3
1 3 4
1 4 5
2
6
6
4
3
7
7
5
1 2 3
0 0 0
0 1 1
0 2 2
2
6
6
4
3
7
7
5
1 2 3
0 1 1
0 0 0
0 0 0
2
6
6
4
3
7
7
5
Thus, (1, 2, 3) and (0, 1, 1) form a basis of Im F. Hence, dimðIm FÞ ¼ 2 and rankðFÞ ¼ 2.
(b) Find a basis and the dimension of the kernel of the map F.
Set FðvÞ ¼ 0, where v ¼ ðx; y; z; tÞ,
Fðx; y; z; tÞ ¼ ðx  y þ z þ t; 2x  2y þ 3z þ 4t; 3x  3y þ 4z þ 5tÞ ¼ ð0; 0; 0Þ
Set corresponding components equal to each other to form the following homogeneous system whose solution
space is Ker F:
x  y þ z þ t ¼ 0
2x  2y þ 3z þ 4t ¼ 0
3x  3y þ 4z þ 5t ¼ 0
or
x  y þ z þ t ¼ 0
z þ 2t ¼ 0
z þ 2t ¼ 0
or
x  y þ z þ t ¼ 0
z þ 2t ¼ 0
The free variables are y and t. Hence, dimðKer FÞ ¼ 2 or nullityðFÞ ¼ 2.
(i) Set y ¼ 1, t ¼ 0 to obtain the solution (1; 1; 0; 0Þ,
(ii) Set y ¼ 0, t ¼ 1 to obtain the solution (1; 0; 2; 1Þ.
Thus, (1; 1; 0; 0) and (1; 0; 2; 1) form a basis for Ker F.
As expected from Theorem 5.6, dimðIm FÞ þ dimðKer FÞ ¼ 4 ¼ dim R4
.
Application to Systems of Linear Equations
Let AX ¼ B denote the matrix form of a system of m linear equations in n unknowns. Now the matrix A
may be viewed as a linear mapping
A : Kn
! Km
CHAPTER 5 Linear Mappings 171
Thus, the solution of the equation AX ¼ B may be viewed as the preimage of the vector B 2 Km
under the
linear mapping A. Furthermore, the solution of the associated homogeneous system
AX ¼ 0
may be viewed as the kernel of the linear mapping A. Applying Theorem 5.6 to this homogeneous system
yields
dimðKer AÞ ¼ dim Kn
 dimðIm AÞ ¼ n  rank A
But n is exactly the number of unknowns in the homogeneous system AX ¼ 0. Thus, we have proved the
following theorem of Chapter 4.
THEOREM 4.19: The dimension of the solution space W of a homogenous system AX ¼ 0 of linear
equations is s ¼ n  r, where n is the number of unknowns and r is the rank of the
coefficient matrix A.
Observe that r is also the number of pivot variables in an echelon form of AX ¼ 0, so s ¼ n  r is also
the number of free variables. Furthermore, the s solution vectors of AX ¼ 0 described in Theorem 3.14
are linearly independent (Problem 4.52). Accordingly, because dim W ¼ s, they form a basis for the
solution space W. Thus, we have also proved Theorem 3.14.
5.5 Singular and Nonsingular Linear Mappings, Isomorphisms
Let F : V ! U be a linear mapping. Recall that Fð0Þ ¼ 0. F is said to be singular if the image of some
nonzero vector v is 0—that is, if there exists v 6¼ 0 such that FðvÞ ¼ 0. Thus, F : V ! U is nonsingular if
the zero vector 0 is the only vector whose image under F is 0 or, in other words, if Ker F ¼ f0g.
EXAMPLE 5.10 Consider the projection map F : R3
! R3
and the rotation map G : R3
! R3
appearing in
Fig. 5-2. (See Example 5.7.) Because the kernel of F is the z-axis, F is singular. On the other hand, the kernel of G
consists only of the zero vector 0. Thus, G is nonsingular.
Nonsingular linear mappings may also be characterized as those mappings that carry independent sets
into independent sets. Specifically, we prove (Problem 5.28) the following theorem.
THEOREM 5.7: Let F : V ! U be a nonsingular linear mapping. Then the image of any linearly
independent set is linearly independent.
Isomorphisms
Suppose a linear mapping F : V ! U is one-to-one. Then only 0 2 V can map into 0 2 U, and so F is
nonsingular. The converse is also true. For suppose F is nonsingular and FðvÞ ¼ FðwÞ, then
Fðv  wÞ ¼ FðvÞ  FðwÞ ¼ 0, and hence, v  w ¼ 0 or v ¼ w. Thus, FðvÞ ¼ FðwÞ implies v ¼ w—
that is, F is one-to-one. We have proved the following proposition.
PROPOSITION 5.8: A linear mapping F : V ! U is one-to-one if and only if F is nonsingular.
Recall that a mapping F : V ! U is called an isomorphism if F is linear and if F is bijective (i.e., if F
is one-to-one and onto). Also, recall that a vector space V is said to be isomorphic to a vector space U,
written V ffi U, if there is an isomorphism F : V ! U.
The following theorem (proved in Problem 5.29) applies.
THEOREM 5.9: Suppose V has finite dimension and dim V ¼ dim U. Suppose F : V ! U is linear.
Then F is an isomorphism if and only if F is nonsingular.
172 CHAPTER 5 Linear Mappings
5.6 Operations with Linear Mappings
We are able to combine linear mappings in various ways to obtain new linear mappings. These operations
are very important and will be used throughout the text.
Let F : V ! U and G : V ! U be linear mappings over a field K. The sum F þ G and the scalar
product kF, where k 2 K, are defined to be the following mappings from V into U:
ðF þ GÞðvÞ  FðvÞ þ GðvÞ and ðkFÞðvÞ  kFðvÞ
We now show that if F and G are linear, then F þ G and kF are also linear. Specifically, for any vectors
v; w 2 V and any scalars a; b 2 K,
ðF þ GÞðav þ bwÞ ¼ Fðav þ bwÞ þ Gðav þ bwÞ
¼ aFðvÞ þ bFðwÞ þ aGðvÞ þ bGðwÞ
¼ a½FðvÞ þ GðvÞ þ b½FðwÞ þ GðwÞ
¼ aðF þ GÞðvÞ þ bðF þ GÞðwÞ
and ðkFÞðav þ bwÞ ¼ kFðav þ bwÞ ¼ k½aFðvÞ þ bFðwÞ
¼ akFðvÞ þ bkFðwÞ ¼ aðkFÞðvÞ þ bðkFÞðwÞ
Thus, F þ G and kF are linear.
The following theorem holds.
THEOREM 5.10: Let V and U be vector spaces over a field K. Then the collection of all linear
mappings from V into U with the above operations of addition and scalar multi-
plication forms a vector space over K.
The vector space of linear mappings in Theorem 5.10 is usually denoted by
HomðV; UÞ
Here Hom comes from the word ‘‘homomorphism.’’ We emphasize that the proof of Theorem 5.10
reduces to showing that HomðV; UÞ does satisfy the eight axioms of a vector space. The zero element of
HomðV; UÞ is the zero mapping from V into U, denoted by 0 and defined by
0ðvÞ ¼ 0
for every vector v 2 V.
Suppose V and U are of finite dimension. Then we have the following theorem.
THEOREM 5.11: Suppose dim V ¼ m and dim U ¼ n. Then dim½HomðV; UÞ ¼ mn.
Composition of Linear Mappings
Now suppose V, U, and W are vector spaces over the same field K, and suppose F : V ! U and
G : U ! W are linear mappings. We picture these mappings as follows:
V 
!
F
U 
!
G
W
Recall that the composition function G  F is the mapping from V into W defined by
ðG  FÞðvÞ ¼ GðFðvÞÞ. We show that G  F is linear whenever F and G are linear. Specifically, for
any vectors v; w 2 V and any scalars a; b 2 K, we have
ðG  FÞðav þ bwÞ ¼ GðFðav þ bwÞÞ ¼ GðaFðvÞ þ bFðwÞÞ
¼ aGðFðvÞÞ þ bGðFðwÞÞ ¼ aðG  FÞðvÞ þ bðG  FÞðwÞ
Thus, G  F is linear.
The composition of linear mappings and the operations of addition and scalar multiplication are
related as follows.
CHAPTER 5 Linear Mappings 173
THEOREM 5.12: Let V, U, W be vector spaces over K. Suppose the following mappings are linear:
F : V ! U; F0
: V ! U and G : U ! W; G0
: U ! W
Then, for any scalar k 2 K:
(i) G  ðF þ F0
Þ ¼ G  F þ G  F0
.
(ii) ðG þ G0
Þ  F ¼ G  F þ G0  F.
(iii) kðG  FÞ ¼ ðkGÞ  F ¼ G  ðkFÞ.
5.7 Algebra AðVÞ of Linear Operators
Let V be a vector space over a field K. This section considers the special case of linear mappings from the
vector space V into itself—that is, linear mappings of the form F : V ! V. They are also called linear
operators or linear transformations on V. We will write AðVÞ, instead of HomðV; VÞ, for the space of all
such mappings.
Now AðVÞ is a vector space over K (Theorem 5.8), and, if dim V ¼ n, then dim AðVÞ ¼ n2
. Moreover,
for any mappings F; G 2 AðVÞ, the composition G  F exists and also belongs to AðVÞ. Thus, we have a
‘‘multiplication’’ defined in AðVÞ. [We sometimes write FG instead of G  F in the space AðVÞ.]
Remark: An algebra A over a field K is a vector space over K in which an operation of
multiplication is defined satisfying, for every F; G; H 2 A and every k 2 K:
(i) FðG þ HÞ ¼ FG þ FH,
(ii) ðG þ HÞF ¼ GF þ HF,
(iii) kðGFÞ ¼ ðkGÞF ¼ GðkFÞ.
The algebra is said to be associative if, in addition, ðFGÞH ¼ FðGHÞ.
The above definition of an algebra and previous theorems give us the following result.
THEOREM 5.13: Let V be a vector space over K. Then AðVÞ is an associative algebra over K with
respect to composition of mappings. If dim V ¼ n, then dim AðVÞ ¼ n2
.
This is why AðVÞ is called the algebra of linear operators on V.
Polynomials and Linear Operators
Observe that the identity mapping I : V ! V belongs to AðVÞ. Also, for any linear operator F in AðVÞ,
we have FI ¼ IF ¼ F. We can also form ‘‘powers’’ of F. Namely, we define
F0
¼ I; F2
¼ F  F; F3
¼ F2  F ¼ F  F  F; F4
¼ F3  F; . . .
Furthermore, for any polynomial pðtÞ over K, say,
pðtÞ ¼ a0 þ a1t þ a2t2
þ    þ ast2
we can form the linear operator pðFÞ defined by
pðFÞ ¼ a0I þ a1F þ a2F2
þ    þ asFs
(For any scalar k, the operator kI is sometimes denoted simply by k.) In particular, we say F is a zero of
the polynomial pðtÞ if pðFÞ ¼ 0.
EXAMPLE 5.11 Let F : K3
! K3
be defined by Fðx; y; zÞ ¼ ð0; x; yÞ. For any ða; b; cÞ 2 K3
,
ðF þ IÞða; b; cÞ ¼ ð0; a; bÞ þ ða; b; cÞ ¼ ða; a þ b; b þ cÞ
F3
ða; b; cÞ ¼ F2
ð0; a; bÞ ¼ Fð0; 0; aÞ ¼ ð0; 0; 0Þ
Thus, F3
¼ 0, the zero mapping in AðVÞ. This means F is a zero of the polynomial pðtÞ ¼ t3
.
174 CHAPTER 5 Linear Mappings
Square Matrices as Linear Operators
Let M ¼ Mn;n be the vector space of all square n n matrices over K. Then any matrix A in M defines a
linear mapping FA : Kn
! Kn
by FAðuÞ ¼ Au (where the vectors in Kn
are written as columns). Because the
mapping is from Kn
into itself, the square matrix A is a linear operator, not simply a linear mapping.
Suppose A and B are matrices in M. Then the matrix product AB is defined. Furthermore, for any
(column) vector u in Kn
,
FABðuÞ ¼ ðABÞu ¼ AðBuÞ ¼ AðFBðUÞÞ ¼ FAðFBðuÞÞ ¼ ðFA  FBÞðuÞ
In other words, the matrix product AB corresponds to the composition of A and B as linear mappings.
Similarly, the matrix sum A þ B corresponds to the sum of A and B as linear mappings, and the scalar
product kA corresponds to the scalar product of A as a linear mapping.
Invertible Operators in AðVÞ
Let F : V ! V be a linear operator. F is said to be invertible if it has an inverse—that is, if there exists
F1
in AðVÞ such that FF1
¼ F1
F ¼ I. On the other hand, F is invertible as a mapping if F is both
one-to-one and onto. In such a case, F1
is also linear and F1
is the inverse of F as a linear operator
(proved in Problem 5.15).
Suppose F is invertible. Then only 0 2 V can map into itself, and so F is nonsingular. The converse is
not true, as seen by the following example.
EXAMPLE 5.12 Let V ¼ PðtÞ, the vector space of polynomials over K. Let F be the mapping on V that increases
by 1 the exponent of t in each term of a polynomial; that is,
Fða0 þ a1t þ a2t2
þ    þ asts
Þ ¼ a0t þ a1t2
þ a2t3
þ    þ astsþ1
Then F is a linear mapping and F is nonsingular. However, F is not onto, and so F is not invertible.
The vector space V ¼ PðtÞ in the above example has infinite dimension. The situation changes
significantly when V has finite dimension. Namely, the following theorem applies.
THEOREM 5.14: Let F be a linear operator on a finite-dimensional vector space V. Then the following
four conditions are equivalent.
(i) F is nonsingular: Ker F ¼ f0g. (iii) F is an onto mapping.
(ii) F is one-to-one. (iv) F is invertible.
The proof of the above theorem mainly follows from Theorem 5.6, which tells us that
dim V ¼ dimðKer FÞ þ dimðIm FÞ
By Proposition 5.8, (i) and (ii) are equivalent. Note that (iv) is equivalent to (ii) and (iii). Thus, to prove
the theorem, we need only show that (i) and (iii) are equivalent. This we do below.
(a) Suppose (i) holds. Then dimðKer FÞ ¼ 0, and so the above equation tells us that dim V ¼ dimðIm FÞ.
This means V ¼ Im F or, in other words, F is an onto mapping. Thus, (i) implies (iii).
(b) Suppose (iii) holds. Then V ¼ Im F, and so dim V ¼ dimðIm FÞ. Therefore, the above equation
tells us that dimðKer FÞ ¼ 0, and so F is nonsingular. Therefore, (iii) implies (i).
Accordingly, all four conditions are equivalent.
Remark: Suppose A is a square n n matrix over K. Then A may be viewed as a linear operator on
Kn
. Because Kn
has finite dimension, Theorem 5.14 holds for the square matrix A. This is why the terms
‘‘nonsingular’’ and ‘‘invertible’’ are used interchangeably when applied to square matrices.
EXAMPLE 5.13 Let F be the linear operator on R2
defined by Fðx; yÞ ¼ ð2x þ y; 3x þ 2yÞ.
(a) To show that F is invertible, we need only show that F is nonsingular. Set Fðx; yÞ ¼ ð0; 0Þ to obtain the
homogeneous system
2x þ y ¼ 0 and 3x þ 2y ¼ 0
CHAPTER 5 Linear Mappings 175
Solve for x and y to get x ¼ 0, y ¼ 0. Hence, F is nonsingular and so invertible.
(b) To find a formula for F1
, we set Fðx; yÞ ¼ ðs; tÞ and so F1
ðs; tÞ ¼ ðx; yÞ. We have
ð2x þ y; 3x þ 2yÞ ¼ ðs; tÞ or
2x þ y ¼ s
3x þ 2y ¼ t
Solve for x and y in terms of s and t to obtain x ¼ 2s  t, y ¼ 3s þ 2t. Thus,
F1
ðs; tÞ ¼ ð2s  t; 3s þ 2tÞ or F1
ðx; yÞ ¼ ð2x  y; 3x þ 2yÞ
where we rewrite the formula for F1
using x and y instead of s and t.
SOLVED PROBLEMS
Mappings
5.1. State whether each diagram in Fig. 5-3 defines a mapping from A ¼ fa; b; cg into B ¼ fx; y; zg.
(a) No. There is nothing assigned to the element b 2 A.
(b) No. Two elements, x and z, are assigned to c 2 A.
(c) Yes.
5.2. Let f : A ! B and g : B ! C be defined by Fig. 5-4.
(a) Find the composition mapping ðg  f Þ : A ! C.
(b) Find the images of the mappings f , g, g  f .
(a) Use the definition of the composition mapping to compute
ðg  f Þ ðaÞ ¼ gð f ðaÞÞ ¼ gðyÞ ¼ t; ðg  f Þ ðbÞ ¼ gð f ðbÞÞ ¼ gðxÞ ¼ s
ðg  f Þ ðcÞ ¼ gð f ðcÞÞ ¼ gðyÞ ¼ t
Observe that we arrive at the same answer if we ‘‘follow the arrows’’ in Fig. 5-4:
a ! y ! t; b ! x ! s; c ! y ! t
(b) By Fig. 5-4, the image values under the mapping f are x and y, and the image values under g are r, s, t.
Figure 5-3
Figure 5-4
176 CHAPTER 5 Linear Mappings
Hence,
Im f ¼ fx; yg and Im g ¼ fr; s; tg
Also, by part (a), the image values under the composition mapping g  f are t and s; accordingly,
Im g  f ¼ fs; tg. Note that the images of g and g  f are different.
5.3. Consider the mapping F : R3
! R2
defined by Fðx; y; zÞ ¼ ðyz; x2
Þ. Find
(a) Fð2; 3; 4Þ; (b) Fð5; 2; 7Þ; (c) F1
ð0; 0Þ, that is, all v 2 R3
such that FðvÞ ¼ 0.
(a) Substitute in the formula for F to get Fð2; 3; 4Þ ¼ ð3  4; 22
Þ ¼ ð12; 4Þ.
(b) Fð5; 2; 7Þ ¼ ð2  7; 52
Þ ¼ ð14; 25Þ.
(c) Set FðvÞ ¼ 0, where v ¼ ðx; y; zÞ, and then solve for x, y, z:
Fðx; y; zÞ ¼ ðyz; x2
Þ ¼ ð0; 0Þ or yz ¼ 0; x2
¼ 0
Thus, x ¼ 0 and either y ¼ 0 or z ¼ 0. In other words, x ¼ 0, y ¼ 0 or x ¼ 0; z ¼ 0—that is, the z-axis
and the y-axis.
5.4. Consider the mapping F : R2
! R2
defined by Fðx; yÞ ¼ ð3y; 2xÞ. Let S be the unit circle in R2
,
that is, the solution set of x2
þ y2
¼ 1. (a) Describe FðSÞ. (b) Find F1
ðSÞ.
(a) Let (a; b) be an element of FðSÞ. Then there exists ðx; yÞ 2 S such that Fðx; yÞ ¼ ða; bÞ. Hence,
ð3y; 2xÞ ¼ ða; bÞ or 3y ¼ a; 2x ¼ b or y ¼
a
3
; x ¼
b
2
Because ðx; yÞ 2 S—that is, x2
þ y2
¼ 1—we have
b
2
 2
þ
a
3
2
¼ 1 or
a2
9
þ
b2
4
¼ 1
Thus, FðSÞ is an ellipse.
(b) Let Fðx; yÞ ¼ ða; bÞ, where ða; bÞ 2 S. Then ð3y; 2xÞ ¼ ða; bÞ or 3y ¼ a, 2x ¼ b. Because ða; bÞ 2 S, we
have a2
þ b2
¼ 1. Thus, ð3yÞ2
þ ð2xÞ2
¼ 1. Accordingly, F1
ðSÞ is the ellipse 4x2
þ 9y2
¼ 1.
5.5. Let the mappings f : A ! B, g : B ! C, h : C ! D be defined by Fig. 5-5. Determine whether or
not each function is (a) one-to-one; (b) onto; (c) invertible (i.e., has an inverse).
(a) The mapping f : A ! B is one-to-one, as each element of A has a different image. The mapping
g : B ! C is not one-to one, because x and z both have the same image 4. The mapping h : C ! D is
one-to-one.
(b) The mapping f : A ! B is not onto, because z 2 B is not the image of any element of A. The mapping
g : B ! C is onto, as each element of C is the image of some element of B. The mapping h : C ! D is
also onto.
(c) A mapping has an inverse if and only if it is one-to-one and onto. Hence, only h has an inverse.
z
y
x
w
B g
f C h
5
6
4
1 a
D
b
c
2
3
A
Figure 5-5
CHAPTER 5 Linear Mappings 177
5.6. Suppose f : A ! B and g : B ! C. Hence, ðg  f Þ : A ! C exists. Prove
(a) If f and g are one-to-one, then g  f is one-to-one.
(b) If f and g are onto mappings, then g  f is an onto mapping.
(c) If g  f is one-to-one, then f is one-to-one.
(d) If g  f is an onto mapping, then g is an onto mapping.
(a) Suppose ðg  f ÞðxÞ ¼ ðg  f ÞðyÞ. Then gð f ðxÞÞ ¼ gð f ðyÞÞ. Because g is one-to-one, f ðxÞ ¼ f ðyÞ.
Because f is one-to-one, x ¼ y. We have proven that ðg  f ÞðxÞ ¼ ðg  f ÞðyÞ implies x ¼ y; hence g  f
is one-to-one.
(b) Suppose c 2 C. Because g is onto, there exists b 2 B for which gðbÞ ¼ c. Because f is onto, there exists
a 2 A for which f ðaÞ ¼ b. Thus, ðg  f ÞðaÞ ¼ gð f ðaÞÞ ¼ gðbÞ ¼ c. Hence, g  f is onto.
(c) Suppose f is not one-to-one. Then there exist distinct elements x; y 2 A for which f ðxÞ ¼ f ðyÞ. Thus,
ðg  f ÞðxÞ ¼ gð f ðxÞÞ ¼ gð f ðyÞÞ ¼ ðg  f ÞðyÞ. Hence, g  f is not one-to-one. Therefore, if g  f is one-to-
one, then f must be one-to-one.
(d) If a 2 A, then ðg  f ÞðaÞ ¼ gð f ðaÞÞ 2 gðBÞ. Hence, ðg  f ÞðAÞ  gðBÞ. Suppose g is not onto. Then gðBÞ
is properly contained in C and so ðg  f ÞðAÞ is properly contained in C; thus, g  f is not onto.
Accordingly, if g  f is onto, then g must be onto.
5.7. Prove that f : A ! B has an inverse if and only if f is one-to-one and onto.
Suppose f has an inverse—that is, there exists a function f 1
: B ! A for which f 1  f ¼ 1A and
f  f 1
¼ 1B. Because 1A is one-to-one, f is one-to-one by Problem 5.6(c), and because 1B is onto, f is onto
by Problem 5.6(d); that is, f is both one-to-one and onto.
Now suppose f is both one-to-one and onto. Then each b 2 B is the image of a unique element in A, say
b*. Thus, if f ðaÞ ¼ b, then a ¼ b*; hence, f ðb*Þ ¼ b. Now let g denote the mapping from B to A defined by
b 7! b*. We have
(i) ðg  f ÞðaÞ ¼ gð f ðaÞÞ ¼ gðbÞ ¼ b* ¼ a for every a 2 A; hence, g  f ¼ 1A.
(ii) ð f  gÞðbÞ ¼ f ðgðbÞÞ ¼ f ðb*Þ ¼ b for every b 2 B; hence, f  g ¼ 1B.
Accordingly, f has an inverse. Its inverse is the mapping g.
5.8. Let f : R ! R be defined by f ðxÞ ¼ 2x  3. Now f is one-to-one and onto; hence, f has an inverse
mapping f 1
. Find a formula for f 1
.
Let y be the image of x under the mapping f ; that is, y ¼ f ðxÞ ¼ 2x  3. Hence, x will be the image of y
under the inverse mapping f 1
. Thus, solve for x in terms of y in the above equation to obtain x ¼ 1
2 ðy þ 3Þ.
Then the formula defining the inverse function is f 1
ðyÞ ¼ 1
2 ðy þ 3Þ, or, using x instead of y, f 1
ðxÞ ¼ 1
2 ðx þ 3Þ.
Linear Mappings
5.9. Suppose the mapping F : R2
! R2
is defined by Fðx; yÞ ¼ ðx þ y; xÞ. Show that F is linear.
We need to show that Fðv þ wÞ ¼ FðvÞ þ FðwÞ and FðkvÞ ¼ kFðvÞ, where u and v are any elements of
R2
and k is any scalar. Let v ¼ ða; bÞ and w ¼ ða0
; b0
Þ. Then
v þ w ¼ ða þ a0
; b þ b0
Þ and kv ¼ ðka; kbÞ
We have FðvÞ ¼ ða þ b; aÞ and FðwÞ ¼ ða0
þ b0
; a0
Þ. Thus,
Fðv þ wÞ ¼ Fða þ a0
; b þ b0
Þ ¼ ða þ a0
þ b þ b0
; a þ a0
Þ
¼ ða þ b; aÞ þ ða0
þ b0
; a0
Þ ¼ FðvÞ þ FðwÞ
and
FðkvÞ ¼ Fðka; kbÞ ¼ ðka þ kb; kaÞ ¼ kða þ b; aÞ ¼ kFðvÞ
Because v, w, k were arbitrary, F is linear.
178 CHAPTER 5 Linear Mappings
5.10. Suppose F : R3
! R2
is defined by Fðx; y; zÞ ¼ ðx þ y þ z; 2x  3y þ 4zÞ. Show that F is linear.
We argue via matrices. Writing vectors as columns, the mapping F may be written in the form
FðvÞ ¼ Av, where v ¼ ½x; y; zT
and
A ¼
1 1 1
2 3 4
 
Then, using properties of matrices, we have
Fðv þ wÞ ¼ Aðv þ wÞ ¼ Av þ Aw ¼ FðvÞ þ FðwÞ
FðkvÞ ¼ AðkvÞ ¼ kðAvÞ ¼ kFðvÞ
and
Thus, F is linear.
5.11. Show that the following mappings are not linear:
(a) F : R2
! R2
defined by Fðx; yÞ ¼ ðxy; xÞ
(b) F : R2
! R3
defined by Fðx; yÞ ¼ ðx þ 3; 2y; x þ yÞ
(c) F : R3
! R2
defined by Fðx; y; zÞ ¼ ðjxj; y þ zÞ
(a) Let v ¼ ð1; 2Þ and w ¼ ð3; 4Þ; then v þ w ¼ ð4; 6Þ. Also,
FðvÞ ¼ ð1ð2Þ; 1Þ ¼ ð2; 1Þ and FðwÞ ¼ ð3ð4Þ; 3Þ ¼ ð12; 3Þ
Hence,
Fðv þ wÞ ¼ ð4ð6Þ; 4Þ ¼ ð24; 6Þ 6¼ FðvÞ þ FðwÞ
(b) Because Fð0; 0Þ ¼ ð3; 0; 0Þ 6¼ ð0; 0; 0Þ, F cannot be linear.
(c) Let v ¼ ð1; 2; 3Þ and k ¼ 3. Then kv ¼ ð3; 6; 9Þ. We have
FðvÞ ¼ ð1; 5Þ and kFðvÞ ¼ 3ð1; 5Þ ¼ ð3; 15Þ:
Thus,
FðkvÞ ¼ Fð3; 6; 9Þ ¼ ð3; 15Þ 6¼ kFðvÞ
Accordingly, F is not linear.
5.12. Let V be the vector space of n-square real matrices. Let M be an arbitrary but fixed matrix in V.
Let F : V ! V be defined by FðAÞ ¼ AM þ MA, where A is any matrix in V. Show that F is
linear.
For any matrices A and B in V and any scalar k, we have
FðA þ BÞ ¼ ðA þ BÞM þ MðA þ BÞ ¼ AM þ BM þ MA þ MB
¼ ðAM þ MAÞ ¼ ðBM þ MBÞ ¼ FðAÞ þ FðBÞ
and
FðkAÞ ¼ ðkAÞM þ MðkAÞ ¼ kðAMÞ þ kðMAÞ ¼ kðAM þ MAÞ ¼ kFðAÞ
Thus, F is linear.
5.13. Prove Theorem 5.2: Let V and U be vector spaces over a field K. Let fv1; v2; . . . ; vng be a basis of
V and let u1; u2; . . . ; un be any vectors in U. Then there exists a unique linear mapping F : V ! U
such that Fðv1Þ ¼ u1; Fðv2Þ ¼ u2; . . . ; FðvnÞ ¼ un.
There are three steps to the proof of the theorem: (1) Define the mapping F : V ! U such that
FðviÞ ¼ ui; i ¼ 1; . . . ; n. (2) Show that F is linear. (3) Show that F is unique.
Step 1. Let v 2 V. Because fv1; . . . ; vng is a basis of V, there exist unique scalars a1; . . . ; an 2 K for
which v ¼ a1v1 þ a2v2 þ    þ anvn. We define F : V ! U by
FðvÞ ¼ a1u1 þ a2u2 þ    þ anun
CHAPTER 5 Linear Mappings 179
(Because the ai are unique, the mapping F is well defined.) Now, for i ¼ 1; . . . ; n,
vi ¼ 0v1 þ    þ 1vi þ    þ 0vn
Hence,
FðviÞ ¼ 0u1 þ    þ 1ui þ    þ 0un ¼ ui
Thus, the first step of the proof is complete.
Step 2. Suppose v ¼ a1v1 þ a2v2 þ    þ anvn and w ¼ b1v1 þ b2v2 þ    þ bnvn. Then
v þ w ¼ ða1 þ b1Þv1 þ ða2 þ b2Þv2 þ    þ ðan þ bnÞvn
and, for any k 2 K, kv ¼ ka1v1 þ ka2v2 þ    þ kanvn. By definition of the mapping F,
FðvÞ ¼ a1u1 þ a2u2 þ    þ anvn and FðwÞ ¼ b1u1 þ b2u2 þ    þ bnun
Hence,
Fðv þ wÞ ¼ ða1 þ b1Þu1 þ ða2 þ b2Þu2 þ    þ ðan þ bnÞun
¼ ða1u1 þ a2u2 þ    þ anunÞ þ ðb1u1 þ b2u2 þ    þ bnunÞ
¼ FðvÞ þ FðwÞ
and
FðkvÞ ¼ kða1u1 þ a2u2 þ    þ anunÞ ¼ kFðvÞ
Thus, F is linear.
Step 3. Suppose G : V ! U is linear and Gðv1Þ ¼ ui; i ¼ 1; . . . ; n. Let
v ¼ a1v1 þ a2v2 þ    þ anvn
Then
GðvÞ ¼ Gða1v1 þ a2v2 þ    þ anvnÞ ¼ a1Gðv1Þ þ a2Gðv2Þ þ    þ anGðvnÞ
¼ a1u1 þ a2u2 þ    þ anun ¼ FðvÞ
Because GðvÞ ¼ FðvÞ for every v 2 V; G ¼ F. Thus, F is unique and the theorem is proved.
5.14. Let F : R2
! R2
be the linear mapping for which Fð1; 2Þ ¼ ð2; 3Þ and Fð0; 1Þ ¼ ð1; 4Þ. [Note that
fð1; 2Þ; ð0; 1Þg is a basis of R2
, so such a linear map F exists and is unique by Theorem 5.2.] Find
a formula for F; that is, find Fða; bÞ.
Write ða; bÞ as a linear combination of (1, 2) and (0, 1) using unknowns x and y,
ða; bÞ ¼ xð1; 2Þ þ yð0; 1Þ ¼ ðx; 2x þ yÞ; so a ¼ x; b ¼ 2x þ y
Solve for x and y in terms of a and b to get x ¼ a, y ¼ 2a þ b. Then
Fða; bÞ ¼ xFð1; 2Þ þ yFð0; 1Þ ¼ að2; 3Þ þ ð2a þ bÞð1; 4Þ ¼ ðb; 5a þ 4bÞ
5.15. Suppose a linear mapping F : V ! U is one-to-one and onto. Show that the inverse mapping
F1
: U ! V is also linear.
Suppose u; u0
2 U. Because F is one-to-one and onto, there exist unique vectors v; v0
2 V for which
FðvÞ ¼ u and Fðv0
Þ ¼ u0
. Because F is linear, we also have
Fðv þ v0
Þ ¼ FðvÞ þ Fðv0
Þ ¼ u þ u0
and FðkvÞ ¼ kFðvÞ ¼ ku
By definition of the inverse mapping,
F1
ðuÞ ¼ v; F1
ðu0
Þ ¼ v0
; F1
ðu þ u0
Þ ¼ v þ v0
; F1
ðkuÞ ¼ kv:
Then
F1
ðu þ u0
Þ ¼ v þ v0
¼ F1
ðuÞ þ F1
ðu0
Þ and F1
ðkuÞ ¼ kv ¼ kF1
ðuÞ
Thus, F1
is linear.
180 CHAPTER 5 Linear Mappings
Kernel and Image of Linear Mappings
5.16. Let F : R4
! R3
be the linear mapping defined by
Fðx; y; z; tÞ ¼ ðx  y þ z þ t; x þ 2z  t; x þ y þ 3z  3tÞ
Find a basis and the dimension of (a) the image of F; (b) the kernel of F.
(a) Find the images of the usual basis of R4
:
Fð1; 0; 0; 0Þ ¼ ð1; 1; 1Þ; Fð0; 0; 1; 0Þ ¼ ð1; 2; 3Þ
Fð0; 1; 0; 0Þ ¼ ð1; 0; 1Þ; Fð0; 0; 0; 1Þ ¼ ð1; 1; 3Þ
By Proposition 5.4, the image vectors span Im F. Hence, form the matrix whose rows are these image
vectors, and row reduce to echelon form:
1 1 1
1 0 1
1 2 3
1 1 3
2
6
6
6
4
3
7
7
7
5
1 1 1
0 1 2
0 1 2
0 2 4
2
6
6
6
4
3
7
7
7
5
1 1 1
0 1 2
0 0 0
0 0 0
2
6
6
6
4
3
7
7
7
5
Thus, (1, 1, 1) and (0, 1, 2) form a basis for Im F; hence, dimðIm FÞ ¼ 2.
(b) Set FðvÞ ¼ 0, where v ¼ ðx; y; z; tÞ; that is, set
Fðx; y; z; tÞ ¼ ðx  y þ z þ t; x þ 2z  t; x þ y þ 3z  3tÞ ¼ ð0; 0; 0Þ
Set corresponding entries equal to each other to form the following homogeneous system whose solution
space is Ker F:
x  y þ z þ t ¼ 0
x þ 2z  t ¼ 0
x þ y þ 3z  3t ¼ 0
or
x  y þ z þ t ¼ 0
y þ z  2t ¼ 0
2y þ 2z  4t ¼ 0
or
x  y þ z þ t ¼ 0
y þ z  2t ¼ 0
The free variables are z and t. Hence, dimðKer FÞ ¼ 2.
(i) Set z ¼ 1, t ¼ 0 to obtain the solution (2; 1; 1; 0).
(ii) Set z ¼ 0, t ¼ 1 to obtain the solution (1, 2, 0, 1).
Thus, (2; 1; 1; 0) and (1, 2, 0, 1) form a basis of Ker F.
[As expected, dimðIm FÞ þ dimðKer FÞ ¼ 2 þ 2 ¼ 4 ¼ dim R4
, the domain of F.]
5.17. Let G : R3
! R3
be the linear mapping defined by
Gðx; y; zÞ ¼ ðx þ 2y  z; y þ z; x þ y  2zÞ
Find a basis and the dimension of (a) the image of G, (b) the kernel of G.
(a) Find the images of the usual basis of R3
:
Gð1; 0; 0Þ ¼ ð1; 0; 1Þ; Gð0; 1; 0Þ ¼ ð2; 1; 1Þ; Gð0; 0; 1Þ ¼ ð1; 1; 2Þ
By Proposition 5.4, the image vectors span Im G. Hence, form the matrix M whose rows are these image
vectors, and row reduce to echelon form:
M ¼
1 0 1
2 1 1
1 1 2
2
4
3
5
1 0 1
0 1 1
0 1 1
2
4
3
5
1 0 1
0 1 1
0 0 0
2
4
3
5
Thus, (1, 0, 1) and (0; 1; 1) form a basis for Im G; hence, dimðIm GÞ ¼ 2.
(b) Set GðvÞ ¼ 0, where v ¼ ðx; y; zÞ; that is,
Gðx; y; zÞ ¼ ðx þ 2y  z; y þ z; x þ y  2zÞ ¼ ð0; 0; 0Þ
CHAPTER 5 Linear Mappings 181
Set corresponding entries equal to each other to form the following homogeneous system whose solution
space is Ker G:
x þ 2y  z ¼ 0
y þ z ¼ 0
x þ y  2z ¼ 0
or
x þ 2y  z ¼ 0
y þ z ¼ 0
y  z ¼ 0
or
x þ 2y  z ¼ 0
y þ z ¼ 0
The only free variable is z; hence, dimðKer GÞ ¼ 1. Set z ¼ 1; then y ¼ 1 and x ¼ 3. Thus, (3; 1; 1)
forms a basis of Ker G. [As expected, dimðIm GÞ þ dimðKer GÞ ¼ 2 þ 1 ¼ 3 ¼ dim R3
, the domain
of G.]
5.18. Consider the matrix mapping A : R4
! R3
, where A ¼
1 2 3 1
1 3 5 2
3 8 13 3
2
4
3
5. Find a basis and the
dimension of (a) the image of A, (b) the kernel of A.
(a) The column space of A is equal to Im A. Now reduce AT
to echelon form:
AT
¼
1 1 3
2 3 8
3 5 13
1 2 3
2
6
6
4
3
7
7
5
1 1 3
0 1 2
0 2 4
0 3 6
2
6
6
4
3
7
7
5
1 1 3
0 1 2
0 0 0
0 0 0
2
6
6
4
3
7
7
5
Thus, fð1; 1; 3Þ; ð0; 1; 2Þg is a basis of Im A, and dimðIm AÞ ¼ 2.
(b) Here Ker A is the solution space of the homogeneous system AX ¼ 0, where X ¼ fx; y; z; tÞT
. Thus,
reduce the matrix A of coefficients to echelon form:
1 2 3 1
0 1 2 3
0 2 4 6
2
4
3
5
1 2 3 1
0 1 2 3
0 0 0 0
2
4
3
5 or
x þ 2y þ 3z þ t ¼ 0
y þ 2z  3t ¼ 0
The free variables are z and t. Thus, dimðKer AÞ ¼ 2.
(i) Set z ¼ 1, t ¼ 0 to get the solution (1; 2; 1; 0).
(ii) Set z ¼ 0, t ¼ 1 to get the solution (7; 3; 0; 1).
Thus, (1; 2; 1; 0) and (7; 3; 0; 1) form a basis for Ker A.
5.19. Find a linear map F : R3
! R4
whose image is spanned by (1; 2; 0; 4) and (2; 0; 1; 3).
Form a 4 3 matrix whose columns consist only of the given vectors, say
A ¼
1 2 2
2 0 0
0 1 1
4 3 3
2
6
6
4
3
7
7
5
Recall that A determines a linear map A : R3
! R4
whose image is spanned by the columns of A. Thus, A
satisfies the required condition.
5.20. Suppose f : V ! U is linear with kernel W, and that f ðvÞ ¼ u. Show that the ‘‘coset’’
v þ W ¼ fv þ w : w 2 Wg is the preimage of u; that is, f 1
ðuÞ ¼ v þ W.
We must prove that (i) f 1
ðuÞ  v þ W and (ii) v þ W  f 1
ðuÞ.
We first prove (i). Suppose v0
2 f 1
ðuÞ. Then f ðv0
Þ ¼ u, and so
f ðv0
 vÞ ¼ f ðv0
Þ  f ðvÞ ¼ u  u ¼ 0
that is, v0
 v 2 W. Thus, v0
¼ v þ ðv0
 vÞ 2 v þ W, and hence f 1
ðuÞ  v þ W.
182 CHAPTER 5 Linear Mappings
Now we prove (ii). Suppose v0
2 v þ W. Then v0
¼ v þ w, where w 2 W. Because W is the kernel of f ;
we have f ðwÞ ¼ 0. Accordingly,
f ðv0
Þ ¼ f ðv þ wÞ þ f ðvÞ þ f ðwÞ ¼ f ðvÞ þ 0 ¼ f ðvÞ ¼ u
Thus, v0
2 f 1
ðuÞ, and so v þ W  f 1
ðuÞ.
Both inclusions imply f 1
ðuÞ ¼ v þ W.
5.21. Suppose F : V ! U and G : U ! W are linear. Prove
(a) rankðG  FÞ  rankðGÞ, (b) rankðG  FÞ  rankðFÞ.
(a) Because FðVÞ  U, we also have GðFðVÞÞ  GðUÞ, and so dim½GðFðVÞÞ  dim½GðUÞ. Then
rankðG  FÞ ¼ dim½ðG  FÞðVÞ ¼ dim½GðFðVÞÞ  dim½GðUÞ ¼ rankðGÞ.
(b) We have dim½GðFðVÞÞ  dim½FðVÞ. Hence,
rankðG  FÞ ¼ dim½ðG  FÞðVÞ ¼ dim½GðFðVÞÞ  dim½FðVÞ ¼ rankðFÞ
5.22. Prove Theorem 5.3: Let F : V ! U be linear. Then,
(a) Im F is a subspace of U, (b) Ker F is a subspace of V.
(a) Because Fð0Þ ¼ 0; we have 0 2 Im F. Now suppose u; u0
2 Im F and a; b 2 K. Because u and u0
belong to the image of F, there exist vectors v; v0
2 V such that FðvÞ ¼ u and Fðv0
Þ ¼ u0
. Then
Fðav þ bv0
Þ ¼ aFðvÞ þ bFðv0
Þ ¼ au þ bu0
2 Im F
Thus, the image of F is a subspace of U.
(b) Because Fð0Þ ¼ 0; we have 0 2 Ker F. Now suppose v; w 2 Ker F and a; b 2 K. Because v and w
belong to the kernel of F, FðvÞ ¼ 0 and FðwÞ ¼ 0. Thus,
Fðav þ bwÞ ¼ aFðvÞ þ bFðwÞ ¼ a0 þ b0 ¼ 0 þ 0 ¼ 0; and so av þ bw 2 Ker F
Thus, the kernel of F is a subspace of V.
5.23. Prove Theorem 5.6: Suppose V has finite dimension and F : V ! U is linear. Then
dim V ¼ dimðKer FÞ þ dimðIm FÞ ¼ nullityðFÞ þ rankðFÞ
Suppose dimðKer FÞ ¼ r and fw1; . . . ; wrg is a basis of Ker F, and suppose dimðIm FÞ ¼ s and
fu1; . . . ; usg is a basis of Im F. (By Proposition 5.4, Im F has finite dimension.) Because every
uj 2 Im F, there exist vectors v1; . . . ; vs in V such that Fðv1Þ ¼ u1; . . . ; FðvsÞ ¼ us. We claim that the set
B ¼ fw1; . . . ; wr; v1; . . . ; vsg
is a basis of V; that is, (i) B spans V, and (ii) B is linearly independent. Once we prove (i) and (ii), then
dim V ¼ r þ s ¼ dimðKer FÞ þ dimðIm FÞ.
(i) B spans V. Let v 2 V. Then FðvÞ 2 Im F. Because the uj span Im F, there exist scalars a1; . . . ; as such
that FðvÞ ¼ a1u1 þ    þ asus. Set ^
v ¼ a1v1 þ    þ asvs  v. Then
Fð^
vÞ ¼ Fða1v1 þ    þ asvs  vÞ ¼ a1Fðv1Þ þ    þ asFðvsÞ  FðvÞ
¼ a1u1 þ    þ asus  FðvÞ ¼ 0
Thus, ^
v 2 Ker F. Because the wi span Ker F, there exist scalars b1; . . . ; br, such that
^
v ¼ b1w1 þ    þ brwr ¼ a1v1 þ    þ asvs  v
Accordingly,
v ¼ a1v1 þ    þ asvs  b1w1      brwr
Thus, B spans V.
CHAPTER 5 Linear Mappings 183
(ii) B is linearly independent. Suppose
x1w1 þ    þ xrwr þ y1v1 þ    þ ysvs ¼ 0 ð1Þ
where xi; yj 2 K. Then
0 ¼ Fð0Þ ¼ Fðx1w1 þ    þ xrwr þ y1v1 þ    þ ysvsÞ
¼ x1Fðw1Þ þ    þ xrFðwrÞ þ y1Fðv1Þ þ    þ ysFðvsÞ ð2Þ
But FðwiÞ ¼ 0, since wi 2 Ker F, and FðvjÞ ¼ uj. Substituting into (2), we will obtain
y1u1 þ    þ ysus ¼ 0. Since the uj are linearly independent, each yj ¼ 0. Substitution into (1) gives
x1w1 þ    þ xrwr ¼ 0. Since the wi are linearly independent, each xi ¼ 0. Thus B is linearly
independent.
Singular and Nonsingular Linear Maps, Isomorphisms
5.24. Determine whether or not each of the following linear maps is nonsingular. If not, find a nonzero
vector v whose image is 0.
(a) F : R2
! R2
defined by Fðx; yÞ ¼ ðx  y; x  2yÞ.
(b) G : R2
! R2
defined by Gðx; yÞ ¼ ð2x  4y; 3x  6yÞ.
(a) Find Ker F by setting FðvÞ ¼ 0, where v ¼ ðx; yÞ,
ðx  y; x  2yÞ ¼ ð0; 0Þ or
x  y ¼ 0
x  2y ¼ 0
or
x  y ¼ 0
y ¼ 0
The only solution is x ¼ 0, y ¼ 0. Hence, F is nonsingular.
(b) Set Gðx; yÞ ¼ ð0; 0Þ to find Ker G:
ð2x  4y; 3x  6yÞ ¼ ð0; 0Þ or
2x  4y ¼ 0
3x  6y ¼ 0
or x  2y ¼ 0
The system has nonzero solutions, because y is a free variable. Hence, G is singular. Let y ¼ 1 to obtain
the solution v ¼ ð2; 1Þ, which is a nonzero vector, such that GðvÞ ¼ 0.
5.25. The linear map F : R2
! R2
defined by Fðx; yÞ ¼ ðx  y; x  2yÞ is nonsingular by the previous
Problem 5.24. Find a formula for F1
.
Set Fðx; yÞ ¼ ða; bÞ, so that F1
ða; bÞ ¼ ðx; yÞ. We have
ðx  y; x  2yÞ ¼ ða; bÞ or
x  y ¼ a
x  2y ¼ b
or
x  y ¼ a
y ¼ a  b
Solve for x and y in terms of a and b to get x ¼ 2a  b, y ¼ a  b. Thus,
F1
ða; bÞ ¼ ð2a  b; a  bÞ or F1
ðx; yÞ ¼ ð2x  y; x  yÞ
(The second equation is obtained by replacing a and b by x and y, respectively.)
5.26. Let G : R2
! R3
be defined by Gðx; yÞ ¼ ðx þ y; x  2y; 3x þ yÞ.
(a) Show that G is nonsingular. (b) Find a formula for G1
.
(a) Set Gðx; yÞ ¼ ð0; 0; 0Þ to find Ker G. We have
ðx þ y; x  2y; 3x þ yÞ ¼ ð0; 0; 0Þ or x þ y ¼ 0; x  2y ¼ 0; 3x þ y ¼ 0
The only solution is x ¼ 0, y ¼ 0; hence, G is nonsingular.
(b) Although G is nonsingular, it is not invertible, because R2
and R3
have different dimensions. (Thus,
Theorem 5.9 does not apply.) Accordingly, G1
does not exist.
184 CHAPTER 5 Linear Mappings
5.27. Suppose that F : V ! U is linear and that V is of finite dimension. Show that V and the image of
F have the same dimension if and only if F is nonsingular. Determine all nonsingular linear
mappings T : R4
! R3
.
By Theorem 5.6, dim V ¼ dimðIm FÞ þ dimðKer FÞ. Hence, V and Im F have the same dimension if
and only if dimðKer FÞ ¼ 0 or Ker F ¼ f0g (i.e., if and only if F is nonsingular).
Because dim R3
is less than dim R4
, we have that dimðIm TÞ is less than the dimension of the domain
R4
of T. Accordingly no linear mapping T : R4
! R3
can be nonsingular.
5.28. Prove Theorem 5.7: Let F : V ! U be a nonsingular linear mapping. Then the image of any
linearly independent set is linearly independent.
Suppose v1; v2; . . . ; vn are linearly independent vectors in V. We claim that Fðv1Þ; Fðv2Þ; . . . ; FðvnÞ are
also linearly independent. Suppose a1Fðv1Þ þ a2Fðv2Þ þ    þ anFðvnÞ ¼ 0, where ai 2 K. Because F is
linear, Fða1v1 þ a2v2 þ    þ anvnÞ ¼ 0. Hence,
a1v1 þ a2v2 þ    þ anvn 2 Ker F
But F is nonsingular—that is, Ker F ¼ f0g. Hence, a1v1 þ a2v2 þ    þ anvn ¼ 0. Because the vi are
linearly independent, all the ai are 0. Accordingly, the FðviÞ are linearly independent. Thus, the theorem is
proved.
5.29. Prove Theorem 5.9: Suppose V has finite dimension and dim V ¼ dim U. Suppose F : V ! U is
linear. Then F is an isomorphism if and only if F is nonsingular.
If F is an isomorphism, then only 0 maps to 0; hence, F is nonsingular. Conversely, suppose F is
nonsingular. Then dimðKer FÞ ¼ 0. By Theorem 5.6, dim V ¼ dimðKer FÞ þ dimðIm FÞ. Thus,
dim U ¼ dim V ¼ dimðIm FÞ
Because U has finite dimension, Im F ¼ U. This means F maps V onto U. Thus, F is one-to-one and onto;
that is, F is an isomorphism.
Operations with Linear Maps
5.30. Define F : R3
! R2
and G : R3
! R2
by Fðx; y; zÞ ¼ ð2x; y þ zÞ and Gðx; y; zÞ ¼ ðx  z; yÞ.
Find formulas defining the maps: (a) F þ G, (b) 3F, (c) 2F  5G.
(a) ðF þ GÞðx; y; zÞ ¼ Fðx; y; zÞ þ Gðx; y; zÞ ¼ ð2x; y þ zÞ þ ðx  z; yÞ ¼ ð3x  z; 2y þ zÞ
(b) ð3FÞðx; y; zÞ ¼ 3Fðx; y; zÞ ¼ 3ð2x; y þ zÞ ¼ ð6x; 3y þ 3zÞ
(c) ð2F  5GÞðx; y; zÞ ¼ 2Fðx; y; zÞ  5Gðx; y; zÞ ¼ 2ð2x; y þ zÞ  5ðx  z; yÞ
¼ ð4x; 2y þ 2zÞ þ ð5x þ 5z; 5yÞ ¼ ðx þ 5z; 3y þ 2zÞ
5.31. Let F : R3
! R2
and G : R2
! R2
be defined by Fðx; y; zÞ ¼ ð2x; y þ zÞ and Gðx; yÞ ¼ ðy; xÞ.
Derive formulas defining the mappings: (a) G  F, (b) F  G.
(a) ðG  FÞðx; y; zÞ ¼ GðFðx; y; zÞÞ ¼ Gð2x; y þ zÞ ¼ ðy þ z; 2xÞ
(b) The mapping F  G is not defined, because the image of G is not contained in the domain of F.
5.32. Prove: (a) The zero mapping 0, defined by 0ðvÞ ¼ 0 2 U for every v 2 V, is the zero element of
HomðV; UÞ. (b) The negative of F 2 HomðV; UÞ is the mapping ð1ÞF, that is, F ¼ ð1ÞF.
Let F 2 HomðV; UÞ. Then, for every v 2 V:
ðF þ 0ÞðvÞ ¼ FðvÞ þ 0ðvÞ ¼ FðvÞ þ 0 ¼ FðvÞ
ðaÞ
Because ðF þ 0ÞðvÞ ¼ FðvÞ for every v 2 V, we have F þ 0 ¼ F. Similarly, 0 þ F ¼ F:
ðF þ ð1ÞFÞðvÞ ¼ FðvÞ þ ð1ÞFðvÞ ¼ FðvÞ  FðvÞ ¼ 0 ¼ 0ðvÞ
ðbÞ
Thus, F þ ð1ÞF ¼ 0: Similarly ð1ÞF þ F ¼ 0: Hence, F ¼ ð1ÞF:
CHAPTER 5 Linear Mappings 185
5.33. Suppose F1; F2; . . . ; Fn are linear maps from V into U. Show that, for any scalars a1; a2; . . . ; an,
and for any v 2 V,
ða1F1 þ a2F2 þ    þ anFnÞðvÞ ¼ a1F1ðvÞ þ a2F2ðvÞ þ    þ anFnðvÞ
The mapping a1F1 is defined by ða1F1ÞðvÞ ¼ a1FðvÞ. Hence, the theorem holds for n ¼ 1. Accordingly,
by induction,
ða1F1 þ a2F2 þ    þ anFnÞðvÞ ¼ ða1F1ÞðvÞ þ ða2F2 þ    þ anFnÞðvÞ
¼ a1F1ðvÞ þ a2F2ðvÞ þ    þ anFnðvÞ
5.34. Consider linear mappings F : R3
! R2
, G : R3
! R2
, H : R3
! R2
defined by
Fðx; y; zÞ ¼ ðx þ y þ z; x þ yÞ; Gðx; y; zÞ ¼ ð2x þ z; x þ yÞ; Hðx; y; zÞ ¼ ð2y; xÞ
Show that F, G, H are linearly independent [as elements of HomðR3
; R2
Þ].
Suppose, for scalars a; b; c 2 K,
aF þ bG þ cH ¼ 0 ð1Þ
(Here 0 is the zero mapping.) For e1 ¼ ð1; 0; 0Þ 2 R3
, we have 0ðe1Þ ¼ ð0; 0Þ and
ðaF þ bG þ cHÞðe1Þ ¼ aFð1; 0; 0Þ þ bGð1; 0; 0Þ þ cHð1; 0; 0Þ
¼ að1; 1Þ þ bð2; 1Þ þ cð0; 1Þ ¼ ða þ 2b; a þ b þ cÞ
Thus by (1), ða þ 2b; a þ b þ cÞ ¼ ð0; 0Þ and so
a þ 2b ¼ 0 and a þ b þ c ¼ 0 ð2Þ
Similarly for e2 ¼ ð0; 1; 0Þ 2 R3
, we have 0ðe2Þ ¼ ð0; 0Þ and
ðaF þ bG þ cHÞðe2Þ ¼ aFð0; 1; 0Þ þ bGð0; 1; 0Þ þ cHð0; 1; 0Þ
¼ að1; 1Þ þ bð0; 1Þ þ cð2; 0Þ ¼ ða þ 2c; a þ bÞ
Thus,
a þ 2c ¼ 0 and a þ b ¼ 0 ð3Þ
Using (2) and (3), we obtain
a ¼ 0; b ¼ 0; c ¼ 0 ð4Þ
Because (1) implies (4), the mappings F, G, H are linearly independent.
5.35. Let k be a nonzero scalar. Show that a linear map T is singular if and only if kT is singular. Hence,
T is singular if and only if T is singular.
Suppose T is singular. Then TðvÞ ¼ 0 for some vector v 6¼ 0. Hence,
ðkTÞðvÞ ¼ kTðvÞ ¼ k0 ¼ 0
and so kT is singular.
Now suppose kT is singular. Then ðkTÞðwÞ ¼ 0 for some vector w 6¼ 0. Hence,
TðkwÞ ¼ kTðwÞ ¼ ðkTÞðwÞ ¼ 0
But k 6¼ 0 and w 6¼ 0 implies kw 6¼ 0. Thus, T is also singular.
5.36. Find the dimension d of:
(a) HomðR3
; R4
Þ, (b) HomðR5
; R3
Þ, (c) HomðP3ðtÞ; R2
Þ, (d) HomðM2;3; R4
Þ.
Use dim½HomðV; UÞ ¼ mn, where dim V ¼ m and dim U ¼ n.
(a) d ¼ 3ð4Þ ¼ 12. (c) Because dim P3ðtÞ ¼ 4, d ¼ 4ð2Þ ¼ 8.
(b) d ¼ 5ð3Þ ¼ 15. (d) Because dim M2;3 ¼ 6, d ¼ 6ð4Þ ¼ 24.
186 CHAPTER 5 Linear Mappings
5.37. Prove Theorem 5.11. Suppose dim V ¼ m and dim U ¼ n. Then dim½HomðV; UÞ ¼ mn.
Suppose fv1; . . . ; vmg is a basis of V and fu1; . . . ; ung is a basis of U. By Theorem 5.2, a linear mapping
in HomðV; UÞ is uniquely determined by arbitrarily assigning elements of U to the basis elements vi of V. We
define
Fij 2 HomðV; UÞ; i ¼ 1; . . . ; m; j ¼ 1; . . . ; n
to be the linear mapping for which FijðviÞ ¼ uj, and FijðvkÞ ¼ 0 for k 6¼ i. That is, Fij maps vi into uj and the
other v’s into 0. Observe that fFijg contains exactly mn elements; hence, the theorem is proved if we show
that it is a basis of HomðV; UÞ.
Proof that fFijg generates HomðV; UÞ. Consider an arbitrary function F 2 HomðV; UÞ. Suppose
Fðv1Þ ¼ w1; Fðv2Þ ¼ w2; . . . ; FðvmÞ ¼ wm. Because wk 2 U, it is a linear combination of the u’s; say,
wk ¼ ak1u1 þ ak2u2 þ    þ aknun; k ¼ 1; . . . ; m; aij 2 K ð1Þ
Consider the linear mapping G ¼
Pm
i¼1
Pn
j¼1 aijFij. Because G is a linear combination of the Fij, the proof
that fFijg generates HomðV; UÞ is complete if we show that F ¼ G.
We now compute GðvkÞ; k ¼ 1; . . . ; m. Because FijðvkÞ ¼ 0 for k 6¼ i and FkiðvkÞ ¼ ui;
GðvkÞ ¼
P
m
i¼1
P
n
j¼1
aijFijðvkÞ ¼
P
n
j¼1
akjFkjðvkÞ ¼
P
n
j¼1
akjuj
¼ ak1u1 þ ak2u2 þ    þ aknun
Thus, by (1), GðvkÞ ¼ wk for each k. But FðvkÞ ¼ wk for each k. Accordingly, by Theorem 5.2, F ¼ G;
hence, fFijg generates HomðV; UÞ.
Proof that fFijg is linearly independent. Suppose, for scalars cij 2 K,
P
m
i¼1
P
n
j¼1
cijFij ¼ 0
For vk; k ¼ 1; . . . ; m,
0 ¼ 0ðvkÞ ¼
P
m
i¼1
P
n
j¼1
cijFijðvkÞ ¼
P
n
j¼1
ckjFkjðvkÞ ¼
P
n
j¼1
ckjuj
¼ ck1u1 þ ck2u2 þ    þ cknun
But the ui are linearly independent; hence, for k ¼ 1; . . . ; m, we have ck1 ¼ 0; ck2 ¼ 0; . . . ; ckn ¼ 0. In other
words, all the cij ¼ 0, and so fFijg is linearly independent.
5.38. Prove Theorem 5.12: (i) G  ðF þ F0
Þ ¼ G  F þ G  F0
. (ii) ðG þ G0
Þ  F ¼ G  F þ G0  F.
(iii) kðG  FÞ ¼ ðkGÞ  F ¼ G  ðkFÞ.
(i) For every v 2 V,
ðG  ðF þ F0
ÞÞðvÞ ¼ GððF þ F0
ÞðvÞÞ ¼ GðFðvÞ þ F0
ðvÞÞ
¼ GðFðvÞÞ þ GðF0
ðvÞÞ ¼ ðG  FÞðvÞ þ ðG  F0
ÞðvÞ ¼ ðG  F þ G  F0
ÞðvÞ
Thus, G  ðF þ F0
Þ ¼ G  F þ G  F0
.
(ii) For every v 2 V,
ððG þ G0
Þ  FÞðvÞ ¼ ðG þ G0
ÞðFðvÞÞ ¼ GðFðvÞÞ þ G0
ðFðvÞÞ
¼ ðG  FÞðvÞ þ ðG0  FÞðvÞ ¼ ðG  F þ G0  FÞðvÞ
Thus, ðG þ G0
Þ  F ¼ G  F þ G0  F.
CHAPTER 5 Linear Mappings 187
(iii) For every v 2 V,
ðkðG  FÞÞðvÞ ¼ kðG  FÞðvÞ ¼ kðGðFðvÞÞÞ ¼ ðkGÞðFðvÞÞ ¼ ðkG  FÞðvÞ
and
ðkðG  FÞÞðvÞ ¼ kðG  FÞðvÞ ¼ kðGðFðvÞÞÞ ¼ GðkFðvÞÞ ¼ GððkFÞðvÞÞ ¼ ðG  kFÞðvÞ
Accordingly, kðG  FÞ ¼ ðkGÞ  F ¼ G  ðkFÞ. (We emphasize that two mappings are shown to be equal
by showing that each of them assigns the same image to each point in the domain.)
Algebra of Linear Maps
5.39. Let F and G be the linear operators on R2
defined by Fðx; yÞ ¼ ðy; xÞ and Gðx; yÞ ¼ ð0; xÞ. Find
formulas defining the following operators:
(a) F þ G, (b) 2F  3G, (c) FG, (d) GF, (e) F2
, (f) G2
.
(a) ðF þ GÞðx; yÞ ¼ Fðx; yÞ þ Gðx; yÞ ¼ ðy; xÞ þ ð0; xÞ ¼ ðy; 2xÞ.
(b) ð2F  3GÞðx; yÞ ¼ 2Fðx; yÞ  3Gðx; yÞ ¼ 2ðy; xÞ  3ð0; xÞ ¼ ð2y; xÞ.
(c) ðFGÞðx; yÞ ¼ FðGðx; yÞÞ ¼ Fð0; xÞ ¼ ðx; 0Þ.
(d) ðGFÞðx; yÞ ¼ GðFðx; yÞÞ ¼ Gðy; xÞ ¼ ð0; yÞ.
(e) F2
ðx; yÞ ¼ FðFðx; yÞÞ ¼ Fðy; xÞ ¼ ðx; yÞ. (Note that F2
¼ I, the identity mapping.)
(f) G2
ðx; yÞ ¼ GðGðx; yÞÞ ¼ Gð0; xÞ ¼ ð0; 0Þ. (Note that G2
¼ 0, the zero mapping.)
5.40. Consider the linear operator T on R3
defined by Tðx; y; zÞ ¼ ð2x; 4x  y; 2x þ 3y  zÞ.
(a) Show that T is invertible. Find formulas for (b) T1
, (c) T2
, (d) T2
.
(a) Let W ¼ Ker T. We need only show that T is nonsingular (i.e., that W ¼ f0g). Set Tðx; y; zÞ ¼ ð0; 0; 0Þ,
which yields
Tðx; y; zÞ ¼ ð2x; 4x  y; 2x þ 3y  zÞ ¼ ð0; 0; 0Þ
Thus, W is the solution space of the homogeneous system
2x ¼ 0; 4x  y ¼ 0; 2x þ 3y  z ¼ 0
which has only the trivial solution (0, 0, 0). Thus, W ¼ f0g. Hence, T is nonsingular, and so T is
invertible.
(b) Set Tðx; y; zÞ ¼ ðr; s; tÞ [and so T1
ðr; s; tÞ ¼ ðx; y; zÞ]. We have
ð2x; 4x  y; 2x þ 3y  zÞ ¼ ðr; s; tÞ or 2x ¼ r; 4x  y ¼ s; 2x þ 3y  z ¼ t
Solve for x, y, z in terms of r, s, t to get x ¼ 1
2 r, y ¼ 2r  s, z ¼ 7r  3s  t. Thus,
T1
ðr; s; tÞ ¼ ð1
2 r; 2r  s; 7r  3s  tÞ or T1
ðx; y; zÞ ¼ ð1
2 x; 2x  y; 7x  3y  zÞ
(c) Apply T twice to get
T2
ðx; y; zÞ ¼ Tð2x; 4x  y; 2x þ 3y  zÞ
¼ ½4x; 4ð2xÞ  ð4x  yÞ; 2ð2xÞ þ 3ð4x  yÞ  ð2x þ 3y  zÞ
¼ ð4x; 4x þ y; 14x  6y þ zÞ
(d) Apply T1
twice to get
T2
ðx; y; zÞ ¼ T2
ð1
2 x; 2x  y; 7x  3y  zÞ
¼ ½1
4 x; 2ð1
2 xÞ  ð2x  yÞ; 7ð1
2 xÞ  3ð2x  yÞ  ð7x  3y  zÞ
¼ ð1
4 x; x þ y;  19
2 x þ 6y þ zÞ
188 CHAPTER 5 Linear Mappings
5.41. Let V be of finite dimension and let T be a linear operator on V for which TR ¼ I, for some
operator R on V. (We call R a right inverse of T.)
(a) Show that T is invertible. (b) Show that R ¼ T1
.
(c) Give an example showing that the above need not hold if V is of infinite dimension.
(a) Let dim V ¼ n. By Theorem 5.14, T is invertible if and only if T is onto; hence, T is invertible if and
only if rankðTÞ ¼ n. We have n ¼ rankðIÞ ¼ rankðTRÞ  rankðTÞ  n. Hence, rankðTÞ ¼ n and T is
invertible.
(b) TT1
¼ T1
T ¼ I. Then R ¼ IR ¼ ðT1
TÞR ¼ T1
ðTRÞ ¼ T1
I ¼ T1
.
(c) Let V be the space of polynomials in t over K; say, pðtÞ ¼ a0 þ a1t þ a2t2
þ    þ asts
. Let T and R be
the operators on V defined by
Tð pðtÞÞ ¼ 0 þ a1 þ a2t þ    þ asts1
and Rð pðtÞÞ ¼ a0t þ a1t2
þ    þ astsþ1
We have
ðTRÞð pðtÞÞ ¼ TðRð pðtÞÞÞ ¼ Tða0t þ a1t2
þ    þ astsþ1
Þ ¼ a0 þ a1t þ    þ asts
¼ pðtÞ
and so TR ¼ I, the identity mapping. On the other hand, if k 2 K and k 6¼ 0, then
ðRTÞðkÞ ¼ RðTðkÞÞ ¼ Rð0Þ ¼ 0 6¼ k
Accordingly, RT 6¼ I.
5.42. Let F and G be linear operators on R2
defined by Fðx; yÞ ¼ ð0; xÞ and Gðx; yÞ ¼ ðx; 0Þ. Show that
(a) GF ¼ 0, the zero mapping, but FG 6¼ 0. (b) G2
¼ G.
(a) ðGFÞðx; yÞ ¼ GðFðx; yÞÞ ¼ Gð0; xÞ ¼ ð0; 0Þ. Because GF assigns 0 ¼ ð0; 0Þ to every vector (x; y) in R2
,
it is the zero mapping; that is, GF ¼ 0.
On the other hand, ðFGÞðx; yÞ ¼ FðGðx; yÞÞ ¼ Fðx; 0Þ ¼ ð0; xÞ. For example, ðFGÞð2; 3Þ ¼ ð0; 2Þ.
Thus, FG 6¼ 0, as it does not assign 0 ¼ ð0; 0Þ to every vector in R2
.
(b) For any vector (x; y) in R2
, we have G2
ðx; yÞ ¼ GðGðx; yÞÞ ¼ Gðx; 0Þ ¼ ðx; 0Þ ¼ Gðx; yÞ. Hence, G2
¼ G.
5.43. Find the dimension of (a) AðR4
Þ, (b) AðP2ðtÞÞ, (c) AðM2;3).
Use dim½AðVÞ ¼ n2
where dim V ¼ n. Hence, (a) dim½AðR4
Þ ¼ 42
¼ 16, (b) dim½AðP2ðtÞÞ ¼ 32
¼ 9,
(c) dim½AðM2;3Þ ¼ 62
¼ 36.
5.44. Let E be a linear operator on V for which E2
¼ E. (Such an operator is called a projection.) Let U
be the image of E, and let W be the kernel. Prove
(a) If u 2 U, then EðuÞ ¼ u (i.e., E is the identity mapping on U ).
(b) If E 6¼ I, then E is singular—that is, EðvÞ ¼ 0 for some v 6¼ 0.
(c) V ¼ U W.
(a) If u 2 U, the image of E, then EðvÞ ¼ u for some v 2 V. Hence, using E2
¼ E, we have
u ¼ EðvÞ ¼ E2
ðvÞ ¼ EðEðvÞÞ ¼ EðuÞ
(b) If E 6¼ I, then for some v 2 V, EðvÞ ¼ u, where v 6¼ u. By (i), EðuÞ ¼ u. Thus,
Eðv  uÞ ¼ EðvÞ  EðuÞ ¼ u  u ¼ 0; where v  u 6¼ 0
(c) We first show that V ¼ U þ W. Let v 2 V. Set u ¼ EðvÞ and w ¼ v  EðvÞ. Then
v ¼ EðvÞ þ v  EðvÞ ¼ u þ w
By deflnition, u ¼ EðvÞ 2 U, the image of E. We now show that w 2 W, the kernel of E,
EðwÞ ¼ Eðv  EðvÞÞ ¼ EðvÞ  E2
ðvÞ ¼ EðvÞ  EðvÞ ¼ 0
and thus w 2 W. Hence, V ¼ U þ W.
We next show that U  W ¼ f0g. Let v 2 U  W. Because v 2 U, EðvÞ ¼ v by part (a). Because
v 2 W, EðvÞ ¼ 0. Thus, v ¼ EðvÞ ¼ 0 and so U  W ¼ f0g.
The above two properties imply that V ¼ U W.
CHAPTER 5 Linear Mappings 189
SUPPLEMENTARY PROBLEMS
Mappings
5.45. Determine the number of different mappings from ðaÞ f1; 2g into f1; 2; 3g; ðbÞ f1; 2; ... ; rg into f1; 2; ... ; sg:
5.46. Let f : R ! R and g : R ! R be defined by f ðxÞ ¼ x2
þ 3x þ 1 and gðxÞ ¼ 2x  3. Find formulas defining
the composition mappings: (a) f  g; (b) g  f ; (c) g  g; (d) f  f.
5.47. For each mappings f : R ! R find a formula for its inverse: (a) f ðxÞ ¼ 3x  7, (b) f ðxÞ ¼ x3
þ 2.
5.48. For any mapping f : A ! B, show that 1B  f ¼ f ¼ f  1A.
Linear Mappings
5.49. Show that the following mappings are linear:
(a) F : R3
! R2
defined by Fðx; y; zÞ ¼ ðx þ 2y  3z; 4x  5y þ 6zÞ.
(b) F : R2
! R2
defined by Fðx; yÞ ¼ ðax þ by; cx þ dyÞ, where a, b, c, d belong to R.
5.50. Show that the following mappings are not linear:
(a) F : R2
! R2
defined by Fðx; yÞ ¼ ðx2
; y2
Þ.
(b) F : R3
! R2
defined by Fðx; y; zÞ ¼ ðx þ 1; y þ zÞ.
(c) F : R2
! R2
defined by Fðx; yÞ ¼ ðxy; yÞ.
(d) F : R3
! R2
defined by Fðx; y; zÞ ¼ ðjxj; y þ zÞ.
5.51. Find Fða; bÞ, where the linear map F : R2
! R2
is defined by Fð1; 2Þ ¼ ð3; 1Þ and Fð0; 1Þ ¼ ð2; 1Þ.
5.52. Find a 2 2 matrix A that maps
(a) ð1; 3ÞT
and ð1; 4ÞT
into ð2; 5ÞT
and ð3; 1ÞT
, respectively.
(b) ð2; 4ÞT
and ð1; 2ÞT
into ð1; 1ÞT
and ð1; 3ÞT
, respectively.
5.53. Find a 2 2 singular matrix B that maps ð1; 1ÞT
into ð1; 3ÞT
.
5.54. Let V be the vector space of real n-square matrices, and let M be a fixed nonzero matrix in V. Show that the
first two of the following mappings T : V ! V are linear, but the third is not:
(a) TðAÞ ¼ MA, (b) TðAÞ ¼ AM þ MA, (c) TðAÞ ¼ M þ A.
5.55. Give an example of a nonlinear map F : R2
! R2
such that F1
ð0Þ ¼ f0g but F is not one-to-one.
5.56. Let F : R2
! R2
be defined by Fðx; yÞ ¼ ð3x þ 5y; 2x þ 3yÞ, and let S be the unit circle in R2
. (S consists
of all points satisfying x2
þ y2
¼ 1.) Find (a) the image FðSÞ, (b) the preimage F1
ðSÞ.
5.57. Consider the linear map G : R3
! R3
defined by Gðx; y; zÞ ¼ ðx þ y þ z; y  2z; y  3zÞ and the unit
sphere S2 in R3
, which consists of the points satisfying x2
þ y2
þ z2
¼ 1. Find (a) GðS2Þ, (b) G1
ðS2Þ.
5.58. Let H be the plane x þ 2y  3z ¼ 4 in R3
and let G be the linear map in Problem 5.57. Find
(a) GðHÞ, (b) G1
ðHÞ.
5.59. Let W be a subspace of V. The inclusion map, denoted by i : W ,! V, is defined by iðwÞ ¼ w for every
w 2 W. Show that the inclusion map is linear.
5.60. Suppose F : V ! U is linear. Show that FðvÞ ¼ FðvÞ.
Kernel and Image of Linear Mappings
5.61. For each linear map F find a basis and the dimension of the kernel and the image of F:
(a) F : R3
! R3
defined by Fðx; y; zÞ ¼ ðx þ 2y  3z; 2x þ 5y  4z; x þ 4y þ zÞ,
(b) F : R4
! R3
defined by Fðx; y; z; tÞ ¼ ðx þ 2y þ 3z þ 2t; 2x þ 4y þ 7z þ 5t; x þ 2y þ 6z þ 5tÞ.
190 CHAPTER 5 Linear Mappings
5.62. For each linear map G, find a basis and the dimension of the kernel and the image of G:
(a) G : R3
! R2
defined by Gðx; y; zÞ ¼ ðx þ y þ z; 2x þ 2y þ 2zÞ,
(b) G : R3
! R2
defined by Gðx; y; zÞ ¼ ðx þ y; y þ zÞ,
(c) G : R5
! R3
defined by
Gðx; y; z; s; tÞ ¼ ðx þ 2y þ 2z þ s þ t; x þ 2y þ 3z þ 2s  t; 3x þ 6y þ 8z þ 5s  tÞ:
5.63. Each of the following matrices determines a linear map from R4
into R3
:
(a) A ¼
1 2 0 1
2 1 2 1
1 3 2 2
2
4
3
5, (b) B ¼
1 0 2 1
2 3 1 1
2 0 5 3
2
4
3
5.
Find a basis as well as the dimension of the kernel and the image of each linear map.
5.64. Find a linear mapping F : R3
! R3
whose image is spanned by (1, 2, 3) and (4, 5, 6).
5.65. Find a linear mapping G : R4
! R3
whose kernel is spanned by (1, 2, 3, 4) and (0, 1, 1, 1).
5.66. Let V ¼ P10ðtÞ, the vector space of polynomials of degree 10. Consider the linear map D4
: V ! V, where
D4
denotes the fourth derivative d4
ð f Þ=dt4
. Find a basis and the dimension of
(a) the image of D4
; (b) the kernel of D4
.
5.67. Suppose F : V ! U is linear. Show that (a) the image of any subspace of V is a subspace of U;
(b) the preimage of any subspace of U is a subspace of V.
5.68. Show that if F : V ! U is onto, then dim U  dim V. Determine all linear maps F : R3
! R4
that are onto.
5.69. Consider the zero mapping 0 : V ! U defined by 0ðvÞ ¼ 0; 8 v 2 V. Find the kernel and the image of 0.
Operations with linear Mappings
5.70. Let F : R3
! R2
and G : R3
! R2
be defined by Fðx; y; zÞ ¼ ðy; x þ zÞ and Gðx; y; zÞ ¼ ð2z; x  yÞ. Find
formulas defining the mappings F þ G and 3F  2G.
5.71. Let H : R2
! R2
be defined by Hðx; yÞ ¼ ðy; 2xÞ. Using the maps F and G in Problem 5.70, find formulas
defining the mappings: (a) H  F and H  G, (b) F  H and G  H, (c) H  ðF þ GÞ and H  F þ H  G.
5.72. Show that the following mappings F, G, H are linearly independent:
(a) F; G; H 2 HomðR2
; R2
Þ defined by Fðx; yÞ ¼ ðx; 2yÞ, Gðx; yÞ ¼ ðy; x þ yÞ, Hðx; yÞ ¼ ð0; xÞ,
(b) F; G; H 2 HomðR3
; RÞ defined by Fðx; y; zÞ ¼ x þ y þ z, Gðx; y; zÞ ¼ y þ z, Hðx; y; zÞ ¼ x  z.
5.73. For F; G 2 HomðV; UÞ, show that rankðF þ GÞ  rankðFÞ þ rankðGÞ. (Here V has finite dimension.)
5.74. Let F : V ! U and G : U ! V be linear. Show that if F and G are nonsingular, then G  F is nonsingular.
Give an example where G  F is nonsingular but G is not. [Hint: Let dim V  dim U:
5.75. Find the dimension d of (a) HomðR2
; R8
Þ, (b) HomðP4ðtÞ; R3
Þ, (c) HomðM2;4; P2ðtÞÞ.
5.76. Determine whether or not each of the following linear maps is nonsingular. If not, find a nonzero vector v
whose image is 0; otherwise find a formula for the inverse map:
(a) F : R3
! R3
defined by Fðx; y; zÞ ¼ ðx þ y þ z; 2x þ 3y þ 5z; x þ 3y þ 7zÞ,
(b) G : R3
! P2ðtÞ defined by Gðx; y; zÞ ¼ ðx þ yÞt2
þ ðx þ 2y þ 2zÞt þ y þ z,
(c) H : R2
! P2ðtÞ defined by Hðx; yÞ ¼ ðx þ 2yÞt2
þ ðx  yÞt þ x þ y.
5.77. When can dim ½HomðV; UÞ ¼ dim V?
CHAPTER 5 Linear Mappings 191
Algebra of Linear Operators
5.78. Let F and G be the linear operators on R2
defined by Fðx; yÞ ¼ ðx þ y; 0Þ and Gðx; yÞ ¼ ðy; xÞ. Find
formulas defining the linear operators: (a) F þ G, (b) 5F  3G, (c) FG, (d) GF, (e) F2
, ( f ) G2
.
5.79. Show that each linear operator T on R2
is nonsingular and find a formula for T1
, where
(a) Tðx; yÞ ¼ ðx þ 2y; 2x þ 3yÞ, (b) Tðx; yÞ ¼ ð2x  3y; 3x  4yÞ.
5.80. Show that each of the following linear operators T on R3
is nonsingular and find a formula for T1
, where
(a) Tðx; y; zÞ ¼ ðx  3y  2z; y  4z; zÞ; (b) Tðx; y; zÞ ¼ ðx þ z; x  y; yÞ.
5.81. Find the dimension of AðVÞ, where (a) V ¼ R7
, (b) V ¼ P5ðtÞ, (c) V ¼ M3;4.
5.82. Which of the following integers can be the dimension of an algebra AðVÞ of linear maps:
5, 9, 12, 25, 28, 36, 45, 64, 88, 100?
5.83. Let T be the linear operator on R2
defined by Tðx; yÞ ¼ ðx þ 2y; 3x þ 4yÞ. Find a formula for f ðTÞ, where
(a) f ðtÞ ¼ t2
þ 2t  3, (b) f ðtÞ ¼ t2
 5t  2.
Miscellaneous Problems
5.84. Suppose F : V ! U is linear and k is a nonzero scalar. Prove that the maps F and kF have the same kernel
and the same image.
5.85. Suppose F and G are linear operators on V and that F is nonsingular. Assume that V has finite dimension.
Show that rankðFGÞ ¼ rankðGFÞ ¼ rankðGÞ.
5.86. Suppose V has finite dimension. Suppose T is a linear operator on V such that rankðT2
Þ ¼ rankðTÞ. Show
that Ker T  Im T ¼ f0g.
5.87. Suppose V ¼ U W. Let E1 and E2 be the linear operators on V defined by E1ðvÞ ¼ u, E2ðvÞ ¼ w, where
v ¼ u þ w, u 2 U, w 2 W. Show that (a) E2
1 ¼ E1 and E2
2 ¼ E2 (i.e., that E1 and E2 are projections);
(b) E1 þ E2 ¼ I, the identity mapping; (c) E1E2 ¼ 0 and E2E1 ¼ 0.
5.88. Let E1 and E2 be linear operators on V satisfying parts (a), (b), (c) of Problem 5.88. Prove
V ¼ Im E1 Im E2
5.89. Let v and w be elements of a real vector space V. The line segment L from v to v þ w is defined to be the set
of vectors v þ tw for 0  t  1. (See Fig. 5.6.)
(a) Show that the line segment L between vectors v and u consists of the points:
(i) ð1  tÞv þ tu for 0  t  1, (ii) t1v þ t2u for t1 þ t2 ¼ 1, t1  0, t2  0.
(b) Let F : V ! U be linear. Show that the image FðLÞ of a line segment L in V is a line segment in U.
Figure 5-6
192 CHAPTER 5 Linear Mappings
5.90. Let F : V ! U be linear and let W be a subspace of V. The restriction of F to W is the map FjW : W ! U
defined by FjWðvÞ ¼ FðvÞ for every v in W. Prove the following:
(a) FjW is linear; (b) KerðFjWÞ ¼ ðKer FÞ  W; (c) ImðFjWÞ ¼ FðWÞ.
5.91. A subset X of a vector space V is said to be convex if the line segment L between any two points (vectors)
P; Q 2 X is contained in X. (a) Show that the intersection of convex sets is convex; (b) suppose F : V ! U
is linear and X is convex. Show that FðXÞ is convex.
ANSWERS TO SUPPLEMENTARY PROBLEMS
5.45. ðaÞ 32
¼ 9; ðbÞ sr
5.46. (a) ð f  gÞðxÞ ¼ 4x2
þ 1, (b) ðg  f ÞðxÞ ¼ 2x2
þ 6x  1, (c) ðg  gÞðxÞ ¼ 4x  9,
(d) ð f  f ÞðxÞ ¼ x4
þ 6x3
þ 14x2
þ 15x þ 5
5.47. (a) f 1
ðxÞ ¼ 1
3 ðx þ 7Þ, (b) f 1
ðxÞ ¼
ffiffiffiffiffiffiffiffiffiffiffi
x  2
3
p
5.49. Fðx; y; zÞ ¼ Aðx; y; zÞT
, where (a) A ¼
1 2 3
4 5 6
 
, (b) A ¼
a b
c d
 
5.50. (a) u ¼ ð2; 2Þ, k ¼ 3; then FðkuÞ ¼ ð36; 36Þ but kFðuÞ ¼ ð12; 12Þ; (b) Fð0Þ 6¼ 0;
(c) u ¼ ð1; 2Þ, v ¼ ð3; 4Þ; then Fðu þ vÞ ¼ ð24; 6Þ but FðuÞ þ FðvÞ ¼ ð14; 6Þ;
(d) u ¼ ð1; 2; 3Þ, k ¼ 2; then FðkuÞ ¼ ð2; 10Þ but kFðuÞ ¼ ð2; 10Þ.
5.51. Fða; bÞ ¼ ða þ 2b; 3a þ bÞ
5.52. (a) A ¼
17 5
23 6
 
; (b) None. (2; 4) and (1; 2) are linearly dependent but not (1, 1) and (1, 3).
5.53. B ¼
1 0
3 0
 
[Hint: Send ð0; 1ÞT
into ð0; 0ÞT
.]
5.55. Fðx; yÞ ¼ ðx2
; y2
Þ
5.56. (a) 13x2
 42xy þ 34y2
¼ 1, (b) 13x2
þ 42xy þ 34y2
¼ 1
5.57. (a) x2
 8xy þ 26y2
þ 6xz  38yz þ 14z2
¼ 1, (b) x2
þ 2xy þ 3y2
þ 2xz  8yz þ 14z2
¼ 1
5.58. (a) x  y þ 2z ¼ 4, (b) x þ 6z ¼ 4
5.61. (a) dimðKer FÞ ¼ 1, fð7; 2; 1Þg; dimðIm FÞ ¼ 2, fð1; 2; 1Þ; ð0; 1; 2Þg;
(b) dimðKer FÞ ¼ 2, fð2; 1; 0; 0Þ; ð1; 0; 1; 1Þg; dimðIm FÞ ¼ 2, fð1; 2; 1Þ; ð0; 1; 3Þg
5.62. (a) dimðKer GÞ ¼ 2, fð1; 0; 1Þ; ð1; 1; 0Þg; dimðIm GÞ ¼ 1, fð1; 2Þg;
(b) dimðKer GÞ ¼ 1, fð1; 1; 1Þg; Im G ¼ R2
, fð1; 0Þ; ð0; 1Þg;
(c) dimðKer GÞ ¼ 3, fð2; 1; 0; 0; 0Þ; ð1; 0; 1; 1; 0Þ; ð5; 0; 2; 0; 1Þg; dimðIm GÞ ¼ 2,
fð1; 1; 3Þ; ð0; 1; 2Þg
5.63. (a) dimðKer AÞ ¼ 2, fð4; 2; 5; 0Þ; ð1; 3; 0; 5Þg; dimðIm AÞ ¼ 2, fð1; 2; 1Þ; ð0; 1; 1Þg;
(b) dimðKer BÞ ¼ 1, fð1; 2
3 ; 1; 1Þg; Im B ¼ R3
5.64. Fðx; y; zÞ ¼ ðx þ 4y; 2x þ 5y; 3x þ 6yÞ
CHAPTER 5 Linear Mappings 193
5.65. Fðx; y; z; tÞ ¼ ðx þ y  z; 2x þ y  t; 0Þ
5.66. (a) f1; t; t2
; . . . ; t6
g, (b) f1; t; t2
; t3
g
5.68. None, because dim R4
 dim R3
:
5.69. Ker 0 ¼ V, Im 0 ¼ f0g
5.70. ðF þ GÞðx; y; zÞ ¼ ðy þ 2z; 2x  y þ zÞ, ð3F  2GÞðx; y; zÞ ¼ ð3y  4z; x þ 2y þ 3zÞ
5.71. (a) ðH  FÞðx; y; zÞ ¼ ðx þ z; 2yÞ, ðH  GÞðx; y; zÞ ¼ ðx  y; 4zÞ; (b) not defined;
(c) ðH  ðF þ GÞÞðx; y; zÞ ¼ ðH  F þ H  GÞðx; y; zÞ ¼ ð2x  y þ z; 2y þ 4zÞ
5.74. Fðx; yÞ ¼ ðx; y; yÞ; Gðx; y; zÞ ¼ ðx; yÞ
5.75. (a) 16, (b) 15, (c) 24
5.76. (a) v ¼ ð2; 3; 1Þ; (b) G1
ðat2
þ bt þ cÞ ¼ ðb  2c; a  b þ 2c; a þ b  cÞ;
(c) H is nonsingular, but not invertible, because dim P2ðtÞ  dim R2
.
5.77. dim U ¼ 1; that is, U ¼ K.
5.78. (a) ðF þ GÞðx; yÞ ¼ ðx; xÞ; (b) ð5F  3GÞðx; yÞ ¼ ð5x þ 8y; 3xÞ; (c) ðFGÞðx; yÞ ¼ ðx  y; 0Þ;
(d) ðGFÞðx; yÞ ¼ ð0; x þ yÞ; (e) F2
ðx; yÞ ¼ ðx þ y; 0Þ (note that F2
¼ F); ( f ) G2
ðx; yÞ ¼ ðx; yÞ.
[Note that G2
þ I ¼ 0; hence, G is a zero of f ðtÞ ¼ t2
þ 1.]
5.79. (a) T1
ðx; yÞ ¼ ð3x þ 2y; 2x  yÞ, (b) T1
ðx; yÞ ¼ ð4x þ 3y; 3x þ 2yÞ
5.80. (a) T1
ðx; y; zÞ ¼ ðx þ 3y þ 14z; y  4z; zÞ, (b) T1
ðx; y; zÞ ¼ ðy þ z; y; x  y  zÞ
5.81. (a) 49, (b) 36, (c) 144
5.82. Squares: 9, 25, 36, 64, 100
5.83. (a) Tðx; yÞ ¼ ð6x þ 14y; 21x þ 27yÞ; (b) Tðx; yÞ ¼ ð0; 0Þ—that is, f ðTÞ ¼ 0
194 CHAPTER 5 Linear Mappings
Linear Mappings
and Matrices
6.1 Introduction
Consider a basis S ¼ fu1; u2; . . . ; ung of a vector space V over a field K. For any vector v 2 V, suppose
v ¼ a1u1 þ a2u2 þ    þ anun
Then the coordinate vector of v relative to the basis S, which we assume to be a column vector (unless
otherwise stated or implied), is denoted and defined by
½vS ¼ ½a1; a2; . . . ; anT
Recall (Section 4.11) that the mapping v7!½vS, determined by the basis S, is an isomorphism between V
and Kn
.
This chapter shows that there is also an isomorphism, determined by the basis S, between the algebra
AðVÞ of linear operators on V and the algebra M of n-square matrices over K. Thus, every linear mapping
F: V ! V will correspond to an n-square matrix ½FS determined by the basis S. We will also show how
our matrix representation changes when we choose another basis.
6.2 Matrix Representation of a Linear Operator
Let T be a linear operator (transformation) from a vector space V into itself, and suppose
S ¼ fu1; u2; . . . ; ung is a basis of V. Now Tðu1Þ, Tðu2Þ; . . . ; TðunÞ are vectors in V, and so each is a
linear combination of the vectors in the basis S; say,
Tðu1Þ ¼ a11u1 þ a12u2 þ    þ a1nun
Tðu2Þ ¼ a21u1 þ a22u2 þ    þ a2nun
::::::::::::::::::::::::::::::::::::::::::::::::::::::
TðunÞ ¼ an1u1 þ an2u2 þ    þ annun
The following definition applies.
DEFINITION: The transpose of the above matrix of coefficients, denoted by mSðTÞ or ½TS, is called
the matrix representation of T relative to the basis S, or simply the matrix of T in the
basis S. (The subscript S may be omitted if the basis S is understood.)
Using the coordinate (column) vector notation, the matrix representation of T may be written in the
form
mSðTÞ ¼ ½TS ¼ ½Tðu1ÞS; ½Tðu2ÞS; . . . ; ½Tðu1ÞS
 
That is, the columns of mðTÞ are the coordinate vectors of Tðu1Þ, Tðu2Þ; . . . ; TðunÞ, respectively.
CHAPTER 6
195
EXAMPLE 6.1 Let F: R2
! R2
be the linear operator defined by Fðx; yÞ ¼ ð2x þ 3y; 4x  5yÞ.
(a) Find the matrix representation of F relative to the basis S ¼ fu1; u2g ¼ fð1; 2Þ; ð2; 5Þg.
(1) First find Fðu1Þ, and then write it as a linear combination of the basis vectors u1 and u2. (For notational
convenience, we use column vectors.) We have
Fðu1Þ ¼ F
1
2
 
 
¼
8
6
 
¼ x
1
2
 
þ y
2
5
 
and
x þ 2y ¼ 8
2x þ 5y ¼ 6
Solve the system to obtain x ¼ 52, y ¼ 22. Hence, Fðu1Þ ¼ 52u1  22u2.
(2) Next find Fðu2Þ, and then write it as a linear combination of u1 and u2:
Fðu2Þ ¼ F
2
5
 
 
¼
19
17
 
¼ x
1
2
 
þ y
2
5
 
and
x þ 2y ¼ 19
2x þ 5y ¼ 17
Solve the system to get x ¼ 129, y ¼ 55. Thus, Fðu2Þ ¼ 129u1  55u2.
Now write the coordinates of Fðu1Þ and Fðu2Þ as columns to obtain the matrix
½FS ¼
52 129
22 55
 
(b) Find the matrix representation of F relative to the (usual) basis E ¼ fe1; e2g ¼ fð1; 0Þ; ð0; 1Þg.
Find Fðe1Þ and write it as a linear combination of the usual basis vectors e1 and e2, and then find Fðe2Þ and
write it as a linear combination of e1 and e2. We have
Fðe1Þ ¼ Fð1; 0Þ ¼ ð2; 2Þ ¼ 2e1 þ 4e2
Fðe2Þ ¼ Fð0; 1Þ ¼ ð3; 5Þ ¼ 3e1  5e2
and so ½FE ¼
2 3
4 5
 
Note that the coordinates of Fðe1Þ and Fðe2Þ form the columns, not the rows, of ½FE. Also, note that the
arithmetic is much simpler using the usual basis of R2
.
EXAMPLE 6.2 Let V be the vector space of functions with basis S ¼ fsin t; cos t; e3t
g, and let D: V ! V
be the differential operator defined by Dð f ðtÞÞ ¼ dð f ðtÞÞ=dt. We compute the matrix representing D in
the basis S:
Dðsin tÞ ¼ cos t ¼ 0ðsin tÞ þ 1ðcos tÞ þ 0ðe3
tÞ
Dðcos tÞ ¼  sin t ¼ 1ðsin tÞ þ 0ðcos tÞ þ 0ðe3t
Þ
Dðe3t
Þ ¼ 3e3t
¼ 0ðsin tÞ þ 0ðcos tÞ þ 3ðe3t
Þ
and so ½D ¼
0 1 0
1 0 0
0 0 3
2
6
4
3
7
5
Note that the coordinates of Dðsin tÞ, Dðcos tÞ, Dðe3t
Þ form the columns, not the rows, of ½D.
Matrix Mappings and Their Matrix Representation
Consider the following matrix A, which may be viewed as a linear operator on R2
, and basis S of R2
:
A ¼
3 2
4 5
 
and S ¼ fu1; u2g ¼
1
2
 
;
2
5
 
(We write vectors as columns, because our map is a matrix.) We find the matrix representation of A
relative to the basis S.
196 CHAPTER 6 Linear Mappings and Matrices
(1) First we write Aðu1Þ as a linear combination of u1 and u2. We have
Aðu1Þ ¼
3 2
4 5
 
1
2
 
¼
1
6
 
¼ x
1
2
 
þ y
2
5
 
and so
x þ 2y ¼ 1
2x þ 5y ¼ 6
Solving the system yields x ¼ 7, y ¼ 4. Thus, Aðu1Þ ¼ 7u1  4u2.
(2) Next we write Aðu2Þ as a linear combination of u1 and u2. We have
Aðu2Þ ¼
3 2
4 5
 
2
5
 
¼
4
7
 
¼ x
1
2
 
þ y
2
5
 
and so
x þ 2y ¼ 4
2x þ 5y ¼ 7
Solving the system yields x ¼ 6, y ¼ 1. Thus, Aðu2Þ ¼ 6u1 þ u2. Writing the coordinates of
Aðu1Þ and Aðu2Þ as columns gives us the following matrix representation of A:
½AS ¼
7 6
4 1
 
Remark: Suppose we want to find the matrix representation of A relative to the usual basis
E ¼ fe1; e2g ¼ f½1; 0T
; ½0; 1T
g of R2
: We have
Aðe1Þ ¼
3 2
4 5
 
1
0
 
¼
3
4
 
¼ 3e1 þ 4e2
Aðe2Þ ¼
3 2
4 5
 
0
1
 
¼
2
5
 
¼ 2e1  5e2
and so ½AE ¼
3 2
4 5
 
Note that ½AE is the original matrix A. This result is true in general:
The matrix representation of any n n square matrix A over a field K relative to the
usual basis E of Kn
is the matrix A itself; that is;
½AE ¼ A
Algorithm for Finding Matrix Representations
Next follows an algorithm for finding matrix representations. The first Step 0 is optional. It may be useful
to use it in Step 1(b), which is repeated for each basis vector.
ALGORITHM 6.1: The input is a linear operator T on a vector space V and a basis
S ¼ fu1; u2; . . . ; ung of V. The output is the matrix representation ½TS.
Step 0. Find a formula for the coordinates of an arbitrary vector v relative to the basis S.
Step 1. Repeat for each basis vector uk in S:
(a) Find TðukÞ.
(b) Write TðukÞ as a linear combination of the basis vectors u1; u2; . . . ; un.
Step 2. Form the matrix ½TS whose columns are the coordinate vectors in Step 1(b).
EXAMPLE 6.3 Let F: R2
! R2
be defined by Fðx; yÞ ¼ ð2x þ 3y; 4x  5yÞ. Find the matrix representa-
tion ½FS of F relative to the basis S ¼ fu1; u2g ¼ fð1; 2Þ; ð2; 5Þg.
(Step 0) First find the coordinates of ða; bÞ 2 R2
relative to the basis S. We have
a
b
 
¼ x
1
2
 
þ y
2
5
 
or
x þ 2y ¼ a
2x  5y ¼ b
or
x þ 2y ¼ a
y ¼ 2a þ b
CHAPTER 6 Linear Mappings and Matrices 197
Solving for x and y in terms of a and b yields x ¼ 5a þ 2b, y ¼ 2a  b. Thus,
ða; bÞ ¼ ð5a þ 2bÞu1 þ ð2a  bÞu2
(Step 1) Now we find Fðu1Þ and write it as a linear combination of u1 and u2 using the above formula for ða; bÞ,
and then we repeat the process for Fðu2Þ. We have
Fðu1Þ ¼ Fð1; 2Þ ¼ ð4; 14Þ ¼ 8u1  6u2
Fðu2Þ ¼ Fð2; 5Þ ¼ ð11; 33Þ ¼ 11u1  11u2
(Step 2) Finally, we write the coordinates of Fðu1Þ and Fðu2Þ as columns to obtain the required matrix:
½FS ¼
8 11
6 11
 
Properties of Matrix Representations
This subsection gives the main properties of the matrix representations of linear operators T on a vector
space V. We emphasize that we are always given a particular basis S of V.
Our first theorem, proved in Problem 6.9, tells us that the ‘‘action’’ of a linear operator T on a vector v
is preserved by its matrix representation.
THEOREM 6.1: Let T: V ! V be a linear operator, and let S be a (finite) basis of V. Then, for any
vector v in V, ½TS½vS ¼ ½TðvÞS.
EXAMPLE 6.4 Consider the linear operator F on R2
and the basis S of Example 6.3; that is,
Fðx; yÞ ¼ ð2x þ 3y; 4x  5yÞ and S ¼ fu1; u2g ¼ fð1; 2Þ; ð2; 5Þg
Let
v ¼ ð5; 7Þ; and so FðvÞ ¼ ð11; 55Þ
Using the formula from Example 6.3, we get
½v ¼ ½11; 3T
and ½FðvÞ ¼ ½55; 33T
We verify Theorem 6.1 for this vector v (where ½F is obtained from Example 6.3):
½F½v ¼
8 11
6 11
 
11
3
 
¼
55
33
 
¼ ½FðvÞ
Given a basis S of a vector space V, we have associated a matrix ½T to each linear operator T in the
algebra AðVÞ of linear operators on V. Theorem 6.1 tells us that the ‘‘action’’ of an individual linear
operator T is preserved by this representation. The next two theorems (proved in Problems 6.10 and 6.11)
tell us that the three basic operations in AðVÞ with these operators—namely (i) addition, (ii) scalar
multiplication, and (iii) composition—are also preserved.
THEOREM 6.2: Let V be an n-dimensional vector space over K, let S be a basis of V, and let M be
the algebra of n n matrices over K. Then the mapping
m: AðVÞ ! M defined by mðTÞ ¼ ½TS
is a vector space isomorphism. That is, for any F; G 2 AðVÞ and any k 2 K,
(i) mðF þ GÞ ¼ mðFÞ þ mðGÞ or ½F þ G ¼ ½F þ ½G
(ii) mðkFÞ ¼ kmðFÞ or ½kF ¼ k½F
(iii) m is bijective (one-to-one and onto).
198 CHAPTER 6 Linear Mappings and Matrices
THEOREM 6.3: For any linear operators F; G 2 AðVÞ,
mðG  FÞ ¼ mðGÞmðFÞ or ½G  F ¼ ½G½F
(Here G  F denotes the composition of the maps G and F.)
6.3 Change of Basis
Let V be an n-dimensional vector space over a field K. We have shown that once we have selected a basis
S of V, every vector v 2 V can be represented by means of an n-tuple ½vS in Kn
, and every linear operator
T in AðVÞ can be represented by an n n matrix over K. We ask the following natural question:
How do our representations change if we select another basis?
In order to answer this question, we first need a definition.
DEFINITION: Let S ¼ fu1; u2; . . . ; ung be a basis of a vector space V; and let S0
¼ fv1; v2; . . . ; vng
be another basis. (For reference, we will call S the ‘‘old’’ basis and S0
the ‘‘new’’
basis.) Because S is a basis, each vector in the ‘‘new’’ basis S0
can be written uniquely
as a linear combination of the vectors in S; say,
v1 ¼ a11u1 þ a12u2 þ    þ a1nun
v2 ¼ a21u1 þ a22u2 þ    þ a2nun
:::::::::::::::::::::::::::::::::::::::::::::::::
vn ¼ an1u1 þ an2u2 þ    þ annun
Let P be the transpose of the above matrix of coefficients; that is, let P ¼ ½pij, where
pij ¼ aji. Then P is called the change-of-basis matrix (or transition matrix) from the
‘‘old’’ basis S to the ‘‘new’’ basis S0
.
The following remarks are in order.
Remark 1: The above change-of-basis matrix P may also be viewed as the matrix whose columns
are, respectively, the coordinate column vectors of the ‘‘new’’ basis vectors vi relative to the ‘‘old’’ basis
S; namely,
P ¼ ½v1S; ½v2S; . . . ; ½vnS
 
Remark 2: Analogously, there is a change-of-basis matrix Q from the ‘‘new’’ basis S0
to the
‘‘old’’ basis S. Similarly, Q may be viewed as the matrix whose columns are, respectively, the coordinate
column vectors of the ‘‘old’’ basis vectors ui relative to the ‘‘new’’ basis S0
; namely,
Q ¼ ½u1S0 ; ½u2S0 ; . . . ; ½unS0
 
Remark 3: Because the vectors v1; v2; . . . ; vn in the new basis S0
are linearly independent, the
matrix P is invertible (Problem 6.18). Similarly, Q is invertible. In fact, we have the following
proposition (proved in Problem 6.18).
PROPOSITION 6.4: Let P and Q be the above change-of-basis matrices. Then Q ¼ P1
.
Now suppose S ¼ fu1; u2; . . . ; ung is a basis of a vector space V, and suppose P ¼ ½pij is any
nonsingular matrix. Then the n vectors
vi ¼ p1iui þ p2iu2 þ    þ pniun; i ¼ 1; 2; . . . ; n
corresponding to the columns of P, are linearly independent [Problem 6.21(a)]. Thus, they form another
basis S0
of V. Moreover, P will be the change-of-basis matrix from S to the new basis S0
.
CHAPTER 6 Linear Mappings and Matrices 199
EXAMPLE 6.5 Consider the following two bases of R2
:
S ¼ fu1; u2g ¼ fð1; 2Þ; ð3; 5Þg and S0
¼ fv1; v2g ¼ fð1; 1Þ; ð1; 2Þg
(a) Find the change-of-basis matrix P from S to the ‘‘new’’ basis S0
.
Write each of the new basis vectors of S0
as a linear combination of the original basis vectors u1 and u2 of
S. We have
1
1
 
¼ x
1
2
 
þ y
3
5
 
or
x þ 3y ¼ 1
2x þ 5y ¼ 1
yielding x ¼ 8; y ¼ 3
1
1
 
¼ x
1
2
 
þ y
3
5
 
or
x þ 3y ¼ 1
2x þ 5y ¼ 1
yielding x ¼ 11; y ¼ 4
Thus,
v1 ¼ 8u1 þ 3u2
v2 ¼ 11u1 þ 4u2
and hence; P ¼
8 11
3 4
 
:
Note that the coordinates of v1 and v2 are the columns, not rows, of the change-of-basis matrix P.
(b) Find the change-of-basis matrix Q from the ‘‘new’’ basis S0
back to the ‘‘old’’ basis S.
Here we write each of the ‘‘old’’ basis vectors u1 and u2 of S0
as a linear combination of the ‘‘new’’ basis
vectors v1 and v2 of S0
. This yields
u1 ¼ 4v1  3v2
u2 ¼ 11v1  8v2
and hence; Q ¼
4 11
3 8
 
As expected from Proposition 6.4, Q ¼ P1
. (In fact, we could have obtained Q by simply finding P1
.)
EXAMPLE 6.6 Consider the following two bases of R3
:
E ¼ fe1; e2; e3g ¼ fð1; 0; 0Þ; ð0; 1; 0Þ; ð0; 0; 1Þg
and S ¼ fu1; u2; u3g ¼ fð1; 0; 1Þ; ð2; 1; 2Þ; ð1; 2; 2Þg
(a) Find the change-of-basis matrix P from the basis E to the basis S.
Because E is the usual basis, we can immediately write each basis element of S as a linear combination of
the basis elements of E. Specifically,
u1 ¼ ð1; 0; 1Þ ¼ e1 þ e3
u2 ¼ ð2; 1; 2Þ ¼ 2e1 þ e2 þ 2e3
u3 ¼ ð1; 2; 2Þ ¼ e1 þ 2e2 þ 2e3
and hence; P ¼
1 2 1
0 1 2
1 2 2
2
6
4
3
7
5
Again, the coordinates of u1; u2; u3 appear as the columns in P. Observe that P is simply the matrix whose
columns are the basis vectors of S. This is true only because the original basis was the usual basis E.
(b) Find the change-of-basis matrix Q from the basis S to the basis E.
The definition of the change-of-basis matrix Q tells us to write each of the (usual) basis vectors in E as a
linear combination of the basis elements of S. This yields
e1 ¼ ð1; 0; 0Þ ¼ 2u1 þ 2u2  u3
e2 ¼ ð0; 1; 0Þ ¼ 2u1 þ u2
e3 ¼ ð0; 0; 1Þ ¼ 3u1  2u2 þ u3
and hence; Q ¼
2 2 3
2 1 2
1 0 1
2
6
4
3
7
5
We emphasize that to find Q, we need to solve three 3 3 systems of linear equations—one 3 3 system for
each of e1; e2; e3.
200 CHAPTER 6 Linear Mappings and Matrices
Alternatively, we can find Q ¼ P1
by forming the matrix M ¼ ½P; I and row reducing M to row
canonical form:
M ¼
1 2 1 1 0 0
0 1 2 0 1 0
1 2 2 0 0 1
2
6
4
3
7
5
1 0 0 2 2 3
0 1 0 2 1 2
0 0 1 1 0 1
2
6
4
3
7
5 ¼ ½I; P1

thus; Q ¼ P1
¼
2 2 3
2 1 2
1 0 1
2
6
4
3
7
5
(Here we have used the fact that Q is the inverse of P.)
The result in Example 6.6(a) is true in general. We state this result formally, because it occurs often.
PROPOSITION 6.5: The change-of-basis matrix from the usual basis E of Kn
to any basis S of Kn
is
the matrix P whose columns are, respectively, the basis vectors of S.
Applications of Change-of-Basis Matrix
First we show how a change of basis affects the coordinates of a vector in a vector space V. The
following theorem is proved in Problem 6.22.
THEOREM 6.6: Let P be the change-of-basis matrix from a basis S to a basis S0
in a vector space V.
Then, for any vector v 2 V, we have
P½vS0 ¼ ½vS and hence; P1
½vS ¼ ½vS0
Namely, if we multiply the coordinates of v in the original basis S by P1
, we get the coordinates of v
in the new basis S0
.
Remark 1: Although P is called the change-of-basis matrix from the old basis S to the new basis
S0
, we emphasize that P1
transforms the coordinates of v in the original basis S into the coordinates of v
in the new basis S0
.
Remark 2: Because of the above theorem, many texts call Q ¼ P1
, not P, the transition matrix
from the old basis S to the new basis S0
. Some texts also refer to Q as the change-of-coordinates matrix.
We now give the proof of the above theorem for the special case that dim V ¼ 3. Suppose P is the
change-of-basis matrix from the basis S ¼ fu1; u2; u3g to the basis S0
¼ fv1; v2; v3g; say,
v1 ¼ a1u1 þ a2u2 þ a3a3
v2 ¼ b1u1 þ b2u2 þ b3u3
v3 ¼ c1u1 þ c2u2 þ c3u3
and hence; P ¼
a1 b1 c1
a2 b2 c2
a3 b3 c3
2
4
3
5
Now suppose v 2 V and, say, v ¼ k1v1 þ k2v2 þ k3v3. Then, substituting for v1; v2; v3 from above, we
obtain
v ¼ k1ða1u1 þ a2u2 þ a3u3Þ þ k2ðb1u1 þ b2u2 þ b3u3Þ þ k3ðc1u1 þ c2u2 þ c3u3Þ
¼ ða1k1 þ b1k2 þ c1k3Þu1 þ ða2k1 þ b2k2 þ c2k3Þu2 þ ða3k1 þ b3k2 þ c3k3Þu3
CHAPTER 6 Linear Mappings and Matrices 201
Thus,
½vS0 ¼
k1
k2
k3
2
4
3
5 and ½vS ¼
a1k1 þ b1k2 þ c1k3
a2k1 þ b2k2 þ c2k3
a3k1 þ b3k2 þ c3k3
2
4
3
5
Accordingly,
P½vS0 ¼
a1 b1 c1
a2 b2 c2
a3 b3 c3
2
4
3
5
k1
k2
k3
2
4
3
5 ¼
a1k1 þ b1k2 þ c1k3
a2k1 þ b2k2 þ c2k3
a3k1 þ b3k2 þ c3k3
2
4
3
5 ¼ ½vS
Finally, multiplying the equation ½vS ¼ P½vS, by P1
, we get
P1
½vS ¼ P1
P½vS0 ¼ I½vS0 ¼ ½vS0
The next theorem (proved in Problem 6.26) shows how a change of basis affects the matrix
representation of a linear operator.
THEOREM 6.7: Let P be the change-of-basis matrix from a basis S to a basis S0
in a vector space V.
Then, for any linear operator T on V,
½TS0 ¼ P1
½TSP
That is, if A and B are the matrix representations of T relative, respectively, to S and
S0
, then
B ¼ P1
AP
EXAMPLE 6.7 Consider the following two bases of R3
:
E ¼ fe1; e2; e3g ¼ fð1; 0; 0Þ; ð0; 1; 0Þ; ð0; 0; 1Þg
and S ¼ fu1; u2; u3g ¼ fð1; 0; 1Þ; ð2; 1; 2Þ; ð1; 2; 2Þg
The change-of-basis matrix P from E to S and its inverse P1
were obtained in Example 6.6.
(a) Write v ¼ ð1; 3; 5Þ as a linear combination of u1; u2; u3, or, equivalently, find ½vS.
One way to do this is to directly solve the vector equation v ¼ xu1 þ yu2 þ zu3; that is,
1
3
5
2
4
3
5 ¼ x
1
0
1
2
4
3
5 þ y
2
1
2
2
4
3
5 þ z
1
2
2
2
4
3
5 or
x þ 2y þ z ¼ 1
y þ 2z ¼ 3
x þ 2y þ 2z ¼ 5
The solution is x ¼ 7, y ¼ 5, z ¼ 4, so v ¼ 7u1  5u2 þ 4u3.
On the other hand, we know that ½vE ¼ ½1; 3; 5T
, because E is the usual basis, and we already know P1
.
Therefore, by Theorem 6.6,
½vS ¼ P1
½vE ¼
2 2 3
2 1 2
1 0 1
2
4
3
5
1
3
5
2
4
3
5 ¼
7
5
4
2
4
3
5
Thus, again, v ¼ 7u1  5u2 þ 4u3.
(b) Let A ¼
1 3 2
2 4 1
3 1 2
2
4
3
5, which may be viewed as a linear operator on R3
. Find the matrix B that represents A
relative to the basis S.
202 CHAPTER 6 Linear Mappings and Matrices
The definition of the matrix representation of A relative to the basis S tells us to write each of Aðu1Þ, Aðu2Þ,
Aðu3Þ as a linear combination of the basis vectors u1; u2; u3 of S. This yields
Aðu1Þ ¼ ð1; 3; 5Þ ¼ 11u1  5u2 þ 6u3
Aðu2Þ ¼ ð1; 2; 9Þ ¼ 21u1  14u2 þ 8u3
Aðu3Þ ¼ ð3; 4; 5Þ ¼ 17u1  8e2 þ 2u3
and hence; B ¼
11 21 17
5 14 8
6 8 2
2
6
4
3
7
5
We emphasize that to find B, we need to solve three 3 3 systems of linear equations—one 3 3 system for
each of Aðu1Þ, Aðu2Þ, Aðu3Þ.
On the other hand, because we know P and P1
, we can use Theorem 6.7. That is,
B ¼ P1
AP ¼
2 2 3
2 1 2
1 0 1
2
4
3
5
1 3 2
2 4 1
3 1 2
2
4
3
5
1 2 1
0 1 2
1 2 2
2
4
3
5 ¼
11 21 17
5 14 8
6 8 2
2
4
3
5
This, as expected, gives the same result.
6.4 Similarity
Suppose A and B are square matrices for which there exists an invertible matrix P such that B ¼ P1
AP;
then B is said to be similar to A, or B is said to be obtained from A by a similarity transformation. We
show (Problem 6.29) that similarity of matrices is an equivalence relation.
By Theorem 6.7 and the above remark, we have the following basic result.
THEOREM 6.8: Two matrices represent the same linear operator if and only if the matrices are
similar.
That is, all the matrix representations of a linear operator T form an equivalence class of similar
matrices.
A linear operator T is said to be diagonalizable if there exists a basis S of V such that T is represented
by a diagonal matrix; the basis S is then said to diagonalize T. The preceding theorem gives us the
following result.
THEOREM 6.9: Let A be the matrix representation of a linear operator T. Then T is diagonalizable
if and only if there exists an invertible matrix P such that P1
AP is a diagonal
matrix.
That is, T is diagonalizable if and only if its matrix representation can be diagonalized by a similarity
transformation.
We emphasize that not every operator is diagonalizable. However, we will show (Chapter 10) that
every linear operator can be represented by certain ‘‘standard’’ matrices called its normal or canonical
forms. Such a discussion will require some theory of fields, polynomials, and determinants.
Functions and Similar Matrices
Suppose f is a function on square matrices that assigns the same value to similar matrices; that is,
f ðAÞ ¼ f ðBÞ whenever A is similar to B. Then f induces a function, also denoted by f , on linear operators
T in the following natural way. We define
f ðTÞ ¼ f ð½TSÞ
where S is any basis. By Theorem 6.8, the function is well defined.
The determinant (Chapter 8) is perhaps the most important example of such a function. The trace
(Section 2.7) is another important example of such a function.
CHAPTER 6 Linear Mappings and Matrices 203
EXAMPLE 6.8 Consider the following linear operator F and bases E and S of R2
:
Fðx; yÞ ¼ ð2x þ 3y; 4x  5yÞ; E ¼ fð1; 0Þ; ð0; 1Þg; S ¼ fð1; 2Þ; ð2; 5Þg
By Example 6.1, the matrix representations of F relative to the bases E and S are, respectively,
A ¼
2 3
4 5
 
and B ¼
52 129
22 55
 
Using matrix A, we have
(i) Determinant of F ¼ detðAÞ ¼ 10  12 ¼ 22; (ii) Trace of F ¼ trðAÞ ¼ 2  5 ¼ 3:
On the other hand, using matrix B, we have
(i) Determinant of F ¼ detðBÞ ¼ 2860 þ 2838 ¼ 22; (ii) Trace of F ¼ trðBÞ ¼ 52  55 ¼ 3.
As expected, both matrices yield the same result.
6.5 Matrices and General Linear Mappings
Last, we consider the general case of linear mappings from one vector space into another. Suppose V and
U are vector spaces over the same field K and, say, dim V ¼ m and dim U ¼ n. Furthermore, suppose
S ¼ fv1; v2; . . . ; vmg and S0
¼ fu1; u2; . . . ; ung
are arbitrary but fixed bases, respectively, of V and U.
Suppose F: V ! U is a linear mapping. Then the vectors Fðv1Þ, Fðv2Þ; . . . ; FðvmÞ belong to U,
and so each is a linear combination of the basis vectors in S0
; say,
Fðv1Þ ¼ a11u1 þ a12u2 þ    þ a1nun
Fðv2Þ ¼ a21u1 þ a22u2 þ    þ a2nun
:::::::::::::::::::::::::::::::::::::::::::::::::::::::
FðvmÞ ¼ am1u1 þ am2u2 þ    þ amnun
DEFINITION: The transpose of the above matrix of coefficients, denoted by mS;S0 ðFÞ or ½FS;S0 , is
called the matrix representation of F relative to the bases S and S0
. [We will use the
simple notation mðFÞ and ½F when the bases are understood.]
The following theorem is analogous to Theorem 6.1 for linear operators (Problem 6.67).
THEOREM 6.10: For any vector v 2 V, ½FS;S0 ½vS ¼ ½FðvÞS0 .
That is, multiplying the coordinates of v in the basis S of V by ½F, we obtain the coordinates of FðvÞ
in the basis S0
of U.
Recall that for any vector spaces V and U, the collection of all linear mappings from V into U is a
vector space and is denoted by HomðV; UÞ. The following theorem is analogous to Theorem 6.2 for linear
operators, where now we let M ¼ Mm;n denote the vector space of all m n matrices (Problem 6.67).
THEOREM 6.11: The mapping m: HomðV; UÞ ! M defined by mðFÞ ¼ ½F is a vector space
isomorphism. That is, for any F; G 2 HomðV; UÞ and any scalar k,
(i) mðF þ GÞ ¼ mðFÞ þ mðGÞ or ½F þ G ¼ ½F þ ½G
(ii) mðkFÞ ¼ kmðFÞ or ½kF ¼ k½F
(iii) m is bijective (one-to-one and onto).
204 CHAPTER 6 Linear Mappings and Matrices
Our next theorem is analogous to Theorem 6.3 for linear operators (Problem 6.67).
THEOREM 6.12: Let S; S0
; S00
be bases of vector spaces V; U; W, respectively. Let F: V ! U and
G  U ! W be linear mappings. Then
½G  FS;S00 ¼ ½GS0;S00 ½FS;S0
That is, relative to the appropriate bases, the matrix representation of the composition of two
mappings is the matrix product of the matrix representations of the individual mappings.
Next we show how the matrix representation of a linear mapping F: V ! U is affected when new
bases are selected (Problem 6.67).
THEOREM 6.13: Let P be the change-of-basis matrix from a basis e to a basis e0
in V, and let Q be
the change-of-basis matrix from a basis f to a basis f 0
in U. Then, for any linear
map F: V ! U,
½Fe0; f 0 ¼ Q1
½Fe; f P
In other words, if A is the matrix representation of a linear mapping F relative to the bases e and f ,
and B is the matrix representation of F relative to the bases e0
and f 0
, then
B ¼ Q1
AP
Our last theorem, proved in Problem 6.36, shows that any linear mapping from one vector space V
into another vector space U can be represented by a very simple matrix. We note that this theorem is
analogous to Theorem 3.18 for m n matrices.
THEOREM 6.14: Let F: V ! U be linear and, say, rankðFÞ ¼ r. Then there exist bases of V and U
such that the matrix representation of F has the form
A ¼
Ir 0
0 0
 
where Ir is the r-square identity matrix.
The above matrix A is called the normal or canonical form of the linear map F.
SOLVED PROBLEMS
Matrix Representation of Linear Operators
6.1. Consider the linear mapping F: R2
! R2
defined by Fðx; yÞ ¼ ð3x þ 4y; 2x  5yÞ and the
following bases of R2
:
E ¼ fe1; e2g ¼ fð1; 0Þ; ð0; 1Þg and S ¼ fu1; u2g ¼ fð1; 2Þ; ð2; 3Þg
(a) Find the matrix A representing F relative to the basis E.
(b) Find the matrix B representing F relative to the basis S.
(a) Because E is the usual basis, the rows of A are simply the coefficients in the components of Fðx; yÞ; that
is, using ða; bÞ ¼ ae1 þ be2, we have
Fðe1Þ ¼ Fð1; 0Þ ¼ ð3; 2Þ ¼ 3e1 þ 2e2
Fðe2Þ ¼ Fð0; 1Þ ¼ ð4; 5Þ ¼ 4e1  5e2
and so A ¼
3 4
2 5
 
Note that the coefficients of the basis vectors are written as columns in the matrix representation.
CHAPTER 6 Linear Mappings and Matrices 205
(b) First find Fðu1Þ and write it as a linear combination of the basis vectors u1 and u2. We have
Fðu1Þ ¼ Fð1; 2Þ ¼ ð11; 8Þ ¼ xð1; 2Þ þ yð2; 3Þ; and so
x þ 2y ¼ 11
2x þ 3y ¼ 8
Solve the system to obtain x ¼ 49, y ¼ 30. Therefore,
Fðu1Þ ¼ 49u1 þ 30u2
Next find Fðu2Þ and write it as a linear combination of the basis vectors u1 and u2. We have
Fðu2Þ ¼ Fð2; 3Þ ¼ ð18; 11Þ ¼ xð1; 2Þ þ yð2; 3Þ; and so
x þ 2y ¼ 18
2x þ 3y ¼ 11
Solve for x and y to obtain x ¼ 76, y ¼ 47. Hence,
Fðu2Þ ¼ 76u1 þ 47u2
Write the coefficients of u1 and u2 as columns to obtain B ¼
49 76
30 47
 
(b0
) Alternatively, one can first find the coordinates of an arbitrary vector ða; bÞ in R2
relative to the basis S.
We have
ða; bÞ ¼ xð1; 2Þ þ yð2; 3Þ ¼ ðx þ 2y; 2x þ 3yÞ; and so
x þ 2y ¼ a
2x þ 3y ¼ b
Solve for x and y in terms of a and b to get x ¼ 3a þ 2b, y ¼ 2a  b. Thus,
ða; bÞ ¼ ð3a þ 2bÞu1 þ ð2a  bÞu2
Then use the formula for ða; bÞ to find the coordinates of Fðu1Þ and Fðu2Þ relative to S:
Fðu1Þ ¼ Fð1; 2Þ ¼ ð11; 8Þ ¼ 49u1 þ 30u2
Fðu2Þ ¼ Fð2; 3Þ ¼ ð18; 11Þ ¼ 76u1 þ 47u2
and so B ¼
49 76
30 47
 
6.2. Consider the following linear operator G on R2
and basis S:
Gðx; yÞ ¼ ð2x  7y; 4x þ 3yÞ and S ¼ fu1; u2g ¼ fð1; 3Þ; ð2; 5Þg
(a) Find the matrix representation ½GS of G relative to S.
(b) Verify ½GS½vS ¼ ½GðvÞS for the vector v ¼ ð4; 3Þ in R2
.
First find the coordinates of an arbitrary vector v ¼ ða; bÞ in R2
relative to the basis S. We
have
a
b
 
¼ x
1
3
 
þ y
2
5
 
; and so
x þ 2y ¼ a
3x þ 5y ¼ b
Solve for x and y in terms of a and b to get x ¼ 5a þ 2b, y ¼ 3a  b. Thus,
ða; bÞ ¼ ð5a þ 2bÞu1 þ ð3a  bÞu2; and so ½v ¼ ½5a þ 2b; 3a  bT
(a) Using the formula for ða; bÞ and Gðx; yÞ ¼ ð2x  7y; 4x þ 3yÞ, we have
Gðu1Þ ¼ Gð1; 3Þ ¼ ð19; 13Þ ¼ 121u1  70u2
Gðu2Þ ¼ Gð2; 5Þ ¼ ð31; 23Þ ¼ 201u1  116u2
and so ½GS ¼
121 201
70 116
 
(We emphasize that the coefficients of u1 and u2 are written as columns, not rows, in the matrix representation.)
(b) Use the formula ða; bÞ ¼ ð5a þ 2bÞu1 þ ð3a  bÞu2 to get
v ¼ ð4; 3Þ ¼ 26u1 þ 15u2
GðvÞ ¼ Gð4; 3Þ ¼ ð20; 7Þ ¼ 131u1 þ 80u2
Then ½vS ¼ ½26; 15T
and ½GðvÞS ¼ ½131; 80T
206 CHAPTER 6 Linear Mappings and Matrices
Accordingly,
½GS½vS ¼
121 201
70 116
 
26
15
 
¼
131
80
 
¼ ½GðvÞS
(This is expected from Theorem 6.1.)
6.3. Consider the following 2 2 matrix A and basis S of R2
:
A ¼
2 4
5 6
 
and S ¼ fu1; u2g ¼
1
2
 
;
3
7
 
The matrix A defines a linear operator on R2
. Find the matrix B that represents the mapping A
relative to the basis S.
First find the coordinates of an arbitrary vector ða; bÞT
with respect to the basis S. We have
a
b
 
¼ x
1
2
 
þ y
3
7
 
or
x þ 3y ¼ a
2x  7y ¼ b
Solve for x and y in terms of a and b to obtain x ¼ 7a þ 3b, y ¼ 2a  b. Thus,
ða; bÞT
¼ ð7a þ 3bÞu1 þ ð2a  bÞu2
Then use the formula for ða; bÞT
to find the coordinates of Au1 and Au2 relative to the basis S:
Au1 ¼
2 4
5 6
 
1
2
 
¼
6
7
 
¼ 63u1 þ 19u2
Au2 ¼
2 4
5 6
 
3
7
 
¼
22
27
 
¼ 235u1 þ 71u2
Writing the coordinates as columns yields
B ¼
63 235
19 71
 
6.4. Find the matrix representation of each of the following linear operators F on R3
relative to the
usual basis E ¼ fe1; e2; e3g of R3
; that is, find ½F ¼ ½FE:
(a) F defined by Fðx; y; zÞ ¼ ðx þ 2y  3z; 4x  5y  6z; 7x þ 8y þ 9z).
(b) F defined by the 3 3 matrix A ¼
1 1 1
2 3 4
5 5 5
2
4
3
5.
(c) F defined by Fðe1Þ ¼ ð1; 3; 5Þ; Fðe2Þ ¼ ð2; 4; 6Þ, Fðe3Þ ¼ ð7; 7; 7Þ. (Theorem 5.2 states that a
linear map is completely defined by its action on the vectors in a basis.)
(a) Because E is the usual basis, simply write the coefficients of the components of Fðx; y; zÞ as rows:
½F ¼
1 2 3
4 5 6
7 8 9
2
4
3
5
(b) Because E is the usual basis, ½F ¼ A, the matrix A itself.
(c) Here
Fðe1Þ ¼ ð1; 3; 5Þ ¼ e1 þ 3e2 þ 5e3
Fðe2Þ ¼ ð2; 4; 6Þ ¼ 2e1 þ 4e2 þ 6e3
Fðe3Þ ¼ ð7; 7; 7Þ ¼ 7e1 þ 7e2 þ 7e3
and so ½F ¼
1 2 7
3 4 7
5 6 7
2
4
3
5
That is, the columns of ½F are the images of the usual basis vectors.
6.5. Let G be the linear operator on R3
defined by Gðx; y; zÞ ¼ ð2y þ z; x  4y; 3xÞ.
(a) Find the matrix representation of G relative to the basis
S ¼ fw1; w2; w3g ¼ fð1; 1; 1Þ; ð1; 1; 0Þ; ð1; 0; 0Þg
(b) Verify that ½G½v ¼ ½GðvÞ for any vector v in R3
.
CHAPTER 6 Linear Mappings and Matrices 207
First find the coordinates of an arbitrary vector ða; b; cÞ 2 R3
with respect to the basis S. Write ða; b; cÞ as
a linear combination of w1; w2; w3 using unknown scalars x; y, and z:
ða; b; cÞ ¼ xð1; 1; 1Þ þ yð1; 1; 0Þ þ zð1; 0; 0Þ ¼ ðx þ y þ z; x þ y; xÞ
Set corresponding components equal to each other to obtain the system of equations
x þ y þ z ¼ a; x þ y ¼ b; x ¼ c
Solve the system for x; y, z in terms of a; b, c to find x ¼ c, y ¼ b  c, z ¼ a  b. Thus,
ða; b; cÞ ¼ cw1 þ ðb  cÞw2 þ ða  bÞw3, or equivalently, ½ða; b; cÞ ¼ ½c; b  c; a  bT
(a) Because Gðx; y; zÞ ¼ ð2y þ z; x  4y; 3xÞ,
Gðw1Þ ¼ Gð1; 1; 1Þ ¼ ð3; 3; 3Þ ¼ 3w1  6x2 þ 6x3
Gðw2Þ ¼ Gð1; 1; 0Þ ¼ ð2; 3; 3Þ ¼ 3w1  6w2 þ 5w3
Gðw3Þ ¼ Gð1; 0; 0Þ ¼ ð0; 1; 3Þ ¼ 3w1  2w2  w3
Write the coordinates Gðw1Þ, Gðw2Þ, Gðw3Þ as columns to get
½G ¼
3 3 3
6 6 2
6 5 1
2
4
3
5
(b) Write GðvÞ as a linear combination of w1; w2; w3, where v ¼ ða; b; cÞ is an arbitrary vector in R3
,
GðvÞ ¼ Gða; b; cÞ ¼ ð2b þ c; a  4b; 3aÞ ¼ 3aw1 þ ð2a  4bÞw2 þ ða þ 6b þ cÞw3
or equivalently,
½GðvÞ ¼ ½3a; 2a  4b; a þ 6b þ cT
Accordingly,
½G½v ¼
3 3 3
6 6 2
6 5 1
2
4
3
5
c
b  c
a  b
2
4
3
5 ¼
3a
2a  4b
a þ 6b þ c
2
4
3
5 ¼ ½GðvÞ
6.6. Consider the following 3 3 matrix A and basis S of R3
:
A ¼
1 2 1
3 1 0
1 4 2
2
4
3
5 and S ¼ fu1; u2; u3g ¼
1
1
1
2
4
3
5;
0
1
1
2
4
3
5;
1
2
3
2
4
3
5
8

:
9
=
;
The matrix A defines a linear operator on R3
. Find the matrix B that represents the mapping A
relative to the basis S. (Recall that A represents itself relative to the usual basis of R3
.)
First find the coordinates of an arbitrary vector ða; b; cÞ in R3
with respect to the basis S. We have
a
b
c
2
4
3
5 ¼ x
1
1
1
2
4
3
5 þ y
0
1
1
2
4
3
5 þ z
1
2
3
2
4
3
5 or
x þ z ¼ a
x þ y þ 2z ¼ b
x þ y þ 3z ¼ c
Solve for x; y; z in terms of a; b; c to get
x ¼ a þ b  c; y ¼ a þ 2b  c; z ¼ c  b
thus; ða; b; cÞT
¼ ða þ b  cÞu1 þ ða þ 2b  cÞu2 þ ðc  bÞu3
208 CHAPTER 6 Linear Mappings and Matrices
Then use the formula for ða; b; cÞT
to find the coordinates of Au1, Au2, Au3 relative to the basis S:
Aðu1Þ ¼ Að1; 1; 1ÞT
¼ ð0; 2; 3ÞT
¼ u1 þ u2 þ u3
Aðu2Þ ¼ Að1; 1; 0ÞT
¼ ð1; 1; 2ÞT
¼ 4u1  3u2 þ 3u3
Aðu3Þ ¼ Að1; 2; 3ÞT
¼ ð0; 1; 3ÞT
¼ 2u1  u2 þ 2u3
so B ¼
1 4 2
1 3 1
1 3 2
2
4
3
5
6.7. For each of the following linear transformations (operators) L on R2
, find the matrix A that
represents L (relative to the usual basis of R2
):
(a) L is defined by Lð1; 0Þ ¼ ð2; 4Þ and Lð0; 1Þ ¼ ð5; 8Þ.
(b) L is the rotation in R2
counterclockwise by 90
.
(c) L is the reflection in R2
about the line y ¼ x.
(a) Because fð1; 0Þ; ð0; 1Þg is the usual basis of R2
, write their images under L as columns to get
A ¼
2 5
4 8
 
(b) Under the rotation L, we have Lð1; 0Þ ¼ ð0; 1Þ and Lð0; 1Þ ¼ ð1; 0Þ. Thus,
A ¼
0 1
1 0
 
(c) Under the reflection L, we have Lð1; 0Þ ¼ ð0; 1Þ and Lð0; 1Þ ¼ ð1; 0Þ. Thus,
A ¼
0 1
1 0
 
6.8. The set S ¼ fe3t
, te3t
, t2
e3t
g is a basis of a vector space V of functions f : R ! R. Let D be the
differential operator on V; that is, Dð f Þ ¼ df =dt. Find the matrix representation of D relative to
the basis S.
Find the image of each basis function:
Dðe3t
Þ ¼ 3e3t
Dðte3t
Þ ¼ e3t
þ 3te3t
Dðt2
e3t
Þ ¼ 2te3t
þ 3t2
e3t
¼ 3ðe3t
Þ þ 0ðte3t
Þ þ 0ðt2
e3t
Þ
¼ 1ðe3t
Þ þ 3ðte3t
Þ þ 0ðt2
e3t
Þ
¼ 0ðe3t
Þ þ 2ðte3t
Þ þ 3ðt2
e3t
Þ
and thus; ½D ¼
3 1 0
0 3 2
0 0 3
2
4
3
5
6.9. Prove Theorem 6.1: Let T: V ! V be a linear operator, and let S be a (finite) basis of V. Then, for
any vector v in V, ½TS½vS ¼ ½TðvÞS.
Suppose S ¼ fu1; u2; . . . ; ung, and suppose, for i ¼ 1; . . . ; n,
TðuiÞ ¼ ai1u1 þ ai2u2 þ    þ ainun ¼
P
n
j¼1
aijuj
Then ½TS is the n-square matrix whose jth row is
ða1j; a2j; . . . ; anjÞ ð1Þ
Now suppose
v ¼ k1u1 þ k2u2 þ    þ knun ¼
P
n
i¼1
kiui
Writing a column vector as the transpose of a row vector, we have
½vS ¼ ½k1; k2; . . . ; knT
ð2Þ
CHAPTER 6 Linear Mappings and Matrices 209
Furthermore, using the linearity of T,
TðvÞ ¼ T
P
n
i¼1
kiui
 
¼
P
n
i¼1
kiTðuiÞ ¼
P
n
i¼1
ki

P
n
j¼1
aijuj

¼
P
n
j¼1
P
n
i¼1
aijki
 
uj ¼
P
n
j¼1
ða1jk1 þ a2jk2 þ    þ anjknÞuj
Thus, ½TðvÞS is the column vector whose jth entry is
a1jk1 þ a2jk2 þ    þ anjkn ð3Þ
On the other hand, the jth entry of ½TS½vS is obtained by multiplying the jth row of ½TS by ½vS—that is
(1) by (2). But the product of (1) and (2) is (3). Hence, ½TS½vS and ½TðvÞS have the same entries. Thus,
½TS½vS ¼ ½TðvÞS.
6.10. Prove Theorem 6.2: Let S ¼ fu1; u2; . . . ; ung be a basis for V over K, and let M be the algebra of
n-square matrices over K. Then the mapping m: AðVÞ ! M defined by mðTÞ ¼ ½TS is a vector
space isomorphism. That is, for any F; G 2 AðVÞ and any k 2 K, we have
(i) ½F þ G ¼ ½F þ ½G, (ii) ½kF ¼ k½F, (iii) m is one-to-one and onto.
(i) Suppose, for i ¼ 1; . . . ; n,
FðuiÞ ¼
P
n
j¼1
aijuj and GðuiÞ ¼
P
n
j¼1
bijuj
Consider the matrices A ¼ ½aij and B ¼ ½bij. Then ½F ¼ AT
and ½G ¼ BT
. We have, for i ¼ 1; . . . ; n,
ðF þ GÞðuiÞ ¼ FðuiÞ þ GðuiÞ ¼
P
n
j¼1
ðaij þ bijÞuj
Because A þ B is the matrix ðaij þ bijÞ, we have
½F þ G ¼ ðA þ BÞT
¼ AT
þ BT
¼ ½F þ ½G
(ii) Also, for i ¼ 1; . . . ; n;
ðkFÞðuiÞ ¼ kFðuiÞ ¼ k
P
n
j¼1
aijuj ¼
P
n
j¼1
ðkaijÞuj
Because kA is the matrix ðkaijÞ, we have
½kF ¼ ðkAÞT
¼ kAT
¼ k½F
(iii) Finally, m is one-to-one, because a linear mapping is completely determined by its values on a basis.
Also, m is onto, because matrix A ¼ ½aij in M is the image of the linear operator,
FðuiÞ ¼
P
n
j¼1
aijuj; i ¼ 1; . . . ; n
Thus, the theorem is proved.
6.11. Prove Theorem 6.3: For any linear operators G; F 2 AðVÞ, ½G  F ¼ ½G½F.
Using the notation in Problem 6.10, we have
ðG  FÞðuiÞ ¼ GðFðuiÞÞ ¼ G

P
n
j¼1
aijuj

¼
P
n
j¼1
aijGðujÞ
¼
P
n
j¼1
aij
P
n
k¼1
bjkuk
 
¼
P
n
k¼1

P
n
j¼1
aijbjk

uk
Recall that AB is the matrix AB ¼ ½cik, where cik ¼
Pn
j¼1 aijbjk. Accordingly,
½G  F ¼ ðABÞT
¼ BT
AT
¼ ½G½F
The theorem is proved.
210 CHAPTER 6 Linear Mappings and Matrices
6.12. Let A be the matrix representation of a linear operator T. Prove that, for any polynomial f ðtÞ, we
have that f ðAÞ is the matrix representation of f ðTÞ. [Thus, f ðTÞ ¼ 0 if and only if f ðAÞ ¼ 0.]
Let f be the mapping that sends an operator T into its matrix representation A. We need to prove that
fð f ðTÞÞ ¼ f ðAÞ. Suppose f ðtÞ ¼ antn
þ    þ a1t þ a0. The proof is by induction on n, the degree of f ðtÞ.
Suppose n ¼ 0. Recall that fðI0
Þ ¼ I, where I0
is the identity mapping and I is the identity matrix. Thus,
fð f ðTÞÞ ¼ fða0I0
Þ ¼ a0fðI0
Þ ¼ a0I ¼ f ðAÞ
and so the theorem holds for n ¼ 0.
Now assume the theorem holds for polynomials of degree less than n. Then, because f is an algebra
isomorphism,
fð f ðTÞÞ ¼ fðanTn
þ an1Tn1
þ    þ a1T þ a0I0
Þ
¼ anfðTÞfðTn1
Þ þ fðan1Tn1
þ    þ a1T þ a0I0
Þ
¼ anAAn1
þ ðan1An1
þ    þ a1A þ a0IÞ ¼ f ðAÞ
and the theorem is proved.
Change of Basis
The coordinate vector ½vS in this section will always denote a column vector; that is,
½vS ¼ ½a1; a2; . . . ; anT
6.13. Consider the following bases of R2
:
E ¼ fe1; e2g ¼ fð1; 0Þ; ð0; 1Þg and S ¼ fu1; u2g ¼ fð1; 3Þ; ð1; 4Þg
(a) Find the change-of-basis matrix P from the usual basis E to S.
(b) Find the change-of-basis matrix Q from S back to E.
(c) Find the coordinate vector ½v of v ¼ ð5; 3Þ relative to S.
(a) Because E is the usual basis, simply write the basis vectors in S as columns: P ¼
1 1
3 4
 
(b) Method 1. Use the definition of the change-of-basis matrix. That is, express each vector in E as a
linear combination of the vectors in S. We do this by first finding the coordinates of an arbitrary vector
v ¼ ða; bÞ relative to S. We have
ða; bÞ ¼ xð1; 3Þ þ yð1; 4Þ ¼ ðx þ y; 3x þ 4yÞ or
x þ y ¼ a
3x þ 4y ¼ b
Solve for x and y to obtain x ¼ 4a  b, y ¼ 3a þ b. Thus,
v ¼ ð4a  bÞu1 þ ð3a þ bÞu2 and ½vS ¼ ½ða; bÞS ¼ ½4a  b; 3a þ bT
Using the above formula for ½vS and writing the coordinates of the ei as columns yields
e1 ¼ ð1; 0Þ ¼ 4u1  3u2
e2 ¼ ð0; 1Þ ¼ u1 þ u2
and Q ¼
4 1
3 1
 
Method 2. Because Q ¼ P1
; find P1
, say by using the formula for the inverse of a 2 2 matrix.
Thus,
P1
¼
4 1
3 1
 
(c) Method 1. Write v as a linear combination of the vectors in S, say by using the above formula for
v ¼ ða; bÞ. We have v ¼ ð5; 3Þ ¼ 23u1  18u2; and so ½vS ¼ ½23; 18T
.
Method 2. Use, from Theorem 6.6, the fact that ½vS ¼ P1
½vE and the fact that ½vE ¼ ½5; 3T
:
½vS ¼ P1
½vE ¼
4 1
3 1
 
5
3
 
¼
23
18
 
CHAPTER 6 Linear Mappings and Matrices 211
6.14. The vectors u1 ¼ ð1; 2; 0Þ, u2 ¼ ð1; 3; 2Þ, u3 ¼ ð0; 1; 3Þ form a basis S of R3
. Find
(a) The change-of-basis matrix P from the usual basis E ¼ fe1; e2; e3g to S.
(b) The change-of-basis matrix Q from S back to E.
(a) Because E is the usual basis, simply write the basis vectors of S as columns: P ¼
1 1 0
2 3 1
0 2 3
2
4
3
5
(b) Method 1. Express each basis vector of E as a linear combination of the basis vectors of S by first
finding the coordinates of an arbitrary vector v ¼ ða; b; cÞ relative to the basis S. We have
a
b
c
2
4
3
5 ¼ x
1
2
0
2
4
3
5 þ y
1
3
2
2
4
3
5 þ z
0
1
3
2
4
3
5 or
x þ y ¼ a
2x þ 3y þ z ¼ b
2y þ 3z ¼ c
Solve for x; y; z to get x ¼ 7a  3b þ c, y ¼ 6a þ 3b  c, z ¼ 4a  2b þ c. Thus,
v ¼ ða; b; cÞ ¼ ð7a  3b þ cÞu1 þ ð6a þ 3b  cÞu2 þ ð4a  2b þ cÞu3
or ½vS ¼ ½ða; b; cÞS ¼ ½7a  3b þ c; 6a þ 3b  c; 4a  2b þ cT
Using the above formula for ½vS and then writing the coordinates of the ei as columns yields
e1 ¼ ð1; 0; 0Þ ¼ 7u1  6u2 þ 4u3
e2 ¼ ð0; 1; 0Þ ¼ 3u1 þ 3u2  2u3
e3 ¼ ð0; 0; 1Þ ¼ u1  u2 þ u3
and Q ¼
7 3 1
6 3 1
4 2 1
2
4
3
5
Method 2. Find P1
by row reducing M ¼ ½P; I to the form ½I; P1
:
M ¼
1 1 0 1 0 0
2 3 1 0 1 0
0 2 3 0 0 1
2
6
4
3
7
5
1 1 0 1 0 0
0 1 1 2 1 0
0 2 3 0 0 1
2
6
4
3
7
5
1 1 0 1 0 0
0 1 1 2 1 0
0 0 1 4 2 1
2
6
4
3
7
5
1 0 0 7 3 1
0 1 0 6 3 1
0 0 1 4 2 1
2
6
4
3
7
5 ¼ ½I; P1

Thus, Q ¼ P1
¼
7 3 1
6 3 1
4 2 1
2
4
3
5.
6.15. Suppose the x-axis and y-axis in the plane R2
are rotated counterclockwise 45
so that the new
x 0
-axis and y 0
-axis are along the line y ¼ x and the line y ¼ x, respectively.
(a) Find the change-of-basis matrix P.
(b) Find the coordinates of the point Að5; 6Þ under the given rotation.
(a) The unit vectors in the direction of the new x 0
- and y 0
-axes are
u1 ¼ ð1
2
ffiffiffi
2
p
; 1
2
ffiffiffi
2
p
Þ and u2 ¼ ð 1
2
ffiffiffi
2
p
; 1
2
ffiffiffi
2
p
Þ
(The unit vectors in the direction of the original x and y axes are the usual basis of R2
.) Thus, write the
coordinates of u1 and u2 as columns to obtain
P ¼
1
2
ffiffiffi
2
p
 1
2
ffiffiffi
2
p
1
2
ffiffiffi
2
p 1
2
ffiffiffi
2
p
 #
(b) Multiply the coordinates of the point by P1
:
1
2
ffiffiffi
2
p 1
2
ffiffiffi
2
p
 1
2
ffiffiffi
2
p 1
2
ffiffiffi
2
p
 #
5
6
 
¼
11
2
ffiffiffi
2
p
1
2
ffiffiffi
2
p
 #
(Because P is orthogonal, P1
is simply the transpose of P.)
212 CHAPTER 6 Linear Mappings and Matrices
6.16. The vectors u1 ¼ ð1; 1; 0Þ, u2 ¼ ð0; 1; 1Þ, u3 ¼ ð1; 2; 2Þ form a basis S of R3
. Find the coordinates
of an arbitrary vector v ¼ ða; b; cÞ relative to the basis S.
Method 1. Express v as a linear combination of u1; u2; u3 using unknowns x; y; z. We have
ða; b; cÞ ¼ xð1; 1; 0Þ þ yð0; 1; 1Þ þ zð1; 2; 2Þ ¼ ðx þ z; x þ y þ 2z; y þ 2zÞ
this yields the system
x þ z ¼ a
x þ y þ 2z ¼ b
y þ 2z ¼ c
or
x þ z ¼ a
y þ z ¼ a þ b
y þ 2z ¼ c
or
x þ z ¼ a
y þ z ¼ a þ b
z ¼ a  b þ c
Solving by back-substitution yields x ¼ b  c, y ¼ 2a þ 2b  c, z ¼ a  b þ c. Thus,
½vS ¼ ½b  c; 2a þ 2b  c; a  b þ cT
Method 2. Find P1
by row reducing M ¼ ½P; I to the form ½I; P1
, where P is the change-of-basis
matrix from the usual basis E to S or, in other words, the matrix whose columns are the basis vectors of S.
We have
M ¼
1 0 1 1 0 0
1 1 2 0 1 0
0 1 2 0 0 1
2
6
4
3
7
5
1 0 1 1 0 0
0 1 1 1 1 0
0 1 2 0 0 1
2
6
4
3
7
5
1 0 1 1 0 0
0 1 1 1 1 0
0 0 1 1 1 1
2
6
4
3
7
5
1 0 0 0 1 1
0 1 0 2 2 1
0 0 1 1 1 1
2
6
4
3
7
5 ¼ ½I; P1

Thus; P1
¼
0 1 1
2 2 1
1 1 1
2
6
4
3
7
5 and ½vS ¼ P1
½vE ¼
0 1 1
2 2 1
1 1 1
2
6
4
3
7
5
a
b
c
2
6
4
3
7
5 ¼
b  c
2a þ 2b  c
a  b þ c
2
6
4
3
7
5
6.17. Consider the following bases of R2
:
S ¼ fu1; u2g ¼ fð1; 2Þ; ð3; 4Þg and S0
¼ fv1; v2g ¼ fð1; 3Þ; ð3; 8Þg
(a) Find the coordinates of v ¼ ða; bÞ relative to the basis S.
(b) Find the change-of-basis matrix P from S to S0
.
(c) Find the coordinates of v ¼ ða; bÞ relative to the basis S0
.
(d) Find the change-of-basis matrix Q from S0
back to S.
(e) Verify Q ¼ P1
.
(f ) Show that, for any vector v ¼ ða; bÞ in R2
, P1
½vS ¼ ½vS0 . (See Theorem 6.6.)
(a) Let v ¼ xu1 þ yu2 for unknowns x and y; that is,
a
b
 
¼ x
1
2
 
þ y
3
4
 
or
x þ 3y ¼ a
2x  4y ¼ b
or
x þ 3y ¼ a
2y ¼ 2a þ b
Solve for x and y in terms of a and b to get x ¼ 2a  3
2 b and y ¼ a þ 1
2 b. Thus,
ða; bÞ ¼ ð2a  3
2Þu1 þ ða þ 1
2 bÞu2 or ½ða; bÞS ¼ ½2a  3
2 b; a þ 1
2 bT
(b) Use part (a) to write each of the basis vectors v1 and v2 of S0
as a linear combination of the basis vectors
u1 and u2 of S; that is,
v1 ¼ ð1; 3Þ ¼ ð2  9
2Þu1 þ ð1 þ 3
2Þu2 ¼  13
2 u1 þ 5
2 u2
v2 ¼ ð3; 8Þ ¼ ð6  12Þu1 þ ð3 þ 4Þu2 ¼ 18u1 þ 7u2
CHAPTER 6 Linear Mappings and Matrices 213
Then P is the matrix whose columns are the coordinates of v1 and v2 relative to the basis S; that is,
P ¼
 13
2 18
5
2 7
 #
(c) Let v ¼ xv1 þ yv2 for unknown scalars x and y:
a
b
 
¼ x
1
3
 
þ y
3
8
 
or
x þ 3y ¼ a
3x þ 8y ¼ b
or
x þ 3y ¼ a
y ¼ b  3a
Solve for x and y to get x ¼ 8a þ 3b and y ¼ 3a  b. Thus,
ða; bÞ ¼ ð8a þ 3bÞv1 þ ð3a  bÞv2 or ½ða; bÞS0 ¼ ½8a þ 3b; 3a  bT
(d) Use part (c) to express each of the basis vectors u1 and u2 of S as a linear combination of the basis
vectors v1 and v2 of S0
:
u1 ¼ ð1; 2Þ ¼ ð8  6Þv1 þ ð3 þ 2Þv2 ¼ 14v1 þ 5v2
u2 ¼ ð3; 4Þ ¼ ð24  12Þv1 þ ð9 þ 4Þv2 ¼ 36v1 þ 13v2
Write the coordinates of u1 and u2 relative to S0
as columns to obtain Q ¼
14 36
5 13
 
.
(e) QP ¼
14 36
5 13
 
 13
2 18
5
2 7
 #
¼
1 0
0 1
 
¼ I
(f ) Use parts (a), (c), and (d) to obtain
P1
½vS ¼ Q½vS ¼
14 36
5 13
 
2a  3
2 b
a þ 1
2 b
 #
¼
8a þ 3b
3a  b
 
¼ ½vS0
6.18. Suppose P is the change-of-basis matrix from a basis fuig to a basis fwig, and suppose Q is the
change-of-basis matrix from the basis fwig back to fuig. Prove that P is invertible and that
Q ¼ P1
.
Suppose, for i ¼ 1; 2; . . . ; n, that
wi ¼ ai1u1 þ ai2u2 þ . . . þ ainun ¼
P
n
j¼1
aijuj ð1Þ
and, for j ¼ 1; 2; . . . ; n,
uj ¼ bj1w1 þ bj2w2 þ    þ bjnwn ¼
P
n
k¼1
bjkwk ð2Þ
Let A ¼ ½aij and B ¼ ½bjk. Then P ¼ AT
and Q ¼ BT
. Substituting (2) into (1) yields
wi ¼
P
n
j¼1
aij

P
n
k¼1
bjkwk

¼
P
n
k¼1

P
n
j¼1
aijbjk

wk
Because fwig is a basis,
P
aijbjk ¼ dik, where dik is the Kronecker delta; that is, dik ¼ 1 if i ¼ k but dik ¼ 0
if i 6¼ k. Suppose AB ¼ ½cik. Then cik ¼ dik. Accordingly, AB ¼ I, and so
QP ¼ BT
AT
¼ ðABÞT
¼ IT
¼ I
Thus, Q ¼ P1
.
6.19. Consider a finite sequence of vectors S ¼ fu1; u2; . . . ; ung. Let S0
be the sequence of vectors
obtained from S by one of the following ‘‘elementary operations’’:
(1) Interchange two vectors.
(2) Multiply a vector by a nonzero scalar.
(3) Add a multiple of one vector to another vector.
Show that S and S0
span the same subspace W. Also, show that S0
is linearly independent if and
only if S is linearly independent.
214 CHAPTER 6 Linear Mappings and Matrices
Observe that, for each operation, the vectors S0
are linear combinations of vectors in S. Also, because
each operation has an inverse of the same type, each vector in S is a linear combination of vectors in S0
.
Thus, S and S0
span the same subspace W. Moreover, S0
is linearly independent if and only if dim W ¼ n,
and this is true if and only if S is linearly independent.
6.20. Let A ¼ ½aij and B ¼ ½bij be row equivalent m n matrices over a field K, and let v1; v2; . . . ; vn
be any vectors in a vector space V over K. For i ¼ 1; 2; . . . ; m, let ui and wi be defined by
ui ¼ ai1v1 þ ai2v2 þ    þ ainvn and wi ¼ bi1v1 þ bi2v2 þ    þ binvn
Show that fuig and fwig span the same subspace of V.
Applying an ‘‘elementary operation’’ of Problem 6.19 to fuig is equivalent to applying an elementary
row operation to the matrix A. Because A and B are row equivalent, B can be obtained from A by a sequence
of elementary row operations. Hence, fwig can be obtained from fuig by the corresponding sequence of
operations. Accordingly, fuig and fwig span the same space.
6.21. Suppose u1; u2; . . . ; un belong to a vector space V over a field K, and suppose P ¼ ½aij is an
n-square matrix over K. For i ¼ 1; 2; . . . ; n, let vi ¼ ai1u1 þ ai2u2 þ    þ ainun.
(a) Suppose P is invertible. Show that fuig and fvig span the same subspace of V. Hence, fuig is
linearly independent if and only if fvig is linearly independent.
(b) Suppose P is singular (not invertible). Show that fvig is linearly dependent.
(c) Suppose fvig is linearly independent. Show that P is invertible.
(a) Because P is invertible, it is row equivalent to the identity matrix I. Hence, by Problem 6.19, fvig and
fuig span the same subspace of V. Thus, one is linearly independent if and only if the other is linearly
independent.
(b) Because P is not invertible, it is row equivalent to a matrix with a zero row. This means fvig spans a
substance that has a spanning set with less than n elements. Thus, fvig is linearly dependent.
(c) This is the contrapositive of the statement of part (b), and so it follows from part (b).
6.22. Prove Theorem 6.6: Let P be the change-of-basis matrix from a basis S to a basis S0
in a vector
space V. Then, for any vector v 2 V, we have P½vS0 ¼ ½vS, and hence, P1
½vS ¼ ½vS0 .
Suppose S ¼ fu1; . . . ; ung and S0
¼ fw1; . . . ; wng, and suppose, for i ¼ 1; . . . ; n,
wi ¼ ai1u1 þ ai2u2 þ    þ ainun ¼
P
n
j¼1
aijuj
Then P is the n-square matrix whose jth row is
ða1j; a2j; . . . ; anjÞ ð1Þ
Also suppose v ¼ k1w1 þ k2w2 þ    þ knwn ¼
Pn
i¼1 kiwi. Then
½vS0 ¼ ½k1; k2; . . . ; knT
ð2Þ
Substituting for wi in the equation for v, we obtain
v ¼
P
n
i¼1
kiwi ¼
P
n
i¼1
ki

P
n
j¼1
aijuj

¼
P
n
j¼1

P
n
i¼1
aijki

uj
¼
P
n
j¼1
ða1jk1 þ a2jk2 þ    þ anjknÞuj
Accordingly, ½vS is the column vector whose jth entry is
a1jk1 þ a2jk2 þ    þ anjkn ð3Þ
On the other hand, the jth entry of P½vS0 is obtained by multiplying the jth row of P by ½vS0 —that is, (1) by
(2). However, the product of (1) and (2) is (3). Hence, P½vS0 and ½vS have the same entries. Thus,
P½vS0 ¼ ½vS0 , as claimed.
Furthermore, multiplying the above by P1
gives P1
½vS ¼ P1
P½vS0 ¼ ½vS0 .
CHAPTER 6 Linear Mappings and Matrices 215
Linear Operators and Change of Basis
6.23. Consider the linear transformation F on R2
defined by Fðx; yÞ ¼ ð5x  y; 2x þ yÞ and the
following bases of R2
:
E ¼ fe1; e2g ¼ fð1; 0Þ; ð0; 1Þg and S ¼ fu1; u2g ¼ fð1; 4Þ; ð2; 7Þg
(a) Find the change-of-basis matrix P from E to S and the change-of-basis matrix Q from S back
to E.
(b) Find the matrix A that represents F in the basis E.
(c) Find the matrix B that represents F in the basis S.
(a) Because E is the usual basis, simply write the vectors in S as columns to obtain the change-of-basis
matrix P. Recall, also, that Q ¼ P1
. Thus,
P ¼
1 2
4 7
 
and Q ¼ P1
¼
7 2
4 1
 
(b) Write the coefficients of x and y in Fðx; yÞ ¼ ð5x  y; 2x þ yÞ as rows to get
A ¼
5 1
2 1
 
(c) Method 1. Find the coordinates of Fðu1Þ and Fðu2Þ relative to the basis S. This may be done by first
finding the coordinates of an arbitrary vector ða; bÞ in R2
relative to the basis S. We have
ða; bÞ ¼ xð1; 4Þ þ yð2; 7Þ ¼ ðx þ 2y; 4x þ 7yÞ; and so
x þ 2y ¼ a
4x þ 7y ¼ b
Solve for x and y in terms of a and b to get x ¼ 7a þ 2b, y ¼ 4a  b. Then
ða; bÞ ¼ ð7a þ 2bÞu1 þ ð4a  bÞu2
Now use the formula for ða; bÞ to obtain
Fðu1Þ ¼ Fð1; 4Þ ¼ ð1; 6Þ ¼ 5u1  2u2
Fðu2Þ ¼ Fð2; 7Þ ¼ ð3; 11Þ ¼ u1 þ u2
and so B ¼
5 1
2 1
 
Method 2. By Theorem 6.7, B ¼ P1
AP. Thus,
B ¼ P1
AP ¼
7 2
4 1
 
5 1
2 1
 
1 2
4 7
 
¼
5 1
2 1
 
6.24. Let A ¼
2 3
4 1
 
. Find the matrix B that represents the linear operator A relative to the basis
S ¼ fu1; u2g ¼ f½1; 3T
; ½2; 5T
g. [Recall A defines a linear operator A: R2
! R2
relative to the
usual basis E of R2
].
Method 1. Find the coordinates of Aðu1Þ and Aðu2Þ relative to the basis S by first finding the coordinates
of an arbitrary vector ½a; bT
in R2
relative to the basis S. By Problem 6.2,
½a; bT
¼ ð5a þ 2bÞu1 þ ð3a  bÞu2
Using the formula for ½a; bT
, we obtain
Aðu1Þ ¼
2 3
4 1
 
1
3
 
¼
11
1
 
¼ 53u1 þ 32u2
and Aðu2Þ ¼
2 3
4 1
 
2
5
 
¼
19
3
 
¼ 89u1 þ 54u2
Thus; B ¼
53 89
32 54
 
Method 2. Use B ¼ P1
AP, where P is the change-of-basis matrix from the usual basis E to S. Thus,
simply write the vectors in S (as columns) to obtain the change-of-basis matrix P and then use the formula
216 CHAPTER 6 Linear Mappings and Matrices
for P1
. This gives
P ¼
1 2
3 5
 
and P1
¼
5 2
3 1
 
Then B ¼ P1
AP ¼
1 2
3 5
 
2 3
4 1
 
5 2
3 1
 
¼
53 89
32 54
 
6.25. Let A ¼
1 3 1
2 5 4
1 2 2
2
4
3
5: Find the matrix B that represents the linear operator A relative to the
basis
S ¼ fu1; u2; u3g ¼ f½1; 1; 0T
; ½0; 1; 1T
; ½1; 2; 2T
g
[Recall A that defines a linear operator A: R3
! R3
relative to the usual basis E of R3
.]
Method 1. Find the coordinates of Aðu1Þ, Aðu2Þ, Aðu3Þ relative to the basis S by first finding the
coordinates of an arbitrary vector v ¼ ða; b; cÞ in R3
relative to the basis S. By Problem 6.16,
½vS ¼ ðb  cÞu1 þ ð2a þ 2b  cÞu2 þ ða  b þ cÞu3
Using this formula for ½a; b; cT
, we obtain
Aðu1Þ ¼ ½4; 7; 1T
¼ 8u1 þ 7u2  5u3; Aðu2Þ ¼ ½4; 1; 0T
¼ u1  6u2 þ 3u3
Aðu3Þ ¼ ½9; 4; 1T
¼ 3u1  11u2 þ 6u3
Writing the coefficients of u1; u2; u3 as columns yields
B ¼
8 1 3
7 6 11
5 3 6
2
4
3
5
Method 2. Use B ¼ P1
AP, where P is the change-of-basis matrix from the usual basis E to S. The matrix
P (whose columns are simply the vectors in S) and P1
appear in Problem 6.16. Thus,
B ¼ P1
AP ¼
0 1 1
2 2 1
1 1 1
2
4
3
5
1 3 1
2 5 4
1 2 2
2
4
3
5
1 0 1
1 1 2
0 1 2
2
4
3
5 ¼
8 1 3
7 6 11
5 3 6
2
4
3
5
6.26. Prove Theorem 6.7: Let P be the change-of-basis matrix from a basis S to a basis S0
in a vector
space V. Then, for any linear operator T on V, ½TS0 ¼ P1
½TSP.
Let v be a vector in V. Then, by Theorem 6.6, P½vS0 ¼ ½vS. Therefore,
P1
½TSP½vS0 ¼ P1
½TS½vS ¼ P1
½TðvÞS ¼ ½TðvÞS0
But ½TS0 ½vS0 ¼ ½TðvÞS0 . Hence,
P1
½TSP½vS0 ¼ ½TS0 ½vS0
Because the mapping v 7! ½vS0 is onto Kn
, we have P1
½TSPX ¼ ½TS0 X for every X 2 Kn
. Thus,
P1
½TSP ¼ ½TS0 , as claimed.
Similarity of Matrices
6.27. Let A ¼
4 2
3 6
 
and P ¼
1 2
3 4
 
.
(a) Find B ¼ P1
AP. (b) Verify trðBÞ ¼ trðAÞ: (c) Verify detðBÞ ¼ detðAÞ:
(a) First find P1
using the formula for the inverse of a 2 2 matrix. We have
P1
¼
2 1
3
2  1
2
 #
CHAPTER 6 Linear Mappings and Matrices 217
Then
B ¼ P1
AP ¼
2 1
3
2  1
2
 
4 2
3 6
 
1 2
3 4
 
¼
25 30
 27
2 15
 
(b) trðAÞ ¼ 4 þ 6 ¼ 10 and trðBÞ ¼ 25  15 ¼ 10. Hence, trðBÞ ¼ trðAÞ.
(c) detðAÞ ¼ 24 þ 6 ¼ 30 and detðBÞ ¼ 375 þ 405 ¼ 30. Hence, detðBÞ ¼ detðAÞ.
6.28. Find the trace of each of the linear transformations F on R3
in Problem 6.4.
Find the trace (sum of the diagonal elements) of any matrix representation of F such as the matrix
representation ½F ¼ ½FE of F relative to the usual basis E given in Problem 6.4.
(a) trðFÞ ¼ trð½FÞ ¼ 1  5 þ 9 ¼ 5.
(b) trðFÞ ¼ trð½FÞ ¼ 1 þ 3 þ 5 ¼ 9.
(c) trðFÞ ¼ trð½FÞ ¼ 1 þ 4 þ 7 ¼ 12.
6.29. Write A B if A is similar to B—that is, if there exists an invertible matrix P such that
A ¼ P1
BP. Prove that is an equivalence relation (on square matrices); that is,
(a) A A, for every A. (b) If A B, then B A.
(c) If A B and B C, then A C.
(a) The identity matrix I is invertible, and I1
¼ I. Because A ¼ I1
AI, we have A A.
(b) Because A B, there exists an invertible matrix P such that A ¼ P1
BP. Hence,
B ¼ PAP1
¼ ðP1
Þ1
AP and P1
is also invertible. Thus, B A.
(c) Because A B, there exists an invertible matrix P such that A ¼ P1
BP, and as B C, there exists an
invertible matrix Q such that B ¼ Q1
CQ. Thus,
A ¼ P1
BP ¼ P1
ðQ1
CQÞP ¼ ðP1
Q1
ÞCðQPÞ ¼ ðQPÞ1
CðQPÞ
and QP is also invertible. Thus, A C.
6.30. Suppose B is similar to A, say B ¼ P1
AP. Prove
(a) Bn
¼ P1
An
P, and so Bn
is similar to An
.
(b) f ðBÞ ¼ P1
f ðAÞP, for any polynomial f ðxÞ, and so f ðBÞ is similar to f ðAÞ:
(c) B is a root of a polynomial gðxÞ if and only if A is a root of gðxÞ.
(a) The proof is by induction on n. The result holds for n ¼ 1 by hypothesis. Suppose n  1 and the result
holds for n  1. Then
Bn
¼ BBn1
¼ ðP1
APÞðP1
An1
PÞ ¼ P1
An
P
(b) Suppose f ðxÞ ¼ anxn
þ    þ a1x þ a0. Using the left and right distributive laws and part (a), we have
P1
f ðAÞP ¼ P1
ðanAn
þ    þ a1A þ a0IÞP
¼ P1
ðanAn
ÞP þ    þ P1
ða1AÞP þ P1
ða0IÞP
¼ anðP1
An
PÞ þ    þ a1ðP1
APÞ þ a0ðP1
IPÞ
¼ anBn
þ    þ a1B þ a0I ¼ f ðBÞ
(c) By part (b), gðBÞ ¼ 0 if and only if P1
gðAÞP ¼ 0 if and only if gðAÞ ¼ P0P1
¼ 0.
Matrix Representations of General Linear Mappings
6.31. Let F: R3
! R2
be the linear map defined by Fðx; y; zÞ ¼ ð3x þ 2y  4z; x  5y þ 3zÞ.
(a) Find the matrix of F in the following bases of R3
and R2
:
S ¼ fw1; w2; w3g ¼ fð1; 1; 1Þ; ð1; 1; 0Þ; ð1; 0; 0Þg and S0
¼ fu1; u2g ¼ fð1; 3Þ; ð2; 5Þg
218 CHAPTER 6 Linear Mappings and Matrices
(b) Verify Theorem 6.10: The action of F is preserved by its matrix representation; that is, for any
v in R3
, we have ½FS;S0 ½vS ¼ ½FðvÞS0 .
(a) From Problem 6.2, ða; bÞ ¼ ð5a þ 2bÞu1 þ ð3a  bÞu2. Thus,
Fðw1Þ ¼ Fð1; 1; 1Þ ¼ ð1; 1Þ ¼ 7u1 þ 4u2
Fðw2Þ ¼ Fð1; 1; 0Þ ¼ ð5; 4Þ ¼ 33u1 þ 19u2
Fðw3Þ ¼ Fð1; 0; 0Þ ¼ ð3; 1Þ ¼ 13u1 þ 8u2
Write the coordinates of Fðw1Þ, Fðw2Þ; Fðw3Þ as columns to get
½FS;S0 ¼
7 33 13
4 19 8
 
(b) If v ¼ ðx; y; zÞ, then, by Problem 6.5, v ¼ zw1 þ ðy  zÞw2 þ ðx  yÞw3. Also,
FðvÞ ¼ ð3x þ 2y  4z; x  5y þ 3zÞ ¼ ð13x  20y þ 26zÞu1 þ ð8x þ 11y  15zÞu2
Hence; ½vS ¼ ðz; y  z; x  yÞT
and ½FðvÞS0 ¼
13x  20y þ 26z
8x þ 11y  15z
 
Thus, ½FS;S0 ½vS ¼
7 33 13
4 19 8
  z
y  x
x  y
2
4
3
5 ¼
13x  20y þ 26z
8x þ 11y  15z
 
¼ ½FðvÞS0
6.32. Let F: Rn
! Rm
be the linear mapping defined as follows:
Fðx1; x2; . . . ; xnÞ ¼ ða11x1 þ    þ a1nxn, a21x1 þ    þ a2nxn; . . . ; am1x1 þ    þ amnxnÞ
(a) Show that the rows of the matrix ½F representing F relative to the usual bases of Rn
and Rm
are the coefficients of the xi in the components of Fðx1; . . . ; xnÞ.
(b) Find the matrix representation of each of the following linear mappings relative to the usual
basis of Rn
:
(i) F: R2
! R3
defined by Fðx; yÞ ¼ ð3x  y; 2x þ 4y; 5x  6yÞ.
(ii) F: R4
! R2
defined by Fðx; y; s; tÞ ¼ ð3x  4y þ 2s  5t; 5x þ 7y  s  2tÞ.
(iii) F: R3
! R4
defined by Fðx; y; zÞ ¼ ð2x þ 3y  8z; x þ y þ z; 4x  5z; 6yÞ.
(a) We have
Fð1; 0; . . . ; 0Þ ¼ ða11; a21; . . . ; am1Þ
Fð0; 1; . . . ; 0Þ ¼ ða12; a22; . . . ; am2Þ
:::::::::::::::::::::::::::::::::::::::::::::::::::::
Fð0; 0; . . . ; 1Þ ¼ ða1n; a2n; . . . ; amnÞ
and thus; ½F ¼
a11 a12 . . . a1n
a21 a22 . . . a2n
:::::::::::::::::::::::::::::::::
am1 am2 . . . amn
2
6
6
4
3
7
7
5
(b) By part (a), we need only look at the coefficients of the unknown x; y; . . . in Fðx; y; . . .Þ. Thus,
ðiÞ ½F ¼
3 1
2 4
5 6
2
4
3
5; ðiiÞ ½F ¼
3 4 2 5
5 7 1 2
 
; ðiiiÞ ½F ¼
2 3 8
1 1 1
4 0 5
0 6 0
2
6
6
4
3
7
7
5
6.33. Let A ¼
2 5 3
1 4 7
 
. Recall that A determines a mapping F: R3
! R2
defined by FðvÞ ¼ Av,
where vectors are written as columns. Find the matrix ½F that represents the mapping relative to
the following bases of R3
and R2
:
(a) The usual bases of R3
and of R2
.
(b) S ¼ fw1; w2; w3g ¼ fð1; 1; 1Þ; ð1; 1; 0Þ; ð1; 0; 0Þg and S0
¼ fu1; u2g ¼ fð1; 3Þ; ð2; 5Þg.
(a) Relative to the usual bases, ½F is the matrix A.
CHAPTER 6 Linear Mappings and Matrices 219
(b) From Problem 9.2, ða; bÞ ¼ ð5a þ 2bÞu1 þ ð3a  bÞu2. Thus,
Fðw1Þ ¼
2 5 3
1 4 7
  1
1
1
2
6
4
3
7
5 ¼
4
4
 
¼ 12u1 þ 8u2
Fðw2Þ ¼
2 5 3
1 4 7
  1
1
0
2
6
4
3
7
5 ¼
7
3
 
¼ 41u1 þ 24u2
Fðw3Þ ¼
2 5 3
1 4 7
  1
0
0
2
6
4
3
7
5 ¼
2
1
 
¼ 8u1 þ 5u2
Writing the coefficients of Fðw1Þ, Fðw2Þ, Fðw3Þ as columns yields ½F ¼
12 41 8
8 24 5
 
.
6.34. Consider the linear transformation T on R2
defined by Tðx; yÞ ¼ ð2x  3y; x þ 4yÞ and the
following bases of R2
:
E ¼ fe1; e2g ¼ fð1; 0Þ; ð0; 1Þg and S ¼ fu1; u2g ¼ fð1; 3Þ; ð2; 5Þg
(a) Find the matrix A representing T relative to the bases E and S.
(b) Find the matrix B representing T relative to the bases S and E.
(We can view T as a linear mapping from one space into another, each having its own basis.)
(a) From Problem 6.2, ða; bÞ ¼ ð5a þ 2bÞu1 þ ð3a  bÞu2. Hence,
Tðe1Þ ¼ Tð1; 0Þ ¼ ð2; 1Þ ¼ 8u1 þ 5u2
Tðe2Þ ¼ Tð0; 1Þ ¼ ð3; 4Þ ¼ 23u1  13u2
and so A ¼
8 23
5 13
 
(b) We have
Tðu1Þ ¼ Tð1; 3Þ ¼ ð7; 13Þ ¼ 7e1 þ 13e2
Tðu2Þ ¼ Tð2; 5Þ ¼ ð11; 22Þ ¼ 11e1 þ 22e2
and so B ¼
7 11
13 22
 
6.35. How are the matrices A and B in Problem 6.34 related?
By Theorem 6.12, the matrices A and B are equivalent to each other; that is, there exist nonsingular
matrices P and Q such that B ¼ Q1
AP, where P is the change-of-basis matrix from S to E, and Q is the
change-of-basis matrix from E to S. Thus,
P ¼
1 2
3 5
 
; Q ¼
5 2
3 1
 
; Q1
¼
1 2
3 5
 
and Q1
AP ¼
1 2
3 5
 
8 23
5 13
 
1 2
3 5
 
¼
7 11
13 22
 
¼ B
6.36. Prove Theorem 6.14: Let F: V ! U be linear and, say, rankðFÞ ¼ r. Then there exist bases V and
of U such that the matrix representation of F has the following form, where Ir is the r-square
identity matrix:
A ¼
Ir 0
0 0
 
Suppose dim V ¼ m and dim U ¼ n. Let W be the kernel of F and U0
the image of F. We are given that
rank ðFÞ ¼ r. Hence, the dimension of the kernel of F is m  r. Let fw1; . . . ; wmrg be a basis of the kernel
of F and extend this to a basis of V:
fv1; . . . ; vr; w1; . . . ; wmrg
Set u1 ¼ Fðv1Þ; u2 ¼ Fðv2Þ; . . . ; ur ¼ FðvrÞ
220 CHAPTER 6 Linear Mappings and Matrices
Then fu1; . . . ; urg is a basis of U0
, the image of F. Extend this to a basis of U, say
fu1; . . . ; ur; urþ1; . . . ; ung
Observe that
Fðv1Þ ¼ u1 ¼ 1u1 þ 0u2 þ    þ 0ur þ 0urþ1 þ    þ 0un
Fðv2Þ ¼ u2 ¼ 0u1 þ 1u2 þ    þ 0ur þ 0urþ1 þ    þ 0un
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
FðvrÞ ¼ ur ¼ 0u1 þ 0u2 þ    þ 1ur þ 0urþ1 þ    þ 0un
Fðw1Þ ¼ 0 ¼ 0u1 þ 0u2 þ    þ 0ur þ 0urþ1 þ    þ 0un
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
FðwmrÞ ¼ 0 ¼ 0u1 þ 0u2 þ    þ 0ur þ 0urþ1 þ    þ 0un
Thus, the matrix of F in the above bases has the required form.
SUPPLEMENTARY PROBLEMS
Matrices and Linear Operators
6.37. Let F: R2
! R2
be defined by Fðx; yÞ ¼ ð4x þ 5y; 2x  yÞ.
(a) Find the matrix A representing F in the usual basis E.
(b) Find the matrix B representing F in the basis S ¼ fu1; u2g ¼ fð1; 4Þ; ð2; 9Þg.
(c) Find P such that B ¼ P1
AP.
(d) For v ¼ ða; bÞ, find ½vS and ½FðvÞS. Verify that ½FS½vS ¼ ½FðvÞS.
6.38. Let A: R2
! R2
be defined by the matrix A ¼
5 1
2 4
 
.
(a) Find the matrix B representing A relative to the basis S ¼ fu1; u2g ¼ fð1; 3Þ; ð2; 8Þg. (Recall that A
represents the mapping A relative to the usual basis E.)
(b) For v ¼ ða; bÞ, find ½vS and ½AðvÞS.
6.39. For each linear transformation L on R2
, find the matrix A representing L (relative to the usual basis of R2
):
(a) L is the rotation in R2
counterclockwise by 45
.
(b) L is the reflection in R2
about the line y ¼ x.
(c) L is defined by Lð1; 0Þ ¼ ð3; 5Þ and Lð0; 1Þ ¼ ð7; 2Þ.
(d) L is defined by Lð1; 1Þ ¼ ð3; 7Þ and Lð1; 2Þ ¼ ð5; 4Þ.
6.40. Find the matrix representing each linear transformation T on R3
relative to the usual basis of R3
:
(a) Tðx; y; zÞ ¼ ðx; y; 0Þ. (b) Tðx; y; zÞ ¼ ðz; y þ z; x þ y þ zÞ.
(c) Tðx; y; zÞ ¼ ð2x  7y  4z; 3x þ y þ 4z; 6x  8y þ zÞ.
6.41. Repeat Problem 6.40 using the basis S ¼ fu1; u2; u3g ¼ fð1; 1; 0Þ; ð1; 2; 3Þ; ð1; 3; 5Þg.
6.42. Let L be the linear transformation on R3
defined by
Lð1; 0; 0Þ ¼ ð1; 1; 1Þ; Lð0; 1; 0Þ ¼ ð1; 3; 5Þ; Lð0; 0; 1Þ ¼ ð2; 2; 2Þ
(a) Find the matrix A representing L relative to the usual basis of R3
.
(b) Find the matrix B representing L relative to the basis S in Problem 6.41.
6.43. Let D denote the differential operator; that is, Dð f ðtÞÞ ¼ df =dt. Each of the following sets is a basis of a
vector space V of functions. Find the matrix representing D in each basis:
(a) fet
; e2t
; te2t
g. (b) f1; t; sin 3t; cos 3tg. (c) fe5t
; te5t
; t2
e5t
g.
CHAPTER 6 Linear Mappings and Matrices 221
6.44. Let D denote the differential operator on the vector space V of functions with basis S ¼ fsin y, cos yg.
(a) Find the matrix A ¼ ½DS. (b) Use A to show that D is a zero of f ðtÞ ¼ t2
þ 1.
6.45. Let V be the vector space of 2 2 matrices. Consider the following matrix M and usual basis E of V:
M ¼
a b
c d
 
and E ¼
1 0
0 0
 
;
0 1
0 0
 
;
0 0
1 0
 
;
0 0
0 1
 
Find the matrix representing each of the following linear operators T on V relative to E:
(a) TðAÞ ¼ MA. (b) TðAÞ ¼ AM. (c) TðAÞ ¼ MA  AM.
6.46. Let 1V and 0V denote the identity and zero operators, respectively, on a vector space V. Show that, for any
basis S of V, (a) ½1V S ¼ I, the identity matrix. (b) ½0V S ¼ 0, the zero matrix.
Change of Basis
6.47. Find the change-of-basis matrix P from the usual basis E of R2
to a basis S, the change-of-basis matrix Q
from S back to E, and the coordinates of v ¼ ða; bÞ relative to S, for the following bases S:
(a) S ¼ fð1; 2Þ; ð3; 5Þg. (c) S ¼ fð2; 5Þ; ð3; 7Þg.
(b) S ¼ fð1; 3Þ; ð3; 8Þg. (d) S ¼ fð2; 3Þ; ð4; 5Þg.
6.48. Consider the bases S ¼ fð1; 2Þ; ð2; 3Þg and S0
¼ fð1; 3Þ; ð1; 4Þg of R2
. Find the change-of-basis matrix:
(a) P from S to S0
. (b) Q from S0
back to S.
6.49. Suppose that the x-axis and y-axis in the plane R2
are rotated counterclockwise 30
to yield new x 0
-axis and
y 0
-axis for the plane. Find
(a) The unit vectors in the direction of the new x 0
-axis and y 0
-axis.
(b) The change-of-basis matrix P for the new coordinate system.
(c) The new coordinates of the points Að1; 3Þ, Bð2; 5Þ, Cða; bÞ.
6.50. Find the change-of-basis matrix P from the usual basis E of R3
to a basis S, the change-of-basis matrix Q
from S back to E, and the coordinates of v ¼ ða; b; cÞ relative to S, where S consists of the vectors:
(a) u1 ¼ ð1; 1; 0Þ; u2 ¼ ð0; 1; 2Þ; u3 ¼ ð0; 1; 1Þ.
(b) u1 ¼ ð1; 0; 1Þ; u2 ¼ ð1; 1; 2Þ; u3 ¼ ð1; 2; 4Þ.
(c) u1 ¼ ð1; 2; 1Þ; u2 ¼ ð1; 3; 4Þ; u3 ¼ ð2; 5; 6Þ.
6.51. Suppose S1; S2; S3 are bases of V. Let P and Q be the change-of-basis matrices, respectively, from S1 to S2
and from S2 to S3. Prove that PQ is the change-of-basis matrix from S1 to S3.
Linear Operators and Change of Basis
6.52. Consider the linear operator F on R2
defined by Fðx; yÞ ¼ ð5x þ y; 3x  2yÞ and the following bases of R2
:
S ¼ fð1; 2Þ; ð2; 3Þg and S0
¼ fð1; 3Þ; ð1; 4Þg
(a) Find the matrix A representing F relative to the basis S.
(b) Find the matrix B representing F relative to the basis S0
.
(c) Find the change-of-basis matrix P from S to S0
.
(d) How are A and B related?
6.53. Let A: R2
! R2
be defined by the matrix A ¼
1 1
3 2
 
. Find the matrix B that represents the linear
operator A relative to each of the following bases: (a) S ¼ fð1; 3ÞT
; ð2; 5ÞT
g. (b) S ¼ fð1; 3ÞT
; ð2; 4ÞT
g.
222 CHAPTER 6 Linear Mappings and Matrices
6.54. Let F: R2
! R2
be defined by Fðx; yÞ ¼ ðx  3y; 2x  4yÞ. Find the matrix A that represents F relative to
each of the following bases: (a) S ¼ fð2; 5Þ; ð3; 7Þg. (b) S ¼ fð2; 3Þ; ð4; 5Þg.
6.55. Let A: R3
! R3
be defined by the matrix A ¼
1 3 1
2 7 4
1 4 3
2
4
3
5. Find the matrix B that represents the linear
operator A relative to the basis S ¼ fð1; 1; 1ÞT
; ð0; 1; 1ÞT
; ð1; 2; 3ÞT
g.
Similarity of Matrices
6.56. Let A ¼
1 1
2 3
 
and P ¼
1 2
3 5
 
.
(a) Find B ¼ P1
AP. (b) Verify that trðBÞ ¼ trðAÞ: (c) Verify that detðBÞ ¼ detðAÞ.
6.57. Find the trace and determinant of each of the following linear maps on R2
:
(a) Fðx; yÞ ¼ ð2x  3y; 5x þ 4yÞ. (b) Gðx; yÞ ¼ ðax þ by; cx þ dyÞ.
6.58. Find the trace and determinant of each of the following linear maps on R3
:
(a) Fðx; y; zÞ ¼ ðx þ 3y; 3x  2z; x  4y  3zÞ.
(b) Gðx; y; zÞ ¼ ðy þ 3z; 2x  4z; 5x þ 7yÞ.
6.59. Suppose S ¼ fu1; u2g is a basis of V, and T: V ! V is defined by Tðu1Þ ¼ 3u1  2u2 and Tðu2Þ ¼ u1 þ 4u2.
Suppose S0
¼ fw1; w2g is a basis of V for which w1 ¼ u1 þ u2 and w2 ¼ 2u1 þ 3u2.
(a) Find the matrices A and B representing T relative to the bases S and S0
, respectively.
(b) Find the matrix P such that B ¼ P1
AP.
6.60. Let A be a 2 2 matrix such that only A is similar to itself. Show that A is a scalar matrix, that is, that
A ¼
a 0
0 a
 
.
6.61. Show that all matrices similar to an invertible matrix are invertible. More generally, show that similar
matrices have the same rank.
Matrix Representation of General Linear Mappings
6.62. Find the matrix representation of each of the following linear maps relative to the usual basis for Rn
:
(a) F: R3
! R2
defined by Fðx; y; zÞ ¼ ð2x  4y þ 9z; 5x þ 3y  2zÞ.
(b) F: R2
! R4
defined by Fðx; yÞ ¼ ð3x þ 4y; 5x  2y; x þ 7y; 4xÞ:
(c) F: R4
! R defined by Fðx1; x2; x3; x4Þ ¼ 2x1 þ x2  7x3  x4.
6.63. Let G: R3
! R2
be defined by Gðx; y; zÞ ¼ ð2x þ 3y  z; 4x  y þ 2zÞ.
(a) Find the matrix A representing G relative to the bases
S ¼ fð1; 1; 0Þ; ð1; 2; 3Þ; ð1; 3; 5Þg and S0
¼ fð1; 2Þ; ð2; 3Þg
(b) For any v ¼ ða; b; cÞ in R3
, find ½vS and ½GðvÞS0 . (c) Verify that A½vS ¼ ½GðvÞS0 .
6.64. Let H: R2
! R2
be defined by Hðx; yÞ ¼ ð2x þ 7y; x  3yÞ and consider the following bases of R2
:
S ¼ fð1; 1Þ; ð1; 2Þg and S0
¼ fð1; 4Þ; ð1; 5Þg
(a) Find the matrix A representing H relative to the bases S and S0
.
(b) Find the matrix B representing H relative to the bases S0
and S.
CHAPTER 6 Linear Mappings and Matrices 223
6.65. Let F: R3
! R2
be defined by Fðx; y; zÞ ¼ ð2x þ y  z; 3x  2y þ 4zÞ.
(a) Find the matrix A representing F relative to the bases
S ¼ fð1; 1; 1Þ; ð1; 1; 0Þ; ð1; 0; 0Þg and S0
¼ ð1; 3Þ; ð1; 4Þg
(b) Verify that, for any v ¼ ða; b; cÞ in R3
, A½vS ¼ ½FðvÞS0 .
6.66. Let S and S0
be bases of V, and let 1V be the identity mapping on V. Show that the matrix A representing
1V relative to the bases S and S0
is the inverse of the change-of-basis matrix P from S to S0
; that is,
A ¼ P1
.
6.67. Prove (a) Theorem 6.10, (b) Theorem 6.11, (c) Theorem 6.12, (d) Theorem 6.13. [Hint: See the proofs
of the analogous Theorems 6.1 (Problem 6.9), 6.2 (Problem 6.10), 6.3 (Problem 6.11), and 6.7
(Problem 6.26).]
Miscellaneous Problems
6.68. Suppose F: V ! V is linear. A subspace W of V is said to be invariant under F if FðWÞ  W. Suppose W is
invariant under F and dim W ¼ r. Show that F has a block triangular matrix representation M ¼
A B
0 C
 
where A is an r r submatrix.
6.69. Suppose V ¼ U þ W, and suppose U and V are each invariant under a linear operator F: V ! V. Also,
suppose dim U ¼ r and dim W ¼ S. Show that F has a block diagonal matrix representation M ¼
A 0
0 B
 
where A and B are r r and s s submatrices.
6.70. Two linear operators F and G on V are said to be similar if there exists an invertible linear operator T on V
such that G ¼ T1
 F  T. Prove
(a) F and G are similar if and only if, for any basis S of V, ½FS and ½GS are similar matrices.
(b) If F is diagonalizable (similar to a diagonal matrix), then any similar matrix G is also diagonalizable.
ANSWERS TO SUPPLEMENTARY PROBLEMS
Notation: M ¼ ½R1; R2; . . . represents a matrix M with rows R1; R2; . . . :
6.37. (a) A ¼ ½4; 5; 2; 1; (b) B ¼ ½220; 487; 98; 217; (c) P ¼ ½1; 2; 4; 9;
(d) ½vS ¼ ½9a  2b; 4a þ bT
and ½FðvÞS ¼ ½32a þ 47b; 14a  21bT
6.38. (a) B ¼ ½6; 28; 4; 15;
(b) ½vS ¼ ½4a  b;  3
2 a þ 1
2 bT
and ½AðvÞS ¼ ½18a  8b; 1
2 ð13a þ 7bÞ
6.39. (a) ½
ffiffiffi
2
p
; 
ffiffiffi
2
p
;
ffiffiffi
2
p
;
ffiffiffi
2
p
; (b) ½0; 1; 1; 0; (c) ½3; 7; 5; 2;
(d) ½1; 2; 18; 11
6.40. (a) ½1; 0; 0; 0; 1; 0; 0; 0; 0; (b) ½0; 0; 1; 0; 1; 1; 1; 1; 1;
(c) ½2; 7; 4; 3; 1; 4; 6; 8; 1
6.41. (a) ½1; 3; 5; 0; 5; 10; 0; 3; 6; (b) ½0; 1; 2; 1; 2; 3; 1; 0; 0;
(c) ½15; 65; 104; 49; 219; 351; 29; 130; 208
6.42. (a) ½1; 1; 2; 1; 3; 2; 1; 5; 2; (b) ½0; 2; 14; 22; 0; 5; 8
6.43. (a) ½1; 0; 0; 0; 2; 1; 0; 0; 2; (b) ½0; 1; 0; 0; 0; 0; 0; 0; 3; 0; 0; 3; 0;
(c) ½5; 1; 0; 0; 5; 2; 0; 0; 5
224 CHAPTER 6 Linear Mappings and Matrices
6.44. (a) A ¼ ½0; 1; 1; 0; (b) A2
þ I ¼ 0
6.45. (a) ½a; 0; b; 0; 0; a; 0; b; c; 0; d; 0; 0; c; 0; d;
(b) ½a; c; 0; 0; b; d; 0; 0; 0; 0; a; c; 0; 0; b; d;
(c) ½0; c; b; 0; b; a  d; 0; b; c; 0; d  a; c; 0; c; b; 0
6.47. (a) ½1; 3; 2; 5; ½5; 3; 2; 1; ½v ¼ ½5a þ 3b; 2a  bT
;
(b) ½1; 3; 3; 8; ½8; 3; 3; 1; ½v ¼ ½8a  3b; 3a þ bT
;
(c) ½2; 3; 5; 7; ½7; 3; 5; 2; ½v ¼ ½7a þ 3b; 5a  2bT
;
(d) ½2; 4; 3; 5; ½ 5
2 ; 2; 3
2 ; 1; ½v ¼ ½ 5
2 a þ 2b; 3
2 a  bT
6.48. (a) P ¼ ½3; 5; 1; 2; (b) Q ¼ ½2; 5; 1; 3
6.49. Here K ¼
ffiffiffi
3
p
:
(a) 1
2 ðK; 1Þ; 1
2 ð1; KÞ;
ðbÞ P ¼ 1
2 ½K; 1; 1; K;
ðcÞ 1
2 ½K þ 3; 3K  1T
; 1
2 ½2K  5; 5K  2T
; 1
2 ½aK þ b; bK  aT
6.50. P is the matrix whose columns are u1; u2; u3; Q ¼ P1
; ½v ¼ Q½a; b; cT
:
(a) Q ¼ ½1; 0; 0; 1; 1; 1; 2; 2; 1; ½v ¼ ½a; a  b þ c; 2a þ 2b  cT
;
(b) Q ¼ ½0; 2; 1; 2; 3; 2; 1; 1; 1; ½v ¼ ½2b þ c; 2a þ 3b  2c; a  b þ cT
;
(c) Q ¼ ½2; 2; 1; 7; 4; 1; 5; 3; 1; ½v ¼ ½2a þ 2b  c; 7a þ 4b  c; 5a  3b þ cT
6.52. (a) ½23; 39; 15; 26; (b) ½35; 41; 27; 32; (c) ½3; 5; 1; 2; (d) B ¼ P1
AP
6.53. (a) ½28; 47; 15; 25; (b) ½13; 18;  15
2 ; 10
6.54. (a) ½43; 60; 33; 46; (b) 1
2 ½3; 7; 5; 9
6.55. ½10; 8; 20; 13; 11; 28; 5; 4; 10
6.56. (a) ½34; 57; 19; 32; (b) trðBÞ ¼ trðAÞ ¼ 2; (c) detðBÞ ¼ detðAÞ ¼ 5
6.57. (a) trðFÞ ¼ 6; detðFÞ ¼ 23; (b) trðGÞ ¼ a þ d; detðGÞ ¼ ad  bc
6.58. (a) trðFÞ ¼ 2; detðFÞ ¼ 13; (b) trðGÞ ¼ 0; detðGÞ ¼ 22
6.59. (a) A ¼ ½3; 1; 2; 4; B ¼ ½8; 11; 2; 1; (b) P ¼ ½1; 2; 1; 3
6.62. (a) ½2; 4; 9; 5; 3; 2; (b) ½3; 5; 1; 4; 4; 2; 7; 0; (c) ½2; 1; 7; 1
6.63. (a) ½9; 1; 4; 7; 2; 1; (b) ½vS ¼ ½a þ 2b  c; 5a  5b þ 2c; 3a þ 3b  cT
, and
½GðvÞS0 ¼ ½2a  11b þ 7c; 7b  4cT
6.64. (a) A ¼ ½47; 85; 38; 69; (b) B ¼ ½71; 88; 41; 51
6.65. A ¼ ½3; 11; 5; 1; 8; 3
CHAPTER 6 Linear Mappings and Matrices 225
Inner Product Spaces,
Orthogonality
7.1 Introduction
The definition of a vector space V involves an arbitrary field K. Here we first restrict K to be the real field
R, in which case V is called a real vector space; in the last sections of this chapter, we extend our results
to the case where K is the complex field C, in which case V is called a complex vector space. Also, we
adopt the previous notation that
u; v; w are vectors in V
a; b; c; k are scalars in K
Furthermore, the vector spaces V in this chapter have finite dimension unless otherwise stated or implied.
Recall that the concepts of ‘‘length’’ and ‘‘orthogonality’’ did not appear in the investigation of
arbitrary vector spaces V (although they did appear in Section 1.4 on the spaces Rn
and Cn
). Here we
place an additional structure on a vector space V to obtain an inner product space, and in this context
these concepts are defined.
7.2 Inner Product Spaces
We begin with a definition.
DEFINITION: Let V be a real vector space. Suppose to each pair of vectors u; v 2 V there is assigned
a real number, denoted by hu; vi. This function is called a (real) inner product on V if it
satisfies the following axioms:
½I1 (Linear Property): hau1 þ bu2; vi ¼ ahu1; vi þ bhu2; vi.
½I2 (Symmetric Property): hu; vi ¼ hv; ui.
½I3 (Positive Definite Property): hu; ui  0.; and hu; ui ¼ 0 if and only if u ¼ 0.
The vector space V with an inner product is called a (real) inner product space.
Axiom ½I1 states that an inner product function is linear in the first position. Using ½I1 and the
symmetry axiom ½I2, we obtain
hu; cv1 þ dv2i ¼ hcv1 þ dv2; ui ¼ chv1; ui þ dhv2; ui ¼ chu; v1i þ dhu; v2i
That is, the inner product function is also linear in its second position. Combining these two properties
and using induction yields the following general formula:

P
i
aiui;
P
j
bjvj

¼
P
i
P
j
aibjhui; vji
CHAPTER 7
226
That is, an inner product of linear combinations of vectors is equal to a linear combination of the inner
products of the vectors.
EXAMPLE 7.1 Let V be a real inner product space. Then, by linearity,
h3u1  4u2; 2v1  5v2 þ 6v3i ¼ 6hu1; v1i  15hu1; v2i þ 18hu1; v3i
 8hu2; v1i þ 20hu2; v2i  24hu2; v3i
h2u  5v; 4u þ 6vi ¼ 8hu; ui þ 12hu; vi  20hv; ui  30hv; vi
¼ 8hu; ui  8hv; ui  30hv; vi
Observe that in the last equation we have used the symmetry property that hu; vi ¼ hv; ui.
Remark: Axiom ½I1 by itself implies h0; 0i ¼ h0v; 0i ¼ 0hv; 0i ¼ 0: Thus, ½I1, ½I2, ½I3 are
equivalent to ½I1, ½I2, and the following axiom:
½I0
3 If u 6¼ 0; then hu; ui is positive:
That is, a function satisfying ½I1, ½I2, ½I0
3 is an inner product.
Norm of a Vector
By the third axiom ½I3 of an inner product, hu; ui is nonnegative for any vector u. Thus, its positive square
root exists. We use the notation
kuk ¼
ffiffiffiffiffiffiffiffiffiffiffi
hu; ui
p
This nonnegative number is called the norm or length of u. The relation kuk2
¼ hu; ui will be used
frequently.
Remark: If kuk ¼ 1 or, equivalently, if hu; ui ¼ 1, then u is called a unit vector and it is said to be
normalized. Every nonzero vector v in V can be multiplied by the reciprocal of its length to obtain the
unit vector
^
v ¼
1
kvk
v
which is a positive multiple of v. This process is called normalizing v.
7.3 Examples of Inner Product Spaces
This section lists the main examples of inner product spaces used in this text.
Euclidean n-Space Rn
Consider the vector space Rn
. The dot product or scalar product in Rn
is defined by
u  v ¼ a1b1 þ a2b2 þ    þ anbn
where u ¼ ðaiÞ and v ¼ ðbiÞ. This function defines an inner product on Rn
. The norm kuk of the vector
u ¼ ðaiÞ in this space is as follows:
kuk ¼
ffiffiffiffiffiffiffiffiffi
u  u
p
¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
a2
1 þ a2
2 þ    þ a2
n
q
On the other hand, by the Pythagorean theorem, the distance from the origin O in R3
to a point
Pða; b; cÞ is given by
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
a2 þ b2 þ c2
p
. This is precisely the same as the above-defined norm of the
vector v ¼ ða; b; cÞ in R3
. Because the Pythagorean theorem is a consequence of the axioms of
CHAPTER 7 Inner Product Spaces, Orthogonality 227
Euclidean geometry, the vector space Rn
with the above inner product and norm is called Euclidean
n-space. Although there are many ways to define an inner product on Rn
, we shall assume this
inner product unless otherwise stated or implied. It is called the usual (or standard) inner product
on Rn
.
Remark: Frequently the vectors in Rn
will be represented by column vectors—that is, by n 1
column matrices. In such a case, the formula
hu; vi ¼ uT
v
defines the usual inner product on Rn
.
EXAMPLE 7.2 Let u ¼ ð1; 3; 4; 2Þ, v ¼ ð4; 2; 2; 1Þ, w ¼ ð5; 1; 2; 6Þ in R4
.
(a) Show h3u  2v; wi ¼ 3hu; wi  2hv; wi:
By definition,
hu; wi ¼ 5  3 þ 8 þ 12 ¼ 22 and hv; wi ¼ 20 þ 2  4 þ 6 ¼ 24
Note that 3u  2v ¼ ð5; 13; 16; 4Þ. Thus,
h3u  2v; wi ¼ 25  13 þ 32 þ 24 ¼ 18
As expected, 3hu; wi  2hv; wi ¼ 3ð22Þ  2ð24Þ ¼ 18 ¼ h3u  2v; wi.
(b) Normalize u and v:
By definition,
kuk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 þ 9 þ 16 þ 4
p
¼
ffiffiffiffiffi
30
p
and kvk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
16 þ 4 þ 4 þ 1
p
¼ 5
We normalize u and v to obtain the following unit vectors in the directions of u and v, respectively:
^
u ¼
1
kuk
u ¼
1
ffiffiffiffiffi
30
p ;
3
ffiffiffiffiffi
30
p ;
4
ffiffiffiffiffi
30
p ;
2
ffiffiffiffiffi
30
p
 
and ^
v ¼
1
kvk
v ¼
4
5
;
2
5
;
2
5
;
1
5
 
Function Space C½a; b and Polynomial Space PðtÞ
The notation C½a; b is used to denote the vector space of all continuous functions on the closed interval
½a; b—that is, where a  t  b. The following defines an inner product on C½a; b, where f ðtÞ and gðtÞ
are functions in C½a; b:
h f ; gi ¼
ðb
a
f ðtÞgðtÞ dt
It is called the usual inner product on C½a; b.
The vector space PðtÞ of all polynomials is a subspace of C½a; b for any interval ½a; b, and hence, the
above is also an inner product on PðtÞ.
EXAMPLE 7.3
Consider f ðtÞ ¼ 3t  5 and gðtÞ ¼ t2
in the polynomial space PðtÞ with inner product
h f ; gi ¼
ð1
0
f ðtÞgðtÞ dt:
(a) Find h f ; gi.
We have f ðtÞgðtÞ ¼ 3t3
 5t2
. Hence,
h f ; gi ¼
ð1
0
ð3t3
 5t2
Þ dt ¼ 3
4 t4
 5
3 t3




1
0
¼ 3
4  5
3 ¼  11
12
228 CHAPTER 7 Inner Product Spaces, Orthogonality
(b) Find k f k and kgk.
We have ½ f ðtÞ2
¼ f ðtÞ f ðtÞ ¼ 9t2
 30t þ 25 and ½gðtÞ2
¼ t4
. Then
k f k2
¼ h f ; f i ¼
ð1
0
ð9t2
 30t þ 25Þ dt ¼ 3t3
 15t2
þ 25t




1
0
¼ 13
kgk2
¼ hg; gi ¼
ð1
0
t4
dt ¼ 1
5 t5




1
0
¼ 1
5
Therefore, k f k ¼
ffiffiffiffiffi
13
p
and kgk ¼
ffiffi
1
5
q
¼ 1
5
ffiffiffi
5
p
.
Matrix Space M ¼ Mm;n
Let M ¼ Mm;n, the vector space of all real m n matrices. An inner product is defined on M by
hA; Bi ¼ trðBT
AÞ
where, as usual, trð Þ is the trace—the sum of the diagonal elements. If A ¼ ½aij and B ¼ ½bij, then
hA; Bi ¼ trðBT
AÞ ¼
P
m
i¼1
P
n
j¼1
aijbij and kAk2
¼ hA; Ai ¼
P
m
i¼1
P
n
j¼1
a2
ij
That is, hA; Bi is the sum of the products of the corresponding entries in A and B and, in particular, hA; Ai
is the sum of the squares of the entries of A.
Hilbert Space
Let V be the vector space of all infinite sequences of real numbers ða1; a2; a3; . . .Þ satisfying
P
1
i¼1
a2
i ¼ a2
1 þ a2
2 þ     1
that is, the sum converges. Addition and scalar multiplication are defined in V componentwise; that is, if
u ¼ ða1; a2; . . .Þ and v ¼ ðb1; b2; . . .Þ
then u þ v ¼ ða1 þ b1; a2 þ b2; . . .Þ and ku ¼ ðka1; ka2; . . .Þ
An inner product is defined in v by
hu; vi ¼ a1b1 þ a2b2 þ   
The above sum converges absolutely for any pair of points in V. Hence, the inner product is well defined.
This inner product space is called l2-space or Hilbert space.
7.4 Cauchy–Schwarz Inequality, Applications
The following formula (proved in Problem 7.8) is called the Cauchy–Schwarz inequality or Schwarz
inequality. It is used in many branches of mathematics.
THEOREM 7.1: (Cauchy–Schwarz) For any vectors u and v in an inner product space V,
hu; vi2
 hu; uihv; vi or jhu; vij  kukkvk
Next we examine this inequality in specific cases.
EXAMPLE 7.4
(a) Consider any real numbers a1; . . . ; an, b1; . . . ; bn. Then, by the Cauchy–Schwarz inequality,
ða1b1 þ a2b2 þ    þ anbnÞ2
 ða2
1 þ    þ a2
nÞðb2
1 þ    þ b2
nÞ
That is, ðu  vÞ2
 kuk2
kvk2
, where u ¼ ðaiÞ and v ¼ ðbiÞ.
CHAPTER 7 Inner Product Spaces, Orthogonality 229
(b) Let f and g be continuous functions on the unit interval ½0; 1. Then, by the Cauchy–Schwarz inequality,
ð1
0
f ðtÞgðtÞ dt
 2

ð1
0
f 2
ðtÞ dt
ð1
0
g2
ðtÞ dt
That is, ðh f ; giÞ2
 k f k2
kvk2
. Here V is the inner product space C½0; 1.
The next theorem (proved in Problem 7.9) gives the basic properties of a norm. The proof of the third
property requires the Cauchy–Schwarz inequality.
THEOREM 7.2: Let V be an inner product space. Then the norm in V satisfies the following
properties:
½N1 kvk  0; and kvk ¼ 0 if and only if v ¼ 0.
½N2 kkvk ¼ jkjkvk.
½N3 ku þ vk  kuk þ kvk.
The property ½N3 is called the triangle inequality, because if we view u þ v as the side of the triangle
formed with sides u and v (as shown in Fig. 7-1), then ½N3 states that the length of one side of a triangle
cannot be greater than the sum of the lengths of the other two sides.
Angle Between Vectors
For any nonzero vectors u and v in an inner product space V, the angle between u and v is defined to be
the angle y such that 0  y  p and
cos y ¼
hu; vi
kukkvk
By the Cauchy–Schwartz inequality, 1  cos y  1, and so the angle exists and is unique.
EXAMPLE 7.5
(a) Consider vectors u ¼ ð2; 3; 5Þ and v ¼ ð1; 4; 3Þ in R3
. Then
hu; vi ¼ 2  12 þ 15 ¼ 5; kuk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
4 þ 9 þ 25
p
¼
ffiffiffiffiffi
38
p
; kvk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 þ 16 þ 9
p
¼
ffiffiffiffiffi
26
p
Then the angle y between u and v is given by
cos y ¼
5
ffiffiffiffiffi
38
p ffiffiffiffiffi
26
p
Note that y is an acute angle, because cos y is positive.
(b) Let f ðtÞ ¼ 3t  5 and gðtÞ ¼ t2
in the polynomial space PðtÞ with inner product h f ; gi ¼
Ð1
0 f ðtÞgðtÞ dt. By
Example 7.3,
h f ; gi ¼  11
12 ; k f k ¼
ffiffiffiffiffi
13
p
; kgk ¼ 1
5
ffiffiffi
5
p
Then the ‘‘angle’’ y between f and g is given by
cos y ¼
 11
12
ð
ffiffiffiffiffi
13
p
Þ 1
5
ffiffiffi
5
p ¼ 
55
12
ffiffiffiffiffi
13
p ffiffiffi
5
p
Note that y is an obtuse angle, because cos y is negative.
Figure 7-1
230 CHAPTER 7 Inner Product Spaces, Orthogonality
7.5 Orthogonality
Let V be an inner product space. The vectors u; v 2 V are said to be orthogonal and u is said to be
orthogonal to v if
hu; vi ¼ 0
The relation is clearly symmetric—if u is orthogonal to v, then hv; ui ¼ 0, and so v is orthogonal to u. We
note that 0 2 V is orthogonal to every v 2 V, because
h0; vi ¼ h0v; vi ¼ 0hv; vi ¼ 0
Conversely, if u is orthogonal to every v 2 V, then hu; ui ¼ 0 and hence u ¼ 0 by ½I3: Observe that u and
v are orthogonal if and only if cos y ¼ 0, where y is the angle between u and v. Also, this is true if and
only if u and v are ‘‘perpendicular’’—that is, y ¼ p=2 (or y ¼ 90
).
EXAMPLE 7.6
(a) Consider the vectors u ¼ ð1; 1; 1Þ, v ¼ ð1; 2; 3Þ, w ¼ ð1; 4; 3Þ in R3
. Then
hu; vi ¼ 1 þ 2  3 ¼ 0; hu; wi ¼ 1  4 þ 3 ¼ 0; hv; wi ¼ 1  8  9 ¼ 16
Thus, u is orthogonal to v and w, but v and w are not orthogonal.
(b) Consider the functions sin t and cos t in the vector space C½p; p of continuous functions on the closed interval
½p; p. Then
hsin t; cos ti ¼
ðp
p
sin t cos t dt ¼ 1
2 sin2
tjp
p ¼ 0  0 ¼ 0
Thus, sin t and cos t are orthogonal functions in the vector space C½p; p.
Remark: A vector w ¼ ðx1; x2; . . . ; xnÞ is orthogonal to u ¼ ða1; a2; . . . ; anÞ in Rn
if
hu; wi ¼ a1x1 þ a2x2 þ    þ anxn ¼ 0
That is, w is orthogonal to u if w satisfies a homogeneous equation whose coefficients are the elements
of u.
EXAMPLE 7.7 Find a nonzero vector w that is orthogonal to u1 ¼ ð1; 2; 1Þ and u2 ¼ ð2; 5; 4Þ in R3
.
Let w ¼ ðx; y; zÞ. Then we want hu1; wi ¼ 0 and hu2; wi ¼ 0. This yields the homogeneous system
x þ 2y þ z ¼ 0
2x þ 5y þ 4z ¼ 0
or
x þ 2y þ z ¼ 0
y þ 2z ¼ 0
Here z is the only free variable in the echelon system. Set z ¼ 1 to obtain y ¼ 2 and x ¼ 3. Thus, w ¼ ð3; 2; 1Þ is
a desired nonzero vector orthogonal to u1 and u2.
Any multiple of w will also be orthogonal to u1 and u2. Normalizing w, we obtain the following unit vector
orthogonal to u1 and u2:
^
w ¼
w
kwk
¼
3
ffiffiffiffiffi
14
p ; 
2
ffiffiffiffiffi
14
p ;
1
ffiffiffiffiffi
14
p
 
Orthogonal Complements
Let S be a subset of an inner product space V. The orthogonal complement of S, denoted by S?
(read ‘‘S
perp’’) consists of those vectors in V that are orthogonal to every vector u 2 S; that is,
S?
¼ fv 2 V : hv; ui ¼ 0 for every u 2 Sg
CHAPTER 7 Inner Product Spaces, Orthogonality 231
In particular, for a given vector u in V, we have
u?
¼ fv 2 V : hv; ui ¼ 0g
that is, u?
consists of all vectors in V that are orthogonal to the given vector u.
We show that S?
is a subspace of V. Clearly 0 2 S?
, because 0 is orthogonal to every vector in V. Now
suppose v, w 2 S?
. Then, for any scalars a and b and any vector u 2 S, we have
hav þ bw; ui ¼ ahv; ui þ bhw; ui ¼ a  0 þ b  0 ¼ 0
Thus, av þ bw 2 S?
, and therefore S?
is a subspace of V.
We state this result formally.
PROPOSITION 7.3: Let S be a subset of a vector space V. Then S?
is a subspace of V.
Remark 1: Suppose u is a nonzero vector in R3
. Then there is a geometrical description of u?
.
Specifically, u?
is the plane in R3
through the origin O and perpendicular to the vector u. This is shown
in Fig. 7-2.
Remark 2: Let W be the solution space of an m n homogeneous system AX ¼ 0, where A ¼ ½aij
and X ¼ ½xi. Recall that W may be viewed as the kernel of the linear mapping A: Rn
! Rm
. Now we can
give another interpretation of W using the notion of orthogonality. Specifically, each solution vector
w ¼ ðx1; x2; . . . ; xnÞ is orthogonal to each row of A; hence, W is the orthogonal complement of the row
space of A.
EXAMPLE 7.8 Find a basis for the subspace u?
of R3
, where u ¼ ð1; 3; 4Þ.
Note that u?
consists of all vectors w ¼ ðx; y; zÞ such that hu; wi ¼ 0, or x þ 3y  4z ¼ 0. The free variables
are y and z.
(1) Set y ¼ 1, z ¼ 0 to obtain the solution w1 ¼ ð3; 1; 0Þ.
(2) Set y ¼ 0, z ¼ 1 to obtain the solution w1 ¼ ð4; 0; 1Þ.
The vectors w1 and w2 form a basis for the solution space of the equation, and hence a basis for u?
.
Suppose W is a subspace of V. Then both W and W?
are subspaces of V. The next theorem, whose
proof (Problem 7.28) requires results of later sections, is a basic result in linear algebra.
THEOREM 7.4: Let W be a subspace of V. Then V is the direct sum of W and W?
; that is,
V ¼ W W?
.
Figure 7-2
232 CHAPTER 7 Inner Product Spaces, Orthogonality
7.6 Orthogonal Sets and Bases
Consider a set S ¼ fu1; u2; . . . ; urg of nonzero vectors in an inner product space V. S is called orthogonal
if each pair of vectors in S are orthogonal, and S is called orthonormal if S is orthogonal and each vector
in S has unit length. That is,
(i) Orthogonal: hui; uji ¼ 0 for i 6¼ j
(ii) Orthonormal: hui; uji ¼
0 for i 6¼ j
1 for i ¼ j
Normalizing an orthogonal set S refers to the process of multiplying each vector in S by the reciprocal of
its length in order to transform S into an orthonormal set of vectors.
The following theorems apply.
THEOREM 7.5: Suppose S is an orthogonal set of nonzero vectors. Then S is linearly independent.
THEOREM 7.6: (Pythagoras) Suppose fu1; u2; . . . ; urg is an orthogonal set of vectors. Then
ku1 þ u2 þ    þ urk2
¼ ku1k2
þ ku2k2
þ    þ kurk2
These theorems are proved in Problems 7.15 and 7.16, respectively. Here we prove the Pythagorean
theorem in the special and familiar case for two vectors. Specifically, suppose hu; vi ¼ 0. Then
ku þ vk2
¼ hu þ v; u þ vi ¼ hu; ui þ 2hu; vi þ hv; vi ¼ hu; ui þ hv; vi ¼ kuk2
þ kvk2
which gives our result.
EXAMPLE 7.9
(a) Let E ¼ fe1; e2; e3g ¼ fð1; 0; 0Þ; ð0; 1; 0Þ; ð0; 0; 1Þg be the usual basis of Euclidean space R3
. It is clear that
he1; e2i ¼ he1; e3i ¼ he2; e3i ¼ 0 and he1; e1i ¼ he2; e2i ¼ he3; e3i ¼ 1
Namely, E is an orthonormal basis of R3
. More generally, the usual basis of Rn
is orthonormal for every n.
(b) Let V ¼ C½p; p be the vector space of continuous functions on the interval p  t  p with inner product
defined by h f ; gi ¼
Ðp
p f ðtÞgðtÞ dt. Then the following is a classical example of an orthogonal set in V:
f1; cos t; cos 2t; cos 3t; . . . ; sin t; sin 2t; sin 3t; . . .g
This orthogonal set plays a fundamental role in the theory of Fourier series.
Orthogonal Basis and Linear Combinations, Fourier Coefficients
Let S consist of the following three vectors in R3
:
u1 ¼ ð1; 2; 1Þ; u2 ¼ ð2; 1; 4Þ; u3 ¼ ð3; 2; 1Þ
The reader can verify that the vectors are orthogonal; hence, they are linearly independent. Thus, S is an
orthogonal basis of R3
.
Suppose we want to write v ¼ ð7; 1; 9Þ as a linear combination of u1; u2; u3. First we set v as a linear
combination of u1; u2; u3 using unknowns x1; x2; x3 as follows:
v ¼ x1u1 þ x2u2 þ x3u3 or ð7; 1; 9Þ ¼ x1ð1; 2; 1Þ þ x2ð2; 1; 4Þ þ x3ð3; 2; 1Þ ð*Þ
We can proceed in two ways.
METHOD 1: Expand ð*Þ (as in Chapter 3) to obtain the system
x1 þ 2x2 þ 3x3 ¼ 7; 2x1 þ x2  2x3 ¼ 1; x1  4x2 þ x3 ¼ 7
Solve the system by Gaussian elimination to obtain x1 ¼ 3, x2 ¼ 1, x3 ¼ 2. Thus,
v ¼ 3u1  u2 þ 2u3.
CHAPTER 7 Inner Product Spaces, Orthogonality 233
METHOD 2: (This method uses the fact that the basis vectors are orthogonal, and the arithmetic is
much simpler.) If we take the inner product of each side of ð*Þ with respect to ui, we get
hv; uii ¼ hx1u2 þ x2u2 þ x3u3; uii or hv; uii ¼ xihui; uii or xi ¼
hv; uii
hui; uii
Here two terms drop out, because u1; u2; u3 are orthogonal. Accordingly,
x1 ¼
hv; u1i
hu1; u1i
¼
7 þ 2 þ 9
1 þ 4 þ 1
¼
18
6
¼ 3; x2 ¼
hv; u2i
hu2; u2i
¼
14 þ 1  36
4 þ 1 þ 16
¼
21
21
¼ 1
x3 ¼
hv; u3i
hu3; u3i
¼
21  2 þ 9
9 þ 4 þ 1
¼
28
14
¼ 2
Thus, again, we get v ¼ 3u1  u2 þ 2u3.
The procedure in Method 2 is true in general. Namely, we have the following theorem (proved in
Problem 7.17).
THEOREM 7.7: Let fu1; u2; . . . ; ung be an orthogonal basis of V. Then, for any v 2 V,
v ¼
hv; u1i
hu1; u1i
u1 þ
hv; u2i
hu2; u2i
u2 þ    þ
hv; uni
hun; uni
un
Remark: The scalar ki 
hv; uii
hui; uii
is called the Fourier coefficient of v with respect to ui, because it
is analogous to a coefficient in the Fourier series of a function. This scalar also has a geometric
interpretation, which is discussed below.
Projections
Let V be an inner product space. Suppose w is a given nonzero vector in V, and suppose v is another
vector. We seek the ‘‘projection of v along w,’’ which, as indicated in Fig. 7-3(a), will be the multiple cw
of w such that v0
¼ v  cw is orthogonal to w. This means
hv  cw; wi ¼ 0 or hv; wi  chw; wi ¼ 0 or c ¼
hv; wi
hw; wi
Accordingly, the projection of v along w is denoted and defined by
projðv; wÞ ¼ cw ¼
hv; wi
hw; wi
w
Such a scalar c is unique, and it is called the Fourier coefficient of v with respect to w or the component of
v along w.
The above notion is generalized as follows (see Problem 7.25).
Figure 7-3
234 CHAPTER 7 Inner Product Spaces, Orthogonality
THEOREM 7.8: Suppose w1; w2; . . . ; wr form an orthogonal set of nonzero vectors in V. Let v be any
vector in V. Define
v0
¼ v  ðc1w1 þ c2w2 þ    þ crwrÞ
where
c1 ¼
hv; w1i
hw1; w1i
; c2 ¼
hv; w2i
hw2; w2i
; . . . ; cr ¼
hv; wri
hwr; wri
Then v0
is orthogonal to w1; w2; . . . ; wr.
Note that each ci in the above theorem is the component (Fourier coefficient) of v along the given wi.
Remark: The notion of the projection of a vector v 2 V along a subspace W of V is defined as
follows. By Theorem 7.4, V ¼ W W?
. Hence, v may be expressed uniquely in the form
v ¼ w þ w0
; where w 2 W and w0
2 W?
We define w to be the projection of v along W, and denote it by projðv; WÞ, as pictured in Fig. 7-2(b). In
particular, if W ¼ spanðw1; w2; . . . ; wrÞ, where the wi form an orthogonal set, then
projðv; WÞ ¼ c1w1 þ c2w2 þ    þ crwr
Here ci is the component of v along wi, as above.
7.7 Gram–Schmidt Orthogonalization Process
Suppose fv1; v2; . . . ; vng is a basis of an inner product space V. One can use this basis to construct an
orthogonal basis fw1; w2; . . . ; wng of V as follows. Set
w1 ¼ v1
w2 ¼ v2 
hv2; w1i
hw1; w1i
w1
w3 ¼ v3 
hv3; w1i
hw1; w1i
w1 
hv3; w2i
hw2; w2i
w2
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
wn ¼ vn 
hvn; w1i
hw1; w1i
w1 
hvn; w2i
hw2; w2i
w2     
hvn; wn1i
hwn1; wn1i
wn1
In other words, for k ¼ 2; 3; . . . ; n, we define
wk ¼ vk  ck1w1  ck2w2      ck;k1wk1
where cki ¼ hvk; wii=hwi; wii is the component of vk along wi. By Theorem 7.8, each wk is orthogonal to
the preceeding w’s. Thus, w1; w2; . . . ; wn form an orthogonal basis for V as claimed. Normalizing each wi
will then yield an orthonormal basis for V.
The above construction is known as the Gram–Schmidt orthogonalization process. The following
remarks are in order.
Remark 1: Each vector wk is a linear combination of vk and the preceding w’s. Hence, one can
easily show, by induction, that each wk is a linear combination of v1; v2; . . . ; vn.
Remark 2: Because taking multiples of vectors does not affect orthogonality, it may be simpler in
hand calculations to clear fractions in any new wk, by multiplying wk by an appropriate scalar, before
obtaining the next wkþ1.
CHAPTER 7 Inner Product Spaces, Orthogonality 235
Remark 3: Suppose u1; u2; . . . ; ur are linearly independent, and so they form a basis for
U ¼ spanðuiÞ. Applying the Gram–Schmidt orthogonalization process to the u’s yields an orthogonal
basis for U.
The following theorems (proved in Problems 7.26 and 7.27) use the above algorithm and remarks.
THEOREM 7.9: Let fv1; v2; . . . ; vng be any basis of an inner product space V. Then there exists an
orthonormal basis fu1; u2; . . . ; ung of V such that the change-of-basis matrix from
fvig to fuig is triangular; that is, for k ¼ 1; . . . ; n,
uk ¼ ak1v1 þ ak2v2 þ    þ akkvk
THEOREM 7.10: Suppose S ¼ fw1; w2; . . . ; wrg is an orthogonal basis for a subspace W of a vector
space V. Then one may extend S to an orthogonal basis for V; that is, one may find
vectors wrþ1; . . . ; wn such that fw1; w2; . . . ; wng is an orthogonal basis for V.
EXAMPLE 7.10 Apply the Gram–Schmidt orthogonalization process to find an orthogonal basis and
then an orthonormal basis for the subspace U of R4
spanned by
v1 ¼ ð1; 1; 1; 1Þ; v2 ¼ ð1; 2; 4; 5Þ; v3 ¼ ð1; 3; 4; 2Þ
(1) First set w1 ¼ v1 ¼ ð1; 1; 1; 1Þ.
(2) Compute
v2 
hv2; w1i
hw1; w1i
w1 ¼ v2 
12
4
w1 ¼ ð2; 1; 1; 2Þ
Set w2 ¼ ð2; 1; 1; 2Þ.
(3) Compute
v3 
hv3; w1i
hw1; w1i
w1 
hv3; w2i
hw2; w2i
w2 ¼ v3 
ð8Þ
4
w1 
ð7Þ
10
w2 ¼ 8
5 ;  17
10 ;  13
10 ; 7
5
Clear fractions to obtain w3 ¼ ð6; 17; 13; 14Þ.
Thus, w1; w2; w3 form an orthogonal basis for U. Normalize these vectors to obtain an orthonormal basis
fu1; u2; u3g of U. We have kw1k2
¼ 4, kw2k2
¼ 10, kw3k2
¼ 910, so
u1 ¼
1
2
ð1; 1; 1; 1Þ; u2 ¼
1
ffiffiffiffiffi
10
p ð2; 1; 1; 2Þ; u3 ¼
1
ffiffiffiffiffiffiffiffi
910
p ð16; 17; 13; 14Þ
EXAMPLE 7.11 Let V be the vector space of polynomials f ðtÞ with inner product
h f ; gi ¼
Ð1
1 f ðtÞgðtÞ dt. Apply the Gram–Schmidt orthogonalization process to f1; t; t2
; t3
g to find an
orthogonal basis f f0; f1; f2; f3g with integer coefficients for P3ðtÞ.
Here we use the fact that, for r þ s ¼ n,
htr
; ts
i ¼
ð1
1
tn
dt ¼
tnþ1
n þ 1




1
1
¼
2=ðn þ 1Þ when n is even
0 when n is odd
(1) First set f0 ¼ 1.
(2) Compute t ¼
ht; 1i
h1; 1i
ð1Þ ¼ t  0 ¼ t. Set f1 ¼ t.
(3) Compute
t2

ht2
; 1i
h1; 1i
ð1Þ 
ht2
; ti
ht; ti
ðtÞ ¼ t2

2
3
2
ð1Þ þ 0ðtÞ ¼ t2
 1
3
Multiply by 3 to obtain f2 ¼ 3t2
¼ 1.
236 CHAPTER 7 Inner Product Spaces, Orthogonality
(4) Compute
t3

ht3
; 1i
h1; 1i
ð1Þ 
ht3
; ti
ht; ti
ðtÞ 
ht3
; 3t2
 1i
h3t2  1; 3t2  1i
ð3t2
 1Þ
¼ t3
 0ð1Þ 
2
5
2
3
ðtÞ  0ð3t2
 1Þ ¼ t3
 3
5 t
Multiply by 5 to obtain f3 ¼ 5t3
 3t.
Thus, f1; t; 3t2
 1; 5t3
 3tg is the required orthogonal basis.
Remark: Normalizing the polynomials in Example 7.11 so that pð1Þ ¼ 1 yields the polynomials
1; t; 1
2 ð3t2
 1Þ; 1
2 ð5t3
 3tÞ
These are the first four Legendre polynomials, which appear in the study of differential equations.
7.8 Orthogonal and Positive Definite Matrices
This section discusses two types of matrices that are closely related to real inner product spaces V. Here
vectors in Rn
will be represented by column vectors. Thus, hu; vi ¼ uT
v denotes the inner product in
Euclidean space Rn
.
Orthogonal Matrices
A real matrix P is orthogonal if P is nonsingular and P1
¼ PT
, or, in other words, if PPT
¼ PT
P ¼ I.
First we recall (Theorem 2.6) an important characterization of such matrices.
THEOREM 7.11: Let P be a real matrix. Then the following are equivalent: (a) P is orthogonal; (b)
the rows of P form an orthonormal set; (c) the columns of P form an orthonormal
set.
(This theorem is true only using the usual inner product on Rn
. It is not true if Rn
is given any other
inner product.)
EXAMPLE 7.12
(a) Let P ¼
1=
ffiffiffi
3
p
1=
ffiffiffi
3
p
1=
ffiffiffi
3
p
0 1=
ffiffiffi
2
p
1=
ffiffiffi
2
p
2=
ffiffiffi
6
p
1=
ffiffiffi
6
p
1=
ffiffiffi
6
p
2
4
3
5: The rows of P are orthogonal to each other and are unit vectors. Thus
P is an orthogonal matrix.
(b) Let P be a 2 2 orthogonal matrix. Then, for some real number y, we have
P ¼
cos y sin y
 sin y cos y
 
or P ¼
cos y sin y
sin y  cos y
 
The following two theorems (proved in Problems 7.37 and 7.38) show important relationships
between orthogonal matrices and orthonormal bases of a real inner product space V.
THEOREM 7.12: Suppose E ¼ feig and E0
¼ fe0
ig are orthonormal bases of V. Let P be the change-
of-basis matrix from the basis E to the basis E0
. Then P is orthogonal.
THEOREM 7.13: Let fe1; . . . ; eng be an orthonormal basis of an inner product space V. Let P ¼ ½aij
be an orthogonal matrix. Then the following n vectors form an orthonormal basis
for V:
e0
i ¼ a1ie1 þ a2ie2 þ    þ anien; i ¼ 1; 2; . . . ; n
CHAPTER 7 Inner Product Spaces, Orthogonality 237
Positive Definite Matrices
Let A be a real symmetric matrix; that is, AT
¼ A. Then A is said to be positive definite if, for every
nonzero vector u in Rn
,
hu; Aui ¼ uT
Au  0
Algorithms to decide whether or not a matrix A is positive definite will be given in Chapter 12. However,
for 2 2 matrices, we have simple criteria that we state formally in the following theorem (proved in
Problem 7.43).
THEOREM 7.14: A 2 2 real symmetric matrix A ¼
a b
c d
 
¼
a b
b d
 
is positive definite
if and only if the diagonal entries a and d are positive and the determinant
jAj ¼ ad  bc ¼ ad  b2
is positive.
EXAMPLE 7.13 Consider the following symmetric matrices:
A ¼
1 3
3 4
 
; B ¼
1 2
2 3
 
; C ¼
1 2
2 5
 
A is not positive definite, because jAj ¼ 4  9 ¼ 5 is negative. B is not positive definite, because the diagonal
entry 3 is negative. However, C is positive definite, because the diagonal entries 1 and 5 are positive, and the
determinant jCj ¼ 5  4 ¼ 1 is also positive.
The following theorem (proved in Problem 7.44) holds.
THEOREM 7.15: Let A be a real positive definite matrix. Then the function hu; vi ¼ uT
Av is an inner
product on Rn
.
Matrix Representation of an Inner Product (Optional)
Theorem 7.15 says that every positive definite matrix A determines an inner product on Rn
. This
subsection may be viewed as giving the converse of this result.
Let V be a real inner product space with basis S ¼ fu1; u2; . . . ; ung. The matrix
A ¼ ½aij; where aij ¼ hui; uji
is called the matrix representation of the inner product on V relative to the basis S.
Observe that A is symmetric, because the inner product is symmetric; that is, hui; uji ¼ huj; uii. Also, A
depends on both the inner product on V and the basis S for V. Moreover, if S is an orthogonal basis, then
A is diagonal, and if S is an orthonormal basis, then A is the identity matrix.
EXAMPLE 7.14 The vectors u1 ¼ ð1; 1; 0Þ, u2 ¼ ð1; 2; 3Þ, u3 ¼ ð1; 3; 5Þ form a basis S for Euclidean
space R3
. Find the matrix A that represents the inner product in R3
relative to this basis S.
First compute each hui; uji to obtain
hu1; u1i ¼ 1 þ 1 þ 0 ¼ 2;
hu2; u2i ¼ 1 þ 4 þ 9 ¼ 14;
hu1; u2i ¼ 1 þ 2 þ 0 ¼ 3;
hu2; u3i ¼ 1 þ 6 þ 15 ¼ 22;
hu1; u3i ¼ 1 þ 3 þ 0 ¼ 4
hu3; u3i ¼ 1 þ 9 þ 25 ¼ 35
Then A ¼
2 3 4
3 14 22
4 22 35
2
4
3
5. As expected, A is symmetric.
The following theorems (proved in Problems 7.45 and 7.46, respectively) hold.
THEOREM 7.16: Let A be the matrix representation of an inner product relative to basis S for V.
Then, for any vectors u; v 2 V, we have
hu; vi ¼ ½uT
A½v
where ½u and ½v denote the (column) coordinate vectors relative to the basis S.
238 CHAPTER 7 Inner Product Spaces, Orthogonality
THEOREM 7.17: Let A be the matrix representation of any inner product on V. Then A is a positive
definite matrix.
7.9 Complex Inner Product Spaces
This section considers vector spaces over the complex field C. First we recall some properties of the
complex numbers (Section 1.7), especially the relations between a complex number z ¼ a þ bi; where
a; b 2 R; and its complex conjugate 
z ¼ a  bi:
z
z ¼ a2
þ b2
; jzj ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
a2 þ b2
p
; z1 þ z2 ¼ z1 þ z2 z1z2 ¼ z1z2; z

 ¼ z
Also, z is real if and only if 
z ¼ z.
The following definition applies.
DEFINITION: Let V be a vector space over C. Suppose to each pair of vectors, u; v 2 V there is
assigned a complex number, denoted by hu; vi. This function is called a (complex) inner
product on V if it satisfies the following axioms:
½I1
* (Linear Property) hau1 þ bu2; vi ¼ ahu1; vi þ bhu2; vi
½I2
* (Conjugate Symmetric Property) hu; vi ¼ hv; ui
½I3
* (Positive Definite Property) hu; ui  0; and hu; ui ¼ 0 if and only if u ¼ 0.
The vector space V over C with an inner product is called a (complex) inner product space.
Observe that a complex inner product differs from the real case only in the second axiom ½I2
*:
Axiom ½I1
* (Linear Property) is equivalent to the two conditions:
ðaÞ hu1 þ u2; vi ¼ hu1; vi þ hu2; vi; ðbÞ hku; vi ¼ khu; vi
On the other hand, applying ½I1
* and ½I2
*, we obtain
hu; kvi ¼ hkv; ui ¼ khv; ui ¼ 
khv; ui ¼ 
khu; vi
That is, we must take the conjugate of a complex number when it is taken out of the second position of a
complex inner product. In fact (Problem 7.47), the inner product is conjugate linear in the second
position; that is,
hu; av1 þ bv2i ¼ 
ahu; v1i þ 
bhu; v2i
Combining linear in the first position and conjugate linear in the second position, we obtain, by induction,
P
i
aiui;
P
j
bjvj
* +
¼
P
i;j
aibjhui; vji
The following remarks are in order.
Remark 1: Axiom ½I1
* by itself implies that h0; 0i ¼ h0v; 0i ¼ 0hv; 0i ¼ 0. Accordingly, ½I1
*, ½I2
*,
and ½I3
* are equivalent to ½I1
*, ½I2
*, and the following axiom:
½I3
*0
 If u 6¼ 0; then hu; ui  0:
That is, a function satisfying ½I1, ½I2
*, and ½I3
*0
 is a (complex) inner product on V.
Remark 2: By ½I2
*; hu; ui ¼ hu; ui. Thus, hu; ui must be real. By ½I3
*; hu; ui must be nonnegative,
and hence, its positive real square root exists. As with real inner product spaces, we define kuk ¼
ffiffiffiffiffiffiffiffiffiffiffi
hu; ui
p
to be the norm or length of u.
Remark 3: In addition to the norm, we define the notions of orthogonality, orthogonal comple-
ment, and orthogonal and orthonormal sets as before. In fact, the definitions of distance and Fourier
coefficient and projections are the same as in the real case.
CHAPTER 7 Inner Product Spaces, Orthogonality 239
EXAMPLE 7.15 (Complex Euclidean Space Cn
). Let V ¼ Cn
, and let u ¼ ðziÞ and v ¼ ðwiÞ be vectors in
Cn
. Then
hu; vi ¼
P
k
zkwk ¼ z1w1 þ z2w2 þ    þ znwn
is an inner product on V, called the usual or standard inner product on Cn
. V with this inner product is called
Complex Euclidean Space. We assume this inner product on Cn
unless otherwise stated or implied. Assuming u and
v are column vectors, the above inner product may be defined by
hu; vi ¼ uT

v
where, as with matrices, 
v means the conjugate of each element of v. If u and v are real, we have wi ¼ wi. In this
case, the inner product reduced to the analogous one on Rn
.
EXAMPLE 7.16
(a) Let V be the vector space of complex continuous functions on the (real) interval a  t  b. Then the following
is the usual inner product on V:
h f ; gi ¼
ðb
a
f ðtÞgðtÞ dt
(b) Let U be the vector space of m n matrices over C. Suppose A ¼ ðzijÞ and B ¼ ðwijÞ are elements of U. Then
the following is the usual inner product on U:
hA; Bi ¼ trðBH
AÞ ¼
P
m
i¼1
P
n
j¼1

wijzij
As usual, BH
¼ 
BT
; that is, BH
is the conjugate transpose of B.
The following is a list of theorems for complex inner product spaces that are analogous to those for
the real case. Here a Hermitian matrix A (i.e., one where AH
¼ 
AT
¼ AÞ plays the same role that a
symmetric matrix A (i.e., one where AT
¼ A) plays in the real case. (Theorem 7.18 is proved in
Problem 7.50.)
THEOREM 7.18: (Cauchy–Schwarz) Let V be a complex inner product space. Then
jhu; vij  kukkvk
THEOREM 7.19: Let W be a subspace of a complex inner product space V. Then V ¼ W W?
.
THEOREM 7.20: Suppose fu1; u2; . . . ; ung is a basis for a complex inner product space V. Then, for
any v 2 V,
v ¼
hv; u1i
hu1; u1i
u1 þ
hv; u2i
hu2; u2i
u2 þ    þ
hv; uni
hun; uni
un
THEOREM 7.21: Suppose fu1; u2; . . . ; ung is a basis for a complex inner product space V. Let
A ¼ ½aij be the complex matrix defined by aij ¼ hui; uji. Then, for any u; v 2 V,
hu; vi ¼ ½uT
A½v
where ½u and ½v are the coordinate column vectors in the given basis fuig.
(Remark: This matrix A is said to represent the inner product on V.)
THEOREM 7.22: Let A be a Hermitian matrix (i.e., AH
¼ 
AT
¼ AÞ such that XT
A 
X is real and
positive for every nonzero vector X 2 Cn
. Then hu; vi ¼ uT
A
v is an inner product
on Cn
.
THEOREM 7.23: Let A be the matrix that represents an inner product on V. Then A is Hermitian, and
XT
AX is real and positive for any nonzero vector in Cn
.
240 CHAPTER 7 Inner Product Spaces, Orthogonality
7.10 Normed Vector Spaces (Optional)
We begin with a definition.
DEFINITION: Let V be a real or complex vector space. Suppose to each v 2 V there is assigned a real
number, denoted by kvk. This function k  k is called a norm on V if it satisfies the
following axioms:
½N1 kvk  0; and kvk ¼ 0 if and only if v ¼ 0.
½N2 kkvk ¼ jkjkvk.
½N3 ku þ vk  kuk þ kvk.
A vector space V with a norm is called a normed vector space.
Suppose V is a normed vector space. The distance between two vectors u and v in V is denoted and
defined by
dðu; vÞ ¼ ku  vk
The following theorem (proved in Problem 7.56) is the main reason why dðu; vÞ is called the distance
between u and v.
THEOREM 7.24: Let V be a normed vector space. Then the function dðu; vÞ ¼ ku  vk satisfies the
following three axioms of a metric space:
½M1 dðu; vÞ  0; and dðu; vÞ ¼ 0 if and only if u ¼ v.
½M2 dðu; vÞ ¼ dðv; uÞ.
½M3 dðu; vÞ  dðu; wÞ þ dðw; vÞ.
Normed Vector Spaces and Inner Product Spaces
Suppose V is an inner product space. Recall that the norm of a vector v in V is defined by
kvk ¼
ffiffiffiffiffiffiffiffiffiffiffi
hv; vi
p
One can prove (Theorem 7.2) that this norm satisfies ½N1, ½N2, and ½N3. Thus, every inner product space
V is a normed vector space. On the other hand, there may be norms on a vector space V that do not come
from an inner product on V, as shown below.
Norms on Rn
and Cn
The following define three important norms on Rn
and Cn
:
kða1; . . . ; anÞk1 ¼ maxðjaijÞ
kða1; . . . ; anÞk1 ¼ ja1j þ ja2j þ    þ janj
kða1; . . . ; anÞk2 ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ja1j2
þ ja2j2
þ    þ janj2
q
(Note that subscripts are used to distinguish between the three norms.) The norms k  k1, k  k1, and k  k2
are called the infinity-norm, one-norm, and two-norm, respectively. Observe that k  k2 is the norm on Rn
(respectively, Cn
) induced by the usual inner product on Rn
(respectively, Cn
). We will let d1, d1, d2
denote the corresponding distance functions.
EXAMPLE 7.17 Consider vectors u ¼ ð1; 5; 3Þ and v ¼ ð4; 2; 3Þ in R3
.
(a) The infinity norm chooses the maximum of the absolute values of the components. Hence,
kuk1 ¼ 5 and kvk1 ¼ 4
CHAPTER 7 Inner Product Spaces, Orthogonality 241
(b) The one-norm adds the absolute values of the components. Thus,
kuk1 ¼ 1 þ 5 þ 3 ¼ 9 and kvk1 ¼ 4 þ 2 þ 3 ¼ 9
(c) The two-norm is equal to the square root of the sum of the squares of the components (i.e., the norm induced by
the usual inner product on R3
). Thus,
kuk2 ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 þ 25 þ 9
p
¼
ffiffiffiffiffi
35
p
and kvk2 ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
16 þ 4 þ 9
p
¼
ffiffiffiffiffi
29
p
(d) Because u  v ¼ ð1  4; 5  2; 3 þ 3Þ ¼ ð3; 7; 6Þ, we have
d1ðu; vÞ ¼ 7; d1ðu; vÞ ¼ 3 þ 7 þ 6 ¼ 16; d2ðu; vÞ ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
9 þ 49 þ 36
p
¼
ffiffiffiffiffi
94
p
EXAMPLE 7.18 Consider the Cartesian plane R2
shown in Fig. 7-4.
(a) Let D1 be the set of points u ¼ ðx; yÞ in R2
such that kuk2 ¼ 1. Then D1 consists of the points ðx; yÞ such that
kuk2
2 ¼ x2
þ y2
¼ 1. Thus, D1 is the unit circle, as shown in Fig. 7-4.
(b) Let D2 be the set of points u ¼ ðx; yÞ in R2
such that kuk1 ¼ 1. Then D1 consists of the points ðx; yÞ such that
kuk1 ¼ jxj þ jyj ¼ 1. Thus, D2 is the diamond inside the unit circle, as shown in Fig. 7-4.
(c) Let D3 be the set of points u ¼ ðx; yÞ in R2
such that kuk1 ¼ 1. Then D3 consists of the points ðx; yÞ such that
kuk1 ¼ maxðjxj, jyjÞ ¼ 1. Thus, D3 is the square circumscribing the unit circle, as shown in Fig. 7-4.
Norms on C½a; b
Consider the vector space V ¼ C½a; b of real continuous functions on the interval a  t  b. Recall that
the following defines an inner product on V:
h f ; gi ¼
ðb
a
f ðtÞgðtÞ dt
Accordingly, the above inner product defines the following norm on V ¼ C½a; b (which is analogous to
the k  k2 norm on Rn
):
k f k2 ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
ðb
a
½ f ðtÞ2
dt
s
Figure 7-4
242 CHAPTER 7 Inner Product Spaces, Orthogonality
The following define the other norms on V ¼ C½a; b:
k f k1 ¼
ðb
a
j f ðtÞj dt and k f k1 ¼ maxðj f ðtÞjÞ
There are geometrical descriptions of these two norms and their corresponding distance functions, which
are described below.
The first norm is pictured in Fig. 7-5. Here
k f k1 ¼ area between the function j f j and the t-axis
d1ð f ; gÞ ¼ area between the functions f and g
This norm is analogous to the norm k  k1 on Rn
.
The second norm is pictured in Fig. 7-6. Here
k f k1 ¼ maximum distance between f and the t-axis
d1ð f ; gÞ ¼ maximum distance between f and g
This norm is analogous to the norms k  k1 on Rn
.
SOLVED PROBLEMS
Inner Products
7.1. Expand:
(a) h5u1 þ 8u2; 6v1  7v2i,
(b) h3u þ 5v; 4u  6vi,
(c) k2u  3vk2
Use linearity in both positions and, when possible, symmetry, hu; vi ¼ hv; ui.
Figure 7-5
Figure 7-6
CHAPTER 7 Inner Product Spaces, Orthogonality 243
(a) Take the inner product of each term on the left with each term on the right:
h5u1 þ 8u2; 6v1  7v2i ¼ h5u1; 6v1i þ h5u1; 7v2i þ h8u2; 6v1i þ h8u2; 7v2i
¼ 30hu1; v1i  35hu1; v2i þ 48hu2; v1i  56hu2; v2i
[Remark: Observe the similarity between the above expansion and the expansion (5a–8b)(6c–7d ) in
ordinary algebra.]
(b) h3u þ 5v; 4u  6vi ¼ 12hu; ui  18hu; vi þ 20hv; ui  30hv; vi
¼ 12hu; ui þ 2hu; vi  30hv; vi
(c) k2u  3vk2
¼ h2u  3v; 2u  3vi ¼ 4hu; ui  6hu; vi  6hv; ui þ 9hv; vi
¼ 4kuk2
 12ðu; vÞ þ 9kvk2
7.2. Consider vectors u ¼ ð1; 2; 4Þ; v ¼ ð2; 3; 5Þ; w ¼ ð4; 2; 3Þ in R3
. Find
(a) u  v, (b) u  w; (c) v  w, (d) ðu þ vÞ  w, (e) kuk, (f ) kvk.
(a) Multiply corresponding components and add to get u  v ¼ 2  6 þ 20 ¼ 16:
(b) u  w ¼ 4 þ 4  12 ¼ 4.
(c) v  w ¼ 8  6  15 ¼ 13.
(d) First find u þ v ¼ ð3; 1; 9Þ. Then ðu þ vÞ  w ¼ 12  2  27 ¼ 17. Alternatively, using ½I1,
ðu þ vÞ  w ¼ u  w þ v  w ¼ 4  13 ¼ 17.
(e) First find kuk2
by squaring the components of u and adding:
kuk2
¼ 12
þ 22
þ 42
¼ 1 þ 4 þ 16 ¼ 21; and so kuk ¼
ffiffiffiffiffi
21
p
(f ) kvk2
¼ 4 þ 9 þ 25 ¼ 38, and so kvk ¼
ffiffiffiffiffi
38
p
.
7.3. Verify that the following defines an inner product in R2
:
hu; vi ¼ x1y1  x1y2  x2y1 þ 3x2y2; where u ¼ ðx1; x2Þ; v ¼ ðy1; y2Þ
We argue via matrices. We can write hu; vi in matrix notation as follows:
hu; vi ¼ uT
Av ¼ ½x1; x2
1 1
1 3
 
y1
y2
 
Because A is real and symmetric, we need only show that A is positive definite. The diagonal elements 1 and
3 are positive, and the determinant kAk ¼ 3  1 ¼ 2 is positive. Thus, by Theorem 7.14, A is positive
definite. Accordingly, by Theorem 7.15, hu; vi is an inner product.
7.4. Consider the vectors u ¼ ð1; 5Þ and v ¼ ð3; 4Þ in R2
. Find
(a) hu; vi with respect to the usual inner product in R2
.
(b) hu; vi with respect to the inner product in R2
in Problem 7.3.
(c) kvk using the usual inner product in R2
.
(d) kvk using the inner product in R2
in Problem 7.3.
(a) hu; vi ¼ 3 þ 20 ¼ 23.
(b) hu; vi ¼ 1  3  1  4  5  3 þ 3  5  4 ¼ 3  4  15 þ 60 ¼ 44.
(c) kvk2
¼ hv; vi ¼ hð3; 4Þ; ð3; 4Þi ¼ 9 þ 16 ¼ 25; hence, jvk ¼ 5.
(d) kvk2
¼ hv; vi ¼ hð3; 4Þ; ð3; 4Þi ¼ 9  12  12 þ 48 ¼ 33; hence, kvk ¼
ffiffiffiffiffi
33
p
.
7.5. Consider the following polynomials in PðtÞ with the inner product h f ; gi ¼
Ð1
0 f ðtÞgðtÞ dt:
f ðtÞ ¼ t þ 2; gðtÞ ¼ 3t  2; hðtÞ ¼ t2
 2t  3
(a) Find h f ; gi and h f ; hi.
(b) Find k f k and kgk.
(c) Normalize f and g.
244 CHAPTER 7 Inner Product Spaces, Orthogonality
(a) Integrate as follows:
h f ; gi ¼
ð1
0
ðt þ 2Þð3t  2Þ dt ¼
ð1
0
ð3t2
þ 4t  4Þ dt ¼

t3
þ 2t2
 4t




1
0
¼ 1
h f ; hi ¼
ð1
0
ðt þ 2Þðt2
 2t  3Þ dt ¼
t4
4

7t2
2
 6t
 



1
0
¼ 
37
4
(b) h f ; f i ¼
Ð1
0 ðt þ 2Þðt þ 2Þ dt ¼ 19
3 ; hence, k f k ¼
ffiffiffiffi
19
3
q
¼ 1
3
ffiffiffiffiffi
57
p
hg; gi ¼
ð1
0
ð3t  2Þð3t  2Þ ¼ 1; hence; kgk ¼
ffiffiffi
1
p
¼ 1
(c) Because k f k ¼ 1
3
ffiffiffiffiffi
57
p
and g is already a unit vector, we have
^
f ¼
1
k f k
f ¼
3
ffiffiffiffiffi
57
p ðt þ 2Þ and ^
g ¼ g ¼ 3t  2
7.6. Find cos y where y is the angle between:
(a) u ¼ ð1; 3; 5; 4Þ and v ¼ ð2; 3; 4; 1Þ in R4
,
(b) A ¼
9 8 7
6 5 4
 
and B ¼
1 2 3
4 5 6
 
, where hA; Bi ¼ trðBT
AÞ:
Use cos y ¼
hu; vi
kukkvk
(a) Compute:
hu; vi ¼ 2  9  20 þ 4 ¼ 23; kuk2
¼ 1 þ 9 þ 25 þ 16 ¼ 51; kvk2
¼ 4 þ 9 þ 16 þ 1 ¼ 30
Thus; cos y ¼
23
ffiffiffiffiffi
51
p ffiffiffiffiffi
30
p ¼
23
3
ffiffiffiffiffiffiffiffi
170
p
(b) Use hA; Bi ¼ trðBT
AÞ ¼
Pm
i¼1
Pn
j¼1 aijbij, the sum of the products of corresponding entries.
hA; Bi ¼ 9 þ 16 þ 21 þ 24 þ 25 þ 24 ¼ 119
Use kAk2
¼ hA; Ai ¼
Pm
i¼1
Pn
j¼1 a2
ij ; the sum of the squares of all the elements of A.
kAk2
¼ hA; Ai ¼ 92
þ 82
þ 72
þ 62
þ 52
þ 42
¼ 271;
kBk2
¼ hB; Bi ¼ 12
þ 22
þ 32
þ 42
þ 52
þ 62
¼ 91;
and so
and so
kAk ¼
ffiffiffiffiffiffiffiffi
271
p
kBk ¼
ffiffiffiffiffi
91
p
Thus; cos y ¼
119
ffiffiffiffiffiffiffiffi
271
p ffiffiffiffiffi
91
p
7.7. Verify each of the following:
(a) Parallelogram Law (Fig. 7-7): ku þ vk2
þ ku  vk2
¼ 2kuk2
þ 2kvk2
.
(b) Polar form for hu; vi (which shows the inner product can be obtained from the norm function):
hu; vi ¼ 1
4 ðku þ vk2
 ku  vk2
Þ:
Expand as follows to obtain
ku þ vk2
¼ hu þ v; u þ vi ¼ kuk2
þ 2hu; vi þ kvk2
ð1Þ
ku  vk2
¼ hu  v; u  vi ¼ kuk2
 2hu; vi þ kvk2
ð2Þ
Add (1) and (2) to get the Parallelogram Law (a). Subtract (2) from (1) to obtain
ku þ vk2
 ku  vk2
¼ 4hu; vi
Divide by 4 to obtain the (real) polar form (b).
CHAPTER 7 Inner Product Spaces, Orthogonality 245
7.8. Prove Theorem 7.1 (Cauchy–Schwarz): For u and v in a real inner product space V;
hu; ui2
 hu; uihv; vi or jhu; vij  kukkvk:
For any real number t,
htu þ v; tu þ vi ¼ t2
hu; ui þ 2thu; vi þ hv; vi ¼ t2
kuk2
þ 2thu; vi þ kvk2
Let a ¼ kuk2
, b ¼ 2hu; vÞ, c ¼ kvk2
. Because ktu þ vk2
 0, we have
at2
þ bt þ c  0
for every value of t. This means that the quadratic polynomial cannot have two real roots, which implies that
b2
 4ac  0 or b2
 4ac. Thus,
4hu; vi2
 4kuk2
kvk2
Dividing by 4 gives our result.
7.9. Prove Theorem 7.2: The norm in an inner product space V satisfies
(a) ½N1 kvk  0; and kvk ¼ 0 if and only if v ¼ 0.
(b) ½N2 kkvk ¼ jkjkvk.
(c) ½N3 ku þ vk  kuk þ kvk.
(a) If v 6¼ 0, then hv; vi  0, and hence, kvk ¼
ffiffiffiffiffiffiffiffiffiffiffi
hv; vi
p
 0. If v ¼ 0, then h0; 0i ¼ 0. Consequently,
k0k ¼
ffiffiffi
0
p
¼ 0. Thus, ½N1 is true.
(b) We have kkvk2
¼ hkv; kvi ¼ k2
hv; vi ¼ k2
kvk2
. Taking the square root of both sides gives ½N2.
(c) Using the Cauchy–Schwarz inequality, we obtain
ku þ vk2
¼ hu þ v; u þ vi ¼ hu; ui þ hu; vi þ hu; vi þ hv; vi
 kuk2
þ 2kukkvk þ kvk2
¼ ðkuk þ kvkÞ2
Taking the square root of both sides yields ½N3.
Orthogonality, Orthonormal Complements, Orthogonal Sets
7.10. Find k so that u ¼ ð1; 2; k; 3Þ and v ¼ ð3; k; 7; 5Þ in R4
are orthogonal.
First find
hu; vi ¼ ð1; 2; k; 3Þ  ð3; k; 7; 5Þ ¼ 3 þ 2k þ 7k  15 ¼ 9k  12
Then set hu; vi ¼ 9k  12 ¼ 0 to obtain k ¼ 4
3.
7.11. Let W be the subspace of R5
spanned by u ¼ ð1; 2; 3; 1; 2Þ and v ¼ ð2; 4; 7; 2; 1Þ. Find a
basis of the orthogonal complement W?
of W.
We seek all vectors w ¼ ðx; y; z; s; tÞ such that
hw; ui ¼ x þ 2y þ 3z  s þ 2t ¼ 0
hw; vi ¼ 2x þ 4y þ 7z þ 2s  t ¼ 0
Eliminating x from the second equation, we find the equivalent system
x þ 2y þ 3z  s þ 2t ¼ 0
z þ 4s  5t ¼ 0
Figure 7-7
246 CHAPTER 7 Inner Product Spaces, Orthogonality
The free variables are y; s, and t. Therefore,
(1) Set y ¼ 1, s ¼ 0, t ¼ 0 to obtain the solution w1 ¼ ð2; 1; 0; 0; 0Þ.
(2) Set y ¼ 0, s ¼ 1, t ¼ 0 to find the solution w2 ¼ ð13; 0; 4; 1; 0Þ.
(3) Set y ¼ 0, s ¼ 0, t ¼ 1 to obtain the solution w3 ¼ ð17; 0; 5; 0; 1Þ.
The set fw1; w2; w3g is a basis of W?
.
7.12. Let w ¼ ð1; 2; 3; 1Þ be a vector in R4
. Find an orthogonal basis for w?
.
Find a nonzero solution of x þ 2y þ 3z þ t ¼ 0, say v1 ¼ ð0; 0; 1; 3Þ. Now find a nonzero solution of
the system
x þ 2y þ 3z þ t ¼ 0; z  3t ¼ 0
say v2 ¼ ð0; 5; 3; 1Þ. Last, find a nonzero solution of the system
x þ 2y þ 3z þ t ¼ 0; 5y þ 3z þ t ¼ 0; z  3t ¼ 0
say v3 ¼ ð14; 2; 3; 1Þ. Thus, v1, v2, v3 form an orthogonal basis for w?
.
7.13. Let S consist of the following vectors in R4
:
u1 ¼ ð1; 1; 0; 1Þ; u2 ¼ ð1; 2; 1; 3Þ; u3 ¼ ð1; 1; 9; 2Þ; u4 ¼ ð16; 13; 1; 3Þ
(a) Show that S is orthogonal and a basis of R4
.
(b) Find the coordinates of an arbitrary vector v ¼ ða; b; c; dÞ in R4
relative to the basis S.
(a) Compute
u1  u2 ¼ 1 þ 2 þ 0  3 ¼ 0;
u2  u3 ¼ 1 þ 2  9 þ 6 ¼ 0;
u1  u3 ¼ 1 þ 1 þ 0  2 ¼ 0;
u2  u4 ¼ 16  26 þ 1 þ 9 ¼ 0;
u1  u4 ¼ 16  13 þ 0  3 ¼ 0
u3  u4 ¼ 16  13  9 þ 6 ¼ 0
Thus, S is orthogonal, and S is linearly independent. Accordingly, S is a basis for R4
because any four
linearly independent vectors form a basis of R4
.
(b) Because S is orthogonal, we need only find the Fourier coefficients of v with respect to the basis vectors,
as in Theorem 7.7. Thus,
k1 ¼
hv; u1i
hu1; u1i
¼
a þ b  d
3
;
k2 ¼
hv; u2i
hu2; u2i
¼
a þ 2b þ c þ 3d
15
;
k3 ¼
hv; u3i
hu3; u3i
¼
a þ b  9c þ 2d
87
k4 ¼
hv; u4i
hu4; u4i
¼
16a  13b þ c þ 3d
435
are the coordinates of v with respect to the basis S.
7.14. Suppose S, S1, S2 are the subsets of V. Prove the following:
(a) S  S??
.
(b) If S1  S2, then S?
2  S?
1 .
(c) S?
¼ span ðSÞ?
.
(a) Let w 2 S. Then hw; vi ¼ 0 for every v 2 S?
; hence, w 2 S??
. Accordingly, S  S??
.
(b) Let w 2 S?
2 . Then hw; vi ¼ 0 for every v 2 S2. Because S1  S2, hw; vi ¼ 0 for every v ¼ S1. Thus,
w 2 S?
1 , and hence, S?
2  S?
1 .
(c) Because S  spanðSÞ, part (b) gives us spanðSÞ?
 S?
. Suppose u 2 S?
and v 2 spanðSÞ. Then there
exist w1; w2; . . . ; wk in S such that v ¼ a1w1 þ a2w2 þ    þ akwk. Then, using u 2 S?
, we have
hu; vi ¼ hu; a1w1 þ a2w2 þ    þ akwki ¼ a1hu; w1i þ a2hu; w2i þ    þ akhu; wki
¼ a1ð0Þ þ a2ð0Þ þ    þ akð0Þ ¼ 0
Thus, u 2 spanðSÞ?
. Accordingly, S?
 spanðSÞ?
. Both inclusions give S?
¼ spanðSÞ?
.
7.15. Prove Theorem 7.5: Suppose S is an orthogonal set of nonzero vectors. Then S is linearly
independent.
CHAPTER 7 Inner Product Spaces, Orthogonality 247
Suppose S ¼ fu1; u2; . . . ; urg and suppose
a1u1 þ a2u2 þ    þ arur ¼ 0 ð1Þ
Taking the inner product of (1) with u1, we get
0 ¼ h0; u1i ¼ ha1u1 þ a2u2 þ    þ arur; u1i
¼ a1hu1; u1i þ a2hu2; u1i þ    þ arhur; u1i
¼ a1hu1; u1i þ a2  0 þ    þ ar  0 ¼ a1hu1; u1i
Because u1 6¼ 0, we have hu1; u1i 6¼ 0. Thus, a1 ¼ 0. Similarly, for i ¼ 2; . . . ; r, taking the inner product of
(1) with ui,
0 ¼ h0; uii ¼ ha1u1 þ    þ arur; uii
¼ a1hu1; uii þ    þ aihui; uii þ    þ arhur; uii ¼ aihui; uii
But hui; uii 6¼ 0, and hence, every ai ¼ 0. Thus, S is linearly independent.
7.16. Prove Theorem 7.6 (Pythagoras): Suppose fu1; u2; . . . ; urg is an orthogonal set of vectors. Then
ku1 þ u2 þ    þ urk2
¼ ku1k2
þ ku2k2
þ    þ kurk2
Expanding the inner product, we have
ku1 þ u2 þ    þ urk2
¼ hu1 þ u2 þ    þ ur; u1 þ u2 þ    þ uri
¼ hu1; u1i þ hu2; u2i þ    þ hur; uri þ
P
i6¼j
hui; uji
The theorem follows from the fact that hui; uii ¼ kuik2
and hui; uji ¼ 0 for i 6¼ j.
7.17. Prove Theorem 7.7: Let fu1; u2; . . . ; ung be an orthogonal basis of V. Then for any v 2 V,
v ¼
hv; u1i
hu1; u1i
u1 þ
hv; u2i
hu2; u2i
u2 þ    þ
hv; uni
hun; uni
un
Suppose v ¼ k1u1 þ k2u2 þ    þ knun. Taking the inner product of both sides with u1 yields
hv; u1i ¼ hk1u2 þ k2u2 þ    þ knun; u1i
¼ k1hu1; u1i þ k2hu2; u1i þ    þ knhun; u1i
¼ k1hu1; u1i þ k2  0 þ    þ kn  0 ¼ k1hu1; u1i
Thus, k1 ¼
hv; u1i
hu1; u1i
. Similarly, for i ¼ 2; . . . ; n,
hv; uii ¼ hk1ui þ k2u2 þ    þ knun; uii
¼ k1hu1; uii þ k2hu2; uii þ    þ knhun; uii
¼ k1  0 þ    þ kihui; uii þ    þ kn  0 ¼ kihui; uii
Thus, ki ¼
hv; uii
hu1; uii
. Substituting for ki in the equation v ¼ k1u1 þ    þ knun, we obtain the desired result.
7.18. Suppose E ¼ fe1; e2; . . . ; eng is an orthonormal basis of V. Prove
(a) For any u 2 V, we have u ¼ hu; e1ie1 þ hu; e2ie2 þ    þ hu; enien.
(b) ha1e1 þ    þ anen; b1e1 þ    þ bneni ¼ a1b1 þ a2b2 þ    þ anbn.
(c) For any u; v 2 V, we have hu; vi ¼ hu; e1ihv; e1i þ    þ hu; enihv; eni.
(a) Suppose u ¼ k1e1 þ k2e2 þ    þ knen. Taking the inner product of u with e1,
hu; e1i ¼ hk1e1 þ k2e2 þ    þ knen; e1i
¼ k1he1; e1i þ k2he2; e1i þ    þ knhen; e1i
¼ k1ð1Þ þ k2ð0Þ þ    þ knð0Þ ¼ k1
248 CHAPTER 7 Inner Product Spaces, Orthogonality
Similarly, for i ¼ 2; . . . ; n,
hu; eii ¼ hk1e1 þ    þ kiei þ    þ knen; eii
¼ k1he1; eii þ    þ kihei; eii þ    þ knhen; eii
¼ k1ð0Þ þ    þ kið1Þ þ    þ knð0Þ ¼ ki
Substituting hu; eii for ki in the equation u ¼ k1e1 þ    þ knen, we obtain the desired result.
(b) We have
P
n
i¼1
aiei;
P
n
j¼1
bjej
* +
¼
P
n
i;j¼1
aibjhei; eji ¼
P
n
i¼1
aibihei; eii þ
P
i6¼j
aibjhei; eji
But hei; eji ¼ 0 for i 6¼ j, and hei; eji ¼ 1 for i ¼ j. Hence, as required,
P
n
i¼1
aiei;
P
n
j¼1
bjej
* +
¼
P
n
i¼1
aibi ¼ a1b1 þ a2b2 þ    þ anbn
(c) By part (a), we have
u ¼ hu; e1ie1 þ    þ hu; enien and v ¼ hv; e1ie1 þ    þ hv; enien
Thus, by part (b),
hu; vi ¼ hu; e1ihv; e1i þ hu; e2ihv; e2i þ    þ hu; enihv; eni
Projections, Gram–Schmidt Algorithm, Applications
7.19. Suppose w 6¼ 0. Let v be any vector in V. Show that
c ¼
hv; wi
hw; wi
¼
hv; wi
kwk2
is the unique scalar such that v0
¼ v  cw is orthogonal to w.
In order for v0
to be orthogonal to w we must have
hv  cw; wi ¼ 0 or hv; wi  chw; wi ¼ 0 or hv; wi ¼ chw; wi
Thus, c
hv; wi
hw; wi
. Conversely, suppose c ¼
hv; wi
hw; wi
. Then
hv  cw; wi ¼ hv; wi  chw; wi ¼ hv; wi 
hv; wi
hw; wi
hw; wi ¼ 0
7.20. Find the Fourier coefficient c and the projection of v ¼ ð1; 2; 3; 4Þ along w ¼ ð1; 2; 1; 2Þ in R4
.
Compute hv; wi ¼ 1  4 þ 3  8 ¼ 8 and kwk2
¼ 1 þ 4 þ 1 þ 4 ¼ 10. Then
c ¼  8
10 ¼  4
5 and projðv; wÞ ¼ cw ¼ ð 4
5 ;  8
5 ;  4
5 ;  8
5Þ
7.21. Consider the subspace U of R4
spanned by the vectors:
v1 ¼ ð1; 1; 1; 1Þ; v2 ¼ ð1; 1; 2; 4Þ; v3 ¼ ð1; 2; 4; 3Þ
Find (a) an orthogonal basis of U; (b) an orthonormal basis of U.
(a) Use the Gram–Schmidt algorithm. Begin by setting w1 ¼ u ¼ ð1; 1; 1; 1Þ. Next find
v2 
hv2; w1i
hw1; w1i
w1 ¼ ð1; 1; 2; 4Þ 
8
4
ð1; 1; 1; 1Þ ¼ ð1; 1; 0; 2Þ
Set w2 ¼ ð1; 1; 0; 2Þ. Then find
v3 
hv3; w1i
hw1; w1i
w1 
hv3; w2i
hw2; w2i
w2 ¼ ð1; 2; 4; 3Þ 
ð4Þ
4
ð1; 1; 1; 1Þ 
ð9Þ
6
ð1; 1; 0; 2Þ
¼ ð1
2 ; 3
2 ; 3; 1Þ
Clear fractions to obtain w3 ¼ ð1; 3; 6; 2Þ. Then w1; w2; w3 form an orthogonal basis of U.
CHAPTER 7 Inner Product Spaces, Orthogonality 249
(b) Normalize the orthogonal basis consisting of w1; w2; w3. Because kw1k2
¼ 4, kw2k2
¼ 6, and
kw3k2
¼ 50, the following vectors form an orthonormal basis of U:
u1 ¼
1
2
ð1; 1; 1; 1Þ; u2 ¼
1
ffiffiffi
6
p ð1; 1; 0; 2Þ; u3 ¼
1
5
ffiffiffi
2
p ð1; 3; 6; 2Þ
7.22. Consider the vector space PðtÞ with inner product h f ; gi ¼
Ð1
0 f ðtÞgðtÞ dt. Apply the Gram–
Schmidt algorithm to the set f1; t; t2
g to obtain an orthogonal set f f0; f1; f2g with integer
coefficients.
First set f0 ¼ 1. Then find
t 
ht; 1i
h1; 1i
 1 ¼ t 
1
2
1
 1 ¼ t 
1
2
Clear fractions to obtain f1 ¼ 2t  1. Then find
t2

ht2
; 1i
h1; 1i
ð1Þ 
ht2
; 2t  1i
h2t  1; 2t  1i
ð2t  1Þ ¼ t2

1
3
1
ð1Þ 
1
6
1
3
ð2t  1Þ ¼ t2
 t þ
1
6
Clear fractions to obtain f2 ¼ 6t2
 6t þ 1. Thus, f1; 2t  1; 6t2
 6t þ 1g is the required orthogonal set.
7.23. Suppose v ¼ ð1; 3; 5; 7Þ. Find the projection of v onto W or, in other words, find w 2 W that
minimizes kv  wk, where W is the subspance of R4
spanned by
(a) u1 ¼ ð1; 1; 1; 1Þ and u2 ¼ ð1; 3; 4; 2Þ,
(b) v1 ¼ ð1; 1; 1; 1Þ and v2 ¼ ð1; 2; 3; 2Þ.
(a) Because u1 and u2 are orthogonal, we need only compute the Fourier coefficients:
c1 ¼
hv; u1i
hu1; u1i
¼
1 þ 3 þ 5 þ 7
1 þ 1 þ 1 þ 1
¼
16
4
¼ 4
c2 ¼
hv; u2i
hu2; u2i
¼
1  9 þ 20  14
1 þ 9 þ 16 þ 4
¼
2
30
¼ 
1
15
Then w ¼ projðv; WÞ ¼ c1u1 þ c2u2 ¼ 4ð1; 1; 1; 1Þ  1
15 ð1; 3; 4; 2Þ ¼ ð59
15 ; 63
5 ; 56
15 ; 62
15Þ:
(b) Because v1 and v2 are not orthogonal, first apply the Gram–Schmidt algorithm to find an orthogonal
basis for W. Set w1 ¼ v1 ¼ ð1; 1; 1; 1Þ. Then find
v2 
hv2; w1i
hw1; w1i
w1 ¼ ð1; 2; 3; 2Þ 
8
4
ð1; 1; 1; 1Þ ¼ ð1; 0; 1; 0Þ
Set w2 ¼ ð1; 0; 1; 0Þ. Now compute
c1 ¼
hv; w1i
hw1; w1i
¼
1 þ 3 þ 5 þ 7
1 þ 1 þ 1 þ 1
¼
16
4
¼ 4
c2 ¼
hv; w2i
hw2; w2i

1 þ 0 þ 5 þ 0
1 þ 0 þ 1 þ 0
¼
6
2
¼ 3
Then w ¼ projðv; WÞ ¼ c1w1 þ c2w2 ¼ 4ð1; 1; 1; 1Þ  3ð1; 0; 1; 0Þ ¼ ð7; 4; 1; 4Þ.
7.24. Suppose w1 and w2 are nonzero orthogonal vectors. Let v be any vector in V. Find c1 and c2 so that
v0
is orthogonal to w1 and w2, where v0
¼ v  c1w1  c2w2.
If v0
is orthogonal to w1, then
0 ¼ hv  c1w1  c2w2; w1i ¼ hv; w1i  c1hw1; w1i  c2hw2; w1i
¼ hv; w1i  c1hw1; w1i  c20 ¼ hv; w1i  c1hw1; w1i
Thus, c1 ¼ hv; w1i=hw1; w1i. (That is, c1 is the component of v along w1.) Similarly, if v0
is orthogonal to w2,
then
0 ¼ hv  c1w1  c2w2; w2i ¼ hv; w2i  c2hw2; w2i
Thus, c2 ¼ hv; w2i=hw2; w2i. (That is, c2 is the component of v along w2.)
250 CHAPTER 7 Inner Product Spaces, Orthogonality
7.25. Prove Theorem 7.8: Suppose w1; w2; . . . ; wr form an orthogonal set of nonzero vectors in V. Let
v 2 V. Define
v0
¼ v  ðc1w1 þ c2w2 þ    þ crwrÞ; where ci ¼
hv; wii
hwi; wii
Then v0
is orthogonal to w1; w2; . . . ; wr.
For i ¼ 1; 2; . . . ; r and using hwi; wji ¼ 0 for i 6¼ j, we have
hv  c1w1  c2x2      crwr; wii ¼ hv; wii  c1hw1; wii      cihwi; wii      crhwr; wii
¼ hv; wii  c1  0      cihwi; wii      cr  0
¼ hv; wii  cihwi; wii ¼ hv; wii 
hv; wii
hwi; wii
hwi; wii ¼ 0
The theorem is proved.
7.26. Prove Theorem 7.9: Let fv1; v2; . . . ; vng be any basis of an inner product space V. Then there
exists an orthonormal basis fu1; u2; . . . ; ung of V such that the change-of-basis matrix from fvig to
fuig is triangular; that is, for k ¼ 1; 2; . . . ; n,
uk ¼ ak1v1 þ ak2v2 þ    þ akkvk
The proof uses the Gram–Schmidt algorithm and Remarks 1 and 3 of Section 7.7. That is, apply the
algorithm to fvig to obtain an orthogonal basis fwi; . . . ; wng, and then normalize fwig to obtain an
orthonormal basis fuig of V. The specific algorithm guarantees that each wk is a linear combination of
v1; . . . ; vk, and hence, each uk is a linear combination of v1; . . . ; vk.
7.27. Prove Theorem 7.10: Suppose S ¼ fw1; w2; . . . ; wrg, is an orthogonal basis for a subspace W of V.
Then one may extend S to an orthogonal basis for V; that is, one may find vectors wrþ1; . . . ; wr
such that fw1; w2; . . . ; wng is an orthogonal basis for V.
Extend S to a basis S0
¼ fw1; . . . ; wr; vrþ1; . . . ; vng for V. Applying the Gram–Schmidt algorithm to S0
,
we first obtain w1; w2; . . . ; wr because S is orthogonal, and then we obtain vectors wrþ1; . . . ; wn, where
fw1; w2; . . . ; wng is an orthogonal basis for V. Thus, the theorem is proved.
7.28. Prove Theorem 7.4: Let W be a subspace of V. Then V ¼ W W?
.
By Theorem 7.9, there exists an orthogonal basis fu1; . . . ; urg of W, and by Theorem 7.10 we can
extend it to an orthogonal basis fu1; u2; . . . ; ung of V. Hence, urþ1; . . . ; un 2 W?
. If v 2 V, then
v ¼ a1u1 þ    þ anun; where a1u1 þ    þ arur 2 W and arþ1urþ1 þ    þ anun 2 W?
Accordingly, V ¼ W þ W?
.
On the other hand, if w 2 W  W?
, then hw; wi ¼ 0. This yields w ¼ 0. Hence, W  W?
¼ f0g.
The two conditions V ¼ W þ W?
and W  W?
¼ f0g give the desired result V ¼ W W?
.
Remark: Note that we have proved the theorem for the case that V has finite dimension. We
remark that the theorem also holds for spaces of arbitrary dimension.
7.29. Suppose W is a subspace of a finite-dimensional space V. Prove that W ¼ W??
.
By Theorem 7.4, V ¼ W W?
, and also V ¼ W?
W??
. Hence,
dim W ¼ dim V  dim W?
and dim W??
¼ dim V  dim W?
This yields dim W ¼ dim W??
. But W  W??
(see Problem 7.14). Hence, W ¼ W??
, as required.
7.30. Prove the following: Suppose w1; w2; . . . ; wr form an orthogonal set of nonzero vectors in V. Let v be
any vector in V and let ci be the component of v along wi. Then, for any scalars a1; . . . ; ar, we have
v 
P
r
k¼1
ckwk







  v 
P
r
k¼1
akwk








That is,
P
ciwi is the closest approximation to v as a linear combination of w1; . . . ; wr.
CHAPTER 7 Inner Product Spaces, Orthogonality 251
By Theorem 7.8, v 
P
ckwk is orthogonal to every wi and hence orthogonal to any linear combination
of w1; w2; . . . ; wr. Therefore, using the Pythagorean theorem and summing from k ¼ 1 to r,
v 
P
akwk
k k
2
¼ v 
P
ckwk þ
P
ðck  akÞwk
k k
2
¼ v 
P
ckwk
k k
2
þ
P
ðck  akÞwk
k k
2
 v 
P
ckwk
k k
2
The square root of both sides gives our theorem.
7.31. Suppose fe1; e2; . . . ; erg is an orthonormal set of vectors in V. Let v be any vector in V and let ci
be the Fourier coefficient of v with respect to ui. Prove Bessel’s inequality:
P
r
k¼1
c2
k  kvk2
Note that ci ¼ hv; eii, because keik ¼ 1. Then, using hei; eji ¼ 0 for i 6¼ j and summing from k ¼ 1 to r,
we get
0  v 
P
ckek; v 
P
ck; ek
h i ¼ hv; vi  2 v;
P
ckeki þ
P
c2
k ¼ hv; vi 
P
2ckhv; eki þ
P
c2
k

¼ hv; vi 
P
2c2
k þ
P
c2
k ¼ hv; vi 
P
c2
k
This gives us our inequality.
Orthogonal Matrices
7.32. Find an orthogonal matrix P whose first row is u1 ¼ ð1
3 ; 2
3 ; 2
3Þ.
First find a nonzero vector w2 ¼ ðx; y; zÞ that is orthogonal to u1—that is, for which
0 ¼ hu1; w2i ¼
x
3
þ
2y
3
þ
2z
3
¼ 0 or x þ 2y þ 2z ¼ 0
One such solution is w2 ¼ ð0; 1; 1Þ. Normalize w2 to obtain the second row of P:
u2 ¼ ð0; 1=
ffiffiffi
2
p
; 1=
ffiffiffi
2
p
Þ
Next find a nonzero vector w3 ¼ ðx; y; zÞ that is orthogonal to both u1 and u2—that is, for which
0 ¼ hu1; w3i ¼
x
3
þ
2y
3
þ
2z
3
¼ 0 or x þ 2y þ 2z ¼ 0
0 ¼ hu2; w3i ¼
y
ffiffiffi
2
p 
y
ffiffiffi
2
p ¼ 0 or y  z ¼ 0
Set z ¼ 1 and find the solution w3 ¼ ð4; 1; 1Þ. Normalize w3 and obtain the third row of P; that is,
u3 ¼ ð4=
ffiffiffiffiffi
18
p
; 1=
ffiffiffiffiffi
18
p
; 1=
ffiffiffiffiffi
18
p
Þ:
P ¼
1
3
2
3
2
3
0 1=
ffiffiffi
2
p
1=
ffiffiffi
2
p
4=3
ffiffiffi
2
p
1=3
ffiffiffi
2
p
1=3
ffiffiffi
2
p
2
4
3
5
Thus;
We emphasize that the above matrix P is not unique.
7.33. Let A ¼
1 1 1
1 3 4
7 5 2
2
4
3
5. Determine whether or not: (a) the rows of A are orthogonal;
(b) A is an orthogonal matrix; (c) the columns of A are orthogonal.
(a) Yes, because ð1; 1; 1Þ  ð1; 3; 4Þ ¼ 1 þ 3  4 ¼ 0, ð1; 1  1Þ  ð7; 5; 2Þ ¼ 7  5  2 ¼ 0, and
ð1; 3; 4Þ  ð7; 5; 2Þ ¼ 7  15 þ 8 ¼ 0.
(b) No, because the rows of A are not unit vectors, for example, ð1; 1; 1Þ2
¼ 1 þ 1 þ 1 ¼ 3.
(c) No; for example, ð1; 1; 7Þ  ð1; 3; 5Þ ¼ 1 þ 3  35 ¼ 31 6¼ 0.
7.34. Let B be the matrix obtained by normalizing each row of A in Problem 7.33.
(a) Find B.
(b) Is B an orthogonal matrix?
(c) Are the columns of B orthogonal?
252 CHAPTER 7 Inner Product Spaces, Orthogonality
(a) We have
kð1; 1; 1Þk2
¼ 1 þ 1 þ 1 ¼ 3; kð1; 3; 4Þk2
¼ 1 þ 9 þ 16 ¼ 26
kð7; 5; 2Þk2
¼ 49 þ 25 þ 4 ¼ 78
Thus; B ¼
1=
ffiffiffi
3
p
1=
ffiffiffi
3
p
1=
ffiffiffi
3
p
1=
ffiffiffiffiffi
26
p
3=
ffiffiffiffiffi
26
p
4=
ffiffiffiffiffi
26
p
7=
ffiffiffiffiffi
78
p
5=
ffiffiffiffiffi
78
p
2=
ffiffiffiffiffi
78
p
2
6
4
3
7
5
(b) Yes, because the rows of B are still orthogonal and are now unit vectors.
(c) Yes, because the rows of B form an orthonormal set of vectors. Then, by Theorem 7.11, the columns of
B must automatically form an orthonormal set.
7.35. Prove each of the following:
(a) P is orthogonal if and only if PT
is orthogonal.
(b) If P is orthogonal, then P1
is orthogonal.
(c) If P and Q are orthogonal, then PQ is orthogonal.
(a) We have ðPT
ÞT
¼ P. Thus, P is orthogonal if and only if PPT
¼ I if and only if PTT
PT
¼ I if and only if
PT
is orthogonal.
(b) We have PT
¼ P1
, because P is orthogonal. Thus, by part (a), P1
is orthogonal.
(c) We have PT
¼ P1
and QT
¼ Q1
. Thus, ðPQÞðPQÞT
¼ PQQT
PT
¼ PQQ1
P1
¼ I. Therefore,
ðPQÞT
¼ ðPQÞ1
, and so PQ is orthogonal.
7.36. Suppose P is an orthogonal matrix. Show that
(a) hPu; Pvi ¼ hu; vi for any u; v 2 V;
(b) kPuk ¼ kuk for every u 2 V.
Use PT
P ¼ I and hu; vi ¼ uT
v.
(a) hPu; Pvi ¼ ðPuÞT
ðPvÞ ¼ uT
PT
Pv ¼ uT
v ¼ hu; vi.
(b) We have
kPuk2
¼ hPu; Pui ¼ uT
PT
Pu ¼ uT
u ¼ hu; ui ¼ kuk2
Taking the square root of both sides gives our result.
7.37. Prove Theorem 7.12: Suppose E ¼ feig and E0
¼ fe0
ig are orthonormal bases of V. Let P be the
change-of-basis matrix from E to E0
. Then P is orthogonal.
Suppose
e0
i ¼ bi1e1 þ bi2e2 þ    þ binen; i ¼ 1; . . . ; n ð1Þ
Using Problem 7.18(b) and the fact that E0
is orthonormal, we get
dij ¼ he0
i; e0
ji ¼ bi1bj1 þ bi2bj2 þ    þ binbjn ð2Þ
Let B ¼ ½bij be the matrix of the coefficients in (1). (Then P ¼ BT
.) Suppose BBT
¼ ½cij. Then
cij ¼ bi1bj1 þ bi2bj2 þ    þ binbjn ð3Þ
By (2) and (3), we have cij ¼ dij. Thus, BBT
¼ I. Accordingly, B is orthogonal, and hence, P ¼ BT
is
orthogonal.
7.38. Prove Theorem 7.13: Let fe1; . . . ; eng be an orthonormal basis of an inner product space V. Let
P ¼ ½aij be an orthogonal matrix. Then the following n vectors form an orthonormal basis for V:
e0
i ¼ a1ie1 þ a2ie2 þ    þ anien; i ¼ 1; 2; . . . ; n
CHAPTER 7 Inner Product Spaces, Orthogonality 253
Because feig is orthonormal, we get, by Problem 7.18(b),
he0
i; e0
ji ¼ a1ia1j þ a2ia2j þ    þ anianj ¼ hCi; Cji
where Ci denotes the ith column of the orthogonal matrix P ¼ ½aij: Because P is orthogonal, its columns
form an orthonormal set. This implies he0
i; e0
ji ¼ hCi; Cji ¼ dij: Thus, fe0
ig is an orthonormal basis.
Inner Products And Positive Definite Matrices
7.39. Which of the following symmetric matrices are positive definite?
(a) A ¼
3 4
4 5
 
, (b) B ¼
8 3
3 2
 
, (c) C ¼
2 1
1 3
 
, (d) D ¼
3 5
5 9
 
Use Theorem 7.14 that a 2 2 real symmetric matrix is positive definite if and only if its diagonal
entries are positive and if its determinant is positive.
(a) No, because jAj ¼ 15  16 ¼ 1 is negative.
(b) Yes.
(c) No, because the diagonal entry 3 is negative.
(d) Yes.
7.40. Find the values of k that make each of the following matrices positive definite:
(a) A ¼
2 4
4 k
 
, (b) B ¼
4 k
k 9
 
, (c) C ¼
k 5
5 2
 
(a) First, k must be positive. Also, jAj ¼ 2k  16 must be positive; that is, 2k  16  0. Hence, k  8.
(b) We need jBj ¼ 36  k2
positive; that is, 36  k2
 0. Hence, k2
 36 or 6  k  6.
(c) C can never be positive definite, because C has a negative diagonal entry 2.
7.41. Find the matrix A that represents the usual inner product on R2
relative to each of the following
bases of R2
: ðaÞ fv1 ¼ ð1; 3Þ; v2 ¼ ð2; 5Þg; ðbÞ fw1 ¼ ð1; 2Þ; w2 ¼ ð4; 2Þg:
(a) Compute hv1; v1i ¼ 1 þ 9 ¼ 10, hv1; v2i ¼ 2 þ 15 ¼ 17, hv2; v2i ¼ 4 þ 25 ¼ 29. Thus,
A ¼
10 17
17 29
 .
(b) Compute hw1; w1i ¼ 1 þ 4 ¼ 5, hw1; w2i ¼ 4  4 ¼ 0, hw2; w2i ¼ 16 þ 4 ¼ 20. Thus, A ¼
5 0
0 20
 
.
(Because the basis vectors are orthogonal, the matrix A is diagonal.)
7.42. Consider the vector space P2ðtÞ with inner product h f ; gi ¼
Ð1
1 f ðtÞgðtÞ dt.
(a) Find h f ; gi, where f ðtÞ ¼ t þ 2 and gðtÞ ¼ t2
 3t þ 4.
(b) Find the matrix A of the inner product with respect to the basis f1; t; t2
g of V.
(c) Verify Theorem 7.16 by showing that h f ; gi ¼ ½ f T
A½g with respect to the basis f1; t; t2
g.
(a) h f ; gi ¼
ð1
1
ðt þ 2Þðt2
 3t þ 4Þ dt ¼
ð1
1
ðt3
 t2
 2t þ 8Þ dt ¼
t4
4

t3
3
 t2
þ 8t
 



1
1
¼
46
3
(b) Here we use the fact that if r þ s ¼ n,
htr
; tr
i ¼
ð1
1
tn
dt ¼
tnþ1
n þ 1




1
1
¼
2=ðn þ 1Þ if n is even;
0 if n is odd:
Then h1; 1i ¼ 2, h1; ti ¼ 0, h1; t2
i ¼ 2
3, ht; ti ¼ 2
3, ht; t2
i ¼ 0, ht2
; t2
i ¼ 2
5. Thus,
A ¼
2 0 2
3
0 2
3 0
2
3 0 2
5
2
4
3
5
254 CHAPTER 7 Inner Product Spaces, Orthogonality
(c) We have ½ f T
¼ ð2; 1; 0Þ and ½gT
¼ ð4; 3; 1Þ relative to the given basis. Then
½ f T
A½g ¼ ð2; 1; 0Þ
2 0 2
3
0 2
3 0
2
3 0 2
5
2
4
3
5
4
3
1
2
4
3
5 ¼ ð4; 2
3 ; 4
3Þ
4
3
1
2
4
3
5 ¼ 46
3 ¼ h f ; gi
7.43. Prove Theorem 7.14: A ¼
a b
b c
 
is positive definite if and only if a and d are positive and
jAj ¼ ad  b2
is positive.
Let u ¼ ½x; yT
. Then
f ðuÞ ¼ uT
Au ¼ ½x; y
a b
b d
 
x
y
 
¼ ax2
þ 2bxy þ dy2
Suppose f ðuÞ  0 for every u 6¼ 0. Then f ð1; 0Þ ¼ a  0 and f ð0; 1Þ ¼ d  0. Also, we have
f ðb; aÞ ¼ aðad  b2
Þ  0. Because a  0, we get ad  b2
 0.
Conversely, suppose a  0, b ¼ 0, ad  b2
 0. Completing the square gives us
f ðuÞ ¼ a x2
þ
2b
a
xy þ
b2
a2
y2
 
þ dy2

b2
a
y2
¼ a x þ
by
a
 2
þ
ad  b2
a
y2
Accordingly, f ðuÞ  0 for every u 6¼ 0.
7.44. Prove Theorem 7.15: Let A be a real positive definite matrix. Then the function hu; vi ¼ uT
Av is
an inner product on Rn
.
For any vectors u1; u2, and v,
hu1 þ u2; vi ¼ ðu1 þ u2ÞT
Av ¼ ðuT
1 þ uT
2 ÞAv ¼ uT
1 Av þ uT
2 Av ¼ hu1; vi þ hu2; vi
and, for any scalar k and vectors u; v,
hku; vi ¼ ðkuÞT
Av ¼ kuT
Av ¼ khu; vi
Thus ½I1 is satisfied.
Because uT
Av is a scalar, ðuT
AvÞT
¼ uT
Av. Also, AT
¼ A because A is symmetric. Therefore,
hu; vi ¼ uT
Av ¼ ðuT
AvÞT
¼ vT
AT
uTT
¼ vT
Au ¼ hv; ui
Thus, ½I2 is satisfied.
Last, because A is positive definite, XT
AX  0 for any nonzero X 2 Rn
. Thus, for any nonzero vector
v; hv; vi ¼ vT
Av  0. Also, h0; 0i ¼ 0T
A0 ¼ 0. Thus, ½I3 is satisfied. Accordingly, the function hu; vi ¼ Av
is an inner product.
7.45. Prove Theorem 7.16: Let A be the matrix representation of an inner product relative to a basis S of
V. Then, for any vectors u; v 2 V, we have
hu; vi ¼ ½uT
A½v
Suppose S ¼ fw1; w2; . . . ; wng and A ¼ ½kij. Hence, kij ¼ hwi; wji. Suppose
u ¼ a1w1 þ a2w2 þ    þ anwn and v ¼ b1w1 þ b2w2 þ    þ bnwn
Then hu; vi ¼
P
n
i¼1
P
n
j¼1
aibjhwi; wji ð1Þ
On the other hand,
½uT
A½v ¼ ða1; a2; . . . ; anÞ
k11 k12 . . . k1n
k21 k22 . . . k2n
::::::::::::::::::::::::::::::
kn1 kn2 . . . knn
2
6
6
6
4
3
7
7
7
5
b1
b2
.
.
.
bn
2
6
6
6
6
4
3
7
7
7
7
5
¼
P
n
i¼1
aiki1;
P
n
i¼1
aiki2; . . . ;
P
n
i¼1
aikin
 
b1
b2
.
.
.
bn
2
6
6
6
6
4
3
7
7
7
7
5
¼
P
n
j¼1
P
n
i¼1
aibjkij ð2Þ
Equations ð1Þ and (2) give us our result.
CHAPTER 7 Inner Product Spaces, Orthogonality 255
7.46. Prove Theorem 7.17: Let A be the matrix representation of any inner product on V. Then A is a
positive definite matrix.
Because hwi; wji ¼ hwj; wii for any basis vectors wi and wj, the matrix A is symmetric. Let X be any
nonzero vector in Rn
. Then ½u ¼ X for some nonzero vector u 2 V. Theorem 7.16 tells us that
XT
AX ¼ ½uT
A½u ¼ hu; ui  0. Thus, A is positive definite.
Complex Inner Product Spaces
7.47. Let V be a complex inner product space. Verify the relation
hu; av1 þ bv2i ¼ 
ahu; v1i þ 
bhu; v2i
Using ½I2
*, ½I1
*, and then ½I2
*, we find
hu; av1 þ bv2i ¼ hav1 þ bv2; ui ¼ ahv1; ui þ bhv2; ui ¼ 
ahv1; ui þ 
bhv2; ui ¼ 
ahu; v1i þ 
bhu; v2i
7.48. Suppose hu; vi ¼ 3 þ 2i in a complex inner product space V. Find
(a) hð2  4iÞu; vi; (b) hu; ð4 þ 3iÞvi; (c) hð3  6iÞu; ð5  2iÞvi:
(a) hð2  4iÞu; vi ¼ ð2  4iÞhu; vi ¼ ð2  4iÞð3 þ 2iÞ ¼ 14  8i
(b) hu; ð4 þ 3iÞvi ¼ ð4 þ 3iÞhu; vi ¼ ð4  3iÞð3 þ 2iÞ ¼ 18  i
(c) hð3  6iÞu; ð5  2iÞvi ¼ ð3  6iÞð5  2iÞhu; vi ¼ ð3  6iÞð5 þ 2iÞð3 þ 2iÞ ¼ 129  18i
7.49. Find the Fourier coefficient (component) c and the projection cw of v ¼ ð3 þ 4i; 2  3iÞ along
w ¼ ð5 þ i; 2iÞ in C2
.
Recall that c ¼ hv; wi=hw; wi. Compute
hv; wi ¼ ð3 þ 4iÞð5 þ iÞ þ ð2  3iÞð2iÞ ¼ ð3 þ 4iÞð5  iÞ þ ð2  3iÞð2iÞ
¼ 19 þ 17i  6  4i ¼ 13 þ 13i
hw; wi ¼ 25 þ 1 þ 4 ¼ 30
Thus, c ¼ ð13 þ 13iÞ=30 ¼ 13
30 þ 13
30 i: Accordingly, projðv; wÞ ¼ cw ¼ ð26
15 þ 39
15 i;  13
15 þ 1
15 iÞ
7.50. Prove Theorem 7.18 (Cauchy–Schwarz): Let V be a complex inner product space. Then
jhu; vij  kukkvk.
If v ¼ 0, the inequality reduces to 0  0 and hence is valid. Now suppose v 6¼ 0. Using z
z ¼ jzj2
(for
any complex number z) and hv; ui ¼ hu; vi, we expand ku  hu; vitvk2
 0, where t is any real value:
0  ku  hu; vitvk2
¼ hu  hu; vitv; u  hu; vitvi
¼ hu; ui  hu; vithu; vi  hu; vÞthv; ui þ hu; vihu; vit2
hv; vi
¼ kuk2
 2tjhu; vij2
þ jhu; vij2
t2
kvk2
Set t ¼ 1=kvk2
to find 0  kuk2

jhu; vij2
kvk2
, from which jhu; vij2
 kvk2
kvk2
. Taking the square
root of both sides, we obtain the required inequality.
7.51. Find an orthogonal basis for u?
in C3
where u ¼ ð1; i; 1 þ iÞ.
Here u?
consists of all vectors s ¼ ðx; y; zÞ such that
hw; ui ¼ x  iy þ ð1  iÞz ¼ 0
Find one solution, say w1 ¼ ð0; 1  i; iÞ. Then find a solution of the system
x  iy þ ð1  iÞz ¼ 0; ð1 þ iÞy  iz ¼ 0
Here z is a free variable. Set z ¼ 1 to obtain y ¼ i=ð1 þ iÞ ¼ ð1 þ iÞ=2 and x ¼ ð3i  3Þ2. Multiplying by 2
yields the solution w2 ¼ ð3i  3, 1 þ i, 2). The vectors w1 and w2 form an orthogonal basis for u?
.
256 CHAPTER 7 Inner Product Spaces, Orthogonality
7.52. Find an orthonormal basis of the subspace W of C3
spanned by
v1 ¼ ð1; i; 0Þ and v2 ¼ ð1; 2; 1  iÞ:
Apply the Gram–Schmidt algorithm. Set w1 ¼ v1 ¼ ð1; i; 0Þ. Compute
v2 
hv2; w1i
hw1; w1i
w1 ¼ ð1; 2; 1  iÞ 
1  2i
2
ð1; i; 0Þ ¼ ð1
2 þ i; 1  1
2 i; 1  iÞ
Multiply by 2 to clear fractions, obtaining w2 ¼ ð1 þ 2i; 2  i; 2  2iÞ. Next find kw1k ¼
ffiffiffi
2
p
and then
kw2k ¼
ffiffiffiffiffi
18
p
. Normalizing fw1; w2g, we obtain the following orthonormal basis of W:
u1 ¼
1
ffiffiffi
2
p ;
i
ffiffiffi
2
p ; 0
 
; u2 ¼
1 þ 2i
ffiffiffiffiffi
18
p ;
2  i
ffiffiffiffiffi
18
p ;
2  2i
ffiffiffiffiffi
18
p
 
7.53. Find the matrix P that represents the usual inner product on C3
relative to the basis f1; i; 1  ig.
Compute the following six inner products:
h1; 1i ¼ 1;
hi; ii ¼ i

i ¼ 1;
h1; ii ¼
i ¼ i;
hi; 1  ii ¼ ið1  iÞ ¼ 1 þ i;
h1; 1  ii ¼ 1  i ¼ 1 þ i
h1  i; 1  ii ¼ 2
Then, using ðu; vÞ ¼ hv; ui, we obtain
P ¼
1 i 1 þ i
i 1 1 þ i
1  i 1  i 2
2
4
3
5
(As expected, P is Hermitian; that is, PH
¼ P.)
Normed Vector Spaces
7.54. Consider vectors u ¼ ð1; 3; 6; 4Þ and v ¼ ð3; 5; 1; 2Þ in R4
. Find
(a) kuk1 and kvj1, (b) kuk1 and kvk1, (c) kuk2 and kvk2,
(d) d1ðu; vÞ; d1ðu; vÞ, d2ðu; vÞ.
(a) The infinity norm chooses the maximum of the absolute values of the components. Hence,
kuk1 ¼ 6 and kvk1 ¼ 5
(b) The one-norm adds the absolute values of the components. Thus,
kuk1 ¼ 1 þ 3 þ 6 þ 4 ¼ 14 and kvk1 ¼ 3 þ 5 þ 1 þ 2 ¼ 11
(c) The two-norm is equal to the square root of the sum of the squares of the components (i.e., the norm
induced by the usual inner product on R3
). Thus,
kuk2 ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 þ 9 þ 36 þ 16
p
¼
ffiffiffiffiffi
62
p
and kvk2 ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
9 þ 25 þ 1 þ 4
p
¼
ffiffiffiffiffi
39
p
(d) First find u  v ¼ ð2; 8; 7; 6Þ. Then
d1ðu; vÞ ¼ ku  vk1 ¼ 8
d1ðu; vÞ ¼ ku  vk1 ¼ 2 þ 8 þ 7 þ 6 ¼ 23
d2ðu; vÞ ¼ ku  vk2 ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
4 þ 64 þ 49 þ 36
p
¼
ffiffiffiffiffiffiffiffi
153
p
7.55. Consider the function f ðtÞ ¼ t2
 4t in C½0; 3.
(a) Find k f k1, (b) Plot f ðtÞ in the plane R2
, (c) Find k f k1, (d) Find k f k2.
(a) We seek k f k1 ¼ maxðj f ðtÞjÞ. Because f ðtÞ is differentiable on ½0; 3, j f ðtÞj has a maximum at a
critical point of f ðtÞ (i.e., when the derivative f 0
ðtÞ ¼ 0), or at an endpoint of ½0; 3. Because
f 0
ðtÞ ¼ 2t  4, we set 2t  4 ¼ 0 and obtain t ¼ 2 as a critical point. Compute
f ð2Þ ¼ 4  8 ¼ 4; f ð0Þ ¼ 0  0 ¼ 0; f ð3Þ ¼ 9  12 ¼ 3
Thus, k f k1 ¼ j f ð2Þj ¼ j  4j ¼ 4.
CHAPTER 7 Inner Product Spaces, Orthogonality 257
(b) Compute f ðtÞ for various values of t in ½0; 3, for example,
t 0 1 2 3
f ðtÞ 0 3 4 3
Plot the points in R2
and then draw a continuous curve through the points, as shown in Fig. 7-8.
(c) We seek k f k1 ¼
Ð3
0 j f ðtÞj dt. As indicated in Fig. 7-3, f ðtÞ is negative in ½0; 3; hence,
j f ðtÞj ¼ ðt2
 4tÞ ¼ 4t  t2
k f k1 ¼
ð3
0
ð4t  t2
Þ dt ¼ 2t2

t3
3
 



3
0
¼ 18  9 ¼ 9
Thus;
(d) k f k2
2 ¼
ð3
0
f ðtÞ2
dt ¼
ð3
0
ðt4
 8t3
þ 16t2
Þ dt ¼
t5
5
 2t4
þ
16t3
3
 



3
0
¼
153
5
.
Thus, k f k2 ¼
ffiffiffiffiffiffiffiffi
153
5
r
.
7.56. Prove Theorem 7.24: Let V be a normed vector space. Then the function dðu; vÞ ¼ ku  vk
satisfies the following three axioms of a metric space:
½M1 dðu; vÞ  0; and dðu; vÞ ¼ 0 iff u ¼ v.
½M2 dðu; vÞ ¼ dðv; uÞ.
½M3 dðu; vÞ  dðu; wÞ þ dðw; vÞ.
If u 6¼ v, then u  v 6¼ 0, and hence, dðu; vÞ ¼ ku  vk  0. Also, dðu; uÞ ¼ ku  uk ¼ k0k ¼ 0. Thus,
½M1 is satisfied. We also have
dðu; vÞ ¼ ku  vk ¼ k  1ðv  uÞk ¼ j  1jkv  uk ¼ kv  uk ¼ dðv; uÞ
and dðu; vÞ ¼ ku  vk ¼ kðu  wÞ þ ðw  vÞk  ku  wk þ kw  vk ¼ dðu; wÞ þ dðw; vÞ
Thus, ½M2 and ½M3 are satisfied.
SUPPLEMENTARY PROBLEMS
Inner Products
7.57. Verify that the following is an inner product on R2
, where u ¼ ðx1; x2Þ and v ¼ ðy1; y2Þ:
f ðu; vÞ ¼ x1y1  2x1 y2  2x2 y1 þ 5x2 y2
7.58. Find the values of k so that the following is an inner product on R2
, where u ¼ ðx1; x2Þ and v ¼ ðy1; y2Þ:
f ðu; vÞ ¼ x1y1  3x1 y2  3x2 y1 þ kx2 y2
Figure 7-8
258 CHAPTER 7 Inner Product Spaces, Orthogonality
7.59. Consider the vectors u ¼ ð1; 3Þ and v ¼ ð2; 5Þ in R2
. Find
(a) hu; vi with respect to the usual inner product in R2
.
(b) hu; vi with respect to the inner product in R2
in Problem 7.57.
(c) kvk using the usual inner product in R2
.
(d) kvk using the inner product in R2
in Problem 7.57.
7.60. Show that each of the following is not an inner product on R3
, where u ¼ ðx1; x2; x3Þ and v ¼ ðy1; y2; y3Þ:
(a) hu; vi ¼ x1y1 þ x2y2; (b) hu; vi ¼ x1y2x3 þ y1x2y3.
7.61. Let V be the vector space of m n matrices over R. Show that hA; Bi ¼ trðBT
AÞ defines an inner product
in V.
7.62. Suppose jhu; vij ¼ kukkvk. (That is, the Cauchy–Schwarz inequality reduces to an equality.) Show that u
and v are linearly dependent.
7.63. Suppose f ðu; vÞ and gðu; vÞ are inner products on a vector space V over R. Prove
(a) The sum f þ g is an inner product on V, where ð f þ gÞðu; vÞ ¼ f ðu; vÞ þ gðu; vÞ.
(b) The scalar product kf , for k  0, is an inner product on V, where ðkf Þðu; vÞ ¼ kf ðu; vÞ.
Orthogonality, Orthogonal Complements, Orthogonal Sets
7.64. Let V be the vector space of polynomials over R of degree 2 with inner product defined by
h f ; gi ¼
Ð1
0 f ðtÞgðtÞ dt. Find a basis of the subspace W orthogonal to hðtÞ ¼ 2t þ 1.
7.65. Find a basis of the subspace W of R4
orthogonal to u1 ¼ ð1; 2; 3; 4Þ and u2 ¼ ð3; 5; 7; 8Þ.
7.66. Find a basis for the subspace W of R5
orthogonal to the vectors u1 ¼ ð1; 1; 3; 4; 1Þ and u2 ¼ ð1; 2; 1; 2; 1Þ.
7.67. Let w ¼ ð1; 2; 1; 3Þ be a vector in R4
. Find
(a) an orthogonal basis for w?
; (b) an orthonormal basis for w?
.
7.68. Let W be the subspace of R4
orthogonal to u1 ¼ ð1; 1; 2; 2Þ and u2 ¼ ð0; 1; 2; 1Þ. Find
(a) an orthogonal basis for W; (b) an orthonormal basis for W. (Compare with Problem 7.65.)
7.69. Let S consist of the following vectors in R4
:
u1 ¼ ð1; 1; 1; 1Þ; u2 ¼ ð1; 1; 1; 1Þ; u3 ¼ ð1; 1; 1; 1Þ; u4 ¼ ð1; 1; 1; 1Þ
(a) Show that S is orthogonal and a basis of R4
.
(b) Write v ¼ ð1; 3; 5; 6Þ as a linear combination of u1; u2; u3; u4.
(c) Find the coordinates of an arbitrary vector v ¼ ða; b; c; dÞ in R4
relative to the basis S.
(d) Normalize S to obtain an orthonormal basis of R4
.
7.70. Let M ¼ M2;2 with inner product hA; Bi ¼ trðBT
AÞ. Show that the following is an orthonormal basis for M:
1 0
0 0
 
;
0 1
0 0
 
;
0 0
1 0
 
;
0 0
0 1
 
7.71. Let M ¼ M2;2 with inner product hA; Bi ¼ trðBT
AÞ. Find an orthogonal basis for the orthogonal complement
of (a) diagonal matrices, (b) symmetric matrices.
CHAPTER 7 Inner Product Spaces, Orthogonality 259
7.72. Suppose fu1; u2; . . . ; urg is an orthogonal set of vectors. Show that fk1u1; k2u2; . . . ; krurg is an orthogonal set
for any scalars k1; k2; . . . ; kr.
7.73. Let U and W be subspaces of a finite-dimensional inner product space V. Show that
(a) ðU þ WÞ?
¼ U?
 W?
; (b) ðU  WÞ?
¼ U?
þ W?
.
Projections, Gram–Schmidt Algorithm, Applications
7.74. Find the Fourier coefficient c and projection cw of v along w, where
(a) v ¼ ð2; 3; 5Þ and w ¼ ð1; 5; 2Þ in R3
:
(b) v ¼ ð1; 3; 1; 2Þ and w ¼ ð1; 2; 7; 4Þ in R4
:
(c) v ¼ t2
and w ¼ t þ 3 in PðtÞ; with inner product h f ; gi ¼
Ð1
0 f ðtÞgðtÞ dt
(d) v ¼
1 2
3 4
 
and w ¼
1 1
5 5
 
in M ¼ M2;2; with inner product hA; Bi ¼ trðBT
AÞ:
7.75. Let U be the subspace of R4
spanned by
v1 ¼ ð1; 1; 1; 1Þ; v2 ¼ ð1; 1; 2; 2Þ; v3 ¼ ð1; 2; 3; 4Þ
(a) Apply the Gram–Schmidt algorithm to find an orthogonal and an orthonormal basis for U.
(b) Find the projection of v ¼ ð1; 2; 3; 4Þ onto U.
7.76. Suppose v ¼ ð1; 2; 3; 4; 6Þ. Find the projection of v onto W, or, in other words, find w 2 W that minimizes
kv  wk, where W is the subspace of R5
spanned by
(a) u1 ¼ ð1; 2; 1; 2; 1Þ and u2 ¼ ð1; 1; 2; 1; 1Þ, (b) v1 ¼ ð1; 2; 1; 2; 1Þ and v2 ¼ ð1; 0; 1; 5; 1Þ.
7.77. Consider the subspace W ¼ P2ðtÞ of PðtÞ with inner product h f ; gi ¼
Ð1
0 f ðtÞgðtÞ dt. Find the projection of
f ðtÞ ¼ t3
onto W. (Hint: Use the orthogonal polynomials 1; 2t  1, 6t2
 6t þ 1 obtained in Problem 7.22.)
7.78. Consider PðtÞ with inner product h f ; gi ¼
Ð1
1 f ðtÞgðtÞ dt and the subspace W ¼ P3ðtÞ:
(a) Find an orthogonal basis for W by applying the Gram–Schmidt algorithm to f1; t; t2
; t3
g.
(b) Find the projection of f ðtÞ ¼ t5
onto W.
Orthogonal Matrices
7.79. Find the number and exhibit all 2 2 orthogonal matrices of the form
1
3 x
y z
 
.
7.80. Find a 3 3 orthogonal matrix P whose first two rows are multiples of u ¼ ð1; 1; 1Þ and v ¼ ð1; 2; 3Þ,
respectively.
7.81. Find a symmetric orthogonal matrix P whose first row is ð1
3 ; 2
3 ; 2
3Þ. (Compare with Problem 7.32.)
7.82. Real matrices A and B are said to be orthogonally equivalent if there exists an orthogonal matrix P such that
B ¼ PT
AP. Show that this relation is an equivalence relation.
Positive Definite Matrices and Inner Products
7.83. Find the matrix A that represents the usual inner product on R2
relative to each of the following bases:
(a) fv1 ¼ ð1; 4Þ; v2 ¼ ð2; 3Þg, (b) fw1 ¼ ð1; 3Þ; w2 ¼ ð6; 2Þg.
7.84. Consider the following inner product on R2
:
f ðu; vÞ ¼ x1y1  2x1y2  2x2y1 þ 5x2y2; where u ¼ ðx1; x2Þ v ¼ ðy1; y2Þ
Find the matrix B that represents this inner product on R2
relative to each basis in Problem 7.83.
260 CHAPTER 7 Inner Product Spaces, Orthogonality
7.85. Find the matrix C that represents the usual basis on R3
relative to the basis S of R3
consisting of the vectors
u1 ¼ ð1; 1; 1Þ, u2 ¼ ð1; 2; 1Þ, u3 ¼ ð1; 1; 3Þ.
7.86. Let V ¼ P2ðtÞ with inner product h f ; gi ¼
Ð1
0 f ðtÞgðtÞ dt.
(a) Find h f ; gi, where f ðtÞ ¼ t þ 2 and gðtÞ ¼ t2
 3t þ 4.
(b) Find the matrix A of the inner product with respect to the basis f1; t; t2
g of V.
(c) Verify Theorem 7.16 that h f ; gi ¼ ½ f T
A½g with respect to the basis f1; t; t2
g.
7.87. Determine which of the following matrices are positive definite:
(a)
1 3
3 5
 
, (b)
3 4
4 7
 
, (c)
4 2
2 1
 
, (d)
6 7
7 9
 
.
7.88. Suppose A and B are positive definite matrices. Show that:
(a) A þ B is positive definite and (b) kA is positive definite for k  0.
7.89. Suppose B is a real nonsingular matrix. Show that: (a) BT
B is symmetric and (b) BT
B is positive definite.
Complex Inner Product Spaces
7.90. Verify that
ha1u1 þ a2u2 b1v1 þ b2v2i ¼ a1

b1hu1; v1i þ a1

b2hu1; v2i þ a2

b1hu2; v1i þ a2

b2hu2; v2i
More generally, prove that h
Pm
i¼1 aiui;
Pn
j¼1 bjvji ¼
P
i;j ai

bjhui; vii.
7.91. Consider u ¼ ð1 þ i; 3; 4  iÞ and v ¼ ð3  4i; 1 þ i; 2iÞ in C3
. Find
(a) hu; vi, (b) hv; ui, (c) kuk, (d) kvk, (e) dðu; vÞ.
7.92. Find the Fourier coefficient c and the projection cw of
(a) u ¼ ð3 þ i; 5  2iÞ along w ¼ ð5 þ i; 1 þ iÞ in C2
,
(b) u ¼ ð1  i; 3i; 1 þ iÞ along w ¼ ð1; 2  i; 3 þ 2iÞ in C3
.
7.93. Let u ¼ ðz1; z2Þ and v ¼ ðw1; w2Þ belong to C2
. Verify that the following is an inner product of C2
:
f ðu; vÞ ¼ z1 
w1 þ ð1 þ iÞz1 
w2 þ ð1  iÞz2 
w1 þ 3z2 
w2
7.94. Find an orthogonal basis and an orthonormal basis for the subspace W of C3
spanned by u1 ¼ ð1; i; 1Þ and
u2 ¼ ð1 þ i; 0; 2Þ.
7.95. Let u ¼ ðz1; z2Þ and v ¼ ðw1; w2Þ belong to C2
. For what values of a; b; c; d 2 C is the following an inner
product on C2
?
f ðu; vÞ ¼ az1 
w1 þ bz1 
w2 þ cz2 
w1 þ dz2 
w2
7.96. Prove the following form for an inner product in a complex space V:
hu; vi ¼ 1
4 ku þ vk2
 1
4 ku  vk2
þ 1
4 ku þ ivk2
 1
4 ku  ivk2
[Compare with Problem 7.7(b).]
7.97. Let V be a real inner product space. Show that
(i) kuk ¼ kvk if and only if hu þ v; u  vi ¼ 0;
(ii) ku þ vk2
¼ kuk2
þ kvk2
if and only if hu; vi ¼ 0.
Show by counterexamples that the above statements are not true for, say, C2
.
7.98. Find the matrix P that represents the usual inner product on C3
relative to the basis f1; 1 þ i; 1  2ig.
CHAPTER 7 Inner Product Spaces, Orthogonality 261
7.99. A complex matrix A is unitary if it is invertible and A1
¼ AH
. Alternatively, A is unitary if its rows
(columns) form an orthonormal set of vectors (relative to the usual inner product of Cn
). Find a unitary
matrix whose first row is: (a) a multiple of ð1; 1  iÞ; (b) a multiple of ð1
2 ; 1
2 i; 1
2  1
2 iÞ.
Normed Vector Spaces
7.100. Consider vectors u ¼ ð1; 3; 4; 1; 2Þ and v ¼ ð3; 1; 2; 3; 1Þ in R5
. Find
(a) kuk1 and kvk1 , (b) kuk1 and kvk1, (c) kuk2 and kvk2, (d) d1ðu; vÞ; d1ðu; vÞ, d2ðu; vÞ
7.101. Repeat Problem 7.100 for u ¼ ð1 þ i; 2  4iÞ and v ¼ ð1  i; 2 þ 3iÞ in C2
.
7.102. Consider the functions f ðtÞ ¼ 5t  t2
and gðtÞ ¼ 3t  t2
in C½0; 4. Find
(a) d1ð f ; gÞ, (b) d1ð f ; gÞ, (c) d2ð f ; gÞ
7.103. Prove (a) k  k1 is a norm on Rn
. (b) k  k1 is a norm on Rn
.
7.104. Prove (a) k  k1 is a norm on C½a; b. (b) k  k1 is a norm on C½a; b.
ANSWERS TO SUPPLEMENTARY PROBLEMS
Notation: M ¼ ½R1; R2; . . . denotes a matrix M with rows R1; R2; : . . . Also, basis need not be unique.
7.58. k  9
7.59. (a) 13, (b) 71, (c)
ffiffiffiffiffi
29
p
, (d)
ffiffiffiffiffi
89
p
7.60. Let u ¼ ð0; 0; 1Þ; then hu; ui ¼ 0 in both cases
7.64. f7t2
 5t; 12t2
 5g
7.65. fð1; 2; 1; 0Þ; ð4; 4; 0; 1Þg
7.66. ð1; 0; 0; 0; 1Þ; ð6; 2; 0; 1; 0Þ; ð5; 2; 1; 0; 0Þ
7.67. (a) u1 ¼ ð0; 0; 3; 1Þ; u2 ¼ ð0; 5; 1; 3Þ; u3 ¼ ð14; 2; 1; 3Þ;
(b) u1=
ffiffiffiffiffi
10
p
; u2=
ffiffiffiffiffi
35
p
; u3=
ffiffiffiffiffiffiffiffi
210
p
7.68. (a) ð0; 2; 1; 0Þ; ð15; 1; 2; 5Þ, (b) ð0; 2; 1; 0Þ=
ffiffiffi
5
p
; ð15; 1; 2; 5Þ=
ffiffiffiffiffiffiffiffi
255
p
7.69. (b) v ¼ 1
4 ð5u1 þ 3u2  13u3 þ 9u4Þ,
(c) ½v ¼ 1
4 ½a þ b þ c þ d; a þ b  c  d; a  b þ c  d; a  b  c þ d
7.71. (a) ½0; 1; 0; 0; ½0; 0; 1; 0, (b) ½0; 1; 1; 0
7.74. (a) c ¼  23
30, (b) c ¼ 1
7, (c) c ¼ 15
148, (d) c ¼ 19
26
7.75. (a) w1 ¼ ð1; 1; 1; 1Þ; w2 ¼ ð0; 2; 1; 1Þ; w3 ¼ ð12; 4; 1; 7Þ,
(b) projðv; UÞ ¼ 1
5 ð1; 12; 3; 6Þ
7.76. (a) projðv; WÞ ¼ 1
8 ð23; 25; 30; 25; 23Þ, (b) First find an orthogonal basis for W;
say, w1 ¼ ð1; 2; 1; 2; 1Þ and w2 ¼ ð0; 2; 0; 3; 2Þ. Then projðv; WÞ ¼ 1
17 ð34; 76; 34; 56; 42Þ
7.77. projð f ; WÞ ¼ 3
2 t2
 3
5 t þ 1
20
262 CHAPTER 7 Inner Product Spaces, Orthogonality
7.78. (a) f1; t; 3t2
 1; 5t3
 3tg, projð f ; WÞ ¼ 10
9 t3
 5
21 t
7.79. Four: ½a; b; b; a, ½a; b; b; a, ½a; b; b; a, ½a; b; b; a, where a ¼ 1
3 and b ¼ 1
3
ffiffiffi
8
p
7.80. P ¼ ½1=a; 1=a; 1=a; 1=b; 2=b; 3=b; 5=c; 2=c; 3=c, where a ¼
ffiffiffi
3
p
; b ¼
ffiffiffiffiffi
14
p
; c ¼
ffiffiffiffiffi
38
p
7.81. 1
3 ½1; 2; 2; 2; 2; 1; 2; 1; 2
7.83. (a) ½17; 10; 10; 13, (b) ½10; 0; 0; 40
7.84. (a) ½65; 68; 68; 73, (b) ½58; 8; 8; 8
7.85. ½3; 4; 3; 4; 6; 2; 3; 2; 11
7.86. (a) 83
12, (b) ½1; a; b; a; b; c; b; c; d, where a ¼ 1
2, b ¼ 1
3, c ¼ 1
4, d ¼ 1
5
7.87. (a) No, (b) Yes, (c) No, (d) Yes
7.91. (a) 4i, (b) 4i, (c)
ffiffiffiffiffi
28
p
, (d)
ffiffiffiffiffi
31
p
, (e)
ffiffiffiffiffi
59
p
7.92. (a) c ¼ 1
28 ð19  5iÞ, (b) c ¼ 1
19 ð3 þ 6iÞ
7.94. fv1 ¼ ð1; i; 1Þ=
ffiffiffi
3
p
; v2 ¼ ð2i; 1  3i; 3  iÞ=
ffiffiffiffiffi
24
p
g
7.95. a and d real and positive, c ¼ 
b and ad  bc positive.
7.97. u ¼ ð1; 2Þ; v ¼ ði; 2iÞ
7.98. P ¼ ½1; 1  i; 1 þ 2i; 1 þ i; 2; 1 þ 3i; 1  2i; 1  3i; 5
7.99. (a) ð1=
ffiffiffi
3
p
Þ½1; 1  i; 1 þ i; 1,
(b) ½a; ai; a  ai; bi; b; 0; a; ai; a  ai, where a ¼ 1
2 and b ¼ 1=
ffiffiffi
2
p
.
7.100. (a) 4 and 3, (b) 11 and 10, (c)
ffiffiffiffiffi
31
p
and
ffiffiffiffiffi
24
p
, (d) 6; 19; 9
7.101. (a)
ffiffiffiffiffi
20
p
and
ffiffiffiffiffi
13
p
, (b)
ffiffiffi
2
p
þ
ffiffiffiffiffi
20
p
and
ffiffiffi
2
p
þ
ffiffiffiffiffi
13
p
, (c)
ffiffiffiffiffi
22
p
and
ffiffiffiffiffi
15
p
, (d) 7; 9;
ffiffiffiffiffi
53
p
7.102. (a) 8, (b) 16, (c) 16=
ffiffiffi
3
p
CHAPTER 7 Inner Product Spaces, Orthogonality 263
Determinants
8.1 Introduction
Each n-square matrix A ¼ ½aij is assigned a special scalar called the determinant of A, denoted by detðAÞ
or jAj or
a11 a12 . . . a1n
a21 a22 . . . a2n
:::::::::::::::::::::::::::::
an1 an2 . . . ann
















We emphasize that an n n array of scalars enclosed by straight lines, called a determinant of order n, is
not a matrix but denotes the determinant of the enclosed array of scalars (i.e., the enclosed matrix).
The determinant function was first discovered during the investigation of systems of linear equations.
We shall see that the determinant is an indispensable tool in investigating and obtaining properties of
square matrices.
The definition of the determinant and most of its properties also apply in the case where the entries of a
matrix come from a commutative ring.
We begin with a special case of determinants of orders 1, 2, and 3. Then we define a determinant of
arbitrary order. This general definition is preceded by a discussion of permutations, which is necessary for
our general definition of the determinant.
8.2 Determinants of Orders 1 and 2
Determinants of orders 1 and 2 are defined as follows:
ja11j ¼ a11 and
a11 a12
a21 a22







 ¼ a11a22  a12a21
Thus, the determinant of a 1 1 matrix A ¼ ½a11 is the scalar a11; that is, detðAÞ ¼ ja11j ¼ a11. The
determinant of order two may easily be remembered by using the following diagram:
a11 a12
a21 a22








That, is, the determinant is equal to the product of the elements along the plus-labeled arrow minus the
product of the elements along the minus-labeled arrow. (There is an analogous diagram for determinants
of order 3, but not for higher-order determinants.)
EXAMPLE 8.1
(a) Because the determinant of order 1 is the scalar itself, we have:
detð27Þ ¼ 27; detð7Þ ¼ 7; detðt  3Þ ¼ t  3
(b)
5 3
4 6







 ¼ 5ð6Þ  3ð4Þ ¼ 30  12 ¼ 18;
3 2
5 7







 ¼ 21 þ 10 ¼ 31
!









!
þ 
CHAPTER 8
264
Application to Linear Equations
Consider two linear equations in two unknowns, say
a1z þ b1y ¼ c1
a2x þ b2y ¼ c2
Let D ¼ a1b2  a2b1, the determinant of the matrix of coefficients. Then the system has a unique solution
if and only if D 6¼ 0. In such a case, the unique solution may be expressed completely in terms of
determinants as follows:
x ¼
Nx
D
¼
b2c1  b1c2
a1b2  a2b1
¼
c1 b1
c2 b2








a1 b1
a2 b2








; y ¼
Ny
D
¼
a1c2  a2c1
a1b2  a2b1
¼
a1 c1
a2 c2








a1 b1
a2 b2








Here D appears in the denominator of both quotients. The numerators Nx and Ny of the quotients for x and
y, respectively, can be obtained by substituting the column of constant terms in place of the column of
coefficients of the given unknown in the matrix of coefficients. On the other hand, if D ¼ 0, then the
system may have no solution or more than one solution.
EXAMPLE 8.2 Solve by determinants the system
4x  3y ¼ 15
2x þ 5y ¼ 1
First find the determinant D of the matrix of coefficients:
D ¼
4 3
2 5







 ¼ 4ð5Þ  ð3Þð2Þ ¼ 20 þ 6 ¼ 26
Because D 6¼ 0, the system has a unique solution. To obtain the numerators Nx and Ny, simply replace, in the matrix
of coefficients, the coefficients of x and y, respectively, by the constant terms, and then take their determinants:
Nx ¼
15 3
1 5







 ¼ 75 þ 3 ¼ 78 Ny ¼
4 15
2 1







 ¼ 4  30 ¼ 26
Then the unique solution of the system is
x ¼
Nx
D
¼
78
26
¼ 3; y ¼
Ny
D
¼
26
26
¼ 1
8.3 Determinants of Order 3
Consider an arbitrary 3 3 matrix A ¼ ½aij. The determinant of A is defined as follows:
detðAÞ ¼
a11 a12 a13
a21 a22 a23
a31 a32 a33












¼ a11a22a33 þ a12a23a31 þ a13a21a32  a13a22a31  a12a21a33  a11a23a32
Observe that there are six products, each product consisting of three elements of the original matrix.
Three of the products are plus-labeled (keep their sign) and three of the products are minus-labeled
(change their sign).
The diagrams in Fig. 8-1 may help us to remember the above six products in detðAÞ. That is, the
determinant is equal to the sum of the products of the elements along the three plus-labeled arrows in
CHAPTER 8 Determinants 265
Fig. 8-1 plus the sum of the negatives of the products of the elements along the three minus-labeled
arrows. We emphasize that there are no such diagrammatic devices with which to remember determinants
of higher order.
EXAMPLE 8.3 Let A ¼
2 1 1
0 5 2
1 3 4
2
4
3
5 and B ¼
3 2 1
4 5 1
2 3 4
2
4
3
5. Find detðAÞ and detðBÞ.
Use the diagrams in Fig. 8-1:
detðAÞ ¼ 2ð5Þð4Þ þ 1ð2Þð1Þ þ 1ð3Þð0Þ  1ð5Þð1Þ  ð3Þð2Þð2Þ  4ð1Þð0Þ
¼ 40  2 þ 0  5  12  0 ¼ 21
detðBÞ ¼ 60  4 þ 12  10  9 þ 32 ¼ 81
Alternative Form for a Determinant of Order 3
The determinant of the 3 3 matrix A ¼ ½aij may be rewritten as follows:
detðAÞ ¼ a11ða22a23  a23a32Þ  a12ða21a33  a23a31Þ þ a13ða21a32  a22a31Þ
¼ a11
a22 a23
a32 a33










 a12
a21 a23
a31 a33










þ a13
a21 a22
a31 a32










which is a linear combination of three determinants of order 2 whose coefficients (with alternating signs)
form the first row of the given matrix. This linear combination may be indicated in the form
a11
a11 a12 a13
a21 a22 a23
a31 a32 a33














 a12
a11 a12 a13
a21 a22 a23
a31 a32 a33














þ a13
a11 a12 a13
a21 a22 a23
a31 a32 a33














Note that each 2 2 matrix can be obtained by deleting, in the original matrix, the row and column
containing its coefficient.
EXAMPLE 8.4
1 2 3
4 2 3
0 5 1
















¼ 1
1 2 3
4 2 3
0 5 1
















 2
1 2 3
4 2 3
0 5 1
















þ 3
1 2 3
4 2 3
0 5 1
















¼ 1
2 3
5 1










 2
4 3
0 1










þ 3
4 2
0 5










¼ 1ð2  15Þ  2ð4 þ 0Þ þ 3ð20 þ 0Þ ¼ 13 þ 8 þ 60 ¼ 55
Figure 8-1
266 CHAPTER 8 Determinants
8.4 Permutations
A permutation s of the set f1; 2; . . . ; ng is a one-to-one mapping of the set onto itself or, equivalently, a
rearrangement of the numbers 1; 2; . . . ; n. Such a permutation s is denoted by
s ¼
1 2 . . . n
j1 j2 . . . jn
 
or s ¼ j1j2    jn; where ji ¼ sðiÞ
The set of all such permutations is denoted by Sn, and the number of such permutations is n!. If s 2 Sn;
then the inverse mapping s1
2 Sn; and if s; t 2 Sn, then the composition mapping s  t 2 Sn. Also, the
identity mapping e ¼ s  s1
2 Sn. (In fact, e ¼ 123 . . . n.)
EXAMPLE 8.5
(a) There are 2! ¼ 2  1 ¼ 2 permutations in S2; they are 12 and 21.
(b) There are 3! ¼ 3  2  1 ¼ 6 permutations in S3; they are 123, 132, 213, 231, 312, 321.
Sign (Parity) of a Permutation
Consider an arbitrary permutation s in Sn, say s ¼ j1j2    jn: We say s is an even or odd permutation
according to whether there is an even or odd number of inversions in s. By an inversion in s we mean a
pair of integers ði; kÞ such that i  k, but i precedes k in s. We then define the sign or parity of s, written
sgn s, by
sgn s ¼
1 if s is even
1 if s is odd
EXAMPLE 8.6
(a) Find the sign of s ¼ 35142 in S5.
For each element k, we count the number of elements i such that i  k and i precedes k in s. There are
2 numbers ð3 and 5Þ greater than and preceding 1;
3 numbers ð3; 5; and 4Þ greater than and preceding 2;
1 number ð5Þ greater than and preceding 4:
(There are no numbers greater than and preceding either 3 or 5.) Because there are, in all, six inversions, s is
even and sgn s ¼ 1.
(b) The identity permutation e ¼ 123 . . . n is even because there are no inversions in e.
(c) In S2, the permutation 12 is even and 21 is odd. In S3, the permutations 123, 231, 312 are even and the
permutations 132, 213, 321 are odd.
(d) Let t be the permutation that interchanges two numbers i and j and leaves the other numbers fixed. That is,
tðiÞ ¼ j; tðjÞ ¼ i; tðkÞ ¼ k; where k 6¼ i; j
We call t a transposition. If i  j, then there are 2ð j  iÞ  1 inversions in t, and hence, the transposition t
is odd.
Remark: One can show that, for any n, half of the permutations in Sn are even and half of them are
odd. For example, 3 of the 6 permutations in S3 are even, and 3 are odd.
8.5. Determinants of Arbitrary Order
Let A ¼ ½aij be a square matrix of order n over a field K.
Consider a product of n elements of A such that one and only one element comes from each row and
one and only one element comes from each column. Such a product can be written in the form
a1j1
a2j2
   anjn
CHAPTER 8 Determinants 267
that is, where the factors come from successive rows, and so the first subscripts are in the natural order
1; 2; . . . ; n. Now because the factors come from different columns, the sequence of second subscripts
forms a permutation s ¼ j1 j2    jn in Sn. Conversely, each permutation in Sn determines a product of the
above form. Thus, the matrix A contains n! such products.
DEFINITION: The determinant of A ¼ ½aij, denoted by detðAÞ or jAj, is the sum of all the above n!
products, where each such product is multiplied by sgn s. That is,
jAj ¼
P
s
ðsgn sÞa1j1
a2j2
   anjn
or jAj ¼
P
s2Sn
ðsgn sÞa1sð1Þa2sð2Þ    ansðnÞ
The determinant of the n-square matrix A is said to be of order n.
The next example shows that the above definition agrees with the previous definition of determinants
of orders 1, 2, and 3.
EXAMPLE 8.7
(a) Let A ¼ ½a11 be a 1 1 matrix. Because S1 has only one permutation, which is even, detðAÞ ¼ a11, the number
itself.
(b) Let A ¼ ½aij be a 2 2 matrix. In S2, the permutation 12 is even and the permutation 21 is odd. Hence,
detðAÞ ¼
a11 a12
a21 a22







 ¼ a11a22  a12a21
(c) Let A ¼ ½aij be a 3 3 matrix. In S3, the permutations 123, 231, 312 are even, and the permutations 321, 213,
132 are odd. Hence,
detðAÞ ¼
a11 a12 a13
a21 a22 a23
a31 a32 a33












¼ a11a22a33 þ a12a23a31 þ a13a21a32  a13a22a31  a12a21a33  a11a23a32
Remark: As n increases, the number of terms in the determinant becomes astronomical.
Accordingly, we use indirect methods to evaluate determinants rather than the definition of the
determinant. In fact, we prove a number of properties about determinants that will permit us to shorten
the computation considerably. In particular, we show that a determinant of order n is equal to a linear
combination of determinants of order n  1, as in the case n ¼ 3 above.
8.6 Properties of Determinants
We now list basic properties of the determinant.
THEOREM 8.1: The determinant of a matrix A and its transpose AT
are equal; that is, jAj ¼ jAT
j.
By this theorem (proved in Problem 8.22), any theorem about the determinant of a matrix A that
concerns the rows of A will have an analogous theorem concerning the columns of A.
The next theorem (proved in Problem 8.24) gives certain cases for which the determinant can be
obtained immediately.
THEOREM 8.2: Let A be a square matrix.
(i) If A has a row (column) of zeros, then jAj ¼ 0.
(ii) If A has two identical rows (columns), then jAj ¼ 0.
268 CHAPTER 8 Determinants
(iii) If A is triangular (i.e., A has zeros above or below the diagonal), then
jAj ¼ product of diagonal elements. Thus, in particular, jIj ¼ 1, where I is the
identity matrix.
The next theorem (proved in Problems 8.23 and 8.25) shows how the determinant of a matrix is
affected by the elementary row and column operations.
THEOREM 8.3: Suppose B is obtained from A by an elementary row (column) operation.
(i) If two rows (columns) of A were interchanged, then jBj ¼ jAj.
(ii) If a row (column) of A were multiplied by a scalar k, then jBj ¼ kjAj.
(iii) If a multiple of a row (column) of A were added to another row (column) of A,
then jBj ¼ jAj.
Major Properties of Determinants
We now state two of the most important and useful theorems on determinants.
THEOREM 8.4: The determinant of a product of two matrices A and B is the product of their
determinants; that is,
detðABÞ ¼ detðAÞ detðBÞ
The above theorem says that the determinant is a multiplicative function.
THEOREM 8.5: Let A be a square matrix. Then the following are equivalent:
(i) A is invertible; that is, A has an inverse A1
.
(ii) AX ¼ 0 has only the zero solution.
(iii) The determinant of A is not zero; that is, detðAÞ 6¼ 0.
Remark: Depending on the author and the text, a nonsingular matrix A is defined to be an
invertible matrix A, or a matrix A for which jAj 6¼ 0, or a matrix A for which AX ¼ 0 has only the zero
solution. The above theorem shows that all such definitions are equivalent.
We will prove Theorems 8.4 and 8.5 (in Problems 8.29 and 8.28, respectively) using the theory of
elementary matrices and the following lemma (proved in Problem 8.26), which is a special case of
Theorem 8.4.
LEMMA 8.6: Let E be an elementary matrix. Then, for any matrix A; jEAj ¼ jEjjAj.
Recall that matrices A and B are similar if there exists a nonsingular matrix P such that B ¼ P1
AP.
Using the multiplicative property of the determinant (Theorem 8.4), one can easily prove (Problem 8.31)
the following theorem.
THEOREM 8.7: Suppose A and B are similar matrices. Then jAj ¼ jBj.
8.7 Minors and Cofactors
Consider an n-square matrix A ¼ ½aij. Let Mij denote the ðn  1Þ-square submatrix of A obtained by
deleting its ith row and jth column. The determinant jMijj is called the minor of the element aij of A, and
we define the cofactor of aij, denoted by Aij; to be the ‘‘signed’’ minor:
Aij ¼ ð1Þiþj
jMijj
CHAPTER 8 Determinants 269
Note that the ‘‘signs’’ ð1Þiþj
accompanying the minors form a chessboard pattern with þ’s on the main
diagonal:
þ  þ  . . .
 þ  þ . . .
þ  þ  . . .
:::::::::::::::::::::::::::::::
2
6
6
4
3
7
7
5
We emphasize that Mij denotes a matrix, whereas Aij denotes a scalar.
Remark: The sign ð1Þiþj
of the cofactor Aij is frequently obtained using the checkerboard pattern.
Specifically, beginning with þ and alternating signs:
þ; ; þ; ; . . . ;
count from the main diagonal to the appropriate square.
EXAMPLE 8.8 Let A ¼
1 2 3
4 5 6
7 8 9
2
4
3
5. Find the following minors and cofactors: (a) jM23j and A23,
(b) jM31j and A31.
(a) jM23j ¼
1 2 3
4 5 6
7 8 9












¼
1 2
7 8







 ¼ 8  14 ¼ 6, and so A23 ¼ ð1Þ2þ3
jM23j ¼ ð6Þ ¼ 6
(b) jM31j ¼
1 2 3
4 5 6
7 8 9












¼
2 3
5 6







 ¼ 12  15 ¼ 3, and so A31 ¼ ð1Þ1þ3
jM31j ¼ þð3Þ ¼ 3
Laplace Expansion
The following theorem (proved in Problem 8.32) holds.
THEOREM 8.8: (Laplace) The determinant of a square matrix A ¼ ½aij is equal to the sum of the
products obtained by multiplying the elements of any row (column) by their
respective cofactors:
jAj ¼ ai1Ai1 þ ai2Ai2 þ    þ ainAin ¼
P
n
j¼1
aijAij
jAj ¼ a1jA1j þ a2jA2j þ    þ anjAnj ¼
P
n
i¼1
aijAij
The above formulas for jAj are called the Laplace expansions of the determinant of A by the ith row
and the jth column. Together with the elementary row (column) operations, they offer a method of
simplifying the computation of jAj, as described below.
8.8 Evaluation of Determinants
The following algorithm reduces the evaluation of a determinant of order n to the evaluation of a
determinant of order n  1.
ALGORITHM 8.1: (Reduction of the order of a determinant) The input is a nonzero n-square matrix
A ¼ ½aij with n  1.
Step 1. Choose an element aij ¼ 1 or, if lacking, aij 6¼ 0.
Step 2. Using aij as a pivot, apply elementary row (column) operations to put 0’s in all the other
positions in the column (row) containing aij.
Step 3. Expand the determinant by the column (row) containing aij.
270 CHAPTER 8 Determinants
The following remarks are in order.
Remark 1: Algorithm 8.1 is usually used for determinants of order 4 or more. With determinants
of order less than 4, one uses the specific formulas for the determinant.
Remark 2: Gaussian elimination or, equivalently, repeated use of Algorithm 8.1 together with row
interchanges can be used to transform a matrix A into an upper triangular matrix whose determinant is the
product of its diagonal entries. However, one must keep track of the number of row interchanges, because
each row interchange changes the sign of the determinant.
EXAMPLE 8.9 Use Algorithm 8.1 to find the determinant of A ¼
5 4 2 1
2 3 1 2
5 7 3 9
1 2 1 4
2
6
6
4
3
7
7
5.
Use a23 ¼ 1 as a pivot to put 0’s in the other positions of the third column; that is, apply the row operations
‘‘Replace R1 by 2R2 þ R1,’’ ‘‘Replace R3 by 3R2 þ R3,’’ and ‘‘Replace R4 by R2 þ R4.’’ By Theorem 8.3(iii), the
value of the determinant does not change under these operations. Thus,
jAj ¼
5 4 2 1
2 3 1 2
5 7 3 9
1 2 1 4
















¼
1 2 0 5
2 3 1 2
1 2 0 3
3 1 0 2
















Now expand by the third column. Specifically, neglect all terms that contain 0 and use the fact that the sign of the
minor M23 is ð1Þ2þ3
¼ 1. Thus,
jAj ¼ 
1 2 0 5
2 3 1 2
1 2 0 3
3 1 0 2
















¼ 
1 2 5
1 2 3
3 1 2












¼ ð4  18 þ 5  30  3 þ 4Þ ¼ ð38Þ ¼ 38
8.9 Classical Adjoint
Let A ¼ ½aij be an n n matrix over a field K and let Aij denote the cofactor of aij. The classical adjoint
of A, denoted by adj A, is the transpose of the matrix of cofactors of A. Namely,
adj A ¼ ½AijT
We say ‘‘classical adjoint’’ instead of simply ‘‘adjoint’’ because the term ‘‘adjoint’’ is currently used for
an entirely different concept.
EXAMPLE 8.10 Let A ¼
2 3 4
0 4 2
1 1 5
2
4
3
5. The cofactors of the nine elements of A follow:
A11 ¼ þ
4 2
1 5







 ¼ 18;
A21 ¼ 
3 4
1 5







 ¼ 11;
A31 ¼ þ
3 4
4 2







 ¼ 10;
A12 ¼ 
0 2
1 5







 ¼ 2;
A22 ¼ þ
2 4
1 5







 ¼ 14;
A32 ¼ 
2 4
0 2







 ¼ 4;
A13 ¼ þ
0 4
1 1







 ¼ 4
A23 ¼ 
2 3
1 1







 ¼ 5
A33 ¼ þ
2 3
0 4







 ¼ 8
CHAPTER 8 Determinants 271
The transpose of the above matrix of cofactors yields the classical adjoint of A; that is,
adj A ¼
18 11 10
2 14 4
4 5 8
2
4
3
5
The following theorem (proved in Problem 8.34) holds.
THEOREM 8.9: Let A be any square matrix. Then
Aðadj AÞ ¼ ðadj AÞA ¼ jAjI
where I is the identity matrix. Thus, if jAj 6¼ 0,
A1
¼
1
jAj
ðadj AÞ
EXAMPLE 8.11 Let A be the matrix in Example 8.10. We have
detðAÞ ¼ 40 þ 6 þ 0  16 þ 4 þ 0 ¼ 46
Thus, A does have an inverse, and, by Theorem 8.9,
A1
¼
1
jAj
ðadj AÞ ¼ 
1
46
18 11 10
2 14 4
4 5 8
2
6
4
3
7
5 ¼
9
23
11
46
5
23
 1
23  7
23
2
23
 2
23  5
46
4
23
2
6
4
3
7
5
8.10 Applications to Linear Equations, Cramer’s Rule
Consider a system AX ¼ B of n linear equations in n unknowns. Here A ¼ ½aij is the (square) matrix of
coefficients and B ¼ ½bi is the column vector of constants. Let Ai be the matrix obtained from A by
replacing the ith column of A by the column vector B. Furthermore, let
D ¼ detðAÞ; N1 ¼ detðA1Þ; N2 ¼ detðA2Þ; . . . ; Nn ¼ detðAnÞ
The fundamental relationship between determinants and the solution of the system AX ¼ B follows.
THEOREM 8.10: The (square) system AX ¼ B has a solution if and only if D 6¼ 0. In this case, the
unique solution is given by
x1 ¼
N1
D
; x2 ¼
N2
D
; . . . ; xn ¼
Nn
D
The above theorem (proved in Problem 8.10) is known as Cramer’s rule for solving systems of linear
equations. We emphasize that the theorem only refers to a system with the same number of equations as
unknowns, and that it only gives the solution when D 6¼ 0. In fact, if D ¼ 0, the theorem does not tell us
whether or not the system has a solution. However, in the case of a homogeneous system, we have the
following useful result (to be proved in Problem 8.54).
THEOREM 8.11: A square homogeneous system AX ¼ 0 has a nonzero solution if and only if
D ¼ jAj ¼ 0.
272 CHAPTER 8 Determinants
EXAMPLE 8.12 Solve the system using determinants
x þ y þ z ¼ 5
x  2y  3z ¼ 1
2x þ y  z ¼ 3
8

:
First compute the determinant D of the matrix of coefficients:
D ¼
1 1 1
1 2 3
2 1 1












¼ 2  6 þ 1 þ 4 þ 3 þ 1 ¼ 5
Because D 6¼ 0, the system has a unique solution. To compute Nx, Ny, Nz, we replace, respectively, the coefficients
of x; y; z in the matrix of coefficients by the constant terms. This yields
Nx ¼
5 1 1
1 2 3
3 1 1












¼ 20; Ny ¼
1 5 1
1 1 3
2 3 1












¼ 10; Nz ¼
1 1 5
1 2 1
2 1 3












¼ 15
Thus, the unique solution of the system is x ¼ Nx=D ¼ 4, y ¼ Ny=D ¼ 2, z ¼ Nz=D ¼ 3; that is, the
vector u ¼ ð4; 2; 3Þ.
8.11 Submatrices, Minors, Principal Minors
Let A ¼ ½aij be a square matrix of order n. Consider any r rows and r columns of A. That is, consider any
set I ¼ ði1; i2; . . . ; irÞ of r row indices and any set J ¼ ðj1; j2; . . . ; jrÞ of r column indices. Then I and J
define an r r submatrix of A, denoted by AðI; JÞ, obtained by deleting the rows and columns of A whose
subscripts do not belong to I or J, respectively. That is,
AðI; JÞ ¼ ½ast : s 2 I; t 2 J
The determinant jAðI; JÞj is called a minor of A of order r and
ð1Þi1þi2þþirþj1þj2þþjr
jAðI; JÞj
is the corresponding signed minor. (Note that a minor of order n  1 is a minor in the sense of Section
8.7, and the corresponding signed minor is a cofactor.) Furthermore, if I0
and J0
denote, respectively, the
remaining row and column indices, then
jAðI0
; J0
Þj
denotes the complementary minor, and its sign (Problem 8.74) is the same sign as the minor.
EXAMPLE 8.13 Let A ¼ ½aij be a 5-square matrix, and let I ¼ f1; 2; 4g and J ¼ f2; 3; 5g. Then
I0
¼ f3; 5g and J0
¼ f1; 4g, and the corresponding minor jMj and complementary minor jM0
j are as
follows:
jMj ¼ jAðI; JÞj ¼
a12 a13 a15
a22 a23 a25
a42 a43 a45












and jM0
j ¼ jAðI0
; J0
Þj ¼
a31 a34
a51 a54








Because 1 þ 2 þ 4 þ 2 þ 3 þ 5 ¼ 17 is odd, jMj is the signed minor, and jM0
j is the signed complementary
minor.
Principal Minors
A minor is principal if the row and column indices are the same, or equivalently, if the diagonal elements
of the minor come from the diagonal of the matrix. We note that the sign of a principal minor is always
þ1, because the sum of the row and identical column subscripts must always be even.
CHAPTER 8 Determinants 273
EXAMPLE 8.14 Let A ¼
1 2 1
3 5 4
3 1 2
2
4
3
5. Find the sums C1, C2, and C3 of the principal minors of A of
orders 1, 2, and 3, respectively.
(a) There are three principal minors of order 1. These are
j1j ¼ 1; j5j ¼ 5; j  2j ¼ 2; and so C1 ¼ 1 þ 5  2 ¼ 4
Note that C1 is simply the trace of A. Namely, C1 ¼ trðAÞ:
(b) There are three ways to choose two of the three diagonal elements, and each choice gives a minor of order 2.
These are
1 2
3 5







 ¼ 1;
1 1
3 2







 ¼ 1;
5 4
1 2







 ¼ 14
(Note that these minors of order 2 are the cofactors A33, A22, and A11 of A, respectively.) Thus,
C2 ¼ 1 þ 1  14 ¼ 14
(c) There is only one way to choose three of the three diagonal elements. Thus, the only minor of order 3 is the
determinant of A itself. Thus,
C3 ¼ jAj ¼ 10  24  3  15  4 þ 12 ¼ 44
8.12 Block Matrices and Determinants
The following theorem (proved in Problem 8.36) is the main result of this section.
THEOREM 8.12: Suppose M is an upper (lower) triangular block matrix with the diagonal blocks
A1; A2; . . . ; An. Then
detðMÞ ¼ detðA1Þ detðA2Þ . . . detðAnÞ
EXAMPLE 8.15 Find jMj where M ¼
2 3 4 7 8
1 5 3 2 1
0 0 2 1 5
0 0 3 1 4
0 0 5 2 6
2
6
6
6
6
4
3
7
7
7
7
5
Note that M is an upper triangular block matrix. Evaluate the determinant of each diagonal block:
2 3
1 5







 ¼ 10 þ 3 ¼ 13;
2 1 5
3 1 4
5 2 6












¼ 12 þ 20 þ 30 þ 25  16  18 ¼ 29
Then jMj ¼ 13ð29Þ ¼ 377.
Remark: Suppose M ¼
A B
C D
 
, where A; B; C; D are square matrices. Then it is not generally
true that jMj ¼ jAjjDj  jBjjCj. (See Problem 8.68.)
8.13 Determinants and Volume
Determinants are related to the notions of area and volume as follows. Let u1; u2; . . . ; un be vectors in Rn
.
Let S be the (solid) parallelopiped determined by the vectors; that is,
S ¼ fa1u1 þ a2u2 þ    þ anun : 0  ai  1 for i ¼ 1; . . . ; ng
(When n ¼ 2; S is a parallelogram.) Let VðSÞ denote the volume of S (or area of S when n ¼ 2Þ. Then
VðSÞ ¼ absolute value of det ðAÞ
274 CHAPTER 8 Determinants
where A is the matrix with rows u1; u2; . . . ; un. In general, VðSÞ ¼ 0 if and only if the vectors u1; . . . ; un
do not form a coordinate system for Rn
(i.e., if and only if the vectors are linearly dependent).
EXAMPLE 8.16 Let u1 ¼ ð1; 1; 0Þ, u2 ¼ ð1; 1; 1Þ, u3 ¼ ð0; 2; 3Þ. Find the volume VðSÞ of the parallelo-
piped S in R3
(Fig. 8-2) determined by the three vectors.
Evaluate the determinant of the matrix whose rows are u1; u2; u3:
1 1 0
1 1 1
0 2 3












¼ 3 þ 0 þ 0  0  2  3 ¼ 2
Hence, VðSÞ ¼ j  2j ¼ 2.
8.14 Determinant of a Linear Operator
Let F be a linear operator on a vector space V with finite dimension. Let A be the matrix representation of
F relative to some basis S of V. Then we define the determinant of F, written detðFÞ, by
detðFÞ ¼ jAj
If B were another matrix representation of F relative to another basis S0
of V, then A and B are similar
matrices (Theorem 6.7) and jBj ¼ jAj (Theorem 8.7). In other words, the above definition detðFÞ is
independent of the particular basis S of V. (We say that the definition is well defined.)
The next theorem (to be proved in Problem 8.62) follows from analogous theorems on matrices.
THEOREM 8.13: Let F and G be linear operators on a vector space V. Then
(i) detðF  GÞ ¼ detðFÞ detðGÞ.
(ii) F is invertible if and only if detðFÞ 6¼ 0.
EXAMPLE 8.17 Let F be the following linear operator on R3
and let A be the matrix that represents F
relative to the usual basis of R3
:
Fðx; y; zÞ ¼ ð2x  4y þ z; x  2y þ 3z; 5x þ y  zÞ and A ¼
2 4 1
1 2 3
5 1 1
2
4
3
5
Then
detðFÞ ¼ jAj ¼ 4  60 þ 1 þ 10  6  4 ¼ 55
z
y
x
0
u3
u2
u1
Figure 8-2
CHAPTER 8 Determinants 275
8.15 Multilinearity and Determinants
Let V be a vector space over a field K. Let a ¼ Vn
; that is, a consists of all the n-tuples
A ¼ ðA1; A2; . . . ; AnÞ
where the Ai are vectors in V. The following definitions apply.
DEFINITION: A function D: a ! K is said to be multilinear if it is linear in each component:
(i) If Ai ¼ B þ C, then
DðAÞ ¼ Dð. . . ; B þ C; . . .Þ ¼ Dð. . . ; B; . . . ; Þ þ Dð. . . ; C; . . .Þ
(ii) If Ai ¼ kB, where k 2 K, then
DðAÞ ¼ Dð. . . ; kB; . . .Þ ¼ kDð. . . ; B; . . .Þ
We also say n-linear for multilinear if there are n components.
DEFINITION: A function D: a ! K is said to be alternating if DðAÞ ¼ 0 whenever A has two
identical elements:
DðA1; A2; . . . ; AnÞ ¼ 0 whenever Ai ¼ Aj; i 6¼ j
Now let M denote the set of all n-square matrices A over a field K. We may view A as an n-tuple
consisting of its row vectors A1; A2; . . . ; An; that is, we may view A in the form A ¼ ðA1; A2; . . . ; AnÞ.
The following theorem (proved in Problem 8.37) characterizes the determinant function.
THEOREM 8.14: There exists a unique function D: M ! K such that
(i) D is multilinear, (ii) D is alternating, (iii) DðIÞ ¼ 1.
This function D is the determinant function; that is, DðAÞ ¼ jAj; for any matrix
A 2 M.
SOLVED PROBLEMS
Computation of Determinants
8.1. Evaluate the determinant of each of the following matrices:
(a) A ¼
6 5
2 3
 
, (b) B ¼
2 3
4 7
 
; (c) C ¼
4 5
1 2
 
; (d) D ¼
t  5 6
3 t þ 2
 
Use the formula
a b
c d







 ¼ ad  bc:
(a) jAj ¼ 6ð3Þ  5ð2Þ ¼ 18  10 ¼ 8
(b) jBj ¼ 14 þ 12 ¼ 26
(c) jCj ¼ 8  5 ¼ 13
(d) jDj ¼ ðt  5Þðt þ 2Þ  18 ¼ t2
 3t  10  18 ¼ t2
 10t  28
8.2. Evaluate the determinant of each of the following matrices:
(a) A ¼
2 3 4
5 4 3
1 2 1
2
4
3
5, (b) B ¼
1 2 3
2 4 1
1 5 2
2
4
3
5, (c) C ¼
1 3 5
3 1 2
1 2 1
2
4
3
5
276 CHAPTER 8 Determinants
Use the diagram in Fig. 8-1 to obtain the six products:
(a) jAj ¼ 2ð4Þð1Þ þ 3ð3Þð1Þ þ 4ð2Þð5Þ  1ð4Þð4Þ  2ð3Þð2Þ  1ð3Þð5Þ ¼ 8 þ 9 þ 40  16  12 15 ¼ 14
(b) jBj ¼ 8 þ 2 þ 30  12 þ 5  8 ¼ 9
(c) jCj ¼ 1 þ 6 þ 30  5 þ 4  9 ¼ 25
8.3. Compute the determinant of each of the following matrices:
(a) A ¼
2 3 4
5 6 7
8 9 1
2
4
3
5, (b) B ¼
4 6 8 9
0 2 7 3
0 0 5 6
0 0 0 3
2
6
6
4
3
7
7
5, (c) C ¼
1
2 1  1
3
3
4
1
2 1
1 4 1
2
6
4
3
7
5:
(a) One can simplify the entries by first subtracting twice the first row from the second row—that is, by
applying the row operation ‘‘Replace R2 by 21 þ R2.’’ Then
jAj ¼
2 3 4
5 6 7
8 9 1












¼
2 3 4
1 0 1
8 9 1












¼ 0  24 þ 36  0 þ 18  3 ¼ 27
(b) B is triangular, so jBj ¼ product of the diagonal entries ¼ 120.
(c) The arithmetic is simpler if fractions are first eliminated. Hence, multiply the first row R1 by 6 and the
second row R2 by 4. Then
j24Cj ¼
3 6 2
3 2 4
1 4 1












¼ 6 þ 24 þ 24 þ 4  48 þ 18 ¼ 28; so jCj ¼
28
24
¼
7
6
8.4. Compute the determinant of each of the following matrices:
(a) A ¼
2 5 3 2
2 3 2 5
1 3 2 2
1 6 4 3
2
6
6
4
3
7
7
5, (b) B ¼
6 2 1 0 5
2 1 1 2 1
1 1 2 2 3
3 0 2 3 1
1 1 3 4 2
2
6
6
6
6
4
3
7
7
7
7
5
(a) Use a31 ¼ 1 as a pivot to put 0’s in the first column, by applying the row operations ‘‘Replace R1 by
2R3 þ R1,’’ ‘‘Replace R2 by 2R3 þ R2,’’ and ‘‘Replace R4 by R3 þ R4.’’ Then
jAj ¼
2 5 3 2
2 3 2 5
1 3 2 2
1 6 4 3


















¼
0 1 1 6
0 3 2 1
1 3 2 2
0 3 2 5


















¼
1 1 6
3 2 1
3 2 5














¼ 10 þ 3  36 þ 36  2  15 ¼ 4
(b) First reduce jBj to a determinant of order 4, and then to a determinant of order 3, for which we can use
Fig. 8-1. First use c22 ¼ 1 as a pivot to put 0’s in the second column, by applying the row operations
‘‘Replace R1 by 2R2 þ R1,’’ ‘‘Replace R3 by R2 þ R3,’’ and ‘‘Replace R5 by R2 þ R5.’’ Then
jBj ¼
2 0 1 4 3
2 1 1 2 1
1 0 1 0 2
3 0 2 3 1
1 0 2 2 3
























¼
2 1 4 3
1 1 0 2
3 2 3 1
1 2 2 3


















¼
1 1 4 5
0 1 0 0
5 2 3 5
1 2 2 7


















¼
1 4 5
5 3 5
1 2 7














¼ 21 þ 20 þ 50 þ 15 þ 10  140 ¼ 34
CHAPTER 8 Determinants 277
Cofactors, Classical Adjoints, Minors, Principal Minors
8.5. Let A ¼
2 1 3 4
5 4 7 2
4 0 6 3
3 2 5 2
2
6
6
4
3
7
7
5:
(a) Find A23, the cofactor (signed minor) of 7 in A.
(b) Find the minor and the signed minor of the submatrix M ¼ Að2; 4; 2; 3Þ.
(c) Find the principal minor determined by the first and third diagonal entries—that is, by
M ¼ Að1; 3; 1; 3Þ.
(a) Take the determinant of the submatrix of A obtained by deleting row 2 and column 3 (those which
contain the 7), and multiply the determinant by ð1Þ2þ3
:
A23 ¼ 
2 1 4
4 0 3
3 2 2














¼ ð61Þ ¼ 61
The exponent 2 þ 3 comes from the subscripts of A23—that is, from the fact that 7 appears in row 2 and
column 3.
(b) The row subscripts are 2 and 4 and the column subscripts are 2 and 3. Hence, the minor is the
determinant
jMj ¼
a22 a23
a42 a43







 ¼
4 7
2 5







 ¼ 20 þ 14 ¼ 6
and the signed minor is ð1Þ2þ4þ2þ3
jMj ¼ jMj ¼ ð6Þ ¼ 6.
(c) The principal minor is the determinant
jMj ¼
a11 a13
a31 a33







 ¼
2 3
4 6







 ¼ 12 þ 12 ¼ 24
Note that now the diagonal entries of the submatrix are diagonal entries of the original matrix. Also, the
sign of the principal minor is positive.
8.6. Let B ¼
1 1 1
2 3 4
5 8 9
2
4
3
5. Find: (a) jBj, (b) adj B, (c) B1
using adj B.
(a) jBj ¼ 27 þ 20 þ 16  15  32  18 ¼ 2
(b) Take the transpose of the matrix of cofactors:
adj B ¼
3 4
8 9







 
2 4
5 9








2 3
5 8









1 1
8 9








1 1
5 9







 
1 1
5 8








1 1
3 4







 
1 1
2 4








1 1
2 3








2
6
6
6
6
6
6
6
6
4
3
7
7
7
7
7
7
7
7
5
T
¼
5 2 1
1 4 3
1 2 1
2
6
4
3
7
5
T
¼
5 1 1
2 4 2
1 3 1
2
6
4
3
7
5
(c) Because jBj 6¼ 0, B1
¼
1
jBj
ðadj BÞ ¼
1
2
5 1 1
2 4 2
1 3 1
2
4
3
5 ¼
5
2
1
2  1
2
1 2 1
 1
2
3
2  1
2
2
6
4
3
7
5
8.7. Let A ¼
1 2 3
4 5 6
0 7 8
2
4
3
5, and let Sk denote the sum of its principal minors of order k. Find Sk for
(a) k ¼ 1, (b) k ¼ 2, (c) k ¼ 3.
278 CHAPTER 8 Determinants
(a) The principal minors of order 1 are the diagonal elements. Thus, S1 is the trace of A; that is,
S1 ¼ trðAÞ ¼ 1 þ 5 þ 8 ¼ 14
(b) The principal minors of order 2 are the cofactors of the diagonal elements. Thus,
S2 ¼ A11 þ A22 þ A33 ¼
5 6
7 8







 þ
1 3
0 8







 þ
1 2
4 5







 ¼ 2 þ 8  3 ¼ 3
(c) There is only one principal minor of order 3, the determinant of A. Then
S3 ¼ jAj ¼ 40 þ 0 þ 84  0  42  64 ¼ 18
8.8. Let A ¼
1 3 0 1
4 2 5 1
1 0 3 2
3 2 1 4
2
6
6
4
3
7
7
5. Find the number Nk and sum Sk of principal minors of order:
(a) k ¼ 1, (b) k ¼ 2, (c) k ¼ 3, (d) k ¼ 4.
Each (nonempty) subset of the diagonal (or equivalently, each nonempty subset of f1; 2; 3; 4gÞ
determines a principal minor of A, and Nk ¼
n
k
 
¼
n!
k!ðn  kÞ!
of them are of order k.
Thus; N1 ¼
4
1
 
¼ 4; N2 ¼
4
2
 
¼ 6; N3 ¼
4
3
 
¼ 4; N4 ¼
4
4
 
¼ 1
(a) S1 ¼ j1j þ j2j þ j3j þ j4j ¼ 1 þ 2 þ 3 þ 4 ¼ 10
(b) S2 ¼
1 3
4 2







 þ
1 0
1 3







 þ
1 1
3 4







 þ
2 5
0 3







 þ
2 1
2 4







 þ
3 2
1 4








¼ 14 þ 3 þ 7 þ 6 þ 10 þ 14 ¼ 54
(c) S3 ¼
1 3 0
4 2 5
1 0 3














þ
1 3 1
4 2 1
3 2 4














þ
1 0 1
1 3 2
3 1 4














þ
2 5 1
0 3 2
2 1 4














¼ 57 þ 65 þ 22 þ 54 ¼ 198
(d) S4 ¼ detðAÞ ¼ 378
Determinants and Systems of Linear Equations
8.9. Use determinants to solve the system
3y þ 2x ¼ z þ 1
3x þ 2z ¼ 8  5y
3z  1 ¼ x  2y
:
8

:
First arrange the equation in standard form, then compute the determinant D of the matrix of
coefficients:
2x þ 3y  z ¼ 1
3x þ 5y þ 2z ¼ 8
x  2y  3z ¼ 1
and D ¼
2 3 1
3 5 2
1 2 3












¼ 30 þ 6 þ 6 þ 5 þ 8 þ 27 ¼ 22
Because D 6¼ 0, the system has a unique solution. To compute Nx; Ny; Nz, we replace, respectively, the
coefficients of x; y; z in the matrix of coefficients by the constant terms. Then
Nx ¼
1 3 1
8 5 2
1 2 1












¼ 66; Ny ¼
2 1 1
3 8 2
1 1 3












¼ 22; Nz ¼
2 3 1
3 5 8
1 2 1












¼ 44
CHAPTER 8 Determinants 279
Thus,
x ¼
Nx
D
¼
66
22
¼ 3; y ¼
Ny
D
¼
22
22
¼ 1; z ¼
Nz
D
¼
44
22
¼ 2
8.10. Consider the system
kx þ y þ z ¼ 1
x þ ky þ z ¼ 1
x þ y þ kz ¼ 1
8

:
Use determinants to find those values of k for which the system has
(a) a unique solution, (b) more than one solution, (c) no solution.
(a) The system has a unique solution when D 6¼ 0, where D is the determinant of the matrix of coefficients.
Compute
D ¼
k 1 1
1 k 1
1 1 k












¼ k3
þ 1 þ 1  k  k  k ¼ k3
 3k þ 2 ¼ ðk  1Þ2
ðk þ 2Þ
Thus, the system has a unique solution when
ðk  1Þ2
ðk þ 2Þ 6¼ 0; when k 6¼ 1 and k 6¼ 2
(b and c) Gaussian elimination shows that the system has more than one solution when k ¼ 1, and the
system has no solution when k ¼ 2.
Miscellaneous Problems
8.11. Find the volume VðSÞ of the parallelepiped S in R3
determined by the vectors:
(a) u1 ¼ ð1; 1; 1Þ; u2 ¼ ð1; 3; 4Þ; u3 ¼ ð1; 2; 5Þ.
(b) u1 ¼ ð1; 2; 4Þ; u2 ¼ ð2; 1; 3Þ; u3 ¼ ð5; 7; 9Þ.
VðSÞ is the absolute value of the determinant of the matrix M whose rows are the given vectors. Thus,
(a) jMj ¼
1 1 1
1 3 4
1 2 5












¼ 15  4 þ 2  3 þ 8 þ 5 ¼ 7. Hence, VðSÞ ¼ j  7j ¼ 7.
(b) jMj ¼
1 2 4
2 1 3
5 7 9












¼ 9  30 þ 56  20 þ 21  36 ¼ 0. Thus, VðSÞ ¼ 0, or, in other words, u1; u2; u3
lie in a plane and are linearly dependent.
8.12. Find detðMÞ where M ¼
3 4 0 0 0
2 5 0 0 0
0 9 2 0 0
0 5 0 6 7
0 0 4 3 4
2
6
6
6
6
4
3
7
7
7
7
5
¼
3 4 0 0 0
2 5 0 0 0
0 9 2 0 0
0 5 0 6 7
0 0 4 3 4
2
6
6
6
6
4
3
7
7
7
7
5
M is a (lower) triangular block matrix; hence, evaluate the determinant of each diagonal block:
3 4
2 5







 ¼ 15  8 ¼ 7; j2j ¼ 2;
6 7
3 4







 ¼ 24  21 ¼ 3
Thus, jMj ¼ 7ð2Þð3Þ ¼ 42.
8.13. Find the determinant of F: R3
! R3
defined by
Fðx; y; zÞ ¼ ðx þ 3y  4z; 2y þ 7z; x þ 5y  3zÞ
280 CHAPTER 8 Determinants
The determinant of a linear operator F is equal to the determinant of any matrix that represents F. Thus
first find the matrix A representing F in the usual basis (whose rows, respectively, consist of the coefficients
of x; y; z). Then
A ¼
1 3 4
0 2 7
1 5 3
2
4
3
5; and so detðFÞ ¼ jAj ¼ 6 þ 21 þ 0 þ 8  35  0 ¼ 8
8.14. Write out g ¼ gðx1; x2; x3; x4Þ explicitly where
gðx1; x2; . . . ; xnÞ ¼
Q
ij
ðxi  xjÞ:
The symbol
Q
is used for a product of terms in the same way that the symbol
P
is used for a sum of
terms. That is,
Q
ij ðxi  xjÞ means the product of all terms ðxi  xjÞ for which i  j. Hence,
g ¼ gðx1; . . . ; x4Þ ¼ ðx1  x2Þðx1  x3Þðx1  x4Þðx2  x3Þðx2  x4Þðx3  x4Þ
8.15. Let D be a 2-linear, alternating function. Show that DðA; BÞ ¼ DðB; AÞ.
Because D is alternating, DðA; AÞ ¼ 0, DðB; BÞ ¼ 0. Hence,
DðA þ B; A þ BÞ ¼ DðA; AÞ þ DðA; BÞ þ DðB; AÞ þ DðB; BÞ ¼ DðA; BÞ þ DðB; AÞ
However, DðA þ B; A þ BÞ ¼ 0. Hence, DðA; BÞ ¼ DðB; AÞ, as required.
Permutations
8.16. Determine the parity (sign) of the permutation s ¼ 364152.
Count the number of inversions. That is, for each element k, count the number of elements i in s such
that i  k and i precedes k in s. Namely,
k ¼ 1: 3 numbers ð3; 6; 4Þ
k ¼ 2: 4 numbers ð3; 6; 4; 5Þ
k ¼ 3: 0 numbers
k ¼ 4: 1 number ð6Þ
k ¼ 5: 1 number ð6Þ
k ¼ 6: 0 numbers
Because 3 þ 4 þ 0 þ 1 þ 1 þ 0 ¼ 9 is odd, s is an odd permutation, and sgn s ¼ 1.
8.17. Let s ¼ 24513 and t ¼ 41352 be permutations in S5. Find (a) t  s, (b) s1
.
Recall that s ¼ 24513 and t ¼ 41352 are short ways of writing
s ¼
1 2 3 4 5
2 4 5 1 3
 
or sð1Þ ¼ 2; sð2Þ ¼ 4; sð3Þ ¼ 5; sð4Þ ¼ 1; sð5Þ ¼ 3
t ¼
1 2 3 4 5
4 1 3 5 2
c
 
or tð1Þ ¼ 4; tð2Þ ¼ 1; tð3Þ ¼ 3; tð4Þ ¼ 5; tð5Þ ¼ 2
(a) The effects of s and then t on 1; 2; 3; 4; 5 are as follows:
1 ! 2 ! 1; 2 ! 4 ! 5; 3 ! 5 ! 2; 4 ! 1 ! 4; 5 ! 3 ! 3
[That is, for example, ðt  sÞð1Þ ¼ tðsð1ÞÞ ¼ tð2Þ ¼ 1: Thus, t  s ¼ 15243.
(b) By definition, s1
ð jÞ ¼ k if and only if sðkÞ ¼ j. Hence,
s1
¼
2 4 5 1 3
1 2 3 4 5
 
¼
1 2 3 4 5
4 1 5 2 3
 
or s1
¼ 41523
8.18. Let s ¼ j1 j2 . . . jn be any permutation in Sn. Show that, for each inversion ði; kÞ where i  k but i
precedes k in s, there is a pair ði*; j*Þ such that
i*  k* and sði*Þ  sð j*Þ ð1Þ
and vice versa. Thus, s is even or odd according to whether there is an even or an odd number of
pairs satisfying (1).
CHAPTER 8 Determinants 281
Choose i* and k* so that sði*Þ ¼ i and sðk*Þ ¼ k. Then i  k if and only if sði*Þ  sðk*Þ, and i
precedes k in s if and only if i*  k*.
8.19. Consider the polynomials g ¼ gðx1; . . . ; xnÞ and sðgÞ, defined by
g ¼ gðx1; . . . ; xnÞ ¼
Q
ij
ðxi  xjÞ and sðgÞ ¼
Q
ij
ðxsðiÞ  xsðjÞÞ
(See Problem 8.14.) Show that sðgÞ ¼ g when s is an even permutation, and sðgÞ ¼ g when s is
an odd permutation. That is, sðgÞ ¼ ðsgn sÞg.
Because s is one-to-one and onto,
sðgÞ ¼
Q
ij
ðxsðiÞ  xsð jÞÞ ¼
Q
ij or ij
ðxi  xjÞ
Thus, sðgÞ or sðgÞ ¼ g according to whether there is an even or an odd number of terms of the form
xi  xj, where i  j. Note that for each pair ði; jÞ for which
i  j and sðiÞ  sð jÞ
there is a term ðxsðiÞ  xsð jÞÞ in sðgÞ for which sðiÞ  sð jÞ. Because s is even if and only if there is an even
number of pairs satisfying (1), we have sðgÞ ¼ g if and only if s is even. Hence, sðgÞ ¼ g if and only if s
is odd.
8.20. Let s; t 2 Sn. Show that sgnðt  sÞ ¼ ðsgn tÞðsgn sÞ. Thus, the product of two even or two odd
permutations is even, and the product of an odd and an even permutation is odd.
Using Problem 8.19, we have
sgnðt  sÞ g ¼ ðt  sÞðgÞ ¼ tðsðgÞÞ ¼ tððsgn sÞgÞ ¼ ðsgn tÞðsgn sÞg
Accordingly, sgn ðt  sÞ ¼ ðsgn tÞðsgn sÞ.
8.21. Consider the permutation s ¼ j1 j2    jn. Show that sgn s1
¼ sgn s and, for scalars aij,
show that
aj11aj22    ajnn ¼ a1k1
a2k2
   ankn
where s1
¼ k1k2    kn.
We have s1  s ¼ e, the identity permutation. Because e is even, s1
and s are both even or both odd.
Hence sgn s1
¼ sgn s.
Because s ¼ j1j2    jn is a permutation, aj11aj22    ajnn ¼ a1k1
a2k2
   ankn
. Then k1; k2; . . . ; kn have the
property that
sðk1Þ ¼ 1; sðk2Þ ¼ 2; . . . ; sðknÞ ¼ n
Let t ¼ k1k2    kn. Then, for i ¼ 1; . . . ; n,
ðs  tÞðiÞ ¼ sðtðiÞÞ ¼ sðkiÞ ¼ i
Thus, s  t ¼ e, the identity permutation. Hence, t ¼ s1
.
Proofs of Theorems
8.22. Prove Theorem 8.1: jAT
j ¼ jAj.
If A ¼ ½aij, then AT
¼ ½bij, with bij ¼ aji. Hence,
jAT
j ¼
P
s2Sn
ðsgn sÞb1sð1Þb2sð2Þ    bnsðnÞ ¼
P
s2Sn
ðsgn sÞasð1Þ;1asð2Þ;2    asðnÞ;n
Let t ¼ s1
. By Problem 8.21 sgn t ¼ sgn s, and asð1Þ;1asð2Þ;2    asðnÞ;n ¼ a1tð1Þa2tð2Þ    antðnÞ. Hence,
jAT
j ¼
P
s2Sn
ðsgn tÞa1tð1Þa2tð2Þ    antðnÞ
282 CHAPTER 8 Determinants
However, as s runs through all the elements of Sn; t ¼ s1
also runs through all the elements of Sn. Thus,
jAT
j ¼ jAj.
8.23. Prove Theorem 8.3(i): If two rows (columns) of A are interchanged, then jBj ¼ jAj.
We prove the theorem for the case that two columns are interchanged. Let t be the transposition that
interchanges the two numbers corresponding to the two columns of A that are interchanged. If A ¼ ½aij and
B ¼ ½bij, then bij ¼ aitðjÞ. Hence, for any permutation s,
b1sð1Þb2sð2Þ    bnsðnÞ ¼ a1ðt  sÞð1Þa2ðt  sÞð2Þ    anðt  sÞðnÞ
Thus,
jBj ¼
P
s2Sn
ðsgn sÞb1sð1Þb2sð2Þ    bnsðnÞ ¼
P
s2Sn
ðsgn sÞa1ðt  sÞð1Þa2ðt  sÞð2Þ    anðt  sÞðnÞ
Because the transposition t is an odd permutation, sgnðt  sÞ ¼ ðsgn tÞðsgn sÞ ¼ sgn s. Accordingly,
sgn s ¼ sgn ðt  sÞ; and so
jBj ¼ 
P
s2Sn
½sgnðt  sÞa1ðt  sÞð1Þa2ðt  sÞð2Þ    anðt  sÞðnÞ
But as s runs through all the elements of Sn; t  s also runs through all the elements of Sn: Hence, jBj ¼ jAj.
8.24. Prove Theorem 8.2.
(i) If A has a row (column) of zeros, then jAj ¼ 0.
(ii) If A has two identical rows (columns), then jAj ¼ 0.
(iii) If A is triangular, then jAj ¼ product of diagonal elements. Thus, jIj ¼ 1.
(i) Each term in jAj contains a factor from every row, and so from the row of zeros. Thus, each term of jAj
is zero, and so jAj ¼ 0.
(ii) Suppose 1 þ 1 6¼ 0 in K. If we interchange the two identical rows of A, we still obtain the matrix A.
Hence, by Problem 8.23, jAj ¼ jAj, and so jAj ¼ 0.
Now suppose 1 þ 1 ¼ 0 in K. Then sgn s ¼ 1 for every s 2 Sn: Because A has two identical
rows, we can arrange the terms of A into pairs of equal terms. Because each pair is 0, the determinant
of A is zero.
(iii) Suppose A ¼ ½aij is lower triangular; that is, the entries above the diagonal are all zero: aij ¼ 0
whenever i  j. Consider a term t of the determinant of A:
t ¼ ðsgn sÞa1i1
a2i2
   anin
; where s ¼ i1i2    in
Suppose i1 6¼ 1. Then 1  i1 and so a1i1
¼ 0; hence, t ¼ 0: That is, each term for which i1 6¼ 1 is
zero.
Now suppose i1 ¼ 1 but i2 6¼ 2. Then 2  i2, and so a2i2
¼ 0; hence, t ¼ 0. Thus, each term
for which i1 6¼ 1 or i2 6¼ 2 is zero.
Similarly, we obtain that each term for which i1 6¼ 1 or i2 6¼ 2 or . . . or in 6¼ n is zero.
Accordingly, jAj ¼ a11a22    ann ¼ product of diagonal elements.
8.25. Prove Theorem 8.3: B is obtained from A by an elementary operation.
(i) If two rows (columns) of A were interchanged, then jBj ¼ jAj.
(ii) If a row (column) of A were multiplied by a scalar k, then jBj ¼ kjAj.
(iii) If a multiple of a row (column) of A were added to another row (column) of A; then jBj ¼ jAj.
(i) This result was proved in Problem 8.23.
(ii) If the jth row of A is multiplied by k, then every term in jAj is multiplied by k, and so jBj ¼ kjAj. That is,
jBj ¼
P
s
ðsgn sÞa1i1
a2i2
   ðkajij
Þ    anin
¼ k
P
s
ðsgn sÞa1i1
a2i2
   anin
¼ kjAj
CHAPTER 8 Determinants 283
(iii) Suppose c times the kth row is added to the jth row of A. Using the symbol ^ to denote the jth position
in a determinant term, we have
jBj ¼
P
s
ðsgn sÞa1i1
a2i2
   ðcakik
þ ajij
Þ . . . anin
¼ c
P
s
ðsgn sÞa1i1
a2i2
   c
akik
   anin
þ
P
s
ðsgn sÞa1i1
a2i2
   ajij
   anin
The first sum is the determinant of a matrix whose kth and jth rows are identical. Accordingly, by
Theorem 8.2(ii), the sum is zero. The second sum is the determinant of A. Thus, jBj ¼ c  0 þ jAj ¼ jAj.
8.26. Prove Lemma 8.6: Let E be an elementary matrix. Then jEAj ¼ jEjjAj.
Consider the elementary row operations: (i) Multiply a row by a constant k 6¼ 0,
(ii) Interchange two rows, (iii) Add a multiple of one row to another.
Let E1; E2; E3 be the corresponding elementary matrices That is, E1; E2; E3 are obtained by applying the
above operations to the identity matrix I. By Problem 8.25,
jE1j ¼ kjIj ¼ k; jE2j ¼ jIj ¼ 1; jE3j ¼ jIj ¼ 1
Recall (Theorem 3.11) that EiA is identical to the matrix obtained by applying the corresponding operation
to A. Thus, by Theorem 8.3, we obtain the following which proves our lemma:
jE1Aj ¼ kjAj ¼ jE1jjAj; jE2Aj ¼ jAj ¼ jE2jjAj; jE3Aj ¼ jAj ¼ 1jAj ¼ jE3jjAj
8.27. Suppose B is row equivalent to a square matrix A. Prove that jBj ¼ 0 if and only if jAj ¼ 0.
By Theorem 8.3, the effect of an elementary row operation is to change the sign of the determinant or to
multiply the determinant by a nonzero scalar. Hence, jBj ¼ 0 if and only if jAj ¼ 0.
8.28. Prove Theorem 8.5: Let A be an n-square matrix. Then the following are equivalent:
(i) A is invertible, (ii) AX ¼ 0 has only the zero solution, (iii) detðAÞ 6¼ 0.
The proof is by the Gaussian algorithm. If A is invertible, it is row equivalent to I. But jIj 6¼ 0. Hence,
by Problem 8.27, jAj 6¼ 0. If A is not invertible, it is row equivalent to a matrix with a zero row. Hence,
detðAÞ ¼ 0. Thus, (i) and (iii) are equivalent.
If AX ¼ 0 has only the solution X ¼ 0, then A is row equivalent to I and A is invertible. Conversely, if
A is invertible with inverse A1
, then
X ¼ IX ¼ ðA1
AÞX ¼ A1
ðAXÞ ¼ A1
0 ¼ 0
is the only solution of AX ¼ 0. Thus, (i) and (ii) are equivalent.
8.29. Prove Theorem 8.4: jABj ¼ jAjjBj.
If A is singular, then AB is also singular, and so jABj ¼ 0 ¼ jAjjBj. On the other hand, if A is
nonsingular, then A ¼ En    E2E1, a product of elementary matrices. Then, Lemma 8.6 and induction yields
jABj ¼ jEn    E2E1Bj ¼ jEnj    jE2jjE1jjBj ¼ jAjjBj
8.30. Suppose P is invertible. Prove that jP1
j ¼ jPj1
.
P1
P ¼ I: Hence; 1 ¼ jIj ¼ jP1
Pj ¼ jP1
jjPj; and so jP1
j ¼ jPj1
:
8.31. Prove Theorem 8.7: Suppose A and B are similar matrices. Then jAj ¼ jBj.
Because A and B are similar, there exists an invertible matrix P such that B ¼ P1
AP. Therefore, using
Problem 8.30, we get jBj ¼ jP1
APj ¼ jP1
jjAjjPj ¼ jAjjP1
jjP ¼ jAj.
We remark that although the matrices P1
and A may not commute, their determinants jP1
j and jAj do
commute, because they are scalars in the field K.
8.32. Prove Theorem 8.8 (Laplace): Let A ¼ ½aij, and let Aij denote the cofactor of aij. Then, for any i or j
jAj ¼ ai1Ai1 þ    þ ainAin and jAj ¼ a1jA1j þ    þ anjAnj
d
284 CHAPTER 8 Determinants
Because jAj ¼ jAT
j, we need only prove one of the expansions, say, the first one in terms of rows of A.
Each term in jAj contains one and only one entry of the ith row ðai1; ai2; . . . ; ainÞ of A. Hence, we can write
jAj in the form
jAj ¼ ai1A*
i1 þ ai2A*
i2 þ    þ ainA*
in
(Note that A*
ij is a sum of terms involving no entry of the ith row of A.) Thus, the theorem is proved if we can
show that
A*
ij ¼ Aij ¼ ð1Þiþj
jMijj
where Mij is the matrix obtained by deleting the row and column containing the entry aij: (Historically, the
expression A*
ij was defined as the cofactor of aij, and so the theorem reduces to showing that the two
definitions of the cofactor are equivalent.)
First we consider the case that i ¼ n, j ¼ n. Then the sum of terms in jAj containing ann is
annA*
nn ¼ ann
P
s
ðsgn sÞa1sð1Þa2sð2Þ    an1;sðn1Þ
where we sum over all permutations s 2 Sn for which sðnÞ ¼ n. However, this is equivalent (Prove!) to
summing over all permutations of f1; . . . ; n  1g. Thus, A*
nn ¼ jMnnj ¼ ð1Þnþn
jMnnj.
Now we consider any i and j. We interchange the ith row with each succeeding row until it is last, and
we interchange the jth column with each succeeding column until it is last. Note that the determinant jMijj is
not affected, because the relative positions of the other rows and columns are not affected by these
interchanges. However, the ‘‘sign’’ of jAj and of A*
ij is changed n  1 and then n  j times. Accordingly,
A*
ij ¼ ð1Þniþnj
jMijj ¼ ð1Þiþj
jMijj
8.33. Let A ¼ ½aij and let B be the matrix obtained from A by replacing the ith row of A by the row
vector ðbi1; . . . ; binÞ. Show that
jBj ¼ bi1Ai1 þ bi2Ai2 þ    þ binAin
Furthermore, show that, for j 6¼ i,
aj1Ai1 þ aj2Ai2 þ    þ ajnAin ¼ 0 and a1jA1i þ a2jA2i þ    þ anjAni ¼ 0
Let B ¼ ½bij. By Theorem 8.8,
jBj ¼ bi1Bi1 þ bi2Bi2 þ    þ binBin
Because Bij does not depend on the ith row of B; we get Bij ¼ Aij for j ¼ 1; . . . ; n. Hence,
jBj ¼ bi1Ai1 þ bi2Ai2 þ    þ binAin
Now let A0
be obtained from A by replacing the ith row of A by the jth row of A. Because A0
has two
identical rows, jA0
j ¼ 0. Thus, by the above result,
jA0
j ¼ aj1Ai1 þ aj2Ai2 þ    þ ajnAin ¼ 0
Using jAT
j ¼ jAj, we also obtain that a1jA1i þ a2jA2i þ    þ anjAni ¼ 0.
8.34. Prove Theorem 8.9: Aðadj AÞ ¼ ðadj AÞA ¼ jAjI.
Let A ¼ ½aij and let Aðadj AÞ ¼ ½bij. The ith row of A is
ðai1; ai2; . . . ; ainÞ ð1Þ
Because adj A is the transpose of the matrix of cofactors, the jth column of adj A is the tranpose of the
cofactors of the jth row of A:
ðAj; Aj2; . . . ; AjnÞT
ð2Þ
Now bij; the ij entry in Aðadj AÞ, is obtained by multiplying expressions (1) and (2):
bij ¼ ai1Aj1 þ ai2Aj2 þ    þ ainAjn
CHAPTER 8 Determinants 285
By Theorem 8.8 and Problem 8.33,
bij ¼
jAj if i ¼ j
0 if i 6¼ j
Accordingly, Aðadj AÞ is the diagonal matrix with each diagonal element jAj. In other words,
Aðadj AÞ ¼ jAjI. Similarly, ðadj AÞA ¼ jAjI.
8.35. Prove Theorem 8.10 (Cramer’s rule): The (square) system AX ¼ B has a unique solution if and
only if D 6¼ 0. In this case, xi ¼ Ni=D for each i.
By previous results, AX ¼ B has a unique solution if and only if A is invertible, and A is invertible if and
only if D ¼ jAj 6¼ 0.
Now suppose D 6¼ 0. By Theorem 8.9, A1
¼ ð1=DÞðadj AÞ. Multiplying AX ¼ B by A1
, we obtain
X ¼ A1
AX ¼ ð1=DÞðadj AÞB ð1Þ
Note that the ith row of ð1=DÞðadj AÞ is ð1=DÞðA1i; A2i; . . . ; AniÞ. If B ¼ ðb1; b2; . . . ; bnÞT
, then, by (1),
xi ¼ ð1=DÞðb1A1i þ b2A2i þ    þ bnAniÞ
However, as in Problem 8.33, b1A1i þ b2A2i þ    þ bnAni ¼ Ni, the determinant of the matrix obtained by
replacing the ith column of A by the column vector B. Thus, xi ¼ ð1=DÞNi, as required.
8.36. Prove Theorem 8.12: Suppose M is an upper (lower) triangular block matrix with diagonal blocks
A1; A2; . . . ; An. Then
detðMÞ ¼ detðA1Þ detðA2Þ    detðAnÞ
We need only prove the theorem for n ¼ 2—that is, when M is a square matrix of the form
M ¼
A C
0 B
 
. The proof of the general theorem follows easily by induction.
Suppose A ¼ ½aij is r-square, B ¼ ½bij is s-square, and M ¼ ½mij is n-square, where n ¼ r þ s. By
definition,
detðMÞ ¼
P
s2Sn
ðsgn sÞm1sð1Þm2sð2Þ    mnsðnÞ
If i  r and j  r, then mij ¼ 0. Thus, we need only consider those permutations s such that
sfr þ 1; r þ 2; . . . ; r þ sg ¼ fr þ 1; r þ 2; . . . ; r þ sg and sf1; 2; . . . ; rg ¼ f1; 2; . . . ; rg
Let s1ðkÞ ¼ sðkÞ for k  r, and let s2ðkÞ ¼ sðr þ kÞ  r for k  s. Then
ðsgn sÞm1sð1Þm2sð2Þ    mnsðnÞ ¼ ðsgn s1Þa1s1ð1Þa2s1ð2Þ    ars1ðrÞðsgn s2Þb1s2ð1Þb2s2ð2Þ    bss2ðsÞ
which implies detðMÞ ¼ detðAÞ detðBÞ.
8.37. Prove Theorem 8.14: There exists a unique function D : M ! K such that
(i) D is multilinear, (ii) D is alternating, (iii) DðIÞ ¼ 1.
This function D is the determinant function; that is, DðAÞ ¼ jAj.
Let D be the determinant function, DðAÞ ¼ jAj. We must show that D satisfies (i), (ii), and (iii), and that
D is the only function satisfying (i), (ii), and (iii).
By Theorem 8.2, D satisfies (ii) and (iii). Hence, we show that it is multilinear. Suppose the ith row of
A ¼ ½aij has the form ðbi1 þ ci1; bi2 þ ci2; . . . ; bin þ cinÞ. Then
DðAÞ ¼ DðA1; . . . ; Bi þ Ci; . . . ; AnÞ
¼
P
Sn
ðsgn sÞa1sð1Þ    ai1;sði1ÞðbisðiÞ þ cisðiÞÞ    ansðnÞ
¼
P
Sn
ðsgn sÞa1sð1Þ    bisðiÞ    ansðnÞ þ
P
Sn
ðsgn sÞa1sð1Þ    cisðiÞ    ansðnÞ
¼ DðA1; . . . ; Bi; . . . ; AnÞ þ DðA1; . . . ; Ci; . . . ; AnÞ
286 CHAPTER 8 Determinants
Also, by Theorem 8.3(ii),
DðA1; . . . ; kAi; . . . ; AnÞ ¼ kDðA1; . . . ; Ai; . . . ; AnÞ
Thus, D is multilinear—D satisfies (i).
We next must prove the uniqueness of D. Suppose D satisfies (i), (ii), and (iii). If fe1; . . . ; eng is the
usual basis of Kn
, then, by (iii), Dðe1; e2; . . . ; enÞ ¼ DðIÞ ¼ 1. Using (ii), we also have that
Dðei1
; ei2
; . . . ; ein
Þ ¼ sgn s; where s ¼ i1i2    in ð1Þ
Now suppose A ¼ ½aij. Observe that the kth row Ak of A is
Ak ¼ ðak1; ak2; . . . ; aknÞ ¼ ak1e1 þ ak2e2 þ    þ aknen
Thus,
DðAÞ ¼ Dða11e1 þ    þ a1nen; a21e1 þ    þ a2nen; . . . ; an1e1 þ    þ annenÞ
Using the multilinearity of D, we can write DðAÞ as a sum of terms of the form
DðAÞ ¼
P
Dða1i1
ei1
; a2i2
ei2
; . . . ; anin
ein
Þ
¼
P
ða1i1
a2i2
   anin
ÞDðei1
; ei2
; . . . ; ein
Þ ð2Þ
where the sum is summed over all sequences i1i2 . . . in, where ik 2 f1; . . . ; ng. If two of the indices are equal,
say ij ¼ ik but j 6¼ k, then, by (ii),
Dðei1
; ei2
; . . . ; ein
Þ ¼ 0
Accordingly, the sum in (2) need only be summed over all permutations s ¼ i1i2    in. Using (1), we finally
have that
DðAÞ ¼
P
s
ða1i1
a2i2
   anin
ÞDðei1
; ei2
; . . . ; ein
Þ
¼
P
s
ðsgn sÞa1i1
a2i2
   anin
; where s ¼ i1i2    in
Hence, D is the determinant function, and so the theorem is proved.
SUPPLEMENTARY PROBLEMS
Computation of Determinants
8.38. Evaluate:
(a)
2 6
4 1







, (b)
5 1
3 2







, (c)
2 8
5 3







, (d)
4 9
1 3







, (e)
a þ b a
b a þ b








8.39. Find all t such that (a)
t  4 3
2 t  9







 ¼ 0, (b)
t  1 4
3 t  2







 ¼ 0
8.40. Compute the determinant of each of the following matrices:
(a)
2 1 1
0 5 2
1 3 4
2
4
3
5, (b)
3 2 4
2 5 1
0 6 1
2
4
3
5, (c)
2 1 4
6 3 2
4 1 2
2
4
3
5, (d)
7 6 5
1 2 1
3 2 1
2
4
3
5
CHAPTER 8 Determinants 287
8.41. Find the determinant of each of the following matrices:
(a)
1 2 2 3
1 0 2 0
3 1 1 2
4 3 0 2
2
6
6
4
3
7
7
5, (b)
2 1 3 2
3 0 1 2
1 1 4 3
2 2 1 1
2
6
6
4
3
7
7
5
8.42. Evaluate:
(a)
2 1 3 4
2 1 2 1
3 3 5 4
5 2 1 4
















, (b)
2 1 4 3
1 1 0 2
3 2 3 1
1 2 2 3
















, (c)
1 2 3 1
1 1 2 0
2 0 4 5
1 4 4 6
















8.43. Evaluate each of the following determinants:
(a)
1 2 1 3 1
2 1 1 2 3
3 1 0 2 1
5 1 2 3 4
2 3 1 1 2




















, (b)
1 3 5 7 9
2 4 2 4 2
0 0 1 2 3
0 0 5 6 2
0 0 2 3 1




















, (c)
1 2 3 4 5
5 4 3 2 1
0 0 6 5 1
0 0 0 7 4
0 0 0 2 3




















Cofactors, Classical Adjoints, Inverses
8.44. Find detðAÞ, adj A, and A1
, where
(a) A ¼
1 1 0
1 1 1
0 2 1
2
4
3
5, (b) A ¼
1 2 2
3 1 0
1 1 1
2
4
3
5
8.45. Find the classical adjoint of each matrix in Problem 8.41.
8.46. Let A ¼
a b
c d
 
. (a) Find adj A, (b) Show that adjðadj AÞ ¼ A, (c) When does A ¼ adj A?
8.47. Show that if A is diagonal (triangular) then adj A is diagonal (triangular).
8.48. Suppose A ¼ ½aij is triangular. Show that
(a) A is invertible if and only if each diagonal element aii 6¼ 0.
(b) The diagonal elements of A1
(if it exists) are a1
ii , the reciprocals of the diagonal elements of A.
Minors, Principal Minors
8.49. Let A ¼
1 2 3 2
1 0 2 3
3 1 2 5
4 3 0 1
2
6
6
4
3
7
7
5 and B ¼
1 3 1 5
2 3 1 4
0 5 2 1
3 0 5 2
2
6
6
4
3
7
7
5. Find the minor and the signed minor
corresponding to the following submatrices:
(a) Að1; 4; 3; 4Þ, (b) Bð1; 4; 3; 4Þ, (c) Að2; 3; 2; 4Þ, (d) Bð2; 3; 2; 4Þ.
8.50. For k ¼ 1; 2; 3, find the sum Sk of all principal minors of order k for
(a) A ¼
1 3 2
2 4 3
5 2 1
2
4
3
5, (b) B ¼
1 5 4
2 6 1
3 2 0
2
4
3
5, (c) C ¼
1 4 3
2 1 5
4 7 11
2
4
3
5
288 CHAPTER 8 Determinants
8.51. For k ¼ 1; 2; 3; 4, find the sum Sk of all principal minors of order k for
(a) A ¼
1 2 3 1
1 2 0 5
0 1 2 2
4 0 1 3
2
6
6
4
3
7
7
5, (b) B ¼
1 2 1 2
0 1 2 3
1 3 0 4
2 7 4 5
2
6
6
4
3
7
7
5
Determinants and Linear Equations
8.52. Solve the following systems by determinants:
(a)
3x þ 5y ¼ 8
4x  2y ¼ 1
, (b)
2x  3y ¼ 1
4x þ 7y ¼ 1
, (c)
ax  2by ¼ c
3ax  5by ¼ 2c
ðab 6¼ 0Þ
8.53. Solve the following systems by determinants:
(a)
2x  5y þ 2z ¼ 2
x þ 2y  4z ¼ 5
3x  4y  6z ¼ 1
8

:
, (b)
2z þ 3 ¼ y þ 3x
x  3z ¼ 2y þ 1
3y þ z ¼ 2  2x
8

:
8.54. Prove Theorem 8.11: The system AX ¼ 0 has a nonzero solution if and only if D ¼ jAj ¼ 0.
Permutations
8.55. Find the parity of the permutations s ¼ 32154, t ¼ 13524, p ¼ 42531 in S5.
8.56. For the permutations in Problem 8.55, find
(a) t  s, (b) p  s, (c) s1
, (d) t1
.
8.57. Let t 2 Sn: Show that t  s runs through Sn as s runs through Sn; that is, Sn ¼ ft  s : s 2 Sng:
8.58. Let s 2 Sn have the property that sðnÞ ¼ n. Let s* 2 Sn1 be defined by s*ðxÞ ¼ sðxÞ.
(a) Show that sgn s* ¼ sgn s,
(b) Show that as s runs through Sn, where sðnÞ ¼ n, s* runs through Sn1; that is,
Sn1 ¼ fs* : s 2 Sn; sðnÞ ¼ ng:
8.59. Consider a permutation s ¼ j1 j2 . . . jn. Let feig be the usual basis of Kn
, and let A be the matrix whose ith
row is eji
[i.e., A ¼ ðej1
, ej2
; . . . ; ejn
Þ]. Show that jAj ¼ sgn s.
Determinant of Linear Operators
8.60. Find the determinant of each of the following linear transformations:
(a) T:R2
! R2
defined by Tðx; yÞ ¼ ð2x  9y; 3x  5yÞ,
(b) T:R3
! R3
defined by Tðx; y; zÞ ¼ ð3x  2z; 5y þ 7z; x þ y þ zÞ,
(c) T:R3
! R2
defined by Tðx; y; zÞ ¼ ð2x þ 7y  4z; 4x  6y þ 2zÞ.
8.61. Let D:V ! V be the differential operator; that is, Dð f ðtÞÞ ¼ df =dt. Find detðDÞ if V is the vector space of
functions with the following bases: (a) f1; t; . . . ; t5
g, (b) fet
; e2t
; e3t
g, (c) fsin t; cos tg.
8.62. Prove Theorem 8.13: Let F and G be linear operators on a vector space V. Then
(i) detðF  GÞ ¼ detðFÞ detðGÞ, (ii) F is invertible if and only if detðFÞ 6¼ 0.
8.63. Prove (a) detð1V Þ ¼ 1, where 1V is the identity operator, (b) -detðT1
Þ ¼ detðTÞ1
when T is invertible.
CHAPTER 8 Determinants 289
Miscellaneous Problems
8.64. Find the volume VðSÞ of the parallelopiped S in R3
determined by the following vectors:
(a) u1 ¼ ð1; 2; 3Þ, u2 ¼ ð3; 4; 1Þ, u3 ¼ ð2; 1; 5Þ,
(b) u1 ¼ ð1; 1; 3Þ, u2 ¼ ð1; 2; 4Þ, u3 ¼ ð4; 1; 5Þ.
8.65. Find the volume VðSÞ of the parallelepiped S in R4
determined by the following vectors:
u1 ¼ ð1; 2; 5; 1Þ; u2 ¼ ð2; 1; 2; 1Þ; u3 ¼ ð3; 0; 1  2Þ; u4 ¼ ð1; 1; 4; 1Þ
8.66. Let V be the space of 2 2 matrices M ¼
a b
c d
 
over R. Determine whether D:V ! R is 2-linear (with
respect to the rows), where
ðaÞ DðMÞ ¼ a þ d;
ðbÞ DðMÞ ¼ ad;
ðcÞ DðMÞ ¼ ac  bd;
ðdÞ DðMÞ ¼ ab  cd;
ðeÞ DðMÞ ¼ 0
ðfÞ DðMÞ ¼ 1
8.67. Let A be an n-square matrix. Prove jkAj ¼ kn
jAj.
8.68. Let A; B; C; D be commuting n-square matrices. Consider the 2n-square block matrix M ¼
A B
C D
 
. Prove
that jMj ¼ jAjjDj  jBjjCj. Show that the result may not be true if the matrices do not commute.
8.69. Suppose A is orthogonal; that is, AT
A ¼ I. Show that detðAÞ ¼ 1.
8.70. Let V be the space of m-square matrices viewed as m-tuples of row vectors. Suppose D:V ! K is m-linear
and alternating. Show that
(a) Dð. . . ; A; . . . ; B; . . .Þ ¼ Dð. . . ; B; . . . ; A; . . .Þ; sign changed when two rows are interchanged.
(b) If A1; A2; . . . ; Am are linearly dependent, then DðA1; A2; . . . ; AmÞ ¼ 0.
8.71. Let V be the space of m-square matrices (as above), and suppose D: V ! K. Show that the following weaker
statement is equivalent to D being alternating:
DðA1; A2; . . . ; AnÞ ¼ 0 whenever Ai ¼ Aiþ1 for some i
Let V be the space of n-square matrices over K. Suppose B 2 V is invertible and so detðBÞ 6¼ 0. Define
D: V ! K by DðAÞ ¼ detðABÞ=detðBÞ, where A 2 V. Hence,
DðA1; A2; . . . ; AnÞ ¼ detðA1B; A2B; . . . ; AnBÞ=detðBÞ
where Ai is the ith row of A, and so AiB is the ith row of AB. Show that D is multilinear and alternating, and
that DðIÞ ¼ 1. (This method is used by some texts to prove that jABj ¼ jAjjBj.)
8.72. Show that g ¼ gðx1; . . . ; xnÞ ¼ ð1Þn
Vn1ðxÞ where g ¼ gðxiÞ is the difference product in Problem 8.19,
x ¼ xn, and Vn1 is the Vandermonde determinant defined by
Vn1ðxÞ 
1 1 . . . 1 1
x1 x2 . . . xn1 x
x2
1 x2
2 . . . x2
n1 x2
::::::::::::::::::::::::::::::::::::::::::::
xn1
1 xn1
2 . . . xn1
n1 xn1
2
6
6
6
6
6
6
6
4












8.73. Let A be any matrix. Show that the signs of a minor A½I; J and its complementary minor A½I0
; J0
 are
equal.
290 CHAPTER 8 Determinants
8.74. Let A be an n-square matrix. The determinantal rank of A is the order of the largest square submatrix of A
(obtained by deleting rows and columns of A) whose determinant is not zero. Show that the determinantal
rank of A is equal to its rank—the maximum number of linearly independent rows (or columns).
ANSWERS TO SUPPLEMENTARY PROBLEMS
Notation: M ¼ ½R1; R2; . . . denotes a matrix with rows R1; R2; : . . .
8.38. (a) 22, (b) 13, (c) 46, (d) 21, (e) a2
þ ab þ b2
8.39. (a) 3; 10; (b) 5; 2
8.40. (a) 21, (b) 11, (c) 100, (d) 0
8.41. (a) 131, (b) 55
8.42. (a) 33, (b) 0, (c) 45
8.43. (a) 32, (b) 14, (c) 468
8.44. (a) jAj ¼ 2; adj A ¼ ½1; 1; 1; 1; 1; 1; 2; 2; 0,
(b) jAj ¼ 1; adj A ¼ ½1; 0; 2; 3; 1; 6; 2; 1; 5. Also, A1
¼ ðadj AÞ=jAj
8.45. (a) ½16; 29; 26; 2; 30; 38; 16; 29; 8; 51; 13; 1; 13; 1; 28; 18,
(b) ½21; 14; 17; 19; 44; 11; 33; 11; 29; 1; 13; 21; 17; 7; 19; 18
8.46. (a) adj A ¼ ½d; b; c; a, (c) A ¼ kI
8.49. (a) 3; 3, (b) 23; 23, (c) 3; 3, (d) 17; 17
8.50. (a) 2; 17; 73, (b) 7; 10; 105, (c) 13; 54; 0
8.51. (a) 6; 13; 62; 219; (b) 7; 37; 30; 20
8.52. (a) x ¼ 21
26 ; y ¼ 29
26; (b) x ¼  5
13 ; y ¼ 1
13; (c) x ¼  c
a ; y ¼  c
b
8.53. (a) x ¼ 5; y ¼ 2; z ¼ 1, (b) Because D ¼ 0, the system cannot be solved by determinants.
8.55. (a) sgn s ¼ 1; sgn t ¼ 1; sgn p ¼ 1
8.56. (a) t  s ¼ 53142, (b) p  s ¼ 52413, (c) s1
¼ 32154, (d) t1
¼ 14253
8.60. (a) detðTÞ ¼ 17, (b) detðTÞ ¼ 4, (c) not defined
8.61. (a) 0, (b) 6, (c) 1
8.64. (a) 18, (b) 0
8.65. 17
8.66. (a) no, (b) yes, (c) yes, (d) no, (e) yes, (f ) no
CHAPTER 8 Determinants 291
CHAPTER 9
Diagonalization:
Eigenvalues and Eigenvectors
9.1 Introduction
The ideas in this chapter can be discussed from two points of view.
Matrix Point of View
Suppose an n-square matrix A is given. The matrix A is said to be diagonalizable if there exists a
nonsingular matrix P such that
B ¼ P1
AP
is diagonal. This chapter discusses the diagonalization of a matrix A. In particular, an algorithm is given
to find the matrix P when it exists.
Linear Operator Point of View
Suppose a linear operator T: V ! V is given. The linear operator T is said to be diagonalizable if there
exists a basis S of V such that the matrix representation of T relative to the basis S is a diagonal matrix D.
This chapter discusses conditions under which the linear operator T is diagonalizable.
Equivalence of the Two Points of View
The above two concepts are essentially the same. Specifically, a square matrix A may be viewed as a
linear operator F defined by
FðXÞ ¼ AX
where X is a column vector, and B ¼ P1
AP represents F relative to a new coordinate system (basis)
S whose elements are the columns of P. On the other hand, any linear operator T can be represented by a
matrix A relative to one basis and, when a second basis is chosen, T is represented by the matrix
B ¼ P1
AP
where P is the change-of-basis matrix.
Most theorems will be stated in two ways: one in terms of matrices A and again in terms of linear
mappings T.
Role of Underlying Field K
The underlying number field K did not play any special role in our previous discussions on vector spaces
and linear mappings. However, the diagonalization of a matrix A or a linear operator T will depend on the
CHAPTER 9
292
roots of a polynomial DðtÞ over K, and these roots do depend on K. For example, suppose DðtÞ ¼ t2
þ 1.
Then DðtÞ has no roots if K ¼ R, the real field; but DðtÞ has roots i if K ¼ C, the complex field.
Furthermore, finding the roots of a polynomial with degree greater than two is a subject unto itself
(frequently discussed in numerical analysis courses). Accordingly, our examples will usually lead to
those polynomials DðtÞ whose roots can be easily determined.
9.2 Polynomials of Matrices
Consider a polynomial f ðtÞ ¼ antn
þ    þ a1t þ a0 over a field K. Recall (Section 2.8) that if A is any
square matrix, then we define
f ðAÞ ¼ anAn
þ    þ a1A þ a0I
where I is the identity matrix. In particular, we say that A is a root of f ðtÞ if f ðAÞ ¼ 0, the zero matrix.
EXAMPLE 9.1 Let A ¼
1 2
3 4
 
. Then A2
¼
7 10
15 22
 
. Let
f ðtÞ ¼ 2t2
 3t þ 5 and gðtÞ ¼ t2
 5t  2
Then
f ðAÞ ¼ 2A2
 3A þ 5I ¼
14 20
30 44
 
þ
3 6
9 12
 
þ
5 0
0 5
 
¼
16 14
21 37
 
and
gðAÞ ¼ A2
 5A  2I ¼
7 10
15 22
 
þ
5 10
15 20
 
þ
2 0
0 2
 
¼
0 0
0 0
 
Thus, A is a zero of gðtÞ.
The following theorem (proved in Problem 9.7) applies.
THEOREM 9.1: Let f and g be polynomials. For any square matrix A and scalar k,
(i) ð f þ gÞðAÞ ¼ f ðAÞ þ gðAÞ (iii) ðkf ÞðAÞ ¼ kf ðAÞ
(ii) ð fgÞðAÞ ¼ f ðAÞgðAÞ (iv) f ðAÞgðAÞ ¼ gðAÞ f ðAÞ:
Observe that (iv) tells us that any two polynomials in A commute.
Matrices and Linear Operators
Now suppose that T: V ! V is a linear operator on a vector space V. Powers of T are defined by the
composition operation:
T2
¼ T  T; T3
¼ T2  T; . . .
Also, for any polynomial f ðtÞ ¼ antn
þ    þ a1t þ a0, we define f ðTÞ in the same way as we did for
matrices:
f ðTÞ ¼ anTn
þ    þ a1T þ a0I
where I is now the identity mapping. We also say that T is a zero or root of f ðtÞ if f ðTÞ ¼ 0; the zero
mapping. We note that the relations in Theorem 9.1 hold for linear operators as they do for matrices.
Remark: Suppose A is a matrix representation of a linear operator T. Then f ðAÞ is the matrix
representation of f ðTÞ, and, in particular, f ðTÞ ¼ 0 if and only if f ðAÞ ¼ 0.
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 293
9.3 Characteristic Polynomial, Cayley–Hamilton Theorem
Let A ¼ ½aij be an n-square matrix. The matrix M ¼ A  tIn, where In is the n-square identity matrix and
t is an indeterminate, may be obtained by subtracting t down the diagonal of A. The negative of M is the
matrix tIn  A, and its determinant
DðtÞ ¼ detðtIn  AÞ ¼ ð1Þn
detðA  tInÞ
which is a polynomial in t of degree n and is called the characteristic polynomial of A.
We state an important theorem in linear algebra (proved in Problem 9.8).
THEOREM 9.2: (Cayley–Hamilton) Every matrix A is a root of its characteristic polynomial.
Remark: Suppose A ¼ ½aij is a triangular matrix. Then tI  A is a triangular matrix with diagonal
entries t  aii; hence,
DðtÞ ¼ detðtI  AÞ ¼ ðt  a11Þðt  a22Þ    ðt  annÞ
Observe that the roots of DðtÞ are the diagonal elements of A.
EXAMPLE 9.2 Let A ¼
1 3
4 5
 
. Its characteristic polynomial is
DðtÞ ¼ jtI  Aj ¼
t  1 3
4 t  5



 ¼ ðt  1Þðt  5Þ  12 ¼ t2
 6t  7




As expected from the Cayley–Hamilton theorem, A is a root of DðtÞ; that is,
DðAÞ ¼ A2
 6A  7I ¼
13 18
24 37
 
þ
6 18
24 30
 
þ
7 0
0 7
 
¼
0 0
0 0
 
Now suppose A and B are similar matrices, say B ¼ P1
AP, where P is invertible. We show that A
and B have the same characteristic polynomial. Using tI ¼ P1
tIP, we have
DBðtÞ ¼ detðtI  BÞ ¼ detðtI  P1
APÞ ¼ detðP1
tIP  P1
APÞ
¼ det½P1
ðtI  AÞP ¼ detðP1
Þ detðtI  AÞ detðPÞ
Using the fact that determinants are scalars and commute and that detðP1
Þ detðPÞ ¼ 1, we finally obtain
DBðtÞ ¼ detðtI  AÞ ¼ DAðtÞ
Thus, we have proved the following theorem.
THEOREM 9.3: Similar matrices have the same characteristic polynomial.
Characteristic Polynomials of Degrees 2 and 3
There are simple formulas for the characteristic polynomials of matrices of orders 2 and 3.
(a) Suppose A ¼
a11 a12
a21 a22
 
. Then
DðtÞ ¼ t2
 ða11 þ a22Þt þ detðAÞ ¼ t2
 trðAÞ t þ detðAÞ
Here trðAÞ denotes the trace of A—that is, the sum of the diagonal elements of A.
(b) Suppose A ¼
a11 a12 a13
a21 a22 a23
a31 a32 a33
2
4
3
5. Then
DðtÞ ¼ t3
 trðAÞ t2
þ ðA11 þ A22 þ A33Þt  detðAÞ
(Here A11, A22, A33 denote, respectively, the cofactors of a11, a22, a33.)
294 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
EXAMPLE 9.3 Find the characteristic polynomial of each of the following matrices:
(a) A ¼
5 3
2 10
 
, (b) B ¼
7 1
6 2
 
, (c) C ¼
5 2
4 4
 
.
(a) We have trðAÞ ¼ 5 þ 10 ¼ 15 and jAj ¼ 50  6 ¼ 44; hence, DðtÞ þ t2
 15t þ 44.
(b) We have trðBÞ ¼ 7 þ 2 ¼ 9 and jBj ¼ 14 þ 6 ¼ 20; hence, DðtÞ ¼ t2
 9t þ 20.
(c) We have trðCÞ ¼ 5  4 ¼ 1 and jCj ¼ 20 þ 8 ¼ 12; hence, DðtÞ ¼ t2
 t  12.
EXAMPLE 9.4 Find the characteristic polynomial of A ¼
1 1 2
0 3 2
1 3 9
2
4
3
5.
We have trðAÞ ¼ 1 þ 3 þ 9 ¼ 13. The cofactors of the diagonal elements are as follows:
A11 ¼
3 2
3 9







 ¼ 21; A22 ¼
1 2
1 9







 ¼ 7; A33 ¼
1 1
0 3







 ¼ 3
Thus, A11 þ A22 þ A33 ¼ 31. Also, jAj ¼ 27 þ 2 þ 0  6  6  0 ¼ 17. Accordingly,
DðtÞ ¼ t3
 13t2
þ 31t  17
Remark: The coefficients of the characteristic polynomial DðtÞ of the 3-square matrix A are, with
alternating signs, as follows:
S1 ¼ trðAÞ; S2 ¼ A11 þ A22 þ A33; S3 ¼ detðAÞ
We note that each Sk is the sum of all principal minors of A of order k.
The next theorem, whose proof lies beyond the scope of this text, tells us that this result is true in
general.
THEOREM 9.4: Let A be an n-square matrix. Then its characteristic polynomial is
DðtÞ ¼ tn
 S1tn1
þ S2tn2
þ    þ ð1Þn
Sn
where Sk is the sum of the principal minors of order k.
Characteristic Polynomial of a Linear Operator
Now suppose T: V ! V is a linear operator on a vector space V of finite dimension. We define the
characteristic polynomial DðtÞ of T to be the characteristic polynomial of any matrix representation of T.
Recall that if A and B are matrix representations of T, then B ¼ P1
AP, where P is a change-of-basis
matrix. Thus, A and B are similar, and by Theorem 9.3, A and B have the same characteristic polynomial.
Accordingly, the characteristic polynomial of T is independent of the particular basis in which the matrix
representation of T is computed.
Because f ðTÞ ¼ 0 if and only if f ðAÞ ¼ 0, where f ðtÞ is any polynomial and A is any matrix
representation of T, we have the following analogous theorem for linear operators.
THEOREM 9.20
: (Cayley–Hamilton) A linear operator T is a zero of its characteristic polynomial.
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 295
9.4 Diagonalization, Eigenvalues and Eigenvectors
Let A be any n-square matrix. Then A can be represented by (or is similar to) a diagonal matrix
D ¼ diagðk1; k2; . . . ; knÞ if and only if there exists a basis S consisting of (column) vectors u1; u2; . . . ; un
such that
Au1 ¼ k1u1
Au2 ¼ k2u2
::::::::::::::::::::::::::::::::::::
Aun ¼ knun
In such a case, A is said to be diagonizable. Furthermore, D ¼ P1
AP, where P is the nonsingular matrix
whose columns are, respectively, the basis vectors u1; u2; . . . ; un.
The above observation leads us to the following definition.
DEFINITION: Let A be any square matrix. A scalar l is called an eigenvalue of A if there exists a
nonzero (column) vector v such that
Av ¼ lv
Any vector satisfying this relation is called an eigenvector of A belonging to the
eigenvalue l.
We note that each scalar multiple kv of an eigenvector v belonging to l is also such an eigenvector,
because
AðkvÞ ¼ kðAvÞ ¼ kðlvÞ ¼ lðkvÞ
The set El of all such eigenvectors is a subspace of V (Problem 9.19), called the eigenspace of l. (If
dim El ¼ 1, then El is called an eigenline and l is called a scaling factor.)
The terms characteristic value and characteristic vector (or proper value and proper vector) are
sometimes used instead of eigenvalue and eigenvector.
The above observation and definitions give us the following theorem.
THEOREM 9.5: An n-square matrix A is similar to a diagonal matrix D if and only if A has n linearly
independent eigenvectors. In this case, the diagonal elements of D are the corresponding
eigenvalues and D ¼ P1
AP, where P is the matrix whose columns are the eigenvectors.
Suppose a matrix A can be diagonalized as above, say P1
AP ¼ D, where D is diagonal. Then A has
the extremely useful diagonal factorization:
A ¼ PDP1
Using this factorization, the algebra of A reduces to the algebra of the diagonal matrix D, which can be
easily calculated. Specifically, suppose D ¼ diagðk1; k2; . . . ; knÞ. Then
Am
¼ ðPDP1
Þm
¼ PDm
P1
¼ P diagðkm
1 ; . . . ; km
n ÞP1
More generally, for any polynomial f ðtÞ,
f ðAÞ ¼ f ðPDP1
Þ ¼ Pf ðDÞP1
¼ P diagð f ðk1Þ; f ðk2Þ; . . . ; f ðknÞÞP1
Furthermore, if the diagonal entries of D are nonnegative, let
B ¼ P diagð
ffiffiffiffiffi
k1
p
;
ffiffiffiffiffi
k2
p
; . . . ;
ffiffiffiffiffi
kn
p
Þ P1
Then B is a nonnegative square root of A; that is, B2
¼ A and the eigenvalues of B are nonnegative.
296 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
EXAMPLE 9.5 Let A ¼
3 1
2 2
 
and let v1 ¼
1
2
 
and v2 ¼
1
1
 
. Then
Av1 ¼
3 1
2 2
 
1
2
 
¼
1
2
 
¼ v1 and Av2 ¼
3 1
2 2
 
1
1
 
¼
4
4
 
¼ 4v2
Thus, v1 and v2 are eigenvectors of A belonging, respectively, to the eigenvalues l1 ¼ 1 and l2 ¼ 4. Observe that v1
and v2 are linearly independent and hence form a basis of R2
. Accordingly, A is diagonalizable. Furthermore, let P
be the matrix whose columns are the eigenvectors v1 and v2. That is, let
P ¼

1 1
2 1
#
; and so P1
¼
1
3  1
3
2
3
1
3
 #
Then A is similar to the diagonal matrix
D ¼ P1
AP ¼
1
3  1
3
2
3
1
3
 #
3 1
2 2
#
1 1
2 1
#
¼

1 0
0 4
#
As expected, the diagonal elements 1 and 4 in D are the eigenvalues corresponding, respectively, to the eigenvectors
v1 and v2, which are the columns of P. In particular, A has the factorization
A ¼ PDP1
¼

1 1
2 1
#
1 0
0 4
# 1
3  1
3
2
3
1
3
 #
Accordingly,
A4
¼

1 1
2 1
#
1 0
0 256
# 1
3  1
3
2
3
1
3
 #
¼

171 85
170 86
#
Moreover, suppose f ðtÞ ¼ t3
 5t2
þ 3t þ 6; hence, f ð1Þ ¼ 5 and f ð4Þ ¼ 2. Then
f ðAÞ ¼ Pf ðDÞP1
¼
1 1
2 1
 
5 0
0 2
  1
3  1
3
2
3
1
3
 #
¼
3 1
2 4
 
Last, we obtain a ‘‘positive square root’’ of A. Specifically, using
ffiffiffi
1
p
¼ 1 and
ffiffiffi
4
p
¼ 2, we obtain the matrix
B ¼ P
ffiffiffiffi
D
p
P1
¼
1 1
2 1
 
1 0
0 2
  1
3  1
3
2
3
1
3
 #
¼
5
3
1
3
2
3
4
3
 #
where B2
¼ A and where B has positive eigenvalues 1 and 2.
Remark: Throughout this chapter, we use the following fact:
If P ¼
a b
c d
 
; then P1
¼
d=jPj b=jPj
c=jPj a=jPj
 
:
That is, P1
is obtained by interchanging the diagonal elements a and d of P, taking the negatives of the
nondiagonal elements b and c, and dividing each element by the determinant jPj.
Properties of Eigenvalues and Eigenvectors
Example 9.5 indicates the advantages of a diagonal representation (factorization) of a square matrix. In
the following theorem (proved in Problem 9.20), we list properties that help us to find such a
representation.
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 297
THEOREM 9.6: Let A be a square matrix. Then the following are equivalent.
(i) A scalar l is an eigenvalue of A.
(ii) The matrix M ¼ A  lI is singular.
(iii) The scalar l is a root of the characteristic polynomial DðtÞ of A.
The eigenspace El of an eigenvalue l is the solution space of the homogeneous system MX ¼ 0,
where M ¼ A  lI; that is, M is obtained by subtracting l down the diagonal of A.
Some matrices have no eigenvalues and hence no eigenvectors. However, using Theorem 9.6 and the
Fundamental Theorem of Algebra (every polynomial over the complex field C has a root), we obtain the
following result.
THEOREM 9.7: Let A be a square matrix over the complex field C. Then A has at least one eigenvalue.
The following theorems will be used subsequently. (The theorem equivalent to Theorem 9.8 for linear
operators is proved in Problem 9.21, and Theorem 9.9 is proved in Problem 9.22.)
THEOREM 9.8: Suppose v1; v2; . . . ; vn are nonzero eigenvectors of a matrix A belonging to distinct
eigenvalues l1; l2; . . . ; ln. Then v1; v2; . . . ; vn are linearly independent.
THEOREM 9.9: Suppose the characteristic polynomial DðtÞ of an n-square matrix A is a product of n
distinct factors, say, DðtÞ ¼ ðt  a1Þðt  a2Þ    ðt  anÞ. Then A is similar to the
diagonal matrix D ¼ diagða1; a2; . . . ; anÞ.
If l is an eigenvalue of a matrix A, then the algebraic multiplicity of l is defined to be the multiplicity
of l as a root of the characteristic polynomial of A, and the geometric multiplicity of l is defined to be the
dimension of its eigenspace, dim El. The following theorem (whose equivalent for linear operators is
proved in Problem 9.23) holds.
THEOREM 9.10: The geometric multiplicity of an eigenvalue l of a matrix A does not exceed its
algebraic multiplicity.
Diagonalization of Linear Operators
Consider a linear operator T: V ! V. Then T is said to be diagonalizable if it can be represented by a
diagonal matrix D. Thus, T is diagonalizable if and only if there exists a basis S ¼ fu1; u2; . . . ; ung of V
for which
Tðu1Þ ¼ k1u1
Tðu2Þ ¼ k2u2
:::::::::::::::::::::::::::::::::::::::
TðunÞ ¼ knun
In such a case, T is represented by the diagonal matrix
D ¼ diagðk1; k2; . . . ; knÞ
relative to the basis S.
The above observation leads us to the following definitions and theorems, which are analogous to the
definitions and theorems for matrices discussed above.
DEFINITION: Let T be a linear operator. A scalar l is called an eigenvalue of T if there exists a
nonzero vector v such that TðvÞ ¼ lv.
Every vector satisfying this relation is called an eigenvector of T belonging to the
eigenvalue l.
298 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
The set El of all eigenvectors belonging to an eigenvalue l is a subspace of V, called the
eigenspace of l. (Alternatively, l is an eigenvalue of T if lI  T is singular, and, in this case, El is the
kernel of lI  T.) The algebraic and geometric multiplicities of an eigenvalue l of a linear operator T are
defined in the same way as those of an eigenvalue of a matrix A.
The following theorems apply to a linear operator T on a vector space V of finite dimension.
THEOREM 9.50
: T can be represented by a diagonal matrix D if and only if there exists a basis S of V
consisting of eigenvectors of T. In this case, the diagonal elements of D are the
corresponding eigenvalues.
THEOREM 9.60
: Let T be a linear operator. Then the following are equivalent:
(i) A scalar l is an eigenvalue of T.
(ii) The linear operator lI  T is singular.
(iii) The scalar l is a root of the characteristic polynomial DðtÞ of T.
THEOREM 9.70
: Suppose V is a complex vector space. Then T has at least one eigenvalue.
THEOREM 9.80
: Suppose v1; v2; . . . ; vn are nonzero eigenvectors of a linear operator T belonging to
distinct eigenvalues l1; l2; . . . ; ln. Then v1; v2; . . . ; vn are linearly independent.
THEOREM 9.90
: Suppose the characteristic polynomial DðtÞ of T is a product of n distinct factors, say,
DðtÞ ¼ ðt  a1Þðt  a2Þ    ðt  anÞ. Then T can be represented by the diagonal
matrix D ¼ diagða1; a2; . . . ; anÞ.
THEOREM 9.100
: The geometric multiplicity of an eigenvalue l of T does not exceed its algebraic
multiplicity.
Remark: The following theorem reduces the investigation of the diagonalization of a linear
operator T to the diagonalization of a matrix A.
THEOREM 9.11: Suppose A is a matrix representation of T. Then T is diagonalizable if and only if A
is diagonalizable.
9.5 Computing Eigenvalues and Eigenvectors, Diagonalizing Matrices
This section gives an algorithm for computing eigenvalues and eigenvectors for a given square matrix A
and for determining whether or not a nonsingular matrix P exists such that P1
AP is diagonal.
ALGORITHM 9.1: (Diagonalization Algorithm) The input is an n-square matrix A.
Step 1. Find the characteristic polynomial DðtÞ of A.
Step 2. Find the roots of DðtÞ to obtain the eigenvalues of A.
Step 3. Repeat (a) and (b) for each eigenvalue l of A.
(a) Form the matrix M ¼ A  lI by subtracting l down the diagonal of A.
(b) Find a basis for the solution space of the homogeneous system MX ¼ 0. (These basis
vectors are linearly independent eigenvectors of A belonging to l.)
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 299
Step 4. Consider the collection S ¼ fv1; v2; . . . ; vmg of all eigenvectors obtained in Step 3.
(a) If m 6¼ n, then A is not diagonalizable.
(b) If m ¼ n, then A is diagonalizable. Specifically, let P be the matrix whose columns are the
eigenvectors v1; v2; . . . ; vn. Then
D ¼ P1
AP ¼ diagðl1; l2; . . . ; lnÞ
where li is the eigenvalue corresponding to the eigenvector vi.
EXAMPLE 9.6 The diagonalizable algorithm is applied to A ¼
4 2
3 1
 
.
(1) The characteristic polynomial DðtÞ of A is computed. We have
trðAÞ ¼ 4  1 ¼ 3; jAj ¼ 4  6 ¼ 10;
hence,
DðtÞ ¼ t2
 3t  10 ¼ ðt  5Þðt þ 2Þ
(2) Set DðtÞ ¼ ðt  5Þðt þ 2Þ ¼ 0. The roots l1 ¼ 5 and l2 ¼ 2 are the eigenvalues of A.
(3) (i) We find an eigenvector v1 of A belonging to the eigenvalue l1 ¼ 5. Subtract l1 ¼ 5 down the diagonal of
A to obtain the matrix M ¼
1 2
3 6
 
. The eigenvectors belonging to l1 ¼ 5 form the solution of the
homogeneous system MX ¼ 0; that is,
1 2
3 6
 
x
y
 
¼
0
0
 
or
x þ 2y ¼ 0
3x  6y ¼ 0
or x þ 2y ¼ 0
The system has only one free variable. Thus, a nonzero solution, for example, v1 ¼ ð2; 1Þ, is an
eigenvector that spans the eigenspace of l1 ¼ 5.
(ii) We find an eigenvector v2 of A belonging to the eigenvalue l2 ¼ 2. Subtract 2 (or add 2) down the
diagonal of A to obtain the matrix
M ¼
6 2
3 1
 
and the homogenous system
6x þ 2y ¼ 0
3x þ y ¼ 0
or 3x þ y ¼ 0:
The system has only one independent solution. Thus, a nonzero solution, say v2 ¼ ð1; 3Þ; is an
eigenvector that spans the eigenspace of l2 ¼ 2:
(4) Let P be the matrix whose columns are the eigenvectors v1 and v2. Then
P ¼
2 1
1 3
 
; and so P1
¼
3
7
1
7
 1
7
2
7
 #
Accordingly, D ¼ P1
AP is the diagonal matrix whose diagonal entries are the corresponding eigenvalues;
that is,
D ¼ P1
AP ¼
3
7
1
7
 1
7
2
7
 #
4 2
3 1
 
2 1
1 3
 
¼
5 0
0 2
 
EXAMPLE 9.7 Consider the matrix B ¼
5 1
1 3
 
. We have
trðBÞ ¼ 5 þ 3 ¼ 8; jBj ¼ 15 þ 1 ¼ 16; so DðtÞ ¼ t2
 8t þ 16 ¼ ðt  4Þ2
Accordingly, l ¼ 4 is the only eigenvalue of B.
300 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
Subtract l ¼ 4 down the diagonal of B to obtain the matrix
M ¼
1 1
1 1
 
and the homogeneous system
x  y ¼ 0
x  y ¼ 0
or x  y ¼ 0
The system has only one independent solution; for example, x ¼ 1; y ¼ 1. Thus, v ¼ ð1; 1Þ and its multiples are the
only eigenvectors of B. Accordingly, B is not diagonalizable, because there does not exist a basis consisting of
eigenvectors of B.
EXAMPLE 9.8 Consider the matrix A ¼
3 5
2 3
 
. Here trðAÞ ¼ 3  3 ¼ 0 and jAj ¼ 9 þ 10 ¼ 1. Thus,
DðtÞ ¼ t2
þ 1 is the characteristic polynomial of A. We consider two cases:
(a) A is a matrix over the real field R. Then DðtÞ has no (real) roots. Thus, A has no eigenvalues and no
eigenvectors, and so A is not diagonalizable.
(b) A is a matrix over the complex field C. Then DðtÞ ¼ ðt  iÞðt þ iÞ has two roots, i and i. Thus, A has two
distinct eigenvalues i and i, and hence, A has two independent eigenvectors. Accordingly there exists a
nonsingular matrix P over the complex field C for which
P1
AP ¼
i 0
0 i
 
Therefore, A is diagonalizable (over C).
9.6 Diagonalizing Real Symmetric Matrices and Quadratic Forms
There are many real matrices A that are not diagonalizable. In fact, some real matrices may not have any
(real) eigenvalues. However, if A is a real symmetric matrix, then these problems do not exist. Namely,
we have the following theorems.
THEOREM 9.12: Let A be a real symmetric matrix. Then each root l of its characteristic polynomial is
real.
THEOREM 9.13: Let A be a real symmetric matrix. Suppose u and v are eigenvectors of A belonging
to distinct eigenvalues l1 and l2. Then u and v are orthogonal, that; is, hu; vi ¼ 0.
The above two theorems give us the following fundamental result.
THEOREM 9.14: Let A be a real symmetric matrix. Then there exists an orthogonal matrix P such that
D ¼ P1
AP is diagonal.
The orthogonal matrix P is obtained by normalizing a basis of orthogonal eigenvectors of A as
illustrated below. In such a case, we say that A is ‘‘orthogonally diagonalizable.’’
EXAMPLE 9.9 Let A ¼
2 2
2 5
 
, a real symmetric matrix. Find an orthogonal matrix P such that P1
AP is
diagonal.
First we find the characteristic polynomial DðtÞ of A. We have
trðAÞ ¼ 2 þ 5 ¼ 7; jAj ¼ 10  4 ¼ 6; so DðtÞ ¼ t2
 7t þ 6 ¼ ðt  6Þðt  1Þ
Accordingly, l1 ¼ 6 and l2 ¼ 1 are the eigenvalues of A.
(a) Subtracting l1 ¼ 6 down the diagonal of A yields the matrix
M ¼
4 2
2 1
 
and the homogeneous system
4x  2y ¼ 0
2x  y ¼ 0
or 2x þ y ¼ 0
A nonzero solution is u1 ¼ ð1; 2Þ.
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 301
(b) Subtracting l2 ¼ 1 down the diagonal of A yields the matrix
M ¼
1 2
2 4
 
and the homogeneous system x  2y ¼ 0
(The second equation drops out, because it is a multiple of the first equation.) A nonzero solution is
u2 ¼ ð2; 1Þ.
As expected from Theorem 9.13, u1 and u2 are orthogonal. Normalizing u1 and u2 yields the orthonormal vectors
^
u1 ¼ ð1=
ffiffiffi
5
p
; 2=
ffiffiffi
5
p
Þ and ^
u2 ¼ ð2=
ffiffiffi
5
p
; 1=
ffiffiffi
5
p
Þ
Finally, let P be the matrix whose columns are ^
u1 and ^
u2, respectively. Then
P ¼
1=
ffiffiffi
5
p
2=
ffiffiffi
5
p
2=
ffiffiffi
5
p
1=
ffiffiffi
5
p
 
and P1
AP ¼
6 0
0 1
 
As expected, the diagonal entries of P1
AP are the eigenvalues corresponding to the columns of P.
The procedure in the above Example 9.9 is formalized in the following algorithm, which finds an
orthogonal matrix P such that P1
AP is diagonal.
ALGORITHM 9.2: (Orthogonal Diagonalization Algorithm) The input is a real symmetric matrix A.
Step 1. Find the characteristic polynomial DðtÞ of A.
Step 2. Find the eigenvalues of A, which are the roots of DðtÞ.
Step 3. For each eigenvalue l of A in Step 2, find an orthogonal basis of its eigenspace.
Step 4. Normalize all eigenvectors in Step 3, which then forms an orthonormal basis of Rn
.
Step 5. Let P be the matrix whose columns are the normalized eigenvectors in Step 4.
Application to Quadratic Forms
Let q be a real polynomial in variables x1; x2; . . . ; xn such that every term in q has degree two; that is,
qðx1; x2; . . . ; xnÞ ¼
P
i
cix2
i þ
P
ij
dijxixj; where ci; dij 2 R
Then q is called a quadratic form. If there are no cross-product terms xixj (i.e., all dij ¼ 0), then q is said
to be diagonal.
The above quadratic form q determines a real symmetric matrix A ¼ ½aij, where aii ¼ ci and
aij ¼ aji ¼ 1
2 dij. Namely, q can be written in the matrix form
qðXÞ ¼ XT
AX
where X ¼ ½x1; x2; . . . ; xnT
is the column vector of the variables. Furthermore, suppose X ¼ PY is a
linear substitution of the variables. Then substitution in the quadratic form yields
qðYÞ ¼ ðPYÞT
AðPYÞ ¼ YT
ðPT
APÞY
Thus, PT
AP is the matrix representation of q in the new variables.
We seek an orthogonal matrix P such that the orthogonal substitution X ¼ PY yields a diagonal
quadratic form for which PT
AP is diagonal. Because P is orthogonal, PT
¼ P1
, and hence,
PT
AP ¼ P1
AP. The above theory yields such an orthogonal matrix P.
302 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
EXAMPLE 9.10 Consider the quadratic form
qðx; yÞ ¼ 2x2
 4xy þ 5y2
¼ XT
AX; where A ¼
2 2
2 5
 
and X ¼
x
y
 
By Example 9.9,
P1
AP ¼
6 0
0 1
 
¼ PT
AP; where P ¼
1=
ffiffiffi
5
p
2=
ffiffiffi
5
p
2=
ffiffiffi
5
p
1=
ffiffiffi
5
p
 #
Let Y ¼ ½s; tT
: Then matrix P corresponds to the following linear orthogonal substitution x ¼ PY of the variables x
and y in terms of the variables s and t:
x ¼
1
ffiffiffi
5
p s þ
2
ffiffiffi
5
p t; y ¼ 
2
ffiffiffi
5
p s þ
1
ffiffiffi
5
p t
This substitution in qðx; yÞ yields the diagonal quadratic form qðs; tÞ ¼ 6s2
þ t2
.
9.7 Minimal Polynomial
Let A be any square matrix. Let JðAÞ denote the collection of all polynomials f ðtÞ for which A is a root—
that is, for which f ðAÞ ¼ 0. The set JðAÞ is not empty, because the Cayley–Hamilton Theorem 9.1 tells us
that the characteristic polynomial DAðtÞ of A belongs to JðAÞ. Let mðtÞ denote the monic polynomial of
lowest degree in JðAÞ. (Such a polynomial mðtÞ exists and is unique.) We call mðtÞ the minimal
polynomial of the matrix A.
Remark: A polynomial f ðtÞ 6¼ 0 is monic if its leading coefficient equals one.
The following theorem (proved in Problem 9.33) holds.
THEOREM 9.15: The minimal polynomial mðtÞ of a matrix (linear operator) A divides every
polynomial that has A as a zero. In particular, mðtÞ divides the characteristic
polynomial DðtÞ of A.
There is an even stronger relationship between mðtÞ and DðtÞ.
THEOREM 9.16: The characteristic polynomial DðtÞ and the minimal polynomial mðtÞ of a matrix A
have the same irreducible factors.
This theorem (proved in Problem 9.35) does not say that mðtÞ ¼ DðtÞ, only that any irreducible factor
of one must divide the other. In particular, because a linear factor is irreducible, mðtÞ and DðtÞ have the
same linear factors. Hence, they have the same roots. Thus, we have the following theorem.
THEOREM 9.17: A scalar l is an eigenvalue of the matrix A if and only if l is a root of the minimal
polynomial of A.
EXAMPLE 9.11 Find the minimal polynomial mðtÞ of A ¼
2 2 5
3 7 15
1 2 4
2
4
3
5.
First find the characteristic polynomial DðtÞ of A. We have
trðAÞ ¼ 5; A11 þ A22 þ A33 ¼ 2  3 þ 8 ¼ 7; and jAj ¼ 3
Hence,
DðtÞ ¼ t3
 5t2
þ 7t  3 ¼ ðt  1Þ2
ðt  3Þ
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 303
The minimal polynomial mðtÞ must divide DðtÞ. Also, each irreducible factor of DðtÞ (i.e., t  1 and t  3) must
also be a factor of mðtÞ. Thus, mðtÞ is exactly one of the following:
f ðtÞ ¼ ðt  3Þðt  1Þ or gðtÞ ¼ ðt  3Þðt  1Þ2
We know, by the Cayley–Hamilton theorem, that gðAÞ ¼ DðAÞ ¼ 0. Hence, we need only test f ðtÞ. We have
f ðAÞ ¼ ðA  IÞðA  3IÞ ¼
1 2 5
3 6 15
1 2 5
2
4
3
5
1 2 5
3 4 15
1 2 7
2
4
3
5 ¼
0 0 0
0 0 0
0 0 0
2
4
3
5
Thus, f ðtÞ ¼ mðtÞ ¼ ðt  1Þðt  3Þ ¼ t2
 4t þ 3 is the minimal polynomial of A.
EXAMPLE 9.12
(a) Consider the following two r-square matrices, where a 6¼ 0:
Jðl; rÞ ¼
l 1 0 . . . 0 0
0 l 1 . . . 0 0
:::::::::::::::::::::::::::::::::
0 0 0 . . . l 1
0 0 0 . . . 0 l
2
6
6
6
6
4
3
7
7
7
7
5
and A ¼
l a 0 . . . 0 0
0 l a . . . 0 0
:::::::::::::::::::::::::::::::::
0 0 0 . . . l a
0 0 0 . . . 0 l
2
6
6
6
6
4
3
7
7
7
7
5
The first matrix, called a Jordan Block, has l’s on the diagonal, 1’s on the superdiagonal (consisting of the
entries above the diagonal entries), and 0’s elsewhere. The second matrix A has l’s on the diagonal, a’s on the
superdiagonal, and 0’s elsewhere. [Thus, A is a generalization of Jðl; rÞ.] One can show that
f ðtÞ ¼ ðt  lÞr
is both the characteristic and minimal polynomial of both Jðl; rÞ and A.
(b) Consider an arbitrary monic polynomial:
f ðtÞ ¼ tn
þ an1tn1
þ    þ a1t þ a0
Let Cð f Þ be the n-square matrix with 1’s on the subdiagonal (consisting of the entries below the diagonal
entries), the negatives of the coefficients in the last column, and 0’s elsewhere as follows:
Cð f Þ ¼
0 0 . . . 0 a0
1 0 . . . 0 a1
0 1 . . . 0 a2
::::::::::::::::::::::::::::::::::
0 0 . . . 1 an1
2
6
6
6
6
4
3
7
7
7
7
5
Then Cð f Þ is called the companion matrix of the polynomial f ðtÞ. Moreover, the minimal polynomial mðtÞ and
the characteristic polynomial DðtÞ of the companion matrix Cð f Þ are both equal to the original polynomial f ðtÞ.
Minimal Polynomial of a Linear Operator
The minimal polynomial mðtÞ of a linear operator T is defined to be the monic polynomial of lowest
degree for which T is a root. However, for any polynomial f ðtÞ, we have
f ðTÞ ¼ 0 if and only if f ðAÞ ¼ 0
where A is any matrix representation of T. Accordingly, T and A have the same minimal polynomials.
Thus, the above theorems on the minimal polynomial of a matrix also hold for the minimal polynomial of
a linear operator. That is, we have the following theorems.
THEOREM 9.150
: The minimal polynomial mðtÞ of a linear operator T divides every polynomial that
has T as a root. In particular, mðtÞ divides the characteristic polynomial DðtÞ of T.
304 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
THEOREM 9.160
: The characteristic and minimal polynomials of a linear operator T have the same
irreducible factors.
THEOREM 9.170
: A scalar l is an eigenvalue of a linear operator T if and only if l is a root of the
minimal polynomial mðtÞ of T.
9.8 Characteristic and Minimal Polynomials of Block Matrices
This section discusses the relationship of the characteristic polynomial and the minimal polynomial to
certain (square) block matrices.
Characteristic Polynomial and Block Triangular Matrices
Suppose M is a block triangular matrix, say M ¼
A1 B
0 A2
 
, where A1 and A2 are square matrices. Then
tI  M is also a block triangular matrix, with diagonal blocks tI  A1 and tI  A2. Thus,
jtI  Mj ¼
tI  A1 B
0 tI  A2







 ¼ jtI  A1jjtI  A2j
That is, the characteristic polynomial of M is the product of the characteristic polynomials of the diagonal
blocks A1 and A2.
By induction, we obtain the following useful result.
THEOREM 9.18: Suppose M is a block triangular matrix with diagonal blocks A1; A2; . . . ; Ar. Then the
characteristic polynomial of M is the product of the characteristic polynomials of the
diagonal blocks Ai; that is,
DM ðtÞ ¼ DA1
ðtÞDA2
ðtÞ . . . DAr
ðtÞ
EXAMPLE 9.13 Consider the matrix M ¼
9 1 5 7
8 3 2 4
0 0 3 6
0 0 1 8
2
6
6
4
3
7
7
5.
Then M is a block triangular matrix with diagonal blocks A ¼
9 1
8 3
 
and B ¼
3 6
1 8
 
. Here
trðAÞ ¼ 9 þ 3 ¼ 12;
trðBÞ ¼ 3 þ 8 ¼ 11;
detðAÞ ¼ 27 þ 8 ¼ 35;
detðBÞ ¼ 24 þ 6 ¼ 30;
and so
and so
DAðtÞ ¼ t2
 12t þ 35 ¼ ðt  5Þðt  7Þ
DBðtÞ ¼ t2
 11t þ 30 ¼ ðt  5Þðt  6Þ
Accordingly, the characteristic polynomial of M is the product
DM ðtÞ ¼ DAðtÞDBðtÞ ¼ ðt  5Þ2
ðt  6Þðt  7Þ
Minimal Polynomial and Block Diagonal Matrices
The following theorem (proved in Problem 9.36) holds.
THEOREM 9.19: Suppose M is a block diagonal matrix with diagonal blocks A1; A2; . . . ; Ar. Then the
minimal polynomial of M is equal to the least common multiple (LCM) of the
minimal polynomials of the diagonal blocks Ai.
Remark: We emphasize that this theorem applies to block diagonal matrices, whereas the
analogous Theorem 9.18 on characteristic polynomials applies to block triangular matrices.
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 305
EXAMPLE 9.14 Find the characteristic polynomal DðtÞ and the minimal polynomial mðtÞ of the block diagonal
matrix:
M ¼
2 5 0 0 0
0 2 0 0 0
0 0 4 2 0
0 0 3 5 0
0 0 0 0 7
2
6
6
6
6
4
3
7
7
7
7
5
¼ diagðA1; A2; A3Þ; where A1 ¼
2 5
0 2
 
; A2 ¼
4 2
3 5
 
; A3 ¼ ½7
Then DðtÞ is the product of the characterization polynomials D1ðtÞ, D2ðtÞ, D3ðtÞ of A1; A2; A3, respectively.
One can show that
D1ðtÞ ¼ ðt  2Þ2
; D2ðtÞ ¼ ðt  2Þðt  7Þ; D3ðtÞ ¼ t  7
Thus, DðtÞ ¼ ðt  2Þ3
ðt  7Þ2
. [As expected, deg DðtÞ ¼ 5:
The minimal polynomials m1ðtÞ, m2ðtÞ, m3ðtÞ of the diagonal blocks A1; A2; A3, respectively, are equal to the
characteristic polynomials; that is,
m1ðtÞ ¼ ðt  2Þ2
; m2ðtÞ ¼ ðt  2Þðt  7Þ; m3ðtÞ ¼ t  7
But mðtÞ is equal to the least common multiple of m1ðtÞ; m2ðtÞ; m3ðtÞ. Thus, mðtÞ ¼ ðt  2Þ2
ðt  7Þ.
SOLVED PROBLEMS
Polynomials of Matrices, Characteristic Polynomials
9.1. Let A ¼
1 2
4 5
 
. Find f ðAÞ, where
ðaÞ f ðtÞ ¼ t2
 3t þ 7; ðbÞ f ðtÞ ¼ t2
 6t þ 13
First find A2
¼
1 2
4 5
 
1 2
4 5
 
¼
7 12
24 17
 
. Then
(a) f ðAÞ ¼ A2
 3A þ 7I ¼
7 12
24 17
 
þ
3 6
12 15
 
þ
7 0
0 7
 
¼
3 6
12 9
 
(b) f ðAÞ ¼ A2
 6A þ 13I ¼
7 12
24 17
 
þ
6 12
24 30
 
þ
13 0
0 13
 
¼
0 0
0 0
 
[Thus, A is a root of f ðtÞ.]
9.2. Find the characteristic polynomial DðtÞ of each of the following matrices:
(a) A ¼
2 5
4 1
 
, (b) B ¼
7 3
5 2
 
, (c) C ¼
3 2
9 3
 
Use the formula ðtÞ ¼ t2
 trðMÞ t þ jMj for a 2 2 matrix M:
(a) trðAÞ ¼ 2 þ 1 ¼ 3, jAj ¼ 2  20 ¼ 18, so DðtÞ ¼ t2
 3t  18
(b) trðBÞ ¼ 7  2 ¼ 5, jBj ¼ 14 þ 15 ¼ 1, so DðtÞ ¼ t2
 5t þ 1
(c) trðCÞ ¼ 3  3 ¼ 0, jCj ¼ 9 þ 18 ¼ 9, so DðtÞ ¼ t2
þ 9
9.3. Find the characteristic polynomial DðtÞ of each of the following matrices:
(a) A ¼
1 2 3
3 0 4
6 4 5
2
4
3
5, (b) B ¼
1 6 2
3 2 0
0 3 4
2
4
3
5
306 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
Use the formula DðtÞ ¼ t3
 trðAÞt2
þ ðA11 þ A22 þ A33Þt  jAj, where Aii is the cofactor of aii in the
3 3 matrix A ¼ ½aij.
(a) trðAÞ ¼ 1 þ 0 þ 5 ¼ 6,
A11 ¼
0 4
4 5







 ¼ 16; A22 ¼
1 3
6 5







 ¼ 13; A33 ¼
1 2
3 0







 ¼ 6
A11 þ A22 þ A33 ¼ 35, and jAj ¼ 48 þ 36  16  30 ¼ 38
Thus; DðtÞ ¼ t3
 6t2
 35t  38
(b) trðBÞ ¼ 1 þ 2  4 ¼ 1
B11 ¼
2 0
3 4







 ¼ 8; B22 ¼
1 2
0 4







 ¼ 4; B33 ¼
1 6
3 2







 ¼ 20
B11 þ B22 þ B33 ¼ 8, and jBj ¼ 8 þ 18  72 ¼ 62
Thus; DðtÞ ¼ t3
þ t2
 8t þ 62
9.4. Find the characteristic polynomial DðtÞ of each of the following matrices:
(a) A ¼
2 5 1 1
1 4 2 2
0 0 6 5
0 0 2 3
2
6
6
4
3
7
7
5, (b) B ¼
1 1 2 2
0 3 3 4
0 0 5 5
0 0 0 6
2
6
6
4
3
7
7
5
(a) A is block triangular with diagonal blocks
A1 ¼
2 5
1 4
 
and A2 ¼
6 5
2 3
 
Thus; DðtÞ ¼ DA1
ðtÞDA2
ðtÞ ¼ ðt2
 6t þ 3Þðt2
 9t þ 28Þ
(b) Because B is triangular, DðtÞ ¼ ðt  1Þðt  3Þðt  5Þðt  6Þ.
9.5. Find the characteristic polynomial DðtÞ of each of the following linear operators:
(a) F: R2
! R2
defined by Fðx; yÞ ¼ ð3x þ 5y; 2x  7yÞ.
(b) D: V ! V defined by Dð f Þ ¼ df =dt, where V is the space of functions with basis
S ¼ fsin t; cos tg.
The characteristic polynomial DðtÞ of a linear operator is equal to the characteristic polynomial of any
matrix A that represents the linear operator.
(a) Find the matrix A that represents T relative to the usual basis of R2
. We have
A ¼
3 5
2 7
 
; so DðtÞ ¼ t2
 trðAÞ t þ jAj ¼ t2
þ 4t  31
(b) Find the matrix A representing the differential operator D relative to the basis S. We have
Dðsin tÞ ¼ cos t ¼ 0ðsin tÞ þ 1ðcos tÞ
Dðcos tÞ ¼  sin t ¼ 1ðsin tÞ þ 0ðcos tÞ
and so A ¼
0 1
1 0
 
DðtÞ ¼ t2
 trðAÞ t þ jAj ¼ t2
þ 1
Therefore;
9.6. Show that a matrix A and its transpose AT
have the same characteristic polynomial.
By the transpose operation, ðtI  AÞT
¼ tIT
 AT
¼ tI  AT
. Because a matrix and its transpose have
the same determinant,
DAðtÞ ¼ jtI  Aj ¼ jðtI  AÞT
j ¼ jtI  AT
j ¼ DAT ðtÞ
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 307
9.7. Prove Theorem 9.1: Let f and g be polynomials. For any square matrix A and scalar k,
(i) ð f þ gÞðAÞ ¼ f ðAÞ þ gðAÞ, (iii) ðkf ÞðAÞ ¼ kf ðAÞ,
(ii) ð fgÞðAÞ ¼ f ðAÞgðAÞ, (iv) f ðAÞgðAÞ ¼ gðAÞf ðAÞ.
Suppose f ¼ antn
þ    þ a1t þ a0 and g ¼ bmtm
þ    þ b1t þ b0. Then, by definition,
f ðAÞ ¼ anAn
þ    þ a1A þ a0I and gðAÞ ¼ bmAm
þ    þ b1A þ b0I
(i) Suppose m  n and let bi ¼ 0 if i  m. Then
f þ g ¼ ðan þ bnÞtn
þ    þ ða1 þ b1Þt þ ða0 þ b0Þ
Hence,
ð f þ gÞðAÞ ¼ ðan þ bnÞAn
þ    þ ða1 þ b1ÞA þ ða0 þ b0ÞI
¼ anAn
þ bnAn
þ    þ a1A þ b1A þ a0I þ b0I ¼ f ðAÞ þ gðAÞ
(ii) By definition, fg ¼ cnþmtnþm
þ    þ c1t þ c0 ¼
P
nþm
k¼0
cktk
, where
ck ¼ a0bk þ a1bk1 þ    þ akb0 ¼
P
k
i¼0
aibki
Hence, ð fgÞðAÞ ¼
P
nþm
k¼0
ckAk
and
f ðAÞgðAÞ ¼
P
n
i¼0
aiAi
 
P
m
j¼0
bjAj

¼
P
n
i¼0
P
m
j¼0
aibjAiþj
¼
P
nþm
k¼0
ckAk
¼ ð fgÞðAÞ
(iii) By definition, kf ¼ kantn
þ    þ ka1t þ ka0, and so
ðkf ÞðAÞ ¼ kanAn
þ    þ ka1A þ ka0I ¼ kðanAn
þ    þ a1A þ a0IÞ ¼ kf ðAÞ
(iv) By (ii), gðAÞf ðAÞ ¼ ðgf ÞðAÞ ¼ ð fgÞðAÞ ¼ f ðAÞgðAÞ.
9.8. Prove the Cayley–Hamilton Theorem 9.2: Every matrix A is a root of its characterstic polynomial
DðtÞ.
Let A be an arbitrary n-square matrix and let DðtÞ be its characteristic polynomial, say,
DðtÞ ¼ jtI  Aj ¼ tn
þ an1tn1
þ    þ a1t þ a0
Now let BðtÞ denote the classical adjoint of the matrix tI  A. The elements of BðtÞ are cofactors of the
matrix tI  A and hence are polynomials in t of degree not exceeding n  1. Thus,
BðtÞ ¼ Bn1tn1
þ    þ B1t þ B0
where the Bi are n-square matrices over K which are independent of t. By the fundamental property of the
classical adjoint (Theorem 8.9), ðtI  AÞBðtÞ ¼ jtI  AjI, or
ðtI  AÞðBn1tn1
þ    þ B1t þ B0Þ ¼ ðtn
þ an1tn1
þ    þ a1t þ a0ÞI
Removing the parentheses and equating corresponding powers of t yields
Bn1 ¼ I; Bn2  ABn1 ¼ an1I; . . . ; B0  AB1 ¼ a1I; AB0 ¼ a0I
Multiplying the above equations by An
; An1
; . . . ; A; I, respectively, yields
An
Bn1 ¼ AnI; An1
Bn2  An
Bn1 ¼ an1An1
; . . . ; AB0  A2
B1 ¼ a1A; AB0 ¼ a0I
Adding the above matrix equations yields 0 on the left-hand side and DðAÞ on the right-hand side; that is,
0 ¼ An
þ an1An1
þ    þ a1A þ a0I
Therefore, DðAÞ ¼ 0, which is the Cayley–Hamilton theorem.
308 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
Eigenvalues and Eigenvectors of 2 2 Matrices
9.9. Let A ¼
3 4
2 6
 
.
(a) Find all eigenvalues and corresponding eigenvectors.
(b) Find matrices P and D such that P is nonsingular and D ¼ P1
AP is diagonal.
(a) First find the characteristic polynomial DðtÞ of A:
DðtÞ ¼ t2
 trðAÞ t þ jAj ¼ t2
þ 3t  10 ¼ ðt  2Þðt þ 5Þ
The roots l ¼ 2 and l ¼ 5 of DðtÞ are the eigenvalues of A. We find corresponding eigenvectors.
(i) Subtract l ¼ 2 down the diagonal of A to obtain the matrix M ¼ A  2I, where the corresponding
homogeneous system MX ¼ 0 yields the eigenvectors corresponding to l ¼ 2. We have
M ¼
1 4
2 8
 
; corresponding to
x  4y ¼ 0
2x  8y ¼ 0
or x  4y ¼ 0
The system has only one free variable, and v1 ¼ ð4; 1Þ is a nonzero solution. Thus, v1 ¼ ð4; 1Þ is
an eigenvector belonging to (and spanning the eigenspace of) l ¼ 2.
(ii) Subtract l ¼ 5 (or, equivalently, add 5) down the diagonal of A to obtain
M ¼
8 4
2 1
 
; corresponding to
8x  4y ¼ 0
2x  y ¼ 0
or 2x  y ¼ 0
The system has only one free variable, and v2 ¼ ð1; 2Þ is a nonzero solution. Thus, v2 ¼ ð1; 2Þ is
an eigenvector belonging to l ¼ 5.
(b) Let P be the matrix whose columns are v1 and v2. Then
P ¼
4 1
1 2
 
and D ¼ P1
AP ¼
2 0
0 5
 
Note that D is the diagonal matrix whose diagonal entries are the eigenvalues of A corresponding to the
eigenvectors appearing in P.
Remark: Here P is the change-of-basis matrix from the usual basis of R2
to the basis
S ¼ fv1; v2g, and D is the matrix that represents (the matrix function) A relative to the new basis S.
9.10. Let A ¼
2 2
1 3
 
.
(a) Find all eigenvalues and corresponding eigenvectors.
(b) Find a nonsingular matrix P such that D ¼ P1
AP is diagonal, and P1
.
(c) Find A6
and f ðAÞ, where t4
 3t3
 6t2
þ 7t þ 3.
(d) Find a ‘‘real cube root’’ of B—that is, a matrix B such that B3
¼ A and B has real eigenvalues.
(a) First find the characteristic polynomial DðtÞ of A:
DðtÞ ¼ t2
 trðAÞ t þ jAj ¼ t2
 5t þ 4 ¼ ðt  1Þðt  4Þ
The roots l ¼ 1 and l ¼ 4 of DðtÞ are the eigenvalues of A. We find corresponding eigenvectors.
(i) Subtract l ¼ 1 down the diagonal of A to obtain the matrix M ¼ A  lI, where the corresponding
homogeneous system MX ¼ 0 yields the eigenvectors belonging to l ¼ 1. We have
M ¼
1 2
1 2
 
; corresponding to
x þ 2y ¼ 0
x þ 2y ¼ 0
or x þ 2y ¼ 0
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 309
The system has only one independent solution; for example, x ¼ 2, y ¼ 1. Thus, v1 ¼ ð2; 1Þ is
an eigenvector belonging to (and spanning the eigenspace of) l ¼ 1.
(ii) Subtract l ¼ 4 down the diagonal of A to obtain
M ¼
2 2
1 1
 
; corresponding to
2x þ 2y ¼ 0
x  y ¼ 0
or x  y ¼ 0
The system has only one independent solution; for example, x ¼ 1, y ¼ 1. Thus, v2 ¼ ð1; 1Þ is an
eigenvector belonging to l ¼ 4.
(b) Let P be the matrix whose columns are v1 and v2. Then
P ¼
2 1
1 1
 
and D ¼ P1
AP ¼
1 0
0 4
 
; where P1
¼
1
3  1
3
1
3
2
3
 #
(c) Using the diagonal factorization A ¼ PDP1
, and 16
¼ 1 and 46
¼ 4096, we get
A6
¼ PD6
P1
¼
2 1
1 1
 #
1 0
0 4096
 #
1
3  1
3
1
3
2
3
 #
¼
1366 2230
1365 2731
 #
Also, f ð1Þ ¼ 2 and f ð4Þ ¼ 1. Hence,
f ðAÞ ¼ Pf ðDÞP1
¼
2 1
1 1
 #
2 0
0 1
 #
1
3  1
3
1
3
2
3
 #
¼
1 2
1 0
 #
(d) Here
1 0
0
ffiffiffi
4
3
p
 
is the real cube root of D. Hence the real cube root of A is
B ¼ P
ffiffiffiffi
D
3
p
P1
¼
2 1
1 1
 #
1 0
0
ffiffiffi
4
3
p
 #
1
3  1
3
1
3
2
3
 #
¼
1
3
2 þ
ffiffiffi
4
3
p
2 þ 2
ffiffiffi
4
3
p
1 þ
ffiffiffi
4
3
p
1 þ 2
ffiffiffi
4
3
p
 #
9.11. Each of the following real matrices defines a linear transformation on R2
:
(a) A ¼
5 6
3 2
 
, (b) B ¼
1 1
2 1
 
, (c) C ¼
5 1
1 3
 
Find, for each matrix, all eigenvalues and a maximum set S of linearly independent eigenvectors.
Which of these linear operators are diagonalizable—that is, which can be represented by a
diagonal matrix?
(a) First find DðtÞ ¼ t2
 3t  28 ¼ ðt  7Þðt þ 4Þ. The roots l ¼ 7 and l ¼ 4 are the eigenvalues of A.
We find corresponding eigenvectors.
(i) Subtract l ¼ 7 down the diagonal of A to obtain
M ¼
2 6
3 9
 
; corresponding to
2x þ 6y ¼ 0
3x  9y ¼ 0
or x  3y ¼ 0
Here v1 ¼ ð3; 1Þ is a nonzero solution.
(ii) Subtract l ¼ 4 (or add 4) down the diagonal of A to obtain
M ¼
9 6
3 2
 
; corresponding to
9x þ 6y ¼ 0
3x þ 2y ¼ 0
or 3x þ 2y ¼ 0
Here v2 ¼ ð2; 3Þ is a nonzero solution.
Then S ¼ fv1; v2g ¼ fð3; 1Þ; ð2; 3Þg is a maximal set of linearly independent eigenvectors. Because S is
a basis of R2
, A is diagonalizable. Using the basis S, A is represented by the diagonal matrix D ¼ diagð7; 4Þ.
(b) First find the characteristic polynomial DðtÞ ¼ t2
þ 1. There are no real roots. Thus B, a real matrix
representing a linear transformation on R2
, has no eigenvalues and no eigenvectors. Hence, in particular,
B is not diagonalizable.
310 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
(c) First find DðtÞ ¼ t2
 8t þ 16 ¼ ðt  4Þ2
. Thus, l ¼ 4 is the only eigenvalue of C. Subtract l ¼ 4 down
the diagonal of C to obtain
M ¼
1 1
1 1
 
; corresponding to x  y ¼ 0
The homogeneous system has only one independent solution; for example, x ¼ 1, y ¼ 1. Thus,
v ¼ ð1; 1Þ is an eigenvector of C. Furthermore, as there are no other eigenvalues, the singleton set
S ¼ fvg ¼ fð1; 1Þg is a maximal set of linearly independent eigenvectors of C. Furthermore, because S
is not a basis of R2
, C is not diagonalizable.
9.12. Suppose the matrix B in Problem 9.11 represents a linear operator on complex space C2
. Show
that, in this case, B is diagonalizable by finding a basis S of C2
consisting of eigenvectors of B.
The characteristic polynomial of B is still DðtÞ ¼ t2
þ 1. As a polynomial over C, DðtÞ does factor;
specifically, DðtÞ ¼ ðt  iÞðt þ iÞ. Thus, l ¼ i and l ¼ i are the eigenvalues of B.
(i) Subtract l ¼ i down the diagonal of B to obtain the homogeneous system
ð1  iÞx  y ¼ 0
2x þ ð1  iÞy ¼ 0
or ð1  iÞx  y ¼ 0
The system has only one independent solution; for example, x ¼ 1, y ¼ 1  i. Thus, v1 ¼ ð1; 1  iÞ is
an eigenvector that spans the eigenspace of l ¼ i.
(ii) Subtract l ¼ i (or add i) down the diagonal of B to obtain the homogeneous system
ð1 þ iÞx  y ¼ 0
2x þ ð1 þ iÞy ¼ 0
or ð1 þ iÞx  y ¼ 0
The system has only one independent solution; for example, x ¼ 1, y ¼ 1 þ i. Thus, v2 ¼ ð1; 1 þ iÞ is
an eigenvector that spans the eigenspace of l ¼ i.
As a complex matrix, B is diagonalizable. Specifically, S ¼ fv1; v2g ¼ fð1; 1  iÞ; ð1; 1 þ iÞg is a basis of
C2
consisting of eigenvectors of B. Using this basis S, B is represented by the diagonal matrix
D ¼ diagði; iÞ.
9.13. Let L be the linear transformation on R2
that reflects each point P across the line y ¼ kx, where
k  0. (See Fig. 9-1.)
(a) Show that v1 ¼ ðk; 1Þ and v2 ¼ ð1; kÞ are eigenvectors of L.
(b) Show that L is diagonalizable, and find a diagonal representation D.
(a) The vector v1 ¼ ðk; 1Þ lies on the line y ¼ kx, and hence is left fixed by L; that is, Lðv1Þ ¼ v1. Thus, v1
is an eigenvector of L belonging to the eigenvalue l1 ¼ 1.
The vector v2 ¼ ð1; kÞ is perpendicular to the line y ¼ kx, and hence, L reflects v2 into its
negative; that is, Lðv2Þ ¼ v2. Thus, v2 is an eigenvector of L belonging to the eigenvalue l2 ¼ 1.
y
x
0
L P
( )
P
L( )
v2
v2
y k
= x
Figure 9-1
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 311
(b) Here S ¼ fv1; v2g is a basis of R2
consisting of eigenvectors of L. Thus, L is diagonalizable, with the
diagonal representation D ¼
1 0
0 1
 
(relative to the basis S).
Eigenvalues and Eigenvectors
9.14. Let A ¼
4 1 1
2 5 2
1 1 2
2
4
3
5: (a) Find all eigenvalues of A.
(b) Find a maximum set S of linearly independent eigenvectors of A.
(c) Is A diagonalizable? If yes, find P such that D ¼ P1
AP is diagonal.
(a) First find the characteristic polynomial DðtÞ of A. We have
trðAÞ ¼ 4 þ 5 þ 2 ¼ 11 and jAj ¼ 40  2  2 þ 5 þ 8  4 ¼ 45
Also, find each cofactor Aii of aii in A:
A11 ¼
5 2
1 2







 ¼ 12; A22 ¼
4 1
1 2







 ¼ 9; A33 ¼
4 1
2 5







 ¼ 18
Hence; DðtÞ ¼ t3
 trðAÞ t2
þ ðA11 þ A22 þ A33Þt  jAj ¼ t3
 11t2
þ 39t  45
Assuming Dt has a rational root, it must be among 1, 3, 5, 9, 15, 45. Testing, by
synthetic division, we get
3 1  11 þ 39  45
3  24 þ 45
1  8 þ 15 þ 0
Thus, t ¼ 3 is a root of DðtÞ. Also, t  3 is a factor and t2
 8t þ 15 is a factor. Hence,
DðtÞ ¼ ðt  3Þðt2
 8t þ 15Þ ¼ ðt  3Þðt  5Þðt  3Þ ¼ ðt  3Þ2
ðt  5Þ
Accordingly, l ¼ 3 and l ¼ 5 are eigenvalues of A.
(b) Find linearly independent eigenvectors for each eigenvalue of A.
(i) Subtract l ¼ 3 down the diagonal of A to obtain the matrix
M ¼
1 1 1
2 2 2
1 1 1
2
4
3
5; corresponding to x þ y  z ¼ 0
Here u ¼ ð1; 1; 0Þ and v ¼ ð1; 0; 1Þ are linearly independent solutions.
(ii) Subtract l ¼ 5 down the diagonal of A to obtain the matrix
M ¼
1 1 1
2 0 2
1 1 3
2
4
3
5; corresponding to
x þ y  z ¼ 0
2x  2z ¼ 0
x þ y  3z ¼ 0
or
x  z ¼ 0
y  2z ¼ 0
Only z is a free variable. Here w ¼ ð1; 2; 1Þ is a solution.
Thus, S ¼ fu; v; wg ¼ fð1; 1; 0Þ; ð1; 0; 1Þ; ð1; 2; 1Þg is a maximal set of linearly independent
eigenvectors of A.
Remark: The vectors u and v were chosen so that they were independent solutions of the system
x þ y  z ¼ 0. On the other hand, w is automatically independent of u and v because w belongs to a
different eigenvalue of A. Thus, the three vectors are linearly independent.
312 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
(c) A is diagonalizable, because it has three linearly independent eigenvectors. Let P be the matrix with
columns u; v; w. Then
P ¼
1 1 1
1 0 2
0 1 1
2
4
3
5 and D ¼ P1
AP ¼
3
3
5
2
4
3
5
9.15. Repeat Problem 9.14 for the matrix B ¼
3 1 1
7 5 1
6 6 2
2
4
3
5.
(a) First find the characteristic polynomial DðtÞ of B. We have
trðBÞ ¼ 0; jBj ¼ 16; B11 ¼ 4; B22 ¼ 0; B33 ¼ 8; so
P
i
Bii ¼ 12
Therefore, DðtÞ ¼ t3
 12t þ 16 ¼ ðt  2Þ2
ðt þ 4Þ. Thus, l1 ¼ 2 and l2 ¼ 4 are the eigen-
values of B.
(b) Find a basis for the eigenspace of each eigenvalue of B.
(i) Subtract l1 ¼ 2 down the diagonal of B to obtain
M ¼
1 1 1
7 7 1
6 6 0
2
4
3
5; corresponding to
x  y þ z ¼ 0
7x  7y þ z ¼ 0
6x  6y ¼ 0
or
x  y þ z ¼ 0
z ¼ 0
The system has only one independent solution; for example, x ¼ 1, y ¼ 1, z ¼ 0. Thus,
u ¼ ð1; 1; 0Þ forms a basis for the eigenspace of l1 ¼ 2.
(ii) Subtract l2 ¼ 4 (or add 4) down the diagonal of B to obtain
M ¼
7 1 1
7 1 1
6 6 6
2
4
3
5; corresponding to
7x  y þ z ¼ 0
7x  y þ z ¼ 0
6x  6y þ 6z ¼ 0
or
x  y þ z ¼ 0
6y  6z ¼ 0
The system has only one independent solution; for example, x ¼ 0, y ¼ 1, z ¼ 1. Thus,
v ¼ ð0; 1; 1Þ forms a basis for the eigenspace of l2 ¼ 4.
Thus S ¼ fu; vg is a maximal set of linearly independent eigenvectors of B.
(c) Because B has at most two linearly independent eigenvectors, B is not similar to a diagonal matrix; that
is, B is not diagonalizable.
9.16. Find the algebraic and geometric multiplicities of the eigenvalue l1 ¼ 2 of the matrix B in
Problem 9.15.
The algebraic multiplicity of l1 ¼ 2 is 2, because t  2 appears with exponent 2 in DðtÞ. However, the
geometric multiplicity of l1 ¼ 2 is 1, because dim El1
¼ 1 (where El1
is the eigenspace of l1).
9.17. Let T: R3
! R3
be defined by Tðx; y; zÞ ¼ ð2x þ y  2z; 2x þ 3y  4z; x þ y  zÞ. Find all
eigenvalues of T, and find a basis of each eigenspace. Is T diagonalizable? If so, find the basis S of
R3
that diagonalizes T; and find its diagonal representation D.
First find the matrix A that represents T relative to the usual basis of R3
by writing down the coefficients
of x; y; z as rows, and then find the characteristic polynomial of A (and T). We have
A ¼ ½T ¼
2 1 2
2 3 4
1 1 1
2
4
3
5 and
trðAÞ ¼ 4; jAj ¼ 2
A11 ¼ 1; A22 ¼ 0; A33 ¼ 4
P
i
Aii ¼ 5
Therefore, DðtÞ ¼ t3
 4t2
þ 5t  2 ¼ ðt  1Þ2
ðt  2Þ, and so l ¼ 1 and l ¼ 2 are the eigenvalues of A (and
T). We next find linearly independent eigenvectors for each eigenvalue of A.
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 313
(i) Subtract l ¼ 1 down the diagonal of A to obtain the matrix
M ¼
1 1 2
2 2 4
1 1 2
2
4
3
5; corresponding to x þ y  2z ¼ 0
Here y and z are free variables, and so there are two linearly independent eigenvectors belonging
to l ¼ 1. For example, u ¼ ð1; 1; 0Þ and v ¼ ð2; 0; 1Þ are two such eigenvectors.
(ii) Subtract l ¼ 2 down the diagonal of A to obtain
M ¼
0 1 2
2 1 4
1 1 3
2
4
3
5; corresponding to
y  2z ¼ 0
2x þ y  4z ¼ 0
x þ y  3z ¼ 0
or
x þ y  3z ¼ 0
y  2z ¼ 0
Only z is a free variable. Here w ¼ ð1; 2; 1Þ is a solution.
Thus, T is diagonalizable, because it has three independent eigenvectors. Specifically, choosing
S ¼ fu; v; wg ¼ fð1; 1; 0Þ; ð2; 0; 1Þ; ð1; 2; 1Þg
as a basis, T is represented by the diagonal matrix D ¼ diagð1; 1; 2Þ.
9.18. Prove the following for a linear operator (matrix) T:
(a) The scalar 0 is an eigenvalue of T if and only if T is singular.
(b) If l is an eigenvalue of T, where T is invertible, then l1
is an eigenvalue of T1
.
(a) We have that 0 is an eigenvalue of T if and only if there is a vector v 6¼ 0 such that TðvÞ ¼ 0v—that is, if
and only if T is singular.
(b) Because T is invertible, it is nonsingular; hence, by (a), l 6¼ 0. By definition of an eigenvalue, there
exists v 6¼ 0 such that TðvÞ ¼ lv. Applying T1
to both sides, we obtain
v ¼ T1
ðlvÞ ¼ lT1
ðvÞ; and so T1
ðvÞ ¼ l1
v
Therefore, l1
is an eigenvalue of T1
.
9.19. Let l be an eigenvalue of a linear operator T: V ! V, and let El consists of all the eigenvectors
belonging to l (called the eigenspace of l). Prove that El is a subspace of V. That is, prove
(a) If u 2 El, then ku 2 El for any scalar k. (b) If u; v; 2 El, then u þ v 2 El.
(a) Because u 2 El, we have TðuÞ ¼ lu. Then TðkuÞ ¼ kTðuÞ ¼ kðluÞ ¼ lðkuÞ; and so ku 2 El:
(We view the zero vector 0 2 V as an ‘‘eigenvector’’ of l in order for El to be a subspace of V.)
(b) As u; v 2 El, we have TðuÞ ¼ lu and TðvÞ ¼ lv. Then
Tðu þ vÞ ¼ TðuÞ þ TðvÞ ¼ lu þ lv ¼ lðu þ vÞ; and so u þ v 2 El
9.20. Prove Theorem 9.6: The following are equivalent: (i) The scalar l is an eigenvalue of A.
(ii) The matrix lI  A is singular.
(iii) The scalar l is a root of the characteristic polynomial DðtÞ of A.
The scalar l is an eigenvalue of A if and only if there exists a nonzero vector v such that
Av ¼ lv or ðlIÞv  Av ¼ 0 or ðlI  AÞv ¼ 0
or lI  A is singular. In such a case, l is a root of DðtÞ ¼ jtI  Aj. Also, v is in the eigenspace El of l if and
only if the above relations hold. Hence, v is a solution of ðlI  AÞX ¼ 0.
9.21. Prove Theorem 9.80
: Suppose v1; v2; . . . ; vn are nonzero eigenvectors of T belonging to distinct
eigenvalues l1; l2; . . . ; ln. Then v1; v2; . . . ; vn are linearly independent.
314 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
Suppose the theorem is not true. Let v1; v2; . . . ; vs be a minimal set of vectors for which the theorem is
not true. We have s  1, because v1 6¼ 0. Also, by the minimality condition, v2; . . . ; vs are linearly
independent. Thus, v1 is a linear combination of v2; . . . ; vs, say,
v1 ¼ a2v2 þ a3v3 þ    þ asvs ð1Þ
(where some ak 6¼ 0Þ. Applying T to (1) and using the linearity of T yields
Tðv1Þ ¼ Tða2v2 þ a3v3 þ    þ asvsÞ ¼ a2Tðv2Þ þ a3Tðv3Þ þ    þ asTðvsÞ ð2Þ
Because vj is an eigenvector of T belonging to lj, we have TðvjÞ ¼ ljvj. Substituting in (2) yields
l1v1 ¼ a2l2v2 þ a3l3v3 þ    þ aslsvs ð3Þ
Multiplying (1) by l1 yields
l1v1 ¼ a2l1v2 þ a3l1v3 þ    þ asl1vs ð4Þ
Setting the right-hand sides of (3) and (4) equal to each other, or subtracting (3) from (4) yields
a2ðl1  l2Þv2 þ a3ðl1  l3Þv3 þ    þ asðl1  lsÞvs ¼ 0 ð5Þ
Because v2; v3; . . . ; vs are linearly independent, the coefficients in (5) must all be zero. That is,
a2ðl1  l2Þ ¼ 0; a3ðl1  l3Þ ¼ 0; . . . ; asðl1  lsÞ ¼ 0
However, the li are distinct. Hence l1  lj 6¼ 0 for j  1. Hence, a2 ¼ 0, a3 ¼ 0; . . . ; as ¼ 0. This
contradicts the fact that some ak 6¼ 0. The theorem is proved.
9.22. Prove Theorem 9.9. Suppose DðtÞ ¼ ðt  a1Þðt  a2Þ . . . ðt  anÞ is the characteristic polynomial
of an n-square matrix A, and suppose the n roots ai are distinct. Then A is similar to the diagonal
matrix D ¼ diagða1; a2; . . . ; anÞ.
Let v1; v2; . . . ; vn be (nonzero) eigenvectors corresponding to the eigenvalues ai. Then the n eigenvectors
vi are linearly independent (Theorem 9.8), and hence form a basis of Kn
. Accordingly, A is diagonalizable
(i.e., A is similar to a diagonal matrix D), and the diagonal elements of D are the eigenvalues ai.
9.23. Prove Theorem 9.100
: The geometric multiplicity of an eigenvalue l of T does not exceed its
algebraic multiplicity.
Suppose the geometric multiplicity of l is r. Then its eigenspace El contains r linearly independent
eigenvectors v1; . . . ; vr. Extend the set fvig to a basis of V, say, fvi; . . . ; vr; w1; . . . ; wsg. We have
Tðv1Þ ¼ lv1; Tðv2Þ ¼ lv2; . . . ; TðvrÞ ¼ lvr;
Tðw1Þ ¼ a11v1 þ    þ a1rvr þ b11w1 þ    þ b1sws
Tðw2Þ ¼ a21v1 þ    þ a2rvr þ b21w1 þ    þ b2sws
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
TðwsÞ ¼ as1v1 þ    þ asrvr þ bs1w1 þ    þ bssws
Then M ¼
lIr A
0 B
 
is the matrix of T in the above basis, where A ¼ ½aijT
and B ¼ ½bijT
:
Because M is block diagonal, the characteristic polynomial ðt  lÞr
of the block lIr must divide the
characteristic polynomial of M and hence of T. Thus, the algebraic multiplicity of l for T is at least r, as
required.
Diagonalizing Real Symmetric Matrices and Quadratic Forms
9.24. Let A ¼
7 3
3 1
 
. Find an orthogonal matrix P such that D ¼ P1
AP is diagonal.
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 315
First find the characteristic polynomial DðtÞ of A. We have
DðtÞ ¼ t2
 trðAÞ t þ jAj ¼ t2
 6t  16 ¼ ðt  8Þðt þ 2Þ
Thus, the eigenvalues of A are l ¼ 8 and l ¼ 2. We next find corresponding eigenvectors.
Subtract l ¼ 8 down the diagonal of A to obtain the matrix
M ¼
1 3
3 9
 
; corresponding to
x þ 3y ¼ 0
3x  9y ¼ 0
or x  3y ¼ 0
A nonzero solution is u1 ¼ ð3; 1Þ.
Subtract l ¼ 2 (or add 2) down the diagonal of A to obtain the matrix
M ¼
9 3
3 1
 
; corresponding to
9x þ 3y ¼ 0
3x þ y ¼ 0
or 3x þ y ¼ 0
A nonzero solution is u2 ¼ ð1; 3Þ.
As expected, because A is symmetric, the eigenvectors u1 and u2 are orthogonal. Normalize u1 and u2 to
obtain, respectively, the unit vectors
^
u1 ¼ ð3=
ffiffiffiffiffi
10
p
; 1=
ffiffiffiffiffi
10
p
Þ and ^
u2 ¼ ð1=
ffiffiffiffiffi
10
p
; 3=
ffiffiffiffiffi
10
p
Þ:
Finally, let P be the matrix whose columns are the unit vectors ^
u1 and ^
u2, respectively. Then
P ¼
3=
ffiffiffiffiffi
10
p
1=
ffiffiffiffiffi
10
p
1=
ffiffiffiffiffi
10
p
3=
ffiffiffiffiffi
10
p
 #
and D ¼ P1
AP ¼
8 0
0 2
 
As expected, the diagonal entries in D are the eigenvalues of A.
9.25. Let B ¼
11 8 4
8 1 2
4 2 4
2
4
3
5. (a) Find all eigenvalues of B.
(b) Find a maximal set S of nonzero orthogonal eigenvectors of B.
(c) Find an orthogonal matrix P such that D ¼ P1
BP is diagonal.
(a) First find the characteristic polynomial of B. We have
trðBÞ ¼ 6; jBj ¼ 400; B11 ¼ 0; B22 ¼ 60; B33 ¼ 75; so
P
i
Bii ¼ 135
Hence, DðtÞ ¼ t3
 6t2
 135t  400. If DðtÞ has an integer root it must divide 400. Testing t ¼ 5, by
synthetic division, yields
5 1  6  135  400
 5 þ 55 þ 400
1  11  80 þ 0
Thus, t þ 5 is a factor of DðtÞ, and t2
 11t  80 is a factor. Thus,
DðtÞ ¼ ðt þ 5Þðt2
 11t  80Þ ¼ ðt þ 5Þ2
ðt  16Þ
The eigenvalues of B are l ¼ 5 (multiplicity 2), and l ¼ 16 (multiplicity 1).
(b) Find an orthogonal basis for each eigenspace. Subtract l ¼ 5 (or, add 5) down the diagonal of B to
obtain the homogeneous system
16x  8y þ 4z ¼ 0; 8x þ 4y  2z ¼ 0; 4x  2y þ z ¼ 0
That is, 4x  2y þ z ¼ 0. The system has two independent solutions. One solution is v1 ¼ ð0; 1; 2Þ. We
seek a second solution v2 ¼ ða; b; cÞ, which is orthogonal to v1, such that
4a  2b þ c ¼ 0; and also b  2c ¼ 0
316 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
One such solution is v2 ¼ ð5; 8; 4Þ.
Subtract l ¼ 16 down the diagonal of B to obtain the homogeneous system
5x  8y þ 4z ¼ 0; 8x  17y  2z ¼ 0; 4x  2y  20z ¼ 0
This system yields a nonzero solution v3 ¼ ð4; 2; 1Þ. (As expected from Theorem 9.13, the
eigenvector v3 is orthogonal to v1 and v2.)
Then v1; v2; v3 form a maximal set of nonzero orthogonal eigenvectors of B.
(c) Normalize v1; v2; v3 to obtain the orthonormal basis:
^
v1 ¼ v1=
ffiffiffi
5
p
; ^
v2 ¼ v2=
ffiffiffiffiffiffiffiffi
105
p
; ^
v3 ¼ v3=
ffiffiffiffiffi
21
p
Then P is the matrix whose columns are ^
v1; ^
v2; ^
v3. Thus,
P ¼
0 5=
ffiffiffiffiffiffiffiffi
105
p
4=
ffiffiffiffiffi
21
p
1=
ffiffiffi
5
p
8=
ffiffiffiffiffiffiffiffi
105
p
2=
ffiffiffiffiffi
21
p
2=
ffiffiffi
5
p
4=
ffiffiffiffiffiffiffiffi
105
p
1=
ffiffiffiffiffi
21
p
2
6
4
3
7
5 and D ¼ P1
BP ¼
5
5
16
2
6
4
3
7
5
9.26. Let qðx; yÞ ¼ x2
þ 6xy  7y2
. Find an orthogonal substitution that diagonalizes q.
Find the symmetric matrix A that represents q and its characteristic polynomial DðtÞ. We have
A ¼
1 3
3 7
 
and DðtÞ ¼ t2
þ 6t  16 ¼ ðt  2Þðt þ 8Þ
The eigenvalues of A are l ¼ 2 and l ¼ 8. Thus, using s and t as new variables, a diagonal form of q is
qðs; tÞ ¼ 2s2
 8t2
The corresponding orthogonal substitution is obtained by finding an orthogonal set of eigenvectors of A.
(i) Subtract l ¼ 2 down the diagonal of A to obtain the matrix
M ¼
1 3
3 9
 
; corresponding to
x þ 3y ¼ 0
3x  9y ¼ 0
or  x þ 3y ¼ 0
A nonzero solution is u1 ¼ ð3; 1Þ.
(ii) Subtract l ¼ 8 (or add 8) down the diagonal of A to obtain the matrix
M ¼
9 3
3 1
 
; corresponding to
9x þ 3y ¼ 0
3x þ y ¼ 0
or 3x þ y ¼ 0
A nonzero solution is u2 ¼ ð1; 3Þ.
As expected, because A is symmetric, the eigenvectors u1 and u2 are orthogonal.
Now normalize u1 and u2 to obtain, respectively, the unit vectors
^
u1 ¼ ð3=
ffiffiffiffiffi
10
p
; 1=
ffiffiffiffiffi
10
p
Þ and ^
u2 ¼ ð1=
ffiffiffiffiffi
10
p
; 3=
ffiffiffiffiffi
10
p
Þ:
Finally, let P be the matrix whose columns are the unit vectors ^
u1 and ^
u2, respectively, and then
½x; yT
¼ P½s; tT
is the required orthogonal change of coordinates. That is,
P ¼
3=
ffiffiffiffiffi
10
p
1=
ffiffiffiffiffi
10
p
1=
ffiffiffiffiffi
10
p
3=
ffiffiffiffiffi
10
p





#
and x ¼
3s  t
ffiffiffiffiffi
10
p ; y ¼
s þ 3t
ffiffiffiffiffi
10
p
One can also express s and t in terms of x and y by using P1
¼ PT
. That is,
s ¼
3x þ y
ffiffiffiffiffi
10
p ; t ¼
x þ 3t
ffiffiffiffiffi
10
p
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 317
Minimal Polynomial
9.27. Let A ¼
4 2 2
6 3 4
3 2 3
2
4
3
5 and B ¼
3 2 2
4 4 6
2 3 5
2
4
3
5. The characteristic polynomial of both matrices is
DðtÞ ¼ ðt  2Þðt  1Þ2
. Find the minimal polynomial mðtÞ of each matrix.
The minimal polynomial mðtÞ must divide DðtÞ. Also, each factor of DðtÞ (i.e., t  2 and t  1) must
also be a factor of mðtÞ. Thus, mðtÞ must be exactly one of the following:
f ðtÞ ¼ ðt  2Þðt  1Þ or gðtÞ ¼ ðt  2Þðt  1Þ2
(a) By the Cayley–Hamilton theorem, gðAÞ ¼ DðAÞ ¼ 0, so we need only test f ðtÞ. We have
f ðAÞ ¼ ðA  2IÞðA  IÞ ¼
2 2 2
6 5 4
3 2 1
2
4
3
5
3 2 2
6 4 4
3 2 2
2
4
3
5 ¼
0 0 0
0 0 0
0 0 0
2
4
3
5
Thus, mðtÞ ¼ f ðtÞ ¼ ðt  2Þðt  1Þ ¼ t2
 3t þ 2 is the minimal polynomial of A.
(b) Again gðBÞ ¼ DðBÞ ¼ 0, so we need only test f ðtÞ. We get
f ðBÞ ¼ ðB  2IÞðB  IÞ ¼
1 2 2
4 6 6
2 3 3
2
4
3
5
2 2 2
4 5 6
2 3 4
2
4
3
5 ¼
2 2 2
4 4 4
2 2 2
2
4
3
5 6¼ 0
Thus, mðtÞ 6¼ f ðtÞ. Accordingly, mðtÞ ¼ gðtÞ ¼ ðt  2Þðt  1Þ2
is the minimal polynomial of B. [We
emphasize that we do not need to compute gðBÞ; we know gðBÞ ¼ 0 from the Cayley–Hamilton theorem.]
9.28. Find the minimal polynomial mðtÞ of each of the following matrices:
(a) A ¼
5 1
3 7
 
, (b) B ¼
1 2 3
0 2 3
0 0 3
2
4
3
5, (c) C ¼
4 1
1 2
 
(a) The characteristic polynomial of A is DðtÞ ¼ t2
 12t þ 32 ¼ ðt  4Þðt  8Þ. Because DðtÞ has distinct
factors, the minimal polynomial mðtÞ ¼ DðtÞ ¼ t2
 12t þ 32.
(b) Because B is triangular, its eigenvalues are the diagonal elements 1; 2; 3; and so its characteristic
polynomial is DðtÞ ¼ ðt  1Þðt  2Þðt  3Þ. Because DðtÞ has distinct factors, mðtÞ ¼ DðtÞ.
(c) The characteristic polynomial of C is DðtÞ ¼ t2
 6t þ 9 ¼ ðt  3Þ2
. Hence the minimal polynomial of C
is f ðtÞ ¼ t  3 or gðtÞ ¼ ðt  3Þ2
. However, f ðCÞ 6¼ 0; that is, C  3I 6¼ 0. Hence,
mðtÞ ¼ gðtÞ ¼ DðtÞ ¼ ðt  3Þ2
:
9.29. Suppose S ¼ fu1; u2; . . . ; ung is a basis of V, and suppose F and G are linear operators on V such
that ½F has 0’s on and below the diagonal, and ½G has a 6¼ 0 on the superdiagonal and 0’s
elsewhere. That is,
½F ¼
0 a21 a31 . . . an1
0 0 a32 . . . an2
::::::::::::::::::::::::::::::::::::::::
0 0 0 . . . an;n1
0 0 0 . . . 0
2
6
6
6
6
4
3
7
7
7
7
5
; ½G ¼
0 a 0 . . . 0
0 0 a . . . 0
:::::::::::::::::::::::::::
0 0 0 . . . a
0 0 0 . . . 0
2
6
6
6
6
4
3
7
7
7
7
5
Show that (a) Fn
¼ 0, (b) Gn1
6¼ 0, but Gn
¼ 0. (These conditions also hold for ½F and ½G.)
(a) We have Fðu1Þ ¼ 0 and, for r  1, FðurÞ is a linear combination of vectors preceding ur in S. That is,
FðurÞ ¼ ar1u1 þ ar2u2 þ    þ ar;r1ur1
318 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
Hence, F2
ðurÞ ¼ FðFðurÞÞ is a linear combination of vectors preceding ur1, and so on. Hence,
Fr
ðurÞ ¼ 0 for each r. Thus, for each r, Fn
ðurÞ ¼ Fnr
ð0Þ ¼ 0, and so Fn
¼ 0, as claimed.
(b) We have Gðu1Þ ¼ 0 and, for each k  1, GðukÞ ¼ auk1. Hence, Gr
ðukÞ ¼ ar
ukr for r  k. Because a 6¼ 0,
an1
6¼ 0. Therefore, Gn1
ðunÞ ¼ an1
u1 6¼ 0, and so Gn1
6¼ 0. On the other hand, by (a), Gn
¼ 0.
9.30. Let B be the matrix in Example 9.12(a) that has 1’s on the diagonal, a’s on the superdiagonal,
where a 6¼ 0, and 0’s elsewhere. Show that f ðtÞ ¼ ðt  lÞn
is both the characteristic polynomial
DðtÞ and the minimum polynomial mðtÞ of A.
Because A is triangular with l’s on the diagonal, DðtÞ ¼ f ðtÞ ¼ ðt  lÞn
is its characteristic polynomial.
Thus, mðtÞ is a power of t  l. By Problem 9.29, ðA  lIÞr1
6¼ 0. Hence, mðtÞ ¼ DðtÞ ¼ ðt  lÞn
.
9.31. Find the characteristic polynomial DðtÞ and minimal polynomial mðtÞ of each matrix:
(a) M ¼
4 1 0 0 0
0 4 1 0 0
0 0 4 0 0
0 0 0 4 1
0 0 0 0 4
2
6
6
6
6
4
3
7
7
7
7
5
, (b) M0
¼
2 7 0 0
0 2 0 0
0 0 1 1
0 0 2 4
2
6
6
4
3
7
7
5
(a) M is block diagonal with diagonal blocks
A ¼
4 1 0
0 4 1
0 0 4
2
4
3
5 and B ¼
4 1
0 4
 
The characteristic and minimal polynomial of A is f ðtÞ ¼ ðt  4Þ3
and the characteristic and minimal
polynomial of B is gðtÞ ¼ ðt  4Þ2
. Then
DðtÞ ¼ f ðtÞgðtÞ ¼ ðt  4Þ5
but mðtÞ ¼ LCM½ f ðtÞ; gðtÞ ¼ ðt  4Þ3
(where LCM means least common multiple). We emphasize that the exponent in mðtÞ is the size of the
largest block.
(b) Here M0
is block diagonal with diagonal blocks A0
¼
2 7
0 2
 
and B0
¼
1 1
2 4
 
The char-
acteristic and minimal polynomial of A0
is f ðtÞ ¼ ðt  2Þ2
. The characteristic polynomial of B0
is
gðtÞ ¼ t2
 5t þ 6 ¼ ðt  2Þðt  3Þ, which has distinct factors. Hence, gðtÞ is also the minimal polynomial
of B. Accordingly,
DðtÞ ¼ f ðtÞgðtÞ ¼ ðt  2Þ3
ðt  3Þ but mðtÞ ¼ LCM½ f ðtÞ; gðtÞ ¼ ðt  2Þ2
ðt  3Þ
9.32. Find a matrix A whose minimal polynomial is f ðtÞ ¼ t3
 8t2
þ 5t þ 7.
Simply let A ¼
0 0 7
1 0 5
0 1 8
2
4
3
5, the companion matrix of f ðtÞ [defined in Example 9.12(b)].
9.33. Prove Theorem 9.15: The minimal polynomial mðtÞ of a matrix (linear operator) A divides every
polynomial that has A as a zero. In particular (by the Cayley–Hamilton theorem), mðtÞ divides the
characteristic polynomial DðtÞ of A.
Suppose f ðtÞ is a polynomial for which f ðAÞ ¼ 0. By the division algorithm, there exist polynomials
qðtÞ and rðtÞ for which f ðtÞ ¼ mðtÞqðtÞ þ rðtÞ and rðtÞ ¼ 0 or deg rðtÞ  deg mðtÞ. Substituting t ¼ A in this
equation, and using that f ðAÞ ¼ 0 and mðAÞ ¼ 0, we obtain rðAÞ ¼ 0. If rðtÞ 6¼ 0, then rðtÞ is a polynomial
of degree less than mðtÞ that has A as a zero. This contradicts the definition of the minimal polynomial. Thus,
rðtÞ ¼ 0, and so f ðtÞ ¼ mðtÞqðtÞ; that is, mðtÞ divides f ðtÞ.
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 319
9.34. Let mðtÞ be the minimal polynomial of an n-square matrix A. Prove that the characteristic
polynomial DðtÞ of A divides ½mðtÞn
.
Suppose mðtÞ ¼ tr
þ c1tr1
þ    þ cr1t þ cr. Define matrices Bj as follows:
B0 ¼ I
B1 ¼ A þ c1I
B2 ¼ A2
þ c1A þ c2I
Br1 ¼ Ar1
þ c1Ar2
þ    þ cr1I
so
so
so
so
I ¼ B0
c1I ¼ B1  A ¼ B1  AB0
c2I ¼ B2  AðA þ c1IÞ ¼ B2  AB1
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
cr1I ¼ Br1  ABr2
Then
ABr1 ¼ crI  ðAr
þ c1Ar1
þ    þ cr1A þ crIÞ ¼ crI  mðAÞ ¼ crI
Set BðtÞ ¼ tr1
B0 þ tr2
B1 þ    þ tBr2 þ Br1
Then
ðtI  AÞBðtÞ ¼ ðtr
B0 þ tr1
B1 þ    þ tBr1Þ  ðtr1
AB0 þ tr2
AB1 þ    þ ABr1Þ
¼ tr
B0 þ tr1
ðB1  AB0Þ þ tr2
ðB2  AB1Þ þ    þ tðBr1  ABr2Þ  ABr1
¼ tr
I þ c1tr1
I þ c2tr2
I þ    þ cr1tI þ crI ¼ mðtÞI
Taking the determinant of both sides gives jtI  AjjBðtÞj ¼ jmðtÞIj ¼ ½mðtÞn
. Because jBðtÞj is a poly-
nomial, jtI  Aj divides ½mðtÞn
; that is, the characteristic polynomial of A divides ½mðtÞn
.
9.35. Prove Theorem 9.16: The characteristic polynomial DðtÞ and the minimal polynomial mðtÞ of A
have the same irreducible factors.
Suppose f ðtÞ is an irreducible polynomial. If f ðtÞ divides mðtÞ, then f ðtÞ also divides DðtÞ [because mðtÞ
divides DðtÞ. On the other hand, if f ðtÞ divides DðtÞ, then by Problem 9.34, f ðtÞ also divides ½mðtÞn
. But f ðtÞ
is irreducible; hence, f ðtÞ also divides mðtÞ. Thus, mðtÞ and DðtÞ have the same irreducible factors.
9.36. Prove Theorem 9.19: The minimal polynomial mðtÞ of a block diagonal matrix M with diagonal
blocks Ai is equal to the least common multiple (LCM) of the minimal polynomials of the
diagonal blocks Ai.
We prove the theorem for the case r ¼ 2. The general theorem follows easily by induction. Suppose
M ¼
A 0
0 B
 
, where A and B are square matrices. We need to show that the minimal polynomial mðtÞ of M
is the LCM of the minimal polynomials gðtÞ and hðtÞ of A and B, respectively.
Because mðtÞ is the minimal polynomial of M; mðMÞ ¼
mðAÞ 0
0 mðBÞ
 
¼ 0, and mðAÞ ¼ 0 and
mðBÞ ¼ 0. Because gðtÞ is the minimal polynomial of A, gðtÞ divides mðtÞ. Similarly, hðtÞ divides mðtÞ. Thus
mðtÞ is a multiple of gðtÞ and hðtÞ.
Now let f ðtÞ be another multiple of gðtÞ and hðtÞ. Then f ðMÞ ¼
f ðAÞ 0
0 f ðBÞ
 
¼
0 0
0 0
 
¼ 0. But
mðtÞ is the minimal polynomial of M; hence, mðtÞ divides f ðtÞ. Thus, mðtÞ is the LCM of gðtÞ and hðtÞ.
9.37. Suppose mðtÞ ¼ tr
þ ar1tr1
þ    þ a1t þ a0 is the minimal polynomial of an n-square matrix A.
Prove the following:
(a) A is nonsingular if and only if the constant term a0 6¼ 0.
(b) If A is nonsingular, then A1
is a polynomial in A of degree r  1  n.
(a) The following are equivalent: (i) A is nonsingular, (ii) 0 is not a root of mðtÞ, (iii) a0 6¼ 0. Thus, the
statement is true.
320 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
(b) Because A is nonsingular, a0 6¼ 0 by (a). We have
mðAÞ ¼ Ar
þ ar1Ar1
þ    þ a1A þ a0I ¼ 0
Thus; 
1
a0
ðAr1
þ ar1Ar2
þ    þ a1IÞA ¼ I
Accordingly; A1
¼ 
1
a0
ðAr1
þ ar1Ar2
þ    þ a1IÞ
SUPPLEMENTARY PROBLEMS
Polynomials of Matrices
9.38. Let A ¼
2 3
5 1
 
and B ¼
1 2
0 3
 
. Find f ðAÞ, gðAÞ, f ðBÞ, gðBÞ, where f ðtÞ ¼ 2t2
 5t þ 6 and
gðtÞ ¼ t3
 2t2
þ t þ 3.
9.39. Let A ¼
1 2
0 1
 
. Find A2
, A3
, An
, where n  3, and A1
.
9.40. Let B ¼
8 12 0
0 8 12
0 0 8
2
4
3
5. Find a real matrix A such that B ¼ A3
.
9.41. For each matrix, find a polynomial having the following matrix as a root:
(a) A ¼
2 5
1 3
 
, (b) B ¼
2 3
7 4
 
, (c) C ¼
1 1 2
1 2 3
2 1 4
2
4
3
5
9.42. Let A be any square matrix and let f ðtÞ be any polynomial. Prove (a) ðP1
APÞn
¼ P1
An
P.
(b) f ðP1
APÞ ¼ P1
f ðAÞP. (c) f ðAT
Þ ¼ ½ f ðAÞT
. (d) If A is symmetric, then f ðAÞ is symmetric.
9.43. Let M ¼ diag½A1; . . . ; Ar be a block diagonal matrix, and let f ðtÞ be any polynomial. Show that f ðMÞ is
block diagonal and f ðMÞ ¼ diag½ f ðA1Þ; . . . ; f ðArÞ:
9.44. Let M be a block triangular matrix with diagonal blocks A1; . . . ; Ar, and let f ðtÞ be any polynomial. Show
that f ðMÞ is also a block triangular matrix, with diagonal blocks f ðA1Þ; . . . ; f ðArÞ.
Eigenvalues and Eigenvectors
9.45. For each of the following matrices, find all eigenvalues and corresponding linearly independent eigen-
vectors:
(a) A ¼
2 3
2 5
 
, (b) B ¼
2 4
1 6
 
, (c) C ¼
1 4
3 7
 
When possible, find the nonsingular matrix P that diagonalizes the matrix.
9.46. Let A ¼
2 1
2 3
 
.
(a) Find eigenvalues and corresponding eigenvectors.
(b) Find a nonsingular matrix P such that D ¼ P1
AP is diagonal.
(c) Find A8
and f ðAÞ where f ðtÞ ¼ t4
 5t3
þ 7t2
 2t þ 5.
(d) Find a matrix B such that B2
¼ A.
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 321
9.47. Repeat Problem 9.46 for A ¼
5 6
2 2
 
.
9.48. For each of the following matrices, find all eigenvalues and a maximum set S of linearly independent
eigenvectors:
(a) A ¼
1 3 3
3 5 3
6 6 4
2
4
3
5, (b) B ¼
3 1 1
7 5 1
6 6 2
2
4
3
5, (c) C ¼
1 2 2
1 2 1
1 1 4
2
4
3
5
Which matrices can be diagonalized, and why?
9.49. For each of the following linear operators T: R2
! R2
, find all eigenvalues and a basis for each eigenspace:
(a) Tðx; yÞ ¼ ð3x þ 3y; x þ 5yÞ, (b) Tðx; yÞ ¼ ð3x  13y; x  3yÞ.
9.50. Let A ¼
a b
c d
 
be a real matrix. Find necessary and sufficient conditions on a; b; c; d so that A is
diagonalizable—that is, so that A has two (real) linearly independent eigenvectors.
9.51. Show that matrices A and AT
have the same eigenvalues. Give an example of a 2 2 matrix A where A and
AT
have different eigenvectors.
9.52. Suppose v is an eigenvector of linear operators F and G. Show that v is also an eigenvector of the linear
operator kF þ k0
G, where k and k0
are scalars.
9.53. Suppose v is an eigenvector of a linear operator T belonging to the eigenvalue l. Prove
(a) For n  0; v is an eigenvector of Tn
belonging to ln
.
(b) f ðlÞ is an eigenvalue of f ðTÞ for any polynomial f ðtÞ.
9.54. Suppose l 6¼ 0 is an eigenvalue of the composition F  G of linear operators F and G. Show that l is also an
eigenvalue of the composition G  F. [Hint: Show that GðvÞ is an eigenvector of G  F.]
9.55. Let E: V ! V be a projection mapping; that is, E2
¼ E. Show that E is diagonalizable and, in fact, can be
represented by the diagonal matrix M ¼
Ir 0
0 0
 
, where r is the rank of E.
Diagonalizing Real Symmetric Matrices and Quadratic Forms
9.56. For each of the following symmetric matrices A, find an orthogonal matrix P and a diagonal matrix D such
that D ¼ P1
AP:
(a) A ¼
5 4
4 1
 
, (b) A ¼
4 1
1 4
 
, (c) A ¼
7 3
3 1
 
9.57. For each of the following symmetric matrices B, find its eigenvalues, a maximal orthogonal set S of
eigenvectors, and an orthogonal matrix P such that D ¼ P1
BP is diagonal:
(a) B ¼
0 1 1
1 0 1
1 1 0
2
4
3
5, (b) B ¼
2 2 4
2 5 8
4 8 17
2
4
3
5
9.58. Using variables s and t, find an orthogonal substitution that diagonalizes each of the following quadratic
forms:
(a) qðx; yÞ ¼ 4x2
þ 8xy  11y2
, (b) qðx; yÞ ¼ 2x2
 6xy þ 10y2
9.59. For each of the following quadratic forms qðx; y; zÞ, find an orthogonal substitution expressing x; y; z in terms
of variables r; s; t, and find qðr; s; tÞ:
(a) qðx; y; zÞ ¼ 5x2
þ 3y2
þ 12xz; (b) qðx; y; zÞ ¼ 3x2
 4xy þ 6y2
þ 2xz  4yz þ 3z2
322 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
9.60. Find a real 2 2 symmetric matrix A with eigenvalues:
(a) l ¼ 1 and l ¼ 4 and eigenvector u ¼ ð1; 1Þ belonging to l ¼ 1;
(b) l ¼ 2 and l ¼ 3 and eigenvector u ¼ ð1; 2Þ belonging to l ¼ 2.
In each case, find a matrix B for which B2
¼ A.
Characteristic and Minimal Polynomials
9.61. Find the characteristic and minimal polynomials of each of the following matrices:
(a) A ¼
3 1 1
2 4 2
1 1 3
2
4
3
5, (b) B ¼
3 2 1
3 8 3
3 6 1
2
4
3
5
9.62. Find the characteristic and minimal polynomials of each of the following matrices:
(a) A ¼
2 5 0 0 0
0 2 0 0 0
0 0 4 2 0
0 0 3 5 0
0 0 0 0 7
2
6
6
6
6
4
3
7
7
7
7
5
, (b) B ¼
4 1 0 0 0
1 2 0 0 0
0 0 3 1 0
0 0 0 3 1
0 0 0 0 3
2
6
6
6
6
4
3
7
7
7
7
5
, (c) C ¼
3 2 0 0 0
1 4 0 0 0
0 0 3 1 0
0 0 1 3 0
0 0 0 0 4
2
6
6
6
6
4
3
7
7
7
7
5
9.63. Let A ¼
1 1 0
0 2 0
0 0 1
2
4
3
5 and B ¼
2 0 0
0 2 2
0 0 1
2
4
3
5. Show that A and B have different characteristic polynomials
(and so are not similar) but have the same minimal polynomial. Thus, nonsimilar matrices may have the
same minimal polynomial.
9.64. Let A be an n-square matrix for which Ak
¼ 0 for some k  n. Show that An
¼ 0.
9.65. Show that a matrix A and its transpose AT
have the same minimal polynomial.
9.66. Suppose f ðtÞ is an irreducible monic polynomial for which f ðAÞ ¼ 0 for a matrix A. Show that f ðtÞ is the
minimal polynomial of A.
9.67. Show that A is a scalar matrix kI if and only if the minimal polynomial of A is mðtÞ ¼ t  k.
9.68. Find a matrix A whose minimal polynomial is (a) t3
 5t2
þ 6t þ 8, (b) t4
 5t3
 2t þ 7t þ 4.
9.69. Let f ðtÞ and gðtÞ be monic polynomials (leading coefficient one) of minimal degree for which A is a root.
Show f ðtÞ ¼ gðtÞ: [Thus, the minimal polynomial of A is unique.]
ANSWERS TO SUPPLEMENTARY PROBLEMS
Notation: M ¼ ½R1; R2; . . . denotes a matrix M with rows R1; R2; . . . :
9.38. f ðAÞ ¼ ½26; 3; 5; 27, gðAÞ ¼ ½40; 39; 65; 27,
f ðBÞ ¼ ½3; 6; 0; 9, gðBÞ ¼ ½3; 12; 0; 15
9.39. A2
¼ ½1; 4; 0; 1, A3
¼ ½1; 6; 0; 1, An
¼ ½1; 2n; 0; 1, A1
¼ ½1; 2; 0; 1
9.40. Let A ¼ ½2; a; b; 0; 2; c; 0; 0; 2. Set B ¼ A3
and then a ¼ 1, b ¼  1
2, c ¼ 1
CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 323
9.41. Find DðtÞ: (a) t2
þ t  11, (b) t2
þ 2t þ 13, (c) t3
 7t2
þ 6t  1
9.45. (a) l ¼ 1; u ¼ ð3; 1Þ; l ¼ 4; v ¼ ð1; 2Þ, (b) l ¼ 4; u ¼ ð2; 1Þ,
(c) l ¼ 1; u ¼ ð2; 1Þ; l ¼ 5; v ¼ ð2; 3Þ. Only A and C can be diagonalized; use P ¼ ½u; v.
9.46. (a) l ¼ 1; u ¼ ð1; 1Þ; l ¼ 4; v ¼ ð1; 2Þ,
(b) P ¼ ½u; v,
(c) f ðAÞ ¼ ½3; 1; 2; 1; A8
¼ ½21 846; 21 845; 43 690; 43 691,
(d) B ¼ 4
3 ;  1
3 ;  2
3 ; 5
3
 
9.47. (a) l ¼ 1; u ¼ ð3; 2Þ; l ¼ 2; v ¼ ð2; 1Þ, (b) P ¼ ½u; v,
(c) f ðAÞ ¼ ½2; 6; 2; 9; A8
¼ ½1021; 1530; 510; 764,
(d) B ¼ ½3 þ 4
ffiffiffi
2
p
; 6 þ 6
ffiffiffi
2
p
; 2  2
ffiffiffi
2
p
; 4  3
ffiffiffi
2
p

9.48. (a) l ¼ 2; u ¼ ð1; 1; 0Þ; v ¼ ð1; 0; 1Þ; l ¼ 4; w ¼ ð1; 1; 2Þ,
(b) l ¼ 2; u ¼ ð1; 1; 0Þ; l ¼ 4; v ¼ ð0; 1; 1Þ,
(c) l ¼ 3; u ¼ ð1; 1; 0Þ; v ¼ ð1; 0; 1Þ; l ¼ 1; w ¼ ð2; 1; 1Þ. Only A and C can be diagonalized; use
P ¼ ½u; v; w:
9.49. (a) l ¼ 2; u ¼ ð3; 1Þ; l ¼ 6; v ¼ ð1; 1Þ, (b) No real eigenvalues
9.50. We need ½trðAÞ2
 4½detðAÞ  0 or ða  dÞ2
þ 4bc  0.
9.51. A ¼ ½1; 1; 0; 1
9.56. (a) P ¼ ½2; 1; 1; 2=
ffiffiffi
5
p
, D ¼ ½7; 0; 0; 3,
(b) P ¼ ½1; 1; 1; 1=
ffiffiffi
2
p
, D ¼ ½3; 0; 0; 5,
(c) P ¼ ½3; 1; 1; 3=
ffiffiffiffiffi
10
p
, D ¼ ½8; 0; 0; 2
9.57. (a) l ¼ 1; u ¼ ð1; 1; 0Þ; v ¼ ð1; 1; 2Þ; l ¼ 2; w ¼ ð1; 1; 1Þ,
(b) l ¼ 1; u ¼ ð2; 1; 1Þ; v ¼ ð2; 3; 1Þ; l ¼ 22; w ¼ ð1; 2; 4Þ;
Normalize u; v; w, obtaining ^
u; ^
v; ^
w, and set P ¼ ½^
u; ^
v; ^
w. (Remark: u and v are not unique.)
9.58. (a) x ¼ ð4s þ tÞ=
ffiffiffiffiffi
17
p
; y ¼ ðs þ 4tÞ=
ffiffiffiffiffi
17
p
; qðs; tÞ ¼ 5s2
 12t2
,
(b) x ¼ ð3s  tÞ=
ffiffiffiffiffi
10
p
; y ¼ ðs þ 3tÞ=
ffiffiffiffiffi
10
p
; qðs; tÞ ¼ s2
þ 11t2
9.59. (a) x ¼ ð3s þ 2tÞ=
ffiffiffiffiffi
13
p
; y ¼ r; z ¼ ð2s  3tÞ=
ffiffiffiffiffi
13
p
; qðr; s; tÞ ¼ 3r2
þ 9s2
 4t2
,
(b) x ¼ 5Ks þ Lt; y ¼ Jr þ 2Ks  2Lt; z ¼ 2Jr  Ks  Lt, where J ¼ 1=
ffiffiffi
5
p
, K ¼ 1=
ffiffiffiffiffi
30
p
,
L ¼ 1=
ffiffiffi
6
p
; qðr; s; tÞ ¼ 2r2
þ 2s2
þ 8t2
9.60. (a) A ¼ 1
2 ½5; 3; 3; 5; B ¼ 1
2 ½3; 1; 1; 3,
(b) A ¼ 1
5 ½14; 2; 2; 11, B ¼ 1
5 ½
ffiffiffi
2
p
þ 4
ffiffiffi
3
p
; 2
ffiffiffi
2
p
 2
ffiffiffi
3
p
; 2
ffiffiffi
2
p
 2
ffiffiffi
3
p
; 4
ffiffiffi
2
p
þ
ffiffiffi
3
p

9.61. (a) DðtÞ ¼ mðtÞ ¼ ðt  2Þ2
ðt  6Þ, (b) DðtÞ ¼ ðt  2Þ2
ðt  6Þ; mðtÞ ¼ ðt  2Þðt  6Þ
9.62. (a) DðtÞ ¼ ðt  2Þ3
ðt  7Þ2
; mðtÞ ¼ ðt  2Þ2
ðt  7Þ,
(b) DðtÞ ¼ ðt  3Þ5
; mðtÞ ¼ ðt  3Þ3
,
(c) DðtÞ ¼ ðt  2Þ2
ðt  4Þ2
ðt  5Þ; mðtÞ ¼ ðt  2Þðt  4Þðt  5Þ
9.68. Let A be the companion matrix [Example 9.12(b)] with last column: (a) ½8; 6; 5T
, (b) ½4; 7; 2; 5T
9.69. Hint: A is a root of hðtÞ ¼ f ðtÞ  gðtÞ, where hðtÞ  0 or the degree of hðtÞ is less than the degree of f ðtÞ:
324 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
Canonical Forms
10.1 Introduction
Let T be a linear operator on a vector space of finite dimension. As seen in Chapter 6, T may not have a
diagonal matrix representation. However, it is still possible to ‘‘simplify’’ the matrix representation of T
in a number of ways. This is the main topic of this chapter. In particular, we obtain the primary
decomposition theorem, and the triangular, Jordan, and rational canonical forms.
We comment that the triangular and Jordan canonical forms exist for T if and only if the characteristic
polynomial DðtÞ of T has all its roots in the base field K. This is always true if K is the complex field C
but may not be true if K is the real field R.
We also introduce the idea of a quotient space. This is a very powerful tool, and it will be used in the
proof of the existence of the triangular and rational canonical forms.
10.2 Triangular Form
Let T be a linear operator on an n-dimensional vector space V. Suppose T can be represented by the
triangular matrix
A ¼
a11 a12 . . . a1n
a22 . . . a2n
. . . . . .
ann
2
6
6
4
3
7
7
5
Then the characteristic polynomial DðtÞ of T is a product of linear factors; that is,
DðtÞ ¼ detðtI  AÞ ¼ ðt  a11Þðt  a22Þ    ðt  annÞ
The converse is also true and is an important theorem (proved in Problem 10.28).
THEOREM 10.1: Let T:V ! V be a linear operator whose characteristic polynomial factors into
linear polynomials. Then there exists a basis of V in which T is represented by a
triangular matrix.
THEOREM 10.1: (Alternative Form) Let A be a square matrix whose characteristic polynomial
factors into linear polynomials. Then A is similar to a triangular matrix—that is,
there exists an invertible matrix P such that P1
AP is triangular.
We say that an operator T can be brought into triangular form if it can be represented by a triangular
matrix. Note that in this case, the eigenvalues of T are precisely those entries appearing on the main
diagonal. We give an application of this remark.
CHAPTER 10
325
EXAMPLE 10.1 Let A be a square matrix over the complex field C. Suppose l is an eigenvalue of A2
. Show that
ffiffiffi
l
p
or 
ffiffiffi
l
p
is an eigenvalue of A.
By Theorem 10.1, A and A2
are similar, respectively, to triangular matrices of the form
B ¼
m1 * . . . *
m2 . . . *
. . . . . .
mn
2
6
6
4
3
7
7
5 and B2
¼
m2
1 * . . . *
m2
2 . . . *
. . . . . .
m2
n
2
6
6
4
3
7
7
5
Because similar matrices have the same eigenvalues, l ¼ m2
i for some i. Hence, mi ¼
ffiffiffi
l
p
or mi ¼ 
ffiffiffi
l
p
is an
eigenvalue of A.
10.3 Invariance
Let T:V ! V be linear. A subspace W of V is said to be invariant under T or T-invariant if T maps W
into itself—that is, if v 2 W implies TðvÞ 2 W. In this case, T restricted to W defines a linear operator on
W; that is, T induces a linear operator ^
T:W ! W defined by ^
TðwÞ ¼ TðwÞ for every w 2 W.
EXAMPLE 10.2
(a) Let T: R3
! R3
be the following linear operator, which rotates each vector v about the z-axis by an angle y
(shown in Fig. 10-1):
Tðx; y; zÞ ¼ ðx cos y  y sin y; x sin y þ y cos y; zÞ
Observe that each vector w ¼ ða; b; 0Þ in the xy-plane W remains in W under the mapping T; hence, W is
T-invariant. Observe also that the z-axis U is invariant under T. Furthermore, the restriction of T to W rotates
each vector about the origin O, and the restriction of T to U is the identity mapping of U.
(b) Nonzero eigenvectors of a linear operator T:V ! V may be characterized as generators of T-invariant
one-dimensional subspaces. Suppose TðvÞ ¼ lv, v 6¼ 0. Then W ¼ fkv; k 2 Kg, the one-dimensional
subspace generated by v, is invariant under T because
TðkvÞ ¼ kTðvÞ ¼ kðlvÞ ¼ klv 2 W
Conversely, suppose dim U ¼ 1 and u 6¼ 0 spans U, and U is invariant under T. Then TðuÞ 2 U and so TðuÞ is a
multiple of u—that is, TðuÞ ¼ mu. Hence, u is an eigenvector of T.
The next theorem (proved in Problem 10.3) gives us an important class of invariant subspaces.
THEOREM 10.2: Let T:V ! V be any linear operator, and let f ðtÞ be any polynomial. Then the
kernel of f ðTÞ is invariant under T.
The notion of invariance is related to matrix representations (Problem 10.5) as follows.
THEOREM 10.3: Suppose W is an invariant subspace of T:V ! V. Then T has a block matrix repre-
sentation
A B
0 C
 
, where A is a matrix representation of the restriction ^
T of T to W.
0
W
y
z
x
U
T( )
v
T w
( )
θ
θ
v
w
Figure 10-1
326 CHAPTER 10 Canonical Forms
10.4 Invariant Direct-Sum Decompositions
A vector space V is termed the direct sum of subspaces W1; . . . ; Wr, written
V ¼ W1 W2 . . . Wr
if every vector v 2 V can be written uniquely in the form
v ¼ w1 þ w2 þ . . . þ wr; with wi 2 Wi
The following theorem (proved in Problem 10.7) holds.
THEOREM 10.4: Suppose W1; W2; . . . ; Wr are subspaces of V, and suppose
B1 ¼ fw11; w12; . . . ; w1n1
g; . . . ; Br ¼ fwr1; wr2; . . . ; wrnr
g
are bases of W1; W2; . . . ; Wr, respectively. Then V is the direct sum of the Wi if and
only if the union B ¼ B1 [ . . . [ Br is a basis of V.
Now suppose T:V ! V is linear and V is the direct sum of (nonzero) T-invariant subspaces
W1; W2; . . . ; Wr; that is,
V ¼ W1 . . . Wr and TðWiÞ  Wi; i ¼ 1; . . . ; r
Let Ti denote the restriction of T to Wi. Then T is said to be decomposable into the operators Ti or T is
said to be the direct sum of the Ti; written T ¼ T1 . . . Tr: Also, the subspaces W1; . . . ; Wr are said to
reduce T or to form a T-invariant direct-sum decomposition of V.
Consider the special case where two subspaces U and W reduce an operator T:V ! V; say dim U ¼ 2
and dim W ¼ 3, and suppose fu1; u2g and fw1; w2; w3g are bases of U and W, respectively. If T1 and T2
denote the restrictions of T to U and W, respectively, then
T1ðu1Þ ¼ a11u1 þ a12u2
T1ðu2Þ ¼ a21u1 þ a22u2
T2ðw1Þ ¼ b11w1 þ b12w2 þ b13w3
T2ðw2Þ ¼ b21w1 þ b22w2 þ b23w3
T2ðw3Þ ¼ b31w1 þ b32w2 þ b33w3
Accordingly, the following matrices A; B; M are the matrix representations of T1, T2, T, respectively,
A ¼
a11 a21
a12 a22
 
; B ¼
b11 b21 b31
b12 b22 b32
b13 b23 b33
2
4
3
5; M ¼
A 0
0 B
 
The block diagonal matrix M results from the fact that fu1; u2; w1; w2; w3g is a basis of V (Theorem 10.4),
and that TðuiÞ ¼ T1ðuiÞ and TðwjÞ ¼ T2ðwjÞ.
A generalization of the above argument gives us the following theorem.
THEOREM 10.5: Suppose T:V ! V is linear and suppose V is the direct sum of T-invariant
subspaces, say, W1; . . . ; Wr. If Ai is a matrix representation of the restriction of
T to Wi, then T can be represented by the block diagonal matrix:
M ¼ diagðA1; A2; . . . ; ArÞ
10.5 Primary Decomposition
The following theorem shows that any operator T:V ! V is decomposable into operators whose
minimum polynomials are powers of irreducible polynomials. This is the first step in obtaining a
canonical form for T.
CHAPTER 10 Canonical Forms 327
THEOREM 10.6: (Primary Decomposition Theorem) Let T:V ! V be a linear operator with
minimal polynomial
mðtÞ ¼ f1ðtÞn1
f2ðtÞn2
   frðtÞnr
where the fiðtÞ are distinct monic irreducible polynomials. Then V is the direct sum
of T-invariant subspaces W1; . . . ; Wr, where Wi is the kernel of fiðTÞni
. Moreover,
fiðtÞni
is the minimal polynomial of the restriction of T to Wi.
The above polynomials fiðtÞni
are relatively prime. Therefore, the above fundamental theorem
follows (Problem 10.11) from the next two theorems (proved in Problems 10.9 and 10.10, respectively).
THEOREM 10.7: Suppose T:V ! V is linear, and suppose f ðtÞ ¼ gðtÞhðtÞ are polynomials such that
f ðTÞ ¼ 0 and gðtÞ and hðtÞ are relatively prime. Then V is the direct sum of the
T-invariant subspace U and W, where U ¼ Ker gðTÞ and W ¼ Ker hðTÞ.
THEOREM 10.8: In Theorem 10.7, if f ðtÞ is the minimal polynomial of T [and gðtÞ and hðtÞ are
monic], then gðtÞ and hðtÞ are the minimal polynomials of the restrictions of T to U
and W, respectively.
We will also use the primary decomposition theorem to prove the following useful characterization of
diagonalizable operators (see Problem 10.12 for the proof).
THEOREM 10.9: A linear operator T:V ! V is diagonalizable if and only if its minimal polynomial
mðtÞ is a product of distinct linear polynomials.
THEOREM 10.9: (Alternative Form) A matrix A is similar to a diagonal matrix if and only if its
minimal polynomial is a product of distinct linear polynomials.
EXAMPLE 10.3 Suppose A 6¼ I is a square matrix for which A3
¼ I. Determine whether or not A is similar to a
diagonal matrix if A is a matrix over: (i) the real field R, (ii) the complex field C.
Because A3
¼ I, A is a zero of the polynomial f ðtÞ ¼ t3
 1 ¼ ðt  1Þðt2
þ t þ 1Þ: The minimal polynomial mðtÞ
of A cannot be t  1, because A 6¼ I. Hence,
mðtÞ ¼ t2
þ t þ 1 or mðtÞ ¼ t3
 1
Because neither polynomial is a product of linear polynomials over R, A is not diagonalizable over R. On the
other hand, each of the polynomials is a product of distinct linear polynomials over C. Hence, A is diagonalizable
over C.
10.6 Nilpotent Operators
A linear operator T:V ! V is termed nilpotent if Tn
¼ 0 for some positive integer n; we call k the index
of nilpotency of T if Tk
¼ 0 but Tk1
6¼ 0: Analogously, a square matrix A is termed nilpotent if An
¼ 0
for some positive integer n, and of index k if Ak
¼ 0 but Ak1
6¼ 0. Clearly the minimum polynomial of a
nilpotent operator (matrix) of index k is mðtÞ ¼ tk
; hence, 0 is its only eigenvalue.
EXAMPLE 10.4 The following two r-square matrices will be used throughout the chapter:
N ¼ NðrÞ ¼
0 1 0 . . . 0 0
0 0 1 . . . 0 0
::::::::::::::::::::::::::::::::
0 0 0 . . . 0 1
0 0 0 . . . 0 0
2
6
6
6
6
4
3
7
7
7
7
5
and JðlÞ ¼
l 1 0 . . . 0 0
0 l 1 . . . 0 0
::::::::::::::::::::::::::::::::
0 0 0 . . . l 1
0 0 0 . . . 0 l
2
6
6
6
6
4
3
7
7
7
7
5
328 CHAPTER 10 Canonical Forms
The first matrix N, called a Jordan nilpotent block, consists of 1’s above the diagonal (called the super-
diagonal), and 0’s elsewhere. It is a nilpotent matrix of index r. (The matrix N of order 1 is just the 1 1 zero
matrix [0].)
The second matrix JðlÞ, called a Jordan block belonging to the eigenvalue l, consists of l’s on the diagonal, 1’s
on the superdiagonal, and 0’s elsewhere. Observe that
JðlÞ ¼ lI þ N
In fact, we will prove that any linear operator T can be decomposed into operators, each of which is the sum of a
scalar operator and a nilpotent operator.
The following (proved in Problem 10.16) is a fundamental result on nilpotent operators.
THEOREM 10.10: Let T:V ! V be a nilpotent operator of index k. Then T has a block diagonal
matrix representation in which each diagonal entry is a Jordan nilpotent block N.
There is at least one N of order k, and all other N are of orders k. The number of
N of each possible order is uniquely determined by T. The total number of N of all
orders is equal to the nullity of T.
The proof of Theorem 10.10 shows that the number of N of order i is equal to 2mi  miþ1  mi1,
where mi is the nullity of Ti
.
10.7 Jordan Canonical Form
An operator T can be put into Jordan canonical form if its characteristic and minimal polynomials factor
into linear polynomials. This is always true if K is the complex field C. In any case, we can always extend
the base field K to a field in which the characteristic and minimal polynomials do factor into linear
factors; thus, in a broad sense, every operator has a Jordan canonical form. Analogously, every matrix is
similar to a matrix in Jordan canonical form.
The following theorem (proved in Problem 10.18) describes the Jordan canonical form J of a linear
operator T.
THEOREM 10.11: Let T:V ! V be a linear operator whose characteristic and minimal polynomials
are, respectively,
DðtÞ ¼ ðt  l1Þn1
   ðt  lrÞnr
and mðtÞ ¼ ðt  l1Þm1
   ðt  lrÞmr
where the li are distinct scalars. Then T has a block diagonal matrix representa-
tion J in which each diagonal entry is a Jordan block Jij ¼ JðliÞ. For each lij, the
corresponding Jij have the following properties:
(i) There is at least one Jij of order mi; all other Jij are of order mi.
(ii) The sum of the orders of the Jij is ni.
(iii) The number of Jij equals the geometric multiplicity of li.
(iv) The number of Jij of each possible order is uniquely determined by T.
EXAMPLE 10.5 Suppose the characteristic and minimal polynomials of an operator T are, respec-
tively,
DðtÞ ¼ ðt  2Þ4
ðt  5Þ3
and mðtÞ ¼ ðt  2Þ2
ðt  5Þ3
CHAPTER 10 Canonical Forms 329
Then the Jordan canonical form of T is one of the following block diagonal matrices:
diag
2 1
0 2
 
;
2 1
0 2
 
;
5 1 0
0 5 1
0 0 5
2
4
3
5
0
@
1
A or diag
2 1
0 2
 
; ½2; ½2;
5 1 0
0 5 1
0 0 5
2
4
3
5
0
@
1
A
The first matrix occurs if T has two independent eigenvectors belonging to the eigenvalue 2; and the second matrix
occurs if T has three independent eigenvectors belonging to the eigenvalue 2.
10.8 Cyclic Subspaces
Let T be a linear operator on a vector space V of finite dimension over K. Suppose v 2 V and v 6¼ 0. The
set of all vectors of the form f ðTÞðvÞ, where f ðtÞ ranges over all polynomials over K, is a T-invariant
subspace of V called the T-cyclic subspace of V generated by v; we denote it by Zðv; TÞ and denote the
restriction of T to Zðv; TÞ by Tv: By Problem 10.56, we could equivalently define Zðv; TÞ as the
intersection of all T-invariant subspaces of V containing v.
Now consider the sequence
v; TðvÞ; T2
ðvÞ; T3
ðvÞ; . . .
of powers of T acting on v. Let k be the least integer such that Tk
ðvÞ is a linear combination of those
vectors that precede it in the sequence, say,
Tk
ðvÞ ¼ ak1Tk1
ðvÞ      a1TðvÞ  a0v
mvðtÞ ¼ tk
þ ak1tk1
þ    þ a1t þ a0
Then
is the unique monic polynomial of lowest degree for which mvðTÞðvÞ ¼ 0. We call mvðtÞ the
T-annihilator of v and Zðv; TÞ.
The following theorem (proved in Problem 10.29) holds.
THEOREM 10.12: Let Zðv; TÞ, Tv, mvðtÞ be defined as above. Then
(i) The set fv; TðvÞ; . . . ; Tk1
ðvÞg is a basis of Zðv; TÞ; hence, dim Zðv; TÞ ¼ k.
(ii) The minimal polynomial of Tv is mvðtÞ.
(iii) The matrix representation of Tv in the above basis is just the companion
matrix CðmvÞ of mvðtÞ; that is,
CðmvÞ ¼
0 0 0 . . . 0 a0
1 0 0 . . . 0 a1
0 1 0 . . . 0 a2
::::::::::::::::::::::::::::::::::::::::
0 0 0 . . . 0 ak2
0 0 0 . . . 1 ak1
2
6
6
6
6
6
6
4
3
7
7
7
7
7
7
5
10.9 Rational Canonical Form
In this section, we present the rational canonical form for a linear operator T:V ! V. We emphasize that
this form exists even when the minimal polynomial cannot be factored into linear polynomials. (Recall
that this is not the case for the Jordan canonical form.)
330 CHAPTER 10 Canonical Forms
LEMMA 10.13: Let T:V ! V be a linear operator whose minimal polynomial is f ðtÞn
, where f ðtÞ is a
monic irreducible polynomial. Then V is the direct sum
V ¼ Zðv1; TÞ    Zðvr; TÞ
of T-cyclic subspaces Zðvi; TÞ with corresponding T-annihilators
f ðtÞn1
; f ðtÞn2
; . . . ; f ðtÞnr
; n ¼ n1  n2  . . .  nr
Any other decomposition of V into T-cyclic subspaces has the same number of
components and the same set of T-annihilators.
We emphasize that the above lemma (proved in Problem 10.31) does not say that the vectors vi or
other T-cyclic subspaces Zðvi; TÞ are uniquely determined by T, but it does say that the set of
T-annihilators is uniquely determined by T. Thus, T has a unique block diagonal matrix representation:
M ¼ diagðC1; C2; . . . ; CrÞ
where the Ci are companion matrices. In fact, the Ci are the companion matrices of the polynomials f ðtÞni
.
Using the Primary Decomposition Theorem and Lemma 10.13, we obtain the following result.
THEOREM 10.14: Let T:V ! V be a linear operator with minimal polynomial
mðtÞ ¼ f1ðtÞm1
f2ðtÞm2
   fsðtÞms
where the fiðtÞ are distinct monic irreducible polynomials. Then T has a unique
block diagonal matrix representation:
M ¼ diagðC11; C12; . . . ; C1r1
; . . . ; Cs1; Cs2; . . . ; Csrs
Þ
where the Cij are companion matrices. In particular, the Cij are the companion
matrices of the polynomials fiðtÞnij
, where
m1 ¼ n11  n12      n1r1
; . . . ; ms ¼ ns1  ns2      nsrs
The above matrix representation of T is called its rational canonical form. The polynomials fiðtÞnij
are called the elementary divisors of T.
EXAMPLE 10.6 Let V be a vector space of dimension 8 over the rational field Q, and let T be a linear operator on
V whose minimal polynomial is
mðtÞ ¼ f1ðtÞf2ðtÞ2
¼ ðt4
 4t3
þ 6t2
 4t  7Þðt  3Þ2
Thus, because dim V ¼ 8; the characteristic polynomial DðtÞ ¼ f1ðtÞ f2ðtÞ4
: Also, the rational canonical form M of T
must have one block the companion matrix of f1ðtÞ and one block the companion matrix of f2ðtÞ2
. There are two
possibilities:
(a) diag½Cðt4
 4t3
þ 6t2
 4t  7Þ, Cððt  3Þ2
Þ, Cððt  3Þ2
Þ
(b) diag½Cðt4
 4t3
þ 6t2
 4t  7Þ, Cððt  3Þ2
Þ, Cðt  3Þ; Cðt  3Þ
That is,
(a) diag
0 0 0 7
1 0 0 4
0 1 0 6
0 0 1 4
2
6
6
4
3
7
7
5;
0 9
1 6
 
;
0 9
1 6
 
0
B
B
@
1
C
C
A; (b) diag
0 0 0 7
1 0 0 4
0 1 0 6
0 0 1 4
2
6
6
4
3
7
7
5;
0 9
1 6
 
; ½3; ½3
0
B
B
@
1
C
C
A
10.10 Quotient Spaces
Let V be a vector space over a field K and let W be a subspace of V. If v is any vector in V, we write
v þ W for the set of sums v þ w with w 2 W; that is,
v þ W ¼ fv þ w : w 2 Wg
CHAPTER 10 Canonical Forms 331
These sets are called the cosets of W in V. We show (Problem 10.22) that these cosets partition V into
mutually disjoint subsets.
EXAMPLE 10.7 Let W be the subspace of R2
defined by
W ¼ fða; bÞ : a ¼ bg;
that is, W is the line given by the equation x  y ¼ 0. We can view
v þ W as a translation of the line obtained by adding the vector v
to each point in W. As shown in Fig. 10-2, the coset v þ W is also
a line, and it is parallel to W. Thus, the cosets of W in R2
are
precisely all the lines parallel to W.
In the following theorem, we use the cosets of a subspace
W of a vector space V to define a new vector space; it is
called the quotient space of V by W and is denoted by V=W.
THEOREM 10.15: Let W be a subspace of a vector space over a field K. Then the cosets of W in V
form a vector space over K with the following operations of addition and scalar
multiplication:
ðiÞ ðu þ wÞ þ ðv þ WÞ ¼ ðu þ vÞ þ W; ðiiÞ kðu þ WÞ ¼ ku þ W; where k 2 K
We note that, in the proof of Theorem 10.15 (Problem 10.24), it is first necessary to show that the
operations are well defined; that is, whenever u þ W ¼ u0
þ W and v þ W ¼ v0
þ W, then
ðiÞ ðu þ vÞ þ W ¼ ðu0
þ v0
Þ þ W and ðiiÞ ku þ W ¼ ku0
þ W for any k 2 K
In the case of an invariant subspace, we have the following useful result (proved in Problem 10.27).
THEOREM 10.16: Suppose W is a subspace invariant under a linear operator T:V ! V. Then T
induces a linear operator 
T on V=W defined by 
Tðv þ WÞ ¼ TðvÞ þ W. Moreover,
if T is a zero of any polynomial, then so is 
T. Thus, the minimal polynomial of 
T
divides the minimal polynomial of T.
SOLVED PROBLEMS
Invariant Subspaces
10.1. Suppose T:V ! V is linear. Show that each of the following is invariant under T:
(a) f0g, (b) V, (c) kernel of T, (d) image of T.
(a) We have Tð0Þ ¼ 0 2 f0g; hence, f0g is invariant under T.
(b) For every v 2 V , TðvÞ 2 V; hence, V is invariant under T.
(c) Let u 2 Ker T. Then TðuÞ ¼ 0 2 Ker T because the kernel of T is a subspace of V. Thus, Ker T is
invariant under T.
(d) Because TðvÞ 2 Im T for every v 2 V, it is certainly true when v 2 Im T. Hence, the image of T is
invariant under T.
10.2. Suppose fWig is a collection of T-invariant subspaces of a vector space V. Show that the
intersection W ¼
T
i Wi is also T-invariant.
Suppose v 2 W; then v 2 Wi for every i. Because Wi is T-invariant, TðvÞ 2 Wi for every i. Thus,
TðvÞ 2 W and so W is T-invariant.
Figure 10-2
332 CHAPTER 10 Canonical Forms
10.3. Prove Theorem 10.2: Let T:V ! V be linear. For any polynomial f ðtÞ, the kernel of f ðTÞ is
invariant under T.
Suppose v 2 Ker f ðTÞ—that is, f ðTÞðvÞ ¼ 0. We need to show that TðvÞ also belongs to the kernel of
f ðTÞ—that is, f ðTÞðTðvÞÞ ¼ ð f ðTÞ  TÞðvÞ ¼ 0. Because f ðtÞt ¼ tf ðtÞ, we have f ðTÞ  T ¼ T  f ðTÞ.
Thus, as required,
ð f ðTÞ  TÞðvÞ ¼ ðT  f ðTÞÞðvÞ ¼ Tð f ðTÞðvÞÞ ¼ Tð0Þ ¼ 0
10.4. Find all invariant subspaces of A ¼
2 5
1 2
 
viewed as an operator on R2
.
By Problem 10.1, R2
and f0g are invariant under A. Now if A has any other invariant subspace, it must
be one-dimensional. However, the characteristic polynomial of A is
DðtÞ ¼ t2
 trðAÞ t þ jAj ¼ t2
þ 1
Hence, A has no eigenvalues (in R) and so A has no eigenvectors. But the one-dimensional invariant
subspaces correspond to the eigenvectors; thus, R2
and f0g are the only subspaces invariant under A.
10.5. Prove Theorem 10.3: Suppose W is T-invariant. Then T has a triangular block representation
A B
0 C
 
, where A is the matrix representation of the restriction ^
T of T to W.
We choose a basis fw1; . . . ; wrg of W and extend it to a basis fw1; . . . ; wr; v1; . . . ; vsg of V. We have
^
Tðw1Þ ¼ Tðw1Þ ¼ a11w1 þ    þ a1rwr
^
Tðw2Þ ¼ Tðw2Þ ¼ a21w1 þ    þ a2rwr
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
^
TðwrÞ ¼ TðwrÞ ¼ ar1w1 þ    þ arrwr
Tðv1Þ ¼ b11w1 þ    þ b1rwr þ c11v1 þ    þ c1svs
Tðv2Þ ¼ b21w1 þ    þ b2rwr þ c21v1 þ    þ c2svs
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
TðvsÞ ¼ bs1w1 þ    þ bsrwr þ cs1v1 þ    þ cssvs
But the matrix of T in this basis is the transpose of the matrix of coefficients in the above system of
equations (Section 6.2). Therefore, it has the form
A B
0 C
 
, where A is the transpose of the matrix of
coefficients for the obvious subsystem. By the same argument, A is the matrix of ^
T relative to the basis fwig
of W.
10.6. Let ^
T denote the restriction of an operator T to an invariant subspace W. Prove
(a) For any polynomial f ðtÞ, f ð ^
TÞðwÞ ¼ f ðTÞðwÞ.
(b) The minimal polynomial of ^
T divides the minimal polynomial of T.
(a) If f ðtÞ ¼ 0 or if f ðtÞ is a constant (i.e., of degree 1), then the result clearly holds.
Assume deg f ¼ n  1 and that the result holds for polynomials of degree less than n. Suppose that
f ðtÞ ¼ antn
þ an1tn1
þ    þ a1t þ a0
f ð ^
TÞðwÞ ¼ ðan
^
Tn
þ an1
^
Tn1
þ    þ a0IÞðwÞ
¼ ðan
^
Tn1
Þð ^
TðwÞÞ þ ðan1
^
Tn1
þ    þ a0IÞðwÞ
¼ ðanTn1
ÞðTðwÞÞ þ ðan1Tn1
þ    þ a0IÞðwÞ ¼ f ðTÞðwÞ
Then
(b) Let mðtÞ denote the minimal polynomial of T. Then by (a), mð ^
TÞðwÞ ¼ mðTÞðwÞ ¼ 0ðwÞ ¼ 0 for
every w 2 W; that is, ^
T is a zero of the polynomial mðtÞ. Hence, the minimal polynomial of ^
T divides
mðtÞ.
CHAPTER 10 Canonical Forms 333
Invariant Direct-Sum Decompositions
10.7. Prove Theorem 10.4: Suppose W1; W2; . . . ; Wr are subspaces of V with respective bases
B1 ¼ fw11; w12; . . . ; w1n1
g; . . . ; Br ¼ fwr1; wr2; . . . ; wrnr
g
Then V is the direct sum of the Wi if and only if the union B ¼
S
i Bi is a basis of V.
Suppose B is a basis of V. Then, for any v 2 V,
v ¼ a11w11 þ    þ a1n1
w1n1
þ    þ ar1wr1 þ    þ arnr
wrnr
¼ w1 þ w2 þ    þ wr
where wi ¼ ai1wi1 þ    þ aini
wini
2 Wi. We next show that such a sum is unique. Suppose
v ¼ w0
1 þ w0
2 þ    þ w0
r; where w0
i 2 Wi
Because fwi1; . . . ; wini
g is a basis of Wi, w0
i ¼ bi1wi1 þ    þ bini
wini
, and so
v ¼ b11w11 þ    þ b1n1
w1n1
þ    þ br1wr1 þ    þ brnr
wrnr
Because B is a basis of V; aij ¼ bij, for each i and each j. Hence, wi ¼ w0
i, and so the sum for v is unique.
Accordingly, V is the direct sum of the Wi.
Conversely, suppose V is the direct sum of the Wi. Then for any v 2 V, v ¼ w1 þ    þ wr, where
wi 2 Wi. Because fwiji
g is a basis of Wi, each wi is a linear combination of the wiji
, and so v is a linear
combination of the elements of B. Thus, B spans V. We now show that B is linearly independent. Suppose
a11w11 þ    þ a1n1
w1n1
þ    þ ar1wr1 þ    þ arnr
wrnr
¼ 0
Note that ai1wi1 þ    þ aini
wini
2 Wi. We also have that 0 ¼ 0 þ 0    0 2 Wi. Because such a sum for 0 is
unique,
ai1wi1 þ    þ aini
wini
¼ 0 for i ¼ 1; . . . ; r
The independence of the bases fwiji
g implies that all the a’s are 0. Thus, B is linearly independent and is a
basis of V.
10.8. Suppose T:V ! V is linear and suppose T ¼ T1 T2 with respect to a T-invariant direct-sum
decomposition V ¼ U W. Show that
(a) mðtÞ is the least common multiple of m1ðtÞ and m2ðtÞ, where mðtÞ, m1ðtÞ, m2ðtÞ are the
minimum polynomials of T; T1; T2, respectively.
(b) DðtÞ ¼ D1ðtÞD2ðtÞ, where DðtÞ; D1ðtÞ, D2ðtÞ are the characteristic polynomials of T; T1; T2,
respectively.
(a) By Problem 10.6, each of m1ðtÞ and m2ðtÞ divides mðtÞ. Now suppose f ðtÞ is a multiple of both m1ðtÞ
and m2ðtÞ, then f ðT1ÞðUÞ ¼ 0 and f ðT2ÞðWÞ ¼ 0. Let v 2 V, then v ¼ u þ w with u 2 U and w 2 W.
Now
f ðTÞv ¼ f ðTÞu þ f ðTÞw ¼ f ðT1Þu þ f ðT2Þw ¼ 0 þ 0 ¼ 0
That is, T is a zero of f ðtÞ. Hence, mðtÞ divides f ðtÞ, and so mðtÞ is the least common multiple of m1ðtÞ
and m2ðtÞ.
(b) By Theorem 10.5, T has a matrix representation M ¼
A 0
0 B
 
, where A and B are matrix representations
of T1 and T2, respectively. Then, as required,
DðtÞ ¼ jtI  Mj ¼
tI  A 0
0 tI  B







 ¼ jtI  AjjtI  Bj ¼ D1ðtÞD2ðtÞ
10.9. Prove Theorem 10.7: Suppose T:V ! V is linear, and suppose f ðtÞ ¼ gðtÞhðtÞ are polynomials
such that f ðTÞ ¼ 0 and gðtÞ and hðtÞ are relatively prime. Then V is the direct sum of the
T-invariant subspaces U and W where U ¼ Ker gðTÞ and W ¼ Ker hðTÞ.
334 CHAPTER 10 Canonical Forms
Note first that U and W are T-invariant by Theorem 10.2. Now, because gðtÞ and hðtÞ are relatively
prime, there exist polynomials rðtÞ and sðtÞ such that
rðtÞgðtÞ þ sðtÞhðtÞ ¼ 1
Hence; for the operator T; rðTÞgðTÞ þ sðTÞhðTÞ ¼ I ð*Þ
Let v 2 V; then; by ð*Þ; v ¼ rðTÞgðTÞv þ sðTÞhðTÞv
But the first term in this sum belongs to W ¼ Ker hðTÞ, because
hðTÞrðTÞgðTÞv ¼ rðTÞgðTÞhðTÞv ¼ rðTÞf ðTÞv ¼ rðTÞ0v ¼ 0
Similarly, the second term belongs to U. Hence, V is the sum of U and W.
To prove that V ¼ U W, we must show that a sum v ¼ u þ w with u 2 U, w 2 W, is uniquely
determined by v. Applying the operator rðTÞgðTÞ to v ¼ u þ w and using gðTÞu ¼ 0, we obtain
rðTÞgðTÞv ¼ rðTÞgðTÞu þ rðTÞgðTÞw ¼ rðTÞgðTÞw
Also, applying ð*Þ to w alone and using hðTÞw ¼ 0, we obtain
w ¼ rðTÞgðTÞw þ sðTÞhðTÞw ¼ rðTÞgðTÞw
Both of the above formulas give us w ¼ rðTÞgðTÞv, and so w is uniquely determined by v. Similarly u is
uniquely determined by v. Hence, V ¼ U W, as required.
10.10. Prove Theorem 10.8: In Theorem 10.7 (Problem 10.9), if f ðtÞ is the minimal polynomial of T
(and gðtÞ and hðtÞ are monic), then gðtÞ is the minimal polynomial of the restriction T1 of T to U
and hðtÞ is the minimal polynomial of the restriction T2 of T to W.
Let m1ðtÞ and m2ðtÞ be the minimal polynomials of T1 and T2, respectively. Note that gðT1Þ ¼ 0 and
hðT2Þ ¼ 0 because U ¼ Ker gðTÞ and W ¼ Ker hðTÞ. Thus,
m1ðtÞ divides gðtÞ and m2ðtÞ divides hðtÞ ð1Þ
By Problem 10.9, f ðtÞ is the least common multiple of m1ðtÞ and m2ðtÞ. But m1ðtÞ and m2ðtÞ are relatively
prime because gðtÞ and hðtÞ are relatively prime. Accordingly, f ðtÞ ¼ m1ðtÞm2ðtÞ. We also have that
f ðtÞ ¼ gðtÞhðtÞ. These two equations together with (1) and the fact that all the polynomials are monic imply
that gðtÞ ¼ m1ðtÞ and hðtÞ ¼ m2ðtÞ, as required.
10.11. Prove the Primary Decomposition Theorem 10.6: Let T:V ! V be a linear operator with
minimal polynomial
mðtÞ ¼ f1ðtÞn1
f2ðtÞn2
. . . frðtÞnr
where the fiðtÞ are distinct monic irreducible polynomials. Then V is the direct sum of T-
invariant subspaces W1; . . . ; Wr where Wi is the kernel of fiðTÞni
. Moreover, fiðtÞni
is the minimal
polynomial of the restriction of T to Wi.
The proof is by induction on r. The case r ¼ 1 is trivial. Suppose that the theorem has been proved for
r  1. By Theorem 10.7, we can write V as the direct sum of T-invariant subspaces W1 and V1, where W1 is
the kernel of f1ðTÞn1
and where V1 is the kernel of f2ðTÞn2
   frðTÞnr
. By Theorem 10.8, the minimal
polynomials of the restrictions of T to W1 and V1 are f1ðtÞn1
and f2ðtÞn2
   frðtÞnr
, respectively.
Denote the restriction of T to V1 by ^
T1. By the inductive hypothesis, V1 is the direct sum of subspaces
W2; . . . ; Wr such that Wi is the kernel of fiðT1Þni
and such that fiðtÞni
is the minimal polynomial for the
restriction of ^
T1 to Wi. But the kernel of fiðTÞni
, for i ¼ 2; . . . ; r is necessarily contained in V1, because
fiðtÞni
divides f2ðtÞn2
   frðtÞnr
. Thus, the kernel of fiðTÞni
is the same as the kernel of fiðT1Þni
, which is Wi.
Also, the restriction of T to Wi is the same as the restriction of ^
T1 to Wi (for i ¼ 2; . . . ; r); hence, fiðtÞni
is
also the minimal polynomial for the restriction of T to Wi. Thus, V ¼ W1 W2    Wr is the desired
decomposition of T.
10.12. Prove Theorem 10.9: A linear operator T:V ! V has a diagonal matrix representation if and only
if its minimal polynomal mðtÞ is a product of distinct linear polynomials.
CHAPTER 10 Canonical Forms 335
Suppose mðtÞ is a product of distinct linear polynomials, say,
mðtÞ ¼ ðt  l1Þðt  l2Þ    ðt  lrÞ
where the li are distinct scalars. By the Primary Decomposition Theorem, V is the direct sum of subspaces
W1; . . . ; Wr, where Wi ¼ KerðT  liIÞ. Thus, if v 2 Wi, then ðT  liIÞðvÞ ¼ 0 or TðvÞ ¼ liv. In other
words, every vector in Wi is an eigenvector belonging to the eigenvalue li. By Theorem 10.4, the union of
bases for W1; . . . ; Wr is a basis of V. This basis consists of eigenvectors, and so T is diagonalizable.
Conversely, suppose T is diagonalizable (i.e., V has a basis consisting of eigenvectors of T ). Let
l1; . . . ; ls be the distinct eigenvalues of T. Then the operator
f ðTÞ ¼ ðT  l1IÞðT  l2IÞ    ðT  lsIÞ
maps each basis vector into 0. Thus, f ðTÞ ¼ 0, and hence, the minimal polynomial mðtÞ of T divides the
polynomial
f ðtÞ ¼ ðt  l1Þðt  l2Þ    ðt  lsIÞ
Accordingly, mðtÞ is a product of distinct linear polynomials.
Nilpotent Operators, Jordan Canonical Form
10.13. Let T:V be linear. Suppose, for v 2 V, Tk
ðvÞ ¼ 0 but Tk1
ðvÞ 6¼ 0. Prove
(a) The set S ¼ fv; TðvÞ; . . . ; Tk1
ðvÞg is linearly independent.
(b) The subspace W generated by S is T-invariant.
(c) The restriction ^
T of T to W is nilpotent of index k.
(d) Relative to the basis fTk1
ðvÞ; . . . ; TðvÞ; vg of W, the matrix of T is the k-square Jordan
nilpotent block Nk of index k (see Example 10.5).
(a) Suppose
av þ a1TðvÞ þ a2T2
ðvÞ þ    þ ak1Tk1
ðvÞ ¼ 0 ð*Þ
Applying Tk1
to ð*Þ and using Tk
ðvÞ ¼ 0, we obtain aTk1
ðvÞ ¼ 0; because Tk1
ðvÞ 6¼ 0, a ¼ 0.
Now applying Tk2
to ð*Þ and using Tk
ðvÞ ¼ 0 and a ¼ 0, we fiind a1Tk1
ðvÞ ¼ 0; hence, a1 ¼ 0.
Next applying Tk3
to ð*Þ and using Tk
ðvÞ ¼ 0 and a ¼ a1 ¼ 0, we obtain a2Tk1
ðvÞ ¼ 0; hence,
a2 ¼ 0. Continuing this process, we find that all the a’s are 0; hence, S is independent.
(b) Let v 2 W. Then
v ¼ bv þ b1TðvÞ þ b2T2
ðvÞ þ    þ bk1Tk1
ðvÞ
Using Tk
ðvÞ ¼ 0, we have
TðvÞ ¼ bTðvÞ þ b1T2
ðvÞ þ    þ bk2Tk1
ðvÞ 2 W
Thus, W is T-invariant.
(c) By hypothesis, Tk
ðvÞ ¼ 0. Hence, for i ¼ 0; . . . ; k  1,
^
Tk
ðTi
ðvÞÞ ¼ Tkþi
ðvÞ ¼ 0
That is, applying ^
Tk
to each generator of W, we obtain 0; hence, ^
Tk
¼ 0 and so ^
T is nilpotent of index
at most k. On the other hand, ^
Tk1
ðvÞ ¼ Tk1
ðvÞ 6¼ 0; hence, T is nilpotent of index exactly k.
(d) For the basis fTk1
ðvÞ, Tk2
ðvÞ; . . . ; TðvÞ; vg of W,
^
TðTk1
ðvÞÞ ¼ Tk
ðvÞ ¼ 0
^
TðTk2
ðvÞÞ ¼ Tk1
ðvÞ
^
TðTk3
ðvÞÞ ¼ Tk2
ðvÞ
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
^
TðTðvÞÞ ¼ T2
ðvÞ
^
TðvÞ ¼ TðvÞ
Hence, as required, the matrix of T in this basis is the k-square Jordan nilpotent block Nk.
336 CHAPTER 10 Canonical Forms
10.14. Let T:V ! V be linear. Let U ¼ Ker Ti
and W ¼ Ker Tiþ1
. Show that
(a) U  W, (b) TðWÞ  U.
(a) Suppose u 2 U ¼ Ker Ti
. Then Ti
ðuÞ ¼ 0 and so Tiþ1
ðuÞ ¼ TðTi
ðuÞÞ ¼ Tð0Þ ¼ 0. Thus,
u 2 Ker Tiþ1
¼ W. But this is true for every u 2 U; hence, U  W.
(b) Similarly, if w 2 W ¼ Ker Tiþ1
, then Tiþ1
ðwÞ ¼ 0: Thus, Tiþ1
ðwÞ ¼ Ti
ðTðwÞÞ ¼ Ti
ð0Þ ¼ 0 and so
TðWÞ  U.
10.15. Let T:V be linear. Let X ¼ Ker Ti2
, Y ¼ Ker Ti1
, Z ¼ Ker Ti
. Therefore (Problem 10.14),
X  Y  Z. Suppose
fu1; . . . ; urg; fu1; . . . ; ur; v1; . . . ; vsg; fu1; . . . ; ur; v1; . . . ; vs; w1; . . . ; wtg
are bases of X; Y; Z, respectively. Show that
S ¼ fu1; . . . ; ur; Tðw1Þ; . . . ; TðwtÞg
is contained in Y and is linearly independent.
By Problem 10.14, TðZÞ  Y, and hence S  Y. Now suppose S is linearly dependent. Then there
exists a relation
a1u1 þ    þ arur þ b1Tðw1Þ þ    þ btTðwtÞ ¼ 0
where at least one coefficient is not zero. Furthermore, because fuig is independent, at least one of the bk
must be nonzero. Transposing, we find
b1Tðw1Þ þ    þ btTðwtÞ ¼ a1u1      arur 2 X ¼ Ker Ti2
Hence; Ti2
ðb1Tðw1Þ þ    þ btTðwtÞÞ ¼ 0
Thus; Ti1
ðb1w1 þ    þ btwtÞ ¼ 0; and so b1w1 þ    þ btwt 2 Y ¼ Ker Ti1
Because fui; vjg generates Y, we obtain a relation among the ui, vj, wk where one of the coefficients (i.e.,
one of the bk) is not zero. This contradicts the fact that fui; vj; wkg is independent. Hence, S must also be
independent.
10.16. Prove Theorem 10.10: Let T:V ! V be a nilpotent operator of index k. Then T has a unique
block diagonal matrix representation consisting of Jordan nilpotent blocks N. There is at least
one N of order k, and all other N are of orders k. The total number of N of all orders is equal to
the nullity of T.
Suppose dim V ¼ n. Let W1 ¼ Ker T, W2 ¼ Ker T2
; . . . ; Wk ¼ Ker Tk
. Let us set mi ¼ dim Wi, for
i ¼ 1; . . . ; k. Because T is of index k, Wk ¼ V and Wk1 6¼ V and so mk1  mk ¼ n. By Problem 10.14,
W1  W2      Wk ¼ V
Thus, by induction, we can choose a basis fu1; . . . ; ung of V such that fu1; . . . ; umi
g is a basis of Wi.
We now choose a new basis for V with respect to which T has the desired form. It will be convenient
to label the members of this new basis by pairs of indices. We begin by setting
vð1; kÞ ¼ umk1þ1; vð2; kÞ ¼ umk1þ2; . . . ; vðmk  mk1; kÞ ¼ umk
and setting
vð1; k  1Þ ¼ Tvð1; kÞ; vð2; k  1Þ ¼ Tvð2; kÞ; . . . ; vðmk  mk1; k  1Þ ¼ Tvðmk  mk1; kÞ
By the preceding problem,
S1 ¼ fu1 . . . ; umk2
; vð1; k  1Þ; . . . ; vðmk  mk1; k  1Þg
is a linearly independent subset of Wk1. We extend S1 to a basis of Wk1 by adjoining new elements (if
necessary), which we denote by
vðmk  mk1 þ 1; k  1Þ; vðmk  mk1 þ 2; k  1Þ; . . . ; vðmk1  mk2; k  1Þ
Next we set
vð1; k  2Þ ¼ Tvð1; k  1Þ; vð2; k  2Þ ¼ Tvð2; k  1Þ; . . . ;
vðmk1  mk2; k  2Þ ¼ Tvðmk1  mk2; k  1Þ
CHAPTER 10 Canonical Forms 337
Again by the preceding problem,
S2 ¼ fu1; . . . ; umks
; vð1; k  2Þ; . . . ; vðmk1  mk2; k  2Þg
is a linearly independent subset of Wk2, which we can extend to a basis of Wk2 by adjoining elements
vðmk1  mk2 þ 1; k  2Þ; vðmk1  mk2 þ 2; k  2Þ; . . . ; vðmk2  mk3; k  2Þ
Continuing in this manner, we get a new basis for V, which for convenient reference we arrange as follows:
vð1; kÞ . . . ; vðmk  mk1; kÞ
vð1; k  1Þ; . . . ; vðmk  mk1; k  1Þ . . . ; vðmk1  mk2; k  1Þ
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
vð1; 2Þ; . . . ; vðmk  mk1; 2Þ; . . . ; vðmk1  mk2; 2Þ; . . . ; vðm2  m1; 2Þ
vð1; 1Þ; . . . ; vðmk  mk1; 1Þ; . . . ; vðmk1  mk2; 1Þ; . . . ; vðm2  m1; 1Þ; . . . ; vðm1; 1Þ
The bottom row forms a basis of W1, the bottom two rows form a basis of W2, and so forth. But what is
important for us is that T maps each vector into the vector immediately below it in the table or into 0 if the
vector is in the bottom row. That is,
Tvði; jÞ ¼
vði; j  1Þ for j  1
0 for j ¼ 1
Now it is clear [see Problem 10.13(d)] that T will have the desired form if the vði; jÞ are ordered
lexicographically: beginning with vð1; 1Þ and moving up the first column to vð1; kÞ, then jumping to vð2; 1Þ
and moving up the second column as far as possible.
Moreover, there will be exactly mk  mk1 diagonal entries of order k: Also, there will be
ðmk1  mk2Þ  ðmk  mk1Þ ¼ 2mk1  mk  mk2 diagonal entries of order k  1
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
2m2  m1  m3 diagonal entries of order 2
2m1  m2 diagonal entries of order 1
as can be read off directly from the table. In particular, because the numbers m1; . . . ; mk are uniquely
determined by T, the number of diagonal entries of each order is uniquely determined by T. Finally, the
identity
m1 ¼ ðmk  mk1Þ þ ð2mk1  mk  mk2Þ þ    þ ð2m2  m1  m3Þ þ ð2m1  m2Þ
shows that the nullity m1 of T is the total number of diagonal entries of T.
10.17. Let A ¼
0 1 1 0 1
0 0 1 1 1
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
2
6
6
6
6
4
3
7
7
7
7
5
and B ¼
0 1 1 0 0
0 0 1 1 1
0 0 0 1 1
0 0 0 0 0
0 0 0 0 0
2
6
6
6
6
4
3
7
7
7
7
5
. The reader can verify that A and B
are both nilpotent of index 3; that is, A3
¼ 0 but A2
6¼ 0, and B3
¼ 0 but B2
6¼ 0. Find the
nilpotent matrices MA and MB in canonical form that are similar to A and B, respectively.
Because A and B are nilpotent of index 3, MA and MB must each contain a Jordan nilpotent block of
order 3, and none greater then 3. Note that rankðAÞ ¼ 2 and rankðBÞ ¼ 3, so nullityðAÞ ¼ 5  2 ¼ 3 and
nullityðBÞ ¼ 5  3 ¼ 2. Thus, MA must contain three diagonal blocks, which must be one of order 3 and
two of order 1; and MB must contain two diagonal blocks, which must be one of order 3 and one of order 2.
Namely,
MA ¼
0 1 0 0 0
0 0 1 0 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
2
6
6
6
6
4
3
7
7
7
7
5
and MB ¼
0 1 0 0 0
0 0 1 0 0
0 0 0 0 0
0 0 0 0 1
0 0 0 0 0
2
6
6
6
6
4
3
7
7
7
7
5
338 CHAPTER 10 Canonical Forms
10.18. Prove Theorem 10.11 on the Jordan canonical form for an operator T.
By the primary decomposition theorem, T is decomposable into operators T1; . . . ; Tr; that is,
T ¼ T1    Tr, where ðt  liÞmi
is the minimal polynomial of Ti. Thus, in particular,
ðT1  l1IÞm1
¼ 0; . . . ; ðTr  lrIÞmr
¼ 0
Set Ni ¼ Ti  liI. Then, for i ¼ 1; . . . ; r,
Ti ¼ Ni þ liI; where Nmi
i ¼ 0
That is, Ti is the sum of the scalar operator liI and a nilpotent operator Ni, which is of index mi because
ðt  liÞm
i is the minimal polynomial of Ti.
Now, by Theorem 10.10 on nilpotent operators, we can choose a basis so that Ni is in canonical form.
In this basis, Ti ¼ Ni þ liI is represented by a block diagonal matrix Mi whose diagonal entries are the
matrices Jij. The direct sum J of the matrices Mi is in Jordan canonical form and, by Theorem 10.5, is a
matrix representation of T.
Last, we must show that the blocks Jij satisfy the required properties. Property (i) follows from the fact
that Ni is of index mi. Property (ii) is true because T and J have the same characteristic polynomial. Property
(iii) is true because the nullity of Ni ¼ Ti  liI is equal to the geometric multiplicity of the eigenvalue li.
Property (iv) follows from the fact that the Ti and hence the Ni are uniquely determined by T.
10.19. Determine all possible Jordan canonical forms J for a linear operator T:V ! V whose
characteristic polynomial DðtÞ ¼ ðt  2Þ5
and whose minimal polynomial mðtÞ ¼ ðt  2Þ2
.
J must be a 5 5 matrix, because DðtÞ has degree 5, and all diagonal elements must be 2, because 2 is
the only eigenvalue. Moreover, because the exponent of t  2 in mðtÞ is 2, J must have one Jordan block of
order 2, and the others must be of order 2 or 1. Thus, there are only two possibilities:
J ¼ diag
2 1
2
 
;
2 1
2
 
; ½2
 
or J ¼ diag
2 1
2
 
; ½2; ½2; ½2
 
10.20. Determine all possible Jordan canonical forms for a linear operator T:V ! V whose character-
istic polynomial DðtÞ ¼ ðt  2Þ3
ðt  5Þ2
. In each case, find the minimal polynomial mðtÞ.
Because t  2 has exponent 3 in DðtÞ, 2 must appear three times on the diagonal. Similarly, 5 must
appear twice. Thus, there are six possibilities:
(a) diag
2 1
2 1
2
2
4
3
5;
5 1
5
 
0
@
1
A, (b) diag
2 1
2 1
2
2
4
3
5; ½5; ½5
0
@
1
A,
(c) diag
2 1
2
 
; ½2;
5 1
5
 
 
, (d) diag
2 1
2
 
; ½2; ½5; ½5
 
,
(e) diag ½2; ½2; ½2;
5 1
5
 
 
, (f ) diagð½2; ½2; ½2; ½5; ½5Þ
The exponent in the minimal polynomial mðtÞ is equal to the size of the largest block. Thus,
(a) mðtÞ ¼ ðt  2Þ3
ðt  5Þ2
, (b) mðtÞ ¼ ðt  2Þ3
ðt  5Þ, (c) mðtÞ ¼ ðt  2Þ2
ðt  5Þ2
,
(d) mðtÞ ¼ ðt  2Þ2
ðt  5Þ, (e) mðtÞ ¼ ðt  2Þðt  5Þ2
, (f ) mðtÞ ¼ ðt  2Þðt  5Þ
Quotient Space and Triangular Form
10.21. Let W be a subspace of a vector space V. Show that the following are equivalent:
(i) u 2 v þ W, (ii) u  v 2 W, (iii) v 2 u þ W.
Suppose u 2 v þ W. Then there exists w0 2 W such that u ¼ v þ w0. Hence, u  v ¼ w0 2 W.
Conversely, suppose u  v 2 W. Then u  v ¼ w0 where w0 2 W. Hence, u ¼ v þ w0 2 v þ W. Thus,
(i) and (ii) are equivalent.
We also have u  v 2 W iff  ðu  vÞ ¼ v  u 2 W iffv 2 u þ W. Thus, (ii) and (iii) are also
equivalent.
CHAPTER 10 Canonical Forms 339
10.22. Prove the following: The cosets of W in V partition V into mutually disjoint sets. That is,
(a) Any two cosets u þ W and v þ W are either identical or disjoint.
(b) Each v 2 V belongs to a coset; in fact, v 2 v þ W.
Furthermore, u þ W ¼ v þ W if and only if u  v 2 W, and so ðv þ wÞ þ W ¼ v þ W for any
w 2 W.
Let v 2 V. Because 0 2 W, we have v ¼ v þ 0 2 v þ W, which proves (b).
Now suppose the cosets u þ W and v þ W are not disjoint; say, the vector x belongs to both u þ W
and v þ W. Then u  x 2 W and x  v 2 W. The proof of (a) is complete if we show that u þ W ¼ v þ W.
Let u þ w0 be any element in the coset u þ W. Because u  x, x  v, w0 belongs to W,
ðu þ w0Þ  v ¼ ðu  xÞ þ ðx  vÞ þ w0 2 W
Thus, u þ w0 2 v þ W, and hence the cost u þ W is contained in the coset v þ W. Similarly, v þ W is
contained in u þ W, and so u þ W ¼ v þ W.
The last statement follows from the fact that u þ W ¼ v þ W if and only if u 2 v þ W, and, by
Problem 10.21, this is equivalent to u  v 2 W.
10.23. Let W be the solution space of the homogeneous equation 2x þ 3y þ 4z ¼ 0. Describe the cosets
of W in R3
.
W is a plane through the origin O ¼ ð0; 0; 0Þ, and the cosets of W are the planes parallel to W.
Equivalently, the cosets of W are the solution sets of the family of equations
2x þ 3y þ 4z ¼ k; k 2 R
In fact, the coset v þ W, where v ¼ ða; b; cÞ, is the solution set of the linear equation
2x þ 3y þ 4z ¼ 2a þ 3b þ 4c or 2ðx  aÞ þ 3ðy  bÞ þ 4ðz  cÞ ¼ 0
10.24. Suppose W is a subspace of a vector space V. Show that the operations in Theorem 10.15 are well
defined; namely, show that if u þ W ¼ u0
þ W and v þ W ¼ v0
þ W, then
ðaÞ ðu þ vÞ þ W ¼ ðu0
þ v0
Þ þ W and ðbÞ ku þ W ¼ ku0
þ W for any k 2 K
(a) Because u þ W ¼ u0
þ W and v þ W ¼ v0
þ W, both u  u0
and v  v0
belong to W. But then
ðu þ vÞ  ðu0
þ v0
Þ ¼ ðu  u0
Þ þ ðv  v0
Þ 2 W. Hence, ðu þ vÞ þ W ¼ ðu0
þ v0
Þ þ W.
(b) Also, because u  u0
2 W implies kðu  u0
Þ 2 W, then ku  ku0
¼ kðu  u0
Þ 2 W; accordingly,
ku þ W ¼ ku0
þ W.
10.25. Let V be a vector space and W a subspace of V. Show that the natural map Z: V ! V=W, defined
by ZðvÞ ¼ v þ W, is linear.
For any u; v 2 V and any k 2 K, we have
nðu þ vÞ ¼ u þ v þ W ¼ u þ W þ v þ W ¼ ZðuÞ þ ZðvÞ
and ZðkvÞ ¼ kv þ W ¼ kðv þ WÞ ¼ kZðvÞ
Accordingly, Z is linear.
10.26. Let W be a subspace of a vector space V. Suppose fw1; . . . ; wrg is a basis of W and the set of
cosets f
v1; . . . ; 
vsg, where 
vj ¼ vj þ W, is a basis of the quotient space. Show that the set of
vectors B ¼ fv1; . . . ; vs, w1; . . . ; wrg is a basis of V. Thus, dim V ¼ dim W þ dimðV=WÞ.
Suppose u 2 V. Because f
vjg is a basis of V=W,

u ¼ u þ W ¼ a1 
v1 þ a2 
v2 þ    þ as
vs
Hence, u ¼ a1v1 þ    þ asvs þ w, where w 2 W. Since fwig is a basis of W,
u ¼ a1v1 þ    þ asvs þ b1w1 þ    þ brwr
340 CHAPTER 10 Canonical Forms
Accordingly, B spans V.
We now show that B is linearly independent. Suppose
c1v1 þ    þ csvs þ d1w1 þ    þ drwr ¼ 0 ð1Þ
Then c1 
v1 þ    þ cs
vs ¼ 
0 ¼ W
Because f
vjg is independent, the c’s are all 0. Substituting into (1), we find d1w1 þ    þ drwr ¼ 0.
Because fwig is independent, the d’s are all 0. Thus, B is linearly independent and therefore a basis of V.
10.27. Prove Theorem 10.16: Suppose W is a subspace invariant under a linear operator T:V ! V. Then
T induces a linear operator 
T on V=W defined by 
Tðv þ WÞ ¼ TðvÞ þ W. Moreover, if T is a
zero of any polynomial, then so is 
T. Thus, the minimal polynomial of 
T divides the minimal
polynomial of T.
We first show that 
T is well defined; that is, if u þ W ¼ v þ W, then 
Tðu þ WÞ ¼ 
Tðv þ WÞ. If
u þ W ¼ v þ W, then u  v 2 W, and, as W is T-invariant, Tðu  vÞ ¼ TðuÞ  TðvÞ 2 W. Accordingly,

Tðu þ WÞ ¼ TðuÞ þ W ¼ TðvÞ þ W ¼ 
Tðv þ WÞ
as required.
We next show that 
T is linear. We have

Tððu þ WÞ þ ðv þ WÞÞ ¼ 
Tðu þ v þ WÞ ¼ Tðu þ vÞ þ W ¼ TðuÞ þ TðvÞ þ W
¼ TðuÞ þ W þ TðvÞ þ W ¼ 
Tðu þ WÞ þ 
Tðv þ WÞ
Furthermore,

Tðkðu þ WÞÞ ¼ 
Tðku þ WÞ ¼ TðkuÞ þ W ¼ kTðuÞ þ W ¼ kðTðuÞ þ WÞ ¼ k ^
Tðu þ WÞ
Thus, 
T is linear.
Now, for any coset u þ W in V=W,
T2ðu þ WÞ ¼ T2
ðuÞ þ W ¼ TðTðuÞÞ þ W ¼ 
TðTðuÞ þ WÞ ¼ 
Tð 
Tðu þ WÞÞ ¼ 
T2
ðu þ WÞ
Hence, T2 ¼ 
T2
. Similarly, Tn ¼ 
Tn
for any n. Thus, for any polynomial
f ðtÞ ¼ antn
þ    þ a0 ¼
P
aiti
f ðTÞðu þ WÞ ¼ f ðTÞðuÞ þ W ¼
P
aiTi
ðuÞ þ W ¼
P
aiðTi
ðuÞ þ WÞ
¼
P
aiTiðu þ WÞ ¼
P
ai

Ti
ðu þ WÞ ¼ ð
P
ai

Ti
Þðu þ WÞ ¼ f ð 
TÞðu þ WÞ
and so f ðTÞ ¼ f ð 
TÞ. Accordingly, if T is a root of f ðtÞ then f ðTÞ ¼ 
0 ¼ W ¼ f ð 
TÞ; that is, 
T is also a root
of f ðtÞ. The theorem is proved.
10.28. Prove Theorem 10.1: Let T:V ! V be a linear operator whose characteristic polynomial factors
into linear polynomials. Then V has a basis in which T is represented by a triangular matrix.
The proof is by induction on the dimension of V. If dim V ¼ 1, then every matrix representation of T
is a 1 1 matrix, which is triangular.
Now suppose dim V ¼ n  1 and that the theorem holds for spaces of dimension less than n. Because
the characteristic polynomial of T factors into linear polynomials, T has at least one eigenvalue and so at
least one nonzero eigenvector v, say TðvÞ ¼ a11v. Let W be the one-dimensional subspace spanned by v.
Set 
V ¼ V=W. Then (Problem 10.26) dim 
V ¼ dim V  dim W ¼ n  1. Note also that W is invariant
under T. By Theorem 10.16, T induces a linear operator 
T on 
V whose minimal polynomial divides the
minimal polynomial of T. Because the characteristic polynomial of T is a product of linear polynomials,
so is its minimal polynomial, and hence, so are the minimal and characteristic polynomials of 
T. Thus, 
V
and 
T satisfy the hypothesis of the theorem. Hence, by induction, there exists a basis f
v2; . . . ; 
vng of 
V
such that

Tð
v2Þ ¼ a22
v2

Tð
v3Þ ¼ a32
v2 þ a33
v3
:::::::::::::::::::::::::::::::::::::::::

Tð
vnÞ ¼ an2 
vn þ an3 
v3 þ    þ ann
vn
CHAPTER 10 Canonical Forms 341
Now let v2; . . . ; vn be elements of V that belong to the cosets v2; . . . ; vn, respectively. Then fv; v2; . . . ; vng
is a basis of V (Problem 10.26). Because 
Tðv2Þ ¼ a22
v2, we have

Tð
v2Þ  a22
v22 ¼ 0; and so Tðv2Þ  a22v2 2 W
But W is spanned by v; hence, Tðv2Þ  a22v2 is a multiple of v, say,
Tðv2Þ  a22v2 ¼ a21v; and so Tðv2Þ ¼ a21v þ a22v2
Similarly, for i ¼ 3; . . . ; n
TðviÞ  ai2v2  ai3v3      aiivi 2 W; and so TðviÞ ¼ ai1v þ ai2v2 þ    þ aiivi
Thus,
TðvÞ ¼ a11v
Tðv2Þ ¼ a21v þ a22v2
::::::::::::::::::::::::::::::::::::::::
TðvnÞ ¼ an1v þ an2v2 þ    þ annvn
and hence the matrix of T in this basis is triangular.
Cyclic Subspaces, Rational Canonical Form
10.29. Prove Theorem 10.12: Let Zðv; TÞ be a T-cyclic subspace, Tv the restriction of T to Zðv; TÞ, and
mvðtÞ ¼ tk
þ ak1tk1
þ    þ a0 the T-annihilator of v. Then,
(i) The set fv; TðvÞ; . . . ; Tk1
ðvÞg is a basis of Zðv; TÞ; hence, dim Zðv; TÞ ¼ k.
(ii) The minimal polynomial of Tv is mvðtÞ.
(iii) The matrix of Tv in the above basis is the companion matrix C ¼ CðmvÞ of mvðtÞ [which
has 1’s below the diagonal, the negative of the coefficients a0; a1; . . . ; ak1 of mvðtÞ in the
last column, and 0’s elsewhere].
(i) By definition of mvðtÞ, Tk
ðvÞ is the first vector in the sequence v, TðvÞ, T2
ðvÞ; ... that, is a linear
combination of those vectors that precede it in the sequence; hence, the set B ¼ fv; TðvÞ; ... ; Tk1
ðvÞg is
linearly independent. We now only have to show that Zðv; TÞ ¼ LðBÞ, the linear span of B. By the above,
Tk
ðvÞ 2 LðBÞ. We prove by induction that Tn
ðvÞ 2 LðBÞ for every n. Suppose n  k and
Tn1
ðvÞ 2 LðBÞ—that is, Tn1
ðvÞ is a linear combination of v; ... ; Tk1
ðvÞ. Then
Tn
ðvÞ ¼ TðTn1
ðvÞÞ is a linear combination of TðvÞ; ... ; Tk
ðvÞ. But Tk
ðvÞ 2 LðBÞ; hence,
Tn
ðvÞ 2 LðBÞ for every n. Consequently, f ðTÞðvÞ 2 LðBÞ for any polynomial f ðtÞ. Thus,
Zðv; TÞ ¼ LðBÞ, and so B is a basis, as claimed.
(ii) Suppose mðtÞ ¼ ts
þ bs1ts1
þ    þ b0 is the minimal polynomial of Tv. Then, because v 2 Zðv; TÞ,
0 ¼ mðTvÞðvÞ ¼ mðTÞðvÞ ¼ Ts
ðvÞ þ bs1Ts1
ðvÞ þ    þ b0v
Thus, Ts
ðvÞ is a linear combination of v, TðvÞ; . . . ; Ts1
ðvÞ, and therefore k  s. However,
mvðTÞ ¼ 0 and so mvðTvÞ ¼ 0: Then mðtÞ divides mvðtÞ; and so s  k: Accordingly, k ¼ s and
hence mvðtÞ ¼ mðtÞ.
(iii)
TvðvÞ ¼ TðvÞ
TvðTðvÞÞ ¼ T2
ðvÞ
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
TvðTk2
ðvÞÞ ¼ Tk1
ðvÞ
TvðTk1
ðvÞÞ ¼ Tk
ðvÞ ¼ a0v  a1 TðvÞ  a2T2
ðvÞ      ak1 Tk1
ðvÞ
By definition, the matrix of Tv in this basis is the tranpose of the matrix of coefficients of the above
system of equations; hence, it is C, as required.
10.30. Let T:V ! V be linear. Let W be a T-invariant subspace of V and 
T the induced operator on
V=W. Prove
(a) The T-annihilator of v 2 V divides the minimal polynomial of T.
(b) The 
T-annihilator of 
v 2 V=W divides the minimal polynomial of T.
342 CHAPTER 10 Canonical Forms
(a) The T-annihilator of v 2 V is the minimal polynomial of the restriction of T to Zðv; TÞ; therefore, by
Problem 10.6, it divides the minimal polynomial of T.
(b) The 
T-annihilator of 
v 2 V=W divides the minimal polynomial of 
T, which divides the minimal
polynomial of T by Theorem 10.16.
Remark: In the case where the minimum polynomial of T is f ðtÞn
, where f ðtÞ is a monic irreducible
polynomial, then the T-annihilator of v 2 V and the 
T-annihilator of 
v 2 V=W are of the form f ðtÞm
, where
m  n.
10.31. Prove Lemma 10.13: Let T:V ! V be a linear operator whose minimal polynomial is f ðtÞn
,
where f ðtÞ is a monic irreducible polynomial. Then V is the direct sum of T-cyclic subspaces
Zi ¼ Zðvi; TÞ, i ¼ 1; . . . ; r, with corresponding T-annihilators
f ðtÞn1
; f ðtÞn2
; . . . ; f ðtÞnr
; n ¼ n1  n2      nr
Any other decomposition of V into the direct sum of T-cyclic subspaces has the same number of
components and the same set of T-annihilators.
The proof is by induction on the dimension of V. If dim V ¼ 1, then V is T-cyclic and the lemma
holds. Now suppose dim V  1 and that the lemma holds for those vector spaces of dimension less than
that of V.
Because the minimal polynomial of T is f ðtÞn
, there exists v1 2 V such that f ðTÞn1
ðv1Þ 6¼ 0; hence,
the T-annihilator of v1 is f ðtÞn
. Let Z1 ¼ Zðv1; TÞ and recall that Z1 is T-invariant. Let 
V ¼ V=Z1 and let 
T
be the linear operator on 
V induced by T. By Theorem 10.16, the minimal polynomial of 
T divides f ðtÞn
;
hence, the hypothesis holds for 
V and 
T. Consequently, by induction, 
V is the direct sum of 
T-cyclic
subspaces; say,

V ¼ Zð
v2; 
TÞ    Zð
vr; 
T Þ
where the corresponding 
T-annihilators are f ðtÞn2
; . . . ; f ðtÞnr
, n  n2      nr.
We claim that there is a vector v2 in the coset 
v2 whose T-annihilator is f ðtÞn2
, the 
T-annihilator of 
v2.
Let w be any vector in 
v2. Then f ðTÞn2
ðwÞ 2 Z1. Hence, there exists a polynomial gðtÞ for which
f ðTÞn2
ðwÞ ¼ gðTÞðv1Þ ð1Þ
Because f ðtÞn
is the minimal polynomial of T, we have, by (1),
0 ¼ f ðTÞn
ðwÞ ¼ f ðTÞnn2
gðTÞðv1Þ
But f ðtÞn
is the T-annihilator of v1; hence, f ðtÞn
divides f ðtÞnn2
gðtÞ, and so gðtÞ ¼ f ðtÞn2
hðtÞ for some
polynomial hðtÞ. We set
v2 ¼ w  hðTÞðv1Þ
Because w  v2 ¼ hðTÞðv1Þ 2 Z1, v2 also belongs to the coset 
v2. Thus, the T-annihilator of v2 is a
multiple of the 
T-annihilator of 
v2. On the other hand, by (1),
f ðTÞn2
ðv2Þ ¼ f ðTÞns
ðw  hðTÞðv1ÞÞ ¼ f ðTÞn2
ðwÞ  gðTÞðv1Þ ¼ 0
Consequently, the T-annihilator of v2 is f ðtÞn2
, as claimed.
Similarly, there exist vectors v3; . . . ; vr 2 V such that vi 2 vi and that the T-annihilator of vi is f ðtÞni
,
the 
T-annihilator of vi. We set
Z2 ¼ Zðv2; TÞ; . . . ; Zr ¼ Zðvr; TÞ
Let d denote the degree of f ðtÞ, so that f ðtÞni
has degree dni. Then, because f ðtÞni
is both the T-annihilator
of vi and the 
T-annihilator of vi, we know that
fvi; TðviÞ; . . . ; Tdni1
ðviÞg and f
vi: 
TðviÞ; . . . ; 
Tdni1
ðviÞg
are bases for Zðvi; TÞ and Zðvi; 
TÞ, respectively, for i ¼ 2; . . . ; r. But 
V ¼ Zðv2; 
TÞ    Zðvr; 
TÞ;
hence,
f
v2; . . . ;
Tdn21
ð
v2Þ; . . . ; 
vr; . . . ;
Tdnr1
ð
vrÞg
CHAPTER 10 Canonical Forms 343
is a basis for 
V. Therefore, by Problem 10.26 and the relation 
Ti
ð
vÞ ¼ TiðvÞ (see Problem 10.27),
fv1; . . . ; Tdn11
ðv1Þ; v2; . . . ; Ten21
ðv2Þ; . . . ; vr; . . . ; Tdnr1
ðvrÞg
is a basis for V. Thus, by Theorem 10.4, V ¼ Zðv1; TÞ    Zðvr; TÞ, as required.
It remains to show that the exponents n1; . . . ; nr are uniquely determined by T. Because
d ¼ degree of f ðtÞ;
dim V ¼ dðn1 þ    þ nrÞ and dim Zi ¼ dni; i ¼ 1; . . . ; r
Also, if s is any positive integer, then (Problem 10.59) f ðTÞs
ðZiÞ is a cyclic subspace generated by
f ðTÞs
ðviÞ, and it has dimension dðni  sÞ if ni  s and dimension 0 if ni  s.
Now any vector v 2 V can be written uniquely in the form v ¼ w1 þ    þ wr, where wi 2 Zi.
Hence, any vector in f ðTÞs
ðVÞ can be written uniquely in the form
f ðTÞs
ðvÞ ¼ f ðTÞs
ðw1Þ þ    þ f ðTÞs
ðwrÞ
where f ðTÞs
ðwiÞ 2 f ðTÞs
ðZiÞ. Let t be the integer, dependent on s, for which
n1  s; . . . ; nt  s; ntþ1  s
Then f ðTÞs
ðVÞ ¼ f ðTÞs
ðZ1Þ    f ðTÞs
ðZtÞ
and so dim½ f ðTÞs
ðVÞ ¼ d½ðn1  sÞ þ    þ ðnt  sÞ ð2Þ
The numbers on the left of (2) are uniquely determined by T. Set s ¼ n  1, and (2) determines the number
of ni equal to n. Next set s ¼ n  2, and (2) determines the number of ni (if any) equal to n  1. We repeat
the process until we set s ¼ 0 and determine the number of ni equal to 1. Thus, the ni are uniquely
determined by T and V, and the lemma is proved.
10.32. Let V be a seven-dimensional vector space over R, and let T:V ! V be a linear operator with
minimal polynomial mðtÞ ¼ ðt2
 2t þ 5Þðt  3Þ3
. Find all possible rational canonical forms M
of T.
Because dim V ¼ 7; there are only two possible characteristic polynomials, D1ðtÞ ¼ ðt2
 2t þ 5Þ2
ðt  3Þ3
or D1ðtÞ ¼ ðt2
 2t þ 5Þðt  3Þ5
: Moreover, the sum of the orders of the companion matrices
must add up to 7. Also, one companion matrix must be Cðt2
 2t þ 5Þ and one must be Cððt  3Þ3
Þ ¼
Cðt3
 9t2
þ 27t  27Þ. Thus, M must be one of the following block diagonal matrices:
(a) diag
0 5
1 2
 
;
0 5
1 2
 
;
0 0 27
1 0 27
0 1 9
2
4
3
5
0
@
1
A;
(b) diag
0 5
1 2
 
;
0 0 27
1 0 27
0 1 9
2
4
3
5;
0 9
1 6
 
0
@
1
A;
(c) diag
0 5
1 2
 
;
0 0 27
1 0 27
0 1 9
2
4
3
5; ½3; ½3
0
@
1
A
Projections
10.33. Suppose V ¼ W1    Wr. The projection of V into its subspace Wk is the mapping E: V ! V
defined by EðvÞ ¼ wk, where v ¼ w1 þ    þ wr; wi 2 Wi. Show that (a) E is linear, (b) E2
¼ E.
(a) Because the sum v ¼ w1 þ    þ wr, wi 2 W is uniquely determined by v, the mapping E is well
defined. Suppose, for u 2 V, u ¼ w0
1 þ    þ w0
r, w0
i 2 Wi. Then
v þ u ¼ ðw1 þ w0
1Þ þ    þ ðwr þ w0
rÞ and kv ¼ kw1 þ    þ kwr; kwi; wi þ w0
i 2 Wi
are the unique sums corresponding to v þ u and kv. Hence,
Eðv þ uÞ ¼ wk þ w0
k ¼ EðvÞ þ EðuÞ and EðkvÞ ¼ kwk þ kEðvÞ
and therefore E is linear.
344 CHAPTER 10 Canonical Forms
(b) We have that
wk ¼ 0 þ    þ 0 þ wk þ 0 þ    þ 0
is the unique sum corresponding to wk 2 Wk; hence, EðwkÞ ¼ wk. Then, for any v 2 V,
E2
ðvÞ ¼ EðEðvÞÞ ¼ EðwkÞ ¼ wk ¼ EðvÞ
Thus, E2
¼ E, as required.
10.34. Suppose E:V ! V is linear and E2
¼ E. Show that (a) EðuÞ ¼ u for any u 2 Im E (i.e., the
restriction of E to its image is the identity mapping); (b) V is the direct sum of the image and
kernel of E:V ¼ Im E Ker E; (c) E is the projection of V into Im E, its image. Thus, by the
preceding problem, a linear mapping T:V ! V is a projection if and only if T2
¼ T; this
characterization of a projection is frequently used as its definition.
(a) If u 2 Im E, then there exists v 2 V for which EðvÞ ¼ u; hence, as required,
EðuÞ ¼ EðEðvÞÞ ¼ E2
ðvÞ ¼ EðvÞ ¼ u
(b) Let v 2 V. We can write v in the form v ¼ EðvÞ þ v  EðvÞ. Now EðvÞ 2 Im E and, because
Eðv  EðvÞÞ ¼ EðvÞ  E2
ðvÞ ¼ EðvÞ  EðvÞ ¼ 0
v  EðvÞ 2 Ker E. Accordingly, V ¼ Im E þ Ker E.
Now suppose w 2 Im E  Ker E. By (i), EðwÞ ¼ w because w 2 Im E. On the other hand,
EðwÞ ¼ 0 because w 2 Ker E. Thus, w ¼ 0, and so Im E  Ker E ¼ f0g. These two conditions
imply that V is the direct sum of the image and kernel of E.
(c) Let v 2 V and suppose v ¼ u þ w, where u 2 Im E and w 2 Ker E. Note that EðuÞ ¼ u by (i), and
EðwÞ ¼ 0 because w 2 Ker E. Hence,
EðvÞ ¼ Eðu þ wÞ ¼ EðuÞ þ EðwÞ ¼ u þ 0 ¼ u
That is, E is the projection of V into its image.
10.35. Suppose V ¼ U W and suppose T:V ! V is linear. Show that U and W are both T-invariant if
and only if TE ¼ ET, where E is the projection of V into U.
Observe that EðvÞ 2 U for every v 2 V, and that (i) EðvÞ ¼ v iff v 2 U, (ii) EðvÞ ¼ 0 iff v 2 W.
Suppose ET ¼ TE. Let u 2 U. Because EðuÞ ¼ u,
TðuÞ ¼ TðEðuÞÞ ¼ ðTEÞðuÞ ¼ ðETÞðuÞ ¼ EðTðuÞÞ 2 U
Hence, U is T-invariant. Now let w 2 W. Because EðwÞ ¼ 0,
EðTðwÞÞ ¼ ðETÞðwÞ ¼ ðTEÞðwÞ ¼ TðEðwÞÞ ¼ Tð0Þ ¼ 0; and so TðwÞ 2 W
Hence, W is also T-invariant.
Conversely, suppose U and W are both T-invariant. Let v 2 V and suppose v ¼ u þ w, where u 2 T
and w 2 W. Then TðuÞ 2 U and TðwÞ 2 W; hence, EðTðuÞÞ ¼ TðuÞ and EðTðwÞÞ ¼ 0. Thus,
ðETÞðvÞ ¼ ðETÞðu þ wÞ ¼ ðETÞðuÞ þ ðETÞðwÞ ¼ EðTðuÞÞ þ EðTðwÞÞ ¼ TðuÞ
and ðTEÞðvÞ ¼ ðTEÞðu þ wÞ ¼ TðEðu þ wÞÞ ¼ TðuÞ
That is, ðETÞðvÞ ¼ ðTEÞðvÞ for every v 2 V; therefore, ET ¼ TE, as required.
SUPPLEMENTARY PROBLEMS
Invariant Subspaces
10.36. Suppose W is invariant under T:V ! V. Show that W is invariant under f ðTÞ for any polynomial f ðtÞ.
10.37. Show that every subspace of V is invariant under I and 0, the identity and zero operators.
CHAPTER 10 Canonical Forms 345
10.38. Let W be invariant under T1: V ! V and T2: V ! V. Prove W is also invariant under T1 þ T2 and T1T2.
10.39. Let T:V ! V be linear. Prove that any eigenspace, El is T-invariant.
10.40. Let V be a vector space of odd dimension (greater than 1) over the real field R. Show that any linear
operator on V has an invariant subspace other than V or f0g.
10.41. Determine the invariant subspace of A ¼
2 4
5 2
 
viewed as a linear operator on (a) R2
, (b) C2
.
10.42. Suppose dim V ¼ n. Show that T:V ! V has a triangular matrix representation if and only if there exist
T-invariant subspaces W1  W2      Wn ¼ V for which dim Wk ¼ k, k ¼ 1; . . . ; n.
Invariant Direct Sums
10.43. The subspaces W1; . . . ; Wr are said to be independent if w1 þ    þ wr ¼ 0, wi 2 Wi, implies that each
wi ¼ 0. Show that spanðWiÞ ¼ W1    Wr if and only if the Wi are independent. [Here spanðWiÞ
denotes the linear span of the Wi.]
10.44. Show that V ¼ W1    Wr if and only if (i) V ¼ spanðWiÞ and (ii) for k ¼ 1; 2; . . . ; r,
Wk  spanðW1; . . . ; Wk1; Wkþ1; . . . ; WrÞ ¼ f0g.
10.45. Show that spanðWiÞ ¼ W1    Wr if and only if dim ½spanðWiÞ ¼ dim W1 þ    þ dim Wr.
10.46. Suppose the characteristic polynomial of T:V ! V is DðtÞ ¼ f1ðtÞn1
f2ðtÞn2
   frðtÞnr
, where the fiðtÞ are
distinct monic irreducible polynomials. Let V ¼ W1    Wr be the primary decomposition of V into T-
invariant subspaces. Show that fiðtÞni
is the characteristic polynomial of the restriction of T to Wi.
Nilpotent Operators
10.47. Suppose T1 and T2 are nilpotent operators that commute (i.e., T1T2 ¼ T2T1). Show that T1 þ T2 and T1T2
are also nilpotent.
10.48. Suppose A is a supertriangular matrix (i.e., all entries on and below the main diagonal are 0). Show that A is
nilpotent.
10.49. Let V be the vector space of polynomials of degree n. Show that the derivative operator on V is nilpotent
of index n þ 1.
10.50. Show that any Jordan nilpotent block matrix N is similar to its transpose NT
(the matrix with 1’s below the
diagonal and 0’s elsewhere).
10.51. Show that two nilpotent matrices of order 3 are similar if and only if they have the same index of
nilpotency. Show by example that the statement is not true for nilpotent matrices of order 4.
Jordan Canonical Form
10.52. Find all possible Jordan canonical forms for those matrices whose characteristic polynomial DðtÞ and
minimal polynomial mðtÞ are as follows:
(a) DðtÞ ¼ ðt  2Þ4
ðt  3Þ2
; mðtÞ ¼ ðt  2Þ2
ðt  3Þ2
,
(b) DðtÞ ¼ ðt  7Þ5
; mðtÞ ¼ ðt  7Þ2
, (c) DðtÞ ¼ ðt  2Þ7
; mðtÞ ¼ ðt  2Þ3
10.53. Show that every complex matrix is similar to its transpose. (Hint: Use its Jordan canonical form.)
10.54. Show that all n n complex matrices A for which An
¼ I but Ak 6¼ I for k  n are similar.
10.55. Suppose A is a complex matrix with only real eigenvalues. Show that A is similar to a matrix with only real
entries.
346 CHAPTER 10 Canonical Forms
Cyclic Subspaces
10.56. Suppose T:V ! V is linear. Prove that Zðv; TÞ is the intersection of all T-invariant subspaces containing v.
10.57. Let f ðtÞ and gðtÞ be the T-annihilators of u and v, respectively. Show that if f ðtÞ and gðtÞ are relatively
prime, then f ðtÞgðtÞ is the T-annihilator of u þ v.
10.58. Prove that Zðu; TÞ ¼ Zðv; TÞ if and only if gðTÞðuÞ ¼ v where gðtÞ is relatively prime to the T-annihilator of
u.
10.59. Let W ¼ Zðv; TÞ, and suppose the T-annihilator of v is f ðtÞn
, where f ðtÞ is a monic irreducible polynomial
of degree d. Show that f ðTÞs
ðWÞ is a cyclic subspace generated by f ðTÞs
ðvÞ and that it has dimension
dðn  sÞ if n  s and dimension 0 if n  s.
Rational Canonical Form
10.60. Find all possible rational forms for a 6 6 matrix over R with minimal polynomial:
(a) mðtÞ ¼ ðt2
 2t þ 3Þðt þ 1Þ2
, (b) mðtÞ ¼ ðt  2Þ3
.
10.61. Let A be a 4 4 matrix with minimal polynomial mðtÞ ¼ ðt2
þ 1Þðt2
 3Þ. Find the rational canonical form
for A if A is a matrix over (a) the rational field Q, (b) the real field R, (c) the complex field C.
10.62. Find the rational canonical form for the four-square Jordan block with l’s on the diagonal.
10.63. Prove that the characteristic polynomial of an operator T:V ! V is a product of its elementary divisors.
10.64. Prove that two 3 3 matrices with the same minimal and characteristic polynomials are similar.
10.65. Let Cð f ðtÞÞ denote the companion matrix to an arbitrary polynomial f ðtÞ. Show that f ðtÞ is the
characteristic polynomial of Cð f ðtÞÞ.
Projections
10.66. Suppose V ¼ W1    Wr. Let Ei denote the projection of V into Wi. Prove (i) EiEj ¼ 0, i 6¼ j;
(ii) I ¼ E1 þ    þ Er.
10.67. Let E1; . . . ; Er be linear operators on V such that
(i) E2
i ¼ Ei (i.e., the Ei are projections); (ii) EiEj ¼ 0, i 6¼ j; (iii) I ¼ E1 þ    þ Er
Prove that V ¼ Im E1    Im Er.
10.68. Suppose E: V ! V is a projection (i.e., E2
¼ E). Prove that E has a matrix representation of the form
Ir 0
0 0
 
, where r is the rank of E and Ir is the r-square identity matrix.
10.69. Prove that any two projections of the same rank are similar. (Hint: Use the result of Problem 10.68.)
10.70. Suppose E: V ! V is a projection. Prove
(i) I  E is a projection and V ¼ Im E Im ðI  EÞ, (ii) I þ E is invertible (if 1 þ 1 6¼ 0).
Quotient Spaces
10.71. Let W be a subspace of V. Suppose the set of cosets fv1 þ W; v2 þ W; . . . ; vn þ Wg in V=W is linearly
independent. Show that the set of vectors fv1; v2; . . . ; vng in V is also linearly independent.
10.72. Let W be a substance of V. Suppose the set of vectors fu1; u2; . . . ; ung in V is linearly independent, and that
LðuiÞ  W ¼ f0g. Show that the set of cosets fu1 þ W; . . . ; un þ Wg in V=W is also linearly
independent.
CHAPTER 10 Canonical Forms 347
10.73. Suppose V ¼ U W and that fu1; . . . ; ung is a basis of U. Show that fu1 þ W; . . . ; un þ Wg is a basis
of the quotient spaces V=W. (Observe that no condition is placed on the dimensionality of V or W.)
10.74. Let W be the solution space of the linear equation
a1x1 þ a2x2 þ    þ anxn ¼ 0; ai 2 K
and let v ¼ ðb1; b2; . . . ; bnÞ 2 Kn
. Prove that the coset v þ W of W in Kn
is the solution set of the linear
equation
a1x1 þ a2x2 þ    þ anxn ¼ b; where b ¼ a1b1 þ    þ anbn
10.75. Let V be the vector space of polynomials over R and let W be the subspace of polynomials divisible by t4
(i.e., of the form a0t4
þ a1t5
þ    þ an4tn
). Show that the quotient space V=W has dimension 4.
10.76. Let U and W be subspaces of V such that W  U  V. Note that any coset u þ W of W in U may also be
viewed as a coset of W in V, because u 2 U implies u 2 V; hence, U=W is a subset of V=W. Prove that
(i) U=W is a subspace of V=W, (ii) dimðV=WÞ  dimðU=WÞ ¼ dimðV=UÞ.
10.77. Let U and W be subspaces of V. Show that the cosets of U  W in V can be obtained by intersecting each of
the cosets of U in V by each of the cosets of W in V:
V=ðU  WÞ ¼ fðv þ UÞ  ðv0
þ WÞ : v; v0
2 Vg
10.78. Let T:V ! V0
be linear with kernel W and image U. Show that the quotient
space V=W is isomorphic to U under the mapping y:V=W ! U defined by
yðv þ WÞ ¼ TðvÞ. Furthermore, show that T ¼ i  y  Z, where Z:V ! V=W
is the natural mapping of V into V=W (i.e., ZðvÞ ¼ v þ W), and i:U ,! V0
is
the inclusion mapping (i.e., iðuÞ ¼ u). (See diagram.)
ANSWERS TO SUPPLEMENTARY PROBLEMS
10.41. (a) R2
and f0g, (b) C2
; f0g; W1 ¼ spanð2; 1  2iÞ; W2 ¼ spanð2; 1 þ 2iÞ
10.52. (a) diag
2 1
2
 
;
2 1
2
 
;
3 1
3
 
 
; diag
2 1
2
 
; ½2: ½2;
3 1
3
 
 
;
(b) diag
7 1
7
 
;
7 1
7
 
; ½7
 
; diag
7 1
7
 
; ½7; ½7; ½7
 
;
(c) Let Mk denote a Jordan block with l ¼ 2 and order k. Then diagðM3; M3; M1Þ, diagðM3; M2; M2Þ,
diagðM3; M2; M1; M1Þ, diagðM3; M1; M1; M1; M1Þ
10.60. Let A ¼
0 3
1 2
 
; B ¼
0 1
1 2
 
; C ¼
0 0 8
1 0 12
0 1 6
2
4
3
5; D ¼
0 4
1 4
 
.
(a) diagðA; A; BÞ; diagðA; B; BÞ; diagðA; B; 1; 1Þ; (b) diagðC; CÞ; diagðC; D; 2Þ; diagðC; 2; 2; 2Þ
10.61. Let A ¼
0 1
1 0
 
; B ¼
0 3
1 0
 
.
(a) diagðA; BÞ, (b) diagðA;
ffiffiffi
3
p
; 
ffiffiffi
3
p
Þ, (c) diagði; i;
ffiffiffi
3
p
; 
ffiffiffi
3
p
Þ
10.62. Companion matrix with the last column ½l4
; 4l3
; 6l2
; 4lT
348 CHAPTER 10 Canonical Forms
CHAPTER 11
Linear Functionals
and the Dual Space
11.1 Introduction
In this chapter, we study linear mappings from a vector space V into its field K of scalars. (Unless
otherwise stated or implied, we view K as a vector space over itself.) Naturally all the theorems and
results for arbitrary mappings on V hold for this special case. However, we treat these mappings
separately because of their fundamental importance and because the special relationship of V to K gives
rise to new notions and results that do not apply in the general case.
11.2 Linear Functionals and the Dual Space
Let V be a vector space over a field K. A mapping f:V ! K is termed a linear functional (or linear form)
if, for every u; v 2 V and every a; b; 2 K,
fðau þ bvÞ ¼ afðuÞ þ bfðvÞ
In other words, a linear functional on V is a linear mapping from V into K.
EXAMPLE 11.1
(a) Let pi:Kn
! K be the ith projection mapping; that is, piða1; a2; . . . anÞ ¼ ai. Then pi is linear and so it is a linear
functional on Kn
.
(b) Let V be the vector space of polynomials in t over R. Let J:V ! R be the integral operator defined by
JðpðtÞÞ ¼
Ð1
0 pðtÞ dt. Recall that J is linear; and hence, it is a linear functional on V.
(c) Let V be the vector space of n-square matrices over K. Let T :V ! K be the trace mapping
TðAÞ ¼ a11 þ a22 þ    þ ann; where A ¼ ½aij
That is, T assigns to a matrix A the sum of its diagonal elements. This map is linear (Problem 11.24), and so it is
a linear functional on V.
By Theorem 5.10, the set of linear functionals on a vector space V over a field K is also a vector
space over K, with addition and scalar multiplication defined by
ðf þ sÞðvÞ ¼ fðvÞ þ sðvÞ and ðkfÞðvÞ ¼ kfðvÞ
where f and s are linear functionals on V and k 2 K. This space is called the dual space of V and is
denoted by V*.
EXAMPLE 11.2 Let V ¼ Kn
, the vector space of n-tuples, which we write as column vectors. Then the dual space V* can
be identified with the space of row vectors. In particular, any linear functional f ¼ ða1; ... ; anÞ in V* has the representation
fðx1; x2; . . . ; xnÞ ¼ ½a1; a2; . . . ; an½x2; x2; . . . ; xnT
¼ a1x1 þ a2x2 þ    þ anxn
Historically, the formal expression on the right was termed a linear form.
CHAPTER 11
349
11.3 Dual Basis
Suppose V is a vector space of dimension n over K. By Theorem 5.11, the dimension of the dual space V*
is also n (because K is of dimension 1 over itself). In fact, each basis of V determines a basis of V* as
follows (see Problem 11.3 for the proof).
THEOREM 11.1: Suppose fv1; . . . ; vng is a basis of V over K. Let f1; . . . ; fn 2 V* be the linear
functionals as defined by
fiðvjÞ ¼ dij ¼
1 if i ¼ j
0 if i 6¼ j
Then ff1; . . . ; fng is a basis of V*:
The above basis ffig is termed the basis dual to fvig or the dual basis. The above formula, which
uses the Kronecker delta dij, is a short way of writing
f1ðv1Þ ¼ 1; f1ðv2Þ ¼ 0; f1ðv3Þ ¼ 0; . . . ; f1ðvnÞ ¼ 0
f2ðv1Þ ¼ 0; f2ðv2Þ ¼ 1; f2ðv3Þ ¼ 0; . . . ; f2ðvnÞ ¼ 0
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
fnðv1Þ ¼ 0; fnðv2Þ ¼ 0; . . . ; fnðvn1Þ ¼ 0; fnðvnÞ ¼ 1
By Theorem 5.2, these linear mappings fi are unique and well defined.
EXAMPLE 11.3 Consider the basis fv1 ¼ ð2; 1Þ; v2 ¼ ð3; 1Þg of R2
. Find the dual basis ff1; f2g.
We seek linear functionals f1ðx; yÞ ¼ ax þ by and f2ðx; yÞ ¼ cx þ dy such that
f1ðv1Þ ¼ 1; f1ðv2Þ ¼ 0; f2ðv2Þ ¼ 0; f2ðv2Þ ¼ 1
These four conditions lead to the following two systems of linear equations:
f1ðv1Þ ¼ f1ð2; 1Þ ¼ 2a þ b ¼ 1
f1ðv2Þ ¼ f1ð3; 1Þ ¼ 3a þ b ¼ 0
and
f2ðv1Þ ¼ f2ð2; 1Þ ¼ 2c þ d ¼ 0
f2ðv2Þ ¼ f2ð3; 1Þ ¼ 3c þ d ¼ 1
The solutions yield a ¼ 1, b ¼ 3 and c ¼ 1, d ¼ 2. Hence, f1ðx; yÞ ¼ x þ 3y and f2ðx; yÞ ¼ x  2y form the
dual basis.
The next two theorems (proved in Problems 11.4 and 11.5, respectively) give relationships between
bases and their duals.
THEOREM 11.2: Let fv1; . . . ; vng be a basis of V and let ff1; . . . ; fng be the dual basis in V*. Then
(i) For any vector u 2 V, u ¼ f1ðuÞv1 þ f2ðuÞv2 þ    þ fnðuÞvn.
(ii) For any linear functional s 2 V*, s ¼ sðv1Þf1 þ sðv2Þf2 þ    þ sðvnÞfn.
THEOREM 11.3: Let fv1; . . . ; vng and fw1; . . . ; wng be bases of V and let ff1; . . . ; fng and
fs1; . . . ; sng be the bases of V* dual to fvig and fwig, respectively. Suppose P is
the change-of-basis matrix from fvig to fwig. Then ðP1
ÞT
is the change-of-basis
matrix from ffig to fsig.
11.4 Second Dual Space
We repeat: Every vector space V has a dual space V*, which consists of all the linear functionals on V.
Thus, V* has a dual space V**, called the second dual of V, which consists of all the linear functionals
on V*.
We now show that each v 2 V determines a specific element ^
v 2 V**. First, for any f 2 V*, we define
^
vðfÞ ¼ fðvÞ
350 CHAPTER 11 Linear Functionals and the Dual Space
It remains to be shown that this map ^
v:V* ! K is linear. For any scalars a; b 2 K and any linear
functionals f; s 2 V*, we have
^
vðaf þ bsÞ ¼ ðaf þ bsÞðvÞ ¼ afðvÞ þ bsðvÞ ¼ a^
vðfÞ þ b^
vðsÞ
That is, ^
v is linear and so ^
v 2 V**. The following theorem (proved in Problem 12.7) holds.
THEOREM 11.4: If V has finite dimensions, then the mapping v 7! ^
v is an isomorphism of V
onto V**.
The above mapping v 7! ^
v is called the natural mapping of V into V**. We emphasize that this
mapping is never onto V** if V is not finite-dimensional. However, it is always linear, and moreover, it is
always one-to-one.
Now suppose V does have finite dimension. By Theorem 11.4, the natural mapping determines an
isomorphism between V and V**. Unless otherwise stated, we will identify V with V** by this
mapping. Accordingly, we will view V as the space of linear functionals on V* and write V ¼ V**. We
remark that if ffig is the basis of V* dual to a basis fvig of V, then fvig is the basis of V** ¼ V that is
dual to ffig.
11.5 Annihilators
Let W be a subset (not necessarily a subspace) of a vector space V. A linear functional f 2 V* is called
an annihilator of W if fðwÞ ¼ 0 for every w 2 W—that is, if fðWÞ ¼ f0g. We show that the set of all
such mappings, denoted by W0
and called the annihilator of W, is a subspace of V*. Clearly, 0 2 W0
:
Now suppose f; s 2 W0
. Then, for any scalars a; b; 2 K and for any w 2 W,
ðaf þ bsÞðwÞ ¼ afðwÞ þ bsðwÞ ¼ a0 þ b0 ¼ 0
Thus, af þ bs 2 W0
, and so W0
is a subspace of V*.
In the case that W is a subspace of V, we have the following relationship between W and its annihilator
W0
(see Problem 11.11 for the proof).
THEOREM 11.5: Suppose V has finite dimension and W is a subspace of V. Then
ðiÞ dim W þ dim W0
¼ dim V and ðiiÞ W00
¼ W
Here W00
¼ fv 2 V :fðvÞ ¼ 0 for every f 2 W0
g or, equivalently, W00
¼ ðW0
Þ0
, where W00
is viewed
as a subspace of V under the identification of V and V**.
11.6 Transpose of a Linear Mapping
Let T :V ! U be an arbitrary linear mapping from a vector space V into a vector space U. Now for any
linear functional f 2 U*, the composition f  T is a linear mapping from V into K:
That is, f  T 2 V*. Thus, the correspondence
f 7! f  T
is a mapping from U* into V*; we denote it by Tt
and call it the transpose of T. In other words,
Tt
:U* ! V* is defined by
Tt
ðfÞ ¼ f  T
Thus, ðTt
ðfÞÞðvÞ ¼ fðTðvÞÞ for every v 2 V.
CHAPTER 11 Linear Functionals and the Dual Space 351
THEOREM 11.6: The transpose mapping Tt
defined above is linear.
Proof. For any scalars a; b 2 K and any linear functionals f; s 2 U*,
Tt
ðaf þ bsÞ ¼ ðaf þ bsÞ  T ¼ aðf  TÞ þ bðs  TÞ ¼ aTt
ðfÞ þ bTt
ðsÞ
That is, Tt
is linear, as claimed.
We emphasize that if T is a linear mapping from V into U, then Tt
is a linear mapping from U* into
V*. The same ‘‘transpose’’ for the mapping Tt
no doubt derives from the following theorem (proved in
Problem 11.16).
THEOREM 11.7: Let T :V ! U be linear, and let A be the matrix representation of T relative to bases
fvig of V and fuig of U. Then the transpose matrix AT
is the matrix representation of
Tt
:U* ! V* relative to the bases dual to fuig and fvig.
SOLVED PROBLEMS
Dual Spaces and Dual Bases
11.1. Find the basis ff1; f2; f3g that is dual to the following basis of R3
:
fv1 ¼ ð1; 1; 3Þ; v2 ¼ ð0; 1; 1Þ; v3 ¼ ð0; 3; 2Þg
The linear functionals may be expressed in the form
f1ðx; y; zÞ ¼ a1x þ a2y þ a3z; f2ðx; y; zÞ ¼ b1x þ b2y þ b3z; f3ðx; y; zÞ ¼ c1x þ c2y þ c3z
By definition of the dual basis, fiðvjÞ ¼ 0 for i 6¼ j, but fiðvjÞ ¼ 1 for i ¼ j.
We find f1 by setting f1ðv1Þ ¼ 1; f1ðv2Þ ¼ 0; f1ðv3Þ ¼ 0: This yields
f1ð1; 1; 3Þ ¼ a1  a2 þ 3a3 ¼ 1; f1ð0; 1; 1Þ ¼ a2  a3 ¼ 0; f1ð0; 3; 2Þ ¼ 3a2  2a3 ¼ 0
Solving the system of equations yields a1 ¼ 1, a2 ¼ 0, a3 ¼ 0. Thus, f1ðx; y; zÞ ¼ x.
We find f2 by setting f2ðv1Þ ¼ 0, f2ðv2Þ ¼ 1, f2ðv3Þ ¼ 0. This yields
f2ð1; 1; 3Þ ¼ b1  b2 þ 3b3 ¼ 0; f2ð0; 1; 1Þ ¼ b2  b3 ¼ 1; f2ð0; 3; 2Þ ¼ 3b2  2b3 ¼ 0
Solving the system of equations yields b1 ¼ 7, b2 ¼ 2, a3 ¼ 3. Thus, f2ðx; y; zÞ ¼ 7x  2y  3z.
We find f3 by setting f3ðv1Þ ¼ 0, f3ðv2Þ ¼ 0, f3ðv3Þ ¼ 1. This yields
f3ð1; 1; 3Þ ¼ c1  c2 þ 3c3 ¼ 0; f3ð0; 1; 1Þ ¼ c2  c3 ¼ 0; f3ð0; 3; 2Þ ¼ 3c2  2c3 ¼ 1
Solving the system of equations yields c1 ¼ 2, c2 ¼ 1, c3 ¼ 1. Thus, f3ðx; y; zÞ ¼ 2x þ y þ z.
11.2. Let V ¼ fa þ bt : a; b 2 Rg, the vector space of real polynomials of degree 1. Find the basis
fv1; v2g of V that is dual to the basis ff1; f2g of V* defined by
f1ð f ðtÞÞ ¼
ð1
0
f ðtÞ dt and f2ð f ðtÞÞ ¼
ð2
0
f ðtÞ dt
Let v1 ¼ a þ bt and v2 ¼ c þ dt. By definition of the dual basis,
f1ðv1Þ ¼ 1; f1ðv2Þ ¼ 0 and f2ðv1Þ ¼ 0; fiðvjÞ ¼ 1
Thus,
f1ðv1Þ ¼
Ð1
0 ða þ btÞ dt ¼ a þ 1
2 b ¼ 1
f2ðv1Þ ¼
Ð2
0 ða þ btÞ dt ¼ 2a þ 2b ¼ 0
)
and
f1ðv2Þ ¼
Ð1
0 ðc þ dtÞ dt ¼ c þ 1
2 d ¼ 0
f2ðv2Þ ¼
Ð2
0 ðc þ dtÞ dt ¼ 2c þ 2d ¼ 1
)
Solving each system yields a ¼ 2, b ¼ 2 and c ¼  1
2, d ¼ 1. Thus, fv1 ¼ 2  2t; v2 ¼  1
2 þ tg is
the basis of V that is dual to ff1; f2g.
352 CHAPTER 11 Linear Functionals and the Dual Space
11.3. Prove Theorem 11.1: Suppose fv1; . . . ; vng is a basis of V over K. Let f1; . . . ; fn 2 V* be
defined by fiðvjÞ ¼ 0 for i 6¼ j, but fiðvjÞ ¼ 1 for i ¼ j. Then ff1; . . . ; fng is a basis of V*.
We first show that ff1; . . . ; fng spans V*. Let f be an arbitrary element of V*, and suppose
fðv1Þ ¼ k1; fðv2Þ ¼ k2; . . . ; fðvnÞ ¼ kn
Set s ¼ k1f1 þ    þ knfn. Then
sðv1Þ ¼ ðk1f1 þ    þ knfnÞðv1Þ ¼ k1f1ðv1Þ þ k2f2ðv1Þ þ    þ knfnðv1Þ
¼ k1  1 þ k2  0 þ    þ kn  0 ¼ k1
Similarly, for i ¼ 2; . . . ; n,
sðviÞ ¼ ðk1f1 þ    þ knfnÞðviÞ ¼ k1f1ðviÞ þ    þ kifiðviÞ þ    þ knfnðviÞ ¼ ki
Thus, fðviÞ ¼ sðviÞ for i ¼ 1; ... ; n. Because f and s agree on the basis vectors,
f ¼ s ¼ k1f1 þ  þ knfn. Accordingly, ff1; ... ; fng spans V*.
It remains to be shown that ff1; . . . ; fng is linearly independent. Suppose
a1f1 þ a2f2 þ    þ anfn ¼ 0
Applying both sides to v1, we obtain
0 ¼ 0ðv1Þ ¼ ða1f1 þ    þ anfnÞðv1Þ ¼ a1f1ðv1Þ þ a2f2ðv1Þ þ    þ anfnðv1Þ
¼ a1  1 þ a2  0 þ    þ an  0 ¼ a1
Similarly, for i ¼ 2; . . . ; n,
0 ¼ 0ðviÞ ¼ ða1f1 þ    þ anfnÞðviÞ ¼ a1f1ðviÞ þ    þ aifiðviÞ þ    þ anfnðviÞ ¼ ai
That is, a1 ¼ 0; . . . ; an ¼ 0. Hence, ff1; . . . ; fng is linearly independent, and so it is a basis of V*.
11.4. Prove Theorem 11.2: Let fv1; . . . ; vng be a basis of V and let ff1; . . . ; fng be the dual basis in
V*. For any u 2 V and any s 2 V*, (i) u ¼
P
i fiðuÞvi. (ii) s ¼
P
i fðviÞfi.
Suppose
u ¼ a1v1 þ a2v2 þ    þ anvn ð1Þ
Then
f1ðuÞ ¼ a1f1ðv1Þ þ a2f1ðv2Þ þ    þ anf1ðvnÞ ¼ a1  1 þ a2  0 þ    þ an  0 ¼ a1
Similarly, for i ¼ 2; . . . ; n,
fiðuÞ ¼ a1fiðv1Þ þ    þ aifiðviÞ þ    þ anfiðvnÞ ¼ ai
That is, f1ðuÞ ¼ a1, f2ðuÞ ¼ a2; . . . ; fnðuÞ ¼ an. Substituting these results into (1), we obtain (i).
Next we prove ðiiÞ. Applying the linear functional s to both sides of (i),
sðuÞ ¼ f1ðuÞsðv1Þ þ f2ðuÞsðv2Þ þ    þ fnðuÞsðvnÞ
¼ sðv1Þf1ðuÞ þ sðv2Þf2ðuÞ þ    þ sðvnÞfnðuÞ
¼ ðsðv1Þf1 þ sðv2Þf2 þ    þ sðvnÞfnÞðuÞ
Because the above holds for every u 2 V, s ¼ sðv1Þf2 þ sðv2Þf2 þ    þ sðvnÞfn, as claimed.
11.5. Prove Theorem 11.3. Let fvig and fwig be bases of V and let ffig and fsig be the respective
dual bases in V*. Let P be the change-of-basis matrix from fvig to fwig: Then ðP1
ÞT
is the
change-of-basis matrix from ffig to fsig.
Suppose, for i ¼ 1; . . . ; n,
wi ¼ ai1v1 þ ai2v2 þ    þ ainvn and si ¼ bi1f1 þ bi2f2 þ    þ ainvn
Then P ¼ ½aij and Q ¼ ½bij. We seek to prove that Q ¼ ðP1
ÞT
.
Let Ri denote the ith row of Q and let Cj denote the jth column of PT
. Then
Ri ¼ ðbi1; bi2; . . . ; binÞ and Cj ¼ ðaj1; aj2; . . . ; ajnÞT
CHAPTER 11 Linear Functionals and the Dual Space 353
By definition of the dual basis,
siðwjÞ ¼ ðbi1f1 þ bi2f2 þ    þ binfnÞðaj1v1 þ aj2v2 þ    þ ajnvnÞ
¼ bi1aj1 þ bi2aj2 þ    þ binajn ¼ RiCj ¼ dij
where dij is the Kronecker delta. Thus,
QPT
¼ ½RiCj ¼ ½dij ¼ I
Therefore, Q ¼ ðPT
Þ1
¼ ðP1
ÞT
, as claimed.
11.6. Suppose v 2 V, v 6¼ 0, and dim V ¼ n. Show that there exists f 2 V* such that fðvÞ 6¼ 0.
We extend fvg to a basis fv; v2; . . . ; vng of V. By Theorem 5.2, there exists a unique linear mapping
f:V ! K such that fðvÞ ¼ 1 and fðviÞ ¼ 0, i ¼ 2; . . . ; n. Hence, f has the desired property.
11.7. Prove Theorem 11.4: Suppose dim V ¼ n. Then the natural mapping v 7! ^
v is an isomorphism of
V onto V**.
We first prove that the map v 7! ^
v is linear—that is, for any vectors v; w 2 V and any scalars a; b 2 K,
av þ bw ¼ a^
v þ b^
w. For any linear functional f 2 V*,
av þ bwðfÞ ¼ fðav þ bwÞ ¼ afðvÞ þ bfðwÞ ¼ a^
vðfÞ þ b ^
wðfÞ ¼ ða^
v þ b^
wÞðfÞ
Because av þ bwðfÞ ¼ ða^
v þ b^
wÞðfÞ for every f 2 V*, we have av þ bw ¼ a^
v þ b^
w. Thus, the map
v 7! ^
v is linear.
Now suppose v 2 V, v 6¼ 0. Then, by Problem 11.6, there exists f 2 V* for which fðvÞ 6¼ 0. Hence,
^
vðfÞ ¼ fðvÞ 6¼ 0, and thus ^
v 6¼ 0. Because v 6¼ 0 implies ^
v 6¼ 0, the map v 7! ^
v is nonsingular and hence
an isomorphism (Theorem 5.64).
Now dim V ¼ dim V* ¼ dim V**, because V has finite dimension. Accordingly, the mapping v 7! ^
v
is an isomorphism of V onto V**.
Annihilators
11.8. Show that if f 2 V* annihilates a subset S of V, then f annihilates the linear span LðSÞ of S.
Hence, S0
¼ ½spanðSÞ0
.
Suppose v 2 spanðSÞ. Then there exists w1; . . . ; wr 2 S for which v ¼ a1w1 þ a2w2 þ    þ arwr.
fðvÞ ¼ a1fðw1Þ þ a2fðw2Þ þ    þ arfðwrÞ ¼ a10 þ a20 þ    þ ar0 ¼ 0
Because v was an arbitrary element of spanðSÞ; f annihilates spanðSÞ, as claimed.
11.9. Find a basis of the annihilator W0
of the subspace W of R4
spanned by
v1 ¼ ð1; 2; 3; 4Þ and v2 ¼ ð0; 1; 4; 1Þ
By Problem 11.8, it suffices to find a basis of the set of linear functionals f such that fðv1Þ ¼ 0 and
fðv2Þ ¼ 0, where fðx1; x2; x3; x4Þ ¼ ax1 þ bx2 þ cx3 þ dx4. Thus,
fð1; 2; 3; 4Þ ¼ a þ 2b  3c þ 4d ¼ 0 and fð0; 1; 4; 1Þ ¼ b þ 4c  d ¼ 0
The system of two equations in the unknowns a; b; c; d is in echelon form with free variables c and d.
(1) Set c ¼ 1, d ¼ 0 to obtain the solution a ¼ 11, b ¼ 4, c ¼ 1, d ¼ 0.
(2) Set c ¼ 0, d ¼ 1 to obtain the solution a ¼ 6, b ¼ 1, c ¼ 0, d ¼ 1.
The linear functions f1ðxiÞ ¼ 11x1  4x2 þ x3 and f2ðxiÞ ¼ 6x1  x2 þ x4 form a basis of W0
.
11.10. Show that (a) For any subset S of V; S  S00
. (b) If S1  S2, then S0
2  S0
1.
(a) Let v 2 S. Then for every linear functional f 2 S0
, ^
vðfÞ ¼ fðvÞ ¼ 0. Hence, ^
v 2 ðS0
Þ0
. Therefore,
under the identification of V and V**, v 2 S00
. Accordingly, S  S00
.
(b) Let f 2 S0
2 . Then fðvÞ ¼ 0 for every v 2 S2. But S1  S2; hence, f annihilates every element of S1
(i.e., f 2 S0
1 ). Therefore, S0
2  S0
1.
d
d
d d
354 CHAPTER 11 Linear Functionals and the Dual Space
11.11. Prove Theorem 11.5: Suppose V has finite dimension and W is a subspace of V. Then
(i) dim W þ dim W0
¼ dim V, (ii) W00
¼ W.
(i) Suppose dim V ¼ n and dim W ¼ r  n. We want to show that dim W0
¼ n  r. We choose a basis
fw1; . . . ; wrg of W and extend it to a basis of V, say fw1; . . . ; wr; v1; . . . ; vnrg. Consider the dual
basis
ff1; . . . ; fr; s1; . . . ; snrg
By definition of the dual basis, each of the above s’s annihilates each wi; hence, s1; . . . ; snr 2 W0
.
We claim that fsig is a basis of W0
. Now fsjg is part of a basis of V*, and so it is linearly
independent.
We next show that ffjg spans W0
. Let s 2 W0
. By Theorem 11.2,
s ¼ sðw1Þf1 þ    þ sðwrÞfr þ sðv1Þs1 þ    þ sðvnrÞsnr
¼ 0f1 þ    þ 0fr þ sðv1Þs1 þ    þ sðvnrÞsnr
¼ sðv1Þs1 þ    þ sðvnrÞsnr
Consequently, fs1; . . . ; snrg spans W0
and so it is a basis of W0
. Accordingly, as required
dim W0
¼ n  r ¼ dim V  dim W:
(ii) Suppose dim V ¼ n and dim W ¼ r. Then dim V* ¼ n and, by (i), dim W0
¼ n  r. Thus, by (i),
dim W00
¼ n  ðn  rÞ ¼ r; therefore, dim W ¼ dim W00
. By Problem 11.10, W  W00
. Accord-
ingly, W ¼ W00
.
11.12. Let U and W be subspaces of V. Prove that ðU þ WÞ0
¼ U0
 W0
.
Let f 2 ðU þ WÞ0
. Then f annihilates U þ W; and so, in particular, f annihilates U and W: That is,
f 2 U0
and f 2 W0
; hence, f 2 U0
 W0
: Thus, ðU þ WÞ0
 U0
 W0
:
On the other hand, suppose s 2 U0
 W0
: Then s annihilates U and also W. If v 2 U þ W, then
v ¼ u þ w, where u 2 U and w 2 W. Hence, sðvÞ ¼ sðuÞ þ sðwÞ ¼ 0 þ 0 ¼ 0. Thus, s annihilates U þ W;
that is, s 2 ðU þ WÞ0
. Accordingly, U0
þ W0
 ðU þ WÞ0
.
The two inclusion relations together give us the desired equality.
Remark: Observe that no dimension argument is employed in the proof; hence, the result holds for
spaces of finite or infinite dimension.
Transpose of a Linear Mapping
11.13. Let f be the linear functional on R2
defined by fðx; yÞ ¼ x  2y. For each of the following linear
operators T on R2
, find ðTt
ðfÞÞðx; yÞ:
(a) Tðx; yÞ ¼ ðx; 0Þ, (b) Tðx; yÞ ¼ ðy; x þ yÞ, (c) Tðx; yÞ ¼ ð2x  3y; 5x þ 2yÞ
By definition, Tt
ðfÞ ¼ f  T; that is, ðTt
ðfÞÞðvÞ ¼ fðTðvÞÞ for every v. Hence,
(a) ðTt
ðfÞÞðx; yÞ ¼ fðTðx; yÞÞ ¼ fðx; 0Þ ¼ x
(b) ðTt
ðfÞÞðx; yÞ ¼ fðTðx; yÞÞ ¼ fðy; x þ yÞ ¼ y  2ðx þ yÞ ¼ 2x  y
(c) ðTt
ðfÞÞðx; yÞ ¼ fðTðx; yÞÞ ¼ fð2x  3y; 5x þ 2yÞ ¼ ð2x  3yÞ  2ð5x þ 2yÞ ¼ 8x  7y
11.14. Let T :V ! U be linear and let Tt
:U* ! V* be its transpose. Show that the kernel of Tt
is the
annihilator of the image of T—that is, Ker Tt
¼ ðIm TÞ0
.
Suppose f 2 Ker Tt
; that is, Tt
ðfÞ ¼ f  T ¼ 0. If u 2 Im T, then u ¼ TðvÞ for some v 2 V; hence,
fðuÞ ¼ fðTðvÞÞ ¼ ðf  TÞðvÞ ¼ 0ðvÞ ¼ 0
We have that fðuÞ ¼ 0 for every u 2 Im T; hence, f 2 ðIm TÞ0
. Thus, Ker Tt
 ðIm TÞ0
.
On the other hand, suppose s 2 ðIm TÞ0
; that is, sðIm TÞ ¼ f0g . Then, for every v 2 V,
ðTt
ðsÞÞðvÞ ¼ ðs  TÞðvÞ ¼ sðTðvÞÞ ¼ 0 ¼ 0ðvÞ
CHAPTER 11 Linear Functionals and the Dual Space 355
We have ðTt
ðsÞÞðvÞ ¼ 0ðvÞ for every v 2 V; hence, Tt
ðsÞ ¼ 0. Thus, s 2 Ker Tt
, and so
ðIm TÞ0
 Ker Tt
.
The two inclusion relations together give us the required equality.
11.15. Suppose V and U have finite dimension and T:V ! U is linear. Prove rankðTÞ ¼ rankðTt
Þ.
Suppose dim V ¼ n and dim U ¼ m, and suppose rankðTÞ ¼ r. By Theorem 11.5,
dimðIm TÞ0
¼ dim u  dimðIm TÞ ¼ m  rankðTÞ ¼ m  r
By Problem 11.14, Ker Tt
¼ ðIm TÞ0
. Hence, nullity ðTt
Þ ¼ m  r. It then follows that, as claimed,
rankðTt
Þ ¼ dim U*  nullityðTt
Þ ¼ m  ðm  rÞ ¼ r ¼ rankðTÞ
11.16. Prove Theorem 11.7: Let T :V ! U be linear and let A be the matrix representation of T in the
bases fvjg of V and fuig of U. Then the transpose matrix AT
is the matrix representation of
Tt
:U* ! V* in the bases dual to fuig and fvjg.
Suppose, for j ¼ 1; . . . ; m,
TðvjÞ ¼ aj1u1 þ aj2u2 þ    þ ajnun ð1Þ
We want to prove that, for i ¼ 1; . . . ; n,
Tt
ðsiÞ ¼ a1if1 þ a2if2 þ    þ amifm ð2Þ
where fsig and ffjg are the bases dual to fuig and fvjg, respectively.
Let v 2 V and suppose v ¼ k1v1 þ k2v2 þ    þ kmvm. Then, by (1),
TðvÞ ¼ k1Tðv1Þ þ k2Tðv2Þ þ    þ kmTðvmÞ
¼ k1ða11u1 þ    þ a1nunÞ þ k2ða21u1 þ    þ a2nunÞ þ    þ kmðam1u1 þ    þ amnunÞ
¼ ðk1a11 þ k2a21 þ    þ kmam1Þu1 þ    þ ðk1a1n þ k2a2n þ    þ kmamnÞun
¼
P
n
i¼1
ðk1a1i þ k2a2i þ    þ kmamiÞui
Hence, for j ¼ 1; . . . ; n.
ðTt
ðsjÞðvÞÞ ¼ sjðTðvÞÞ ¼ sj
P
n
i¼1
ðk1a1i þ k2a2i þ    þ kmamiÞui
 
¼ k1a1j þ k2a2j þ    þ kmamj ð3Þ
On the other hand, for j ¼ 1; . . . ; n,
ða1jf1 þ a2jf2 þ    þ amjfmÞðvÞ ¼ ða1jf1 þ a2jf2 þ    þ amjfmÞðk1v1 þ k2v2 þ    þ kmvmÞ
¼ k1a1j þ k2a2j þ    þ kmamj ð4Þ
Because v 2 V was arbitrary, (3) and (4) imply that
Tt
ðsjÞ ¼ a1jf1 þ a2jf2 þ    þ amjfm; j ¼ 1; . . . ; n
which is (2). Thus, the theorem is proved.
SUPPLEMENTARY PROBLEMS
Dual Spaces and Dual Bases
11.17. Find (a) f þ s, (b) 3f, (c) 2f  5s, where f:R3
! R and s:R3
! R are defined by
fðx; y; zÞ ¼ 2x  3y þ z and sðx; y; zÞ ¼ 4x  2y þ 3z
11.18. Find the dual basis of each of the following bases of R3
: (a) fð1; 0; 0Þ; ð0; 1; 0Þ; ð0; 0; 1Þg,
(b) fð1; 2; 3Þ; ð1; 1; 1Þ; ð2; 4; 7Þg.
356 CHAPTER 11 Linear Functionals and the Dual Space
11.19. Let V be the vector space of polynomials over R of degree 2. Let f1; f2; f3 be the linear functionals on
V defined by
f1ð f ðtÞÞ ¼
ð1
0
f ðtÞ dt; f2ð f ðtÞÞ ¼ f 0
ð1Þ; f3ð f ðtÞÞ ¼ f ð0Þ
Here f ðtÞ ¼ a þ bt þ ct2
2 V and f 0
ðtÞ denotes the derivative of f ðtÞ. Find the basis f f1ðtÞ; f2ðtÞ; f3ðtÞg of
V that is dual to ff1; f2; f3g.
11.20. Suppose u; v 2 V and that fðuÞ ¼ 0 implies fðvÞ ¼ 0 for all f 2 V*. Show that v ¼ ku for some scalar k.
11.21. Suppose f; s 2 V* and that fðvÞ ¼ 0 implies sðvÞ ¼ 0 for all v 2 V. Show that s ¼ kf for some scalar k.
11.22. Let V be the vector space of polynomials over K. For a 2 K, define fa :V ! K by fað f ðtÞÞ ¼ f ðaÞ. Show
that (a) fa is linear; (b) if a 6¼ b, then fa 6¼ fb.
11.23. Let V be the vector space of polynomials of degree 2. Let a; b; c 2 K be distinct scalars. Let fa; fb; fc
be the linear functionals defined by fað f ðtÞÞ ¼ f ðaÞ, fbð f ðtÞÞ ¼ f ðbÞ, fcð f ðtÞÞ ¼ f ðcÞ. Show that
ffa; fb; fcg is linearly independent, and find the basis f f1ðtÞ; f2ðtÞ; f3ðtÞg of V that is its dual.
11.24. Let V be the vector space of square matrices of order n. Let T :V ! K be the trace mapping; that is,
TðAÞ ¼ a11 þ a22 þ    þ ann, where A ¼ ðaijÞ. Show that T is linear.
11.25. Let W be a subspace of V. For any linear functional f on W, show that there is a linear functional s on V
such that sðwÞ ¼ fðwÞ for any w 2 W; that is, f is the restriction of s to W.
11.26. Let fe1; . . . ; eng be the usual basis of Kn
. Show that the dual basis is fp1; . . . ; png where pi is the ith
projection mapping; that is, piða1; . . . ; anÞ ¼ ai.
11.27. Let V be a vector space over R. Let f1; f2 2 V* and suppose s:V ! R; defined by sðvÞ ¼ f1ðvÞf2ðvÞ;
also belongs to V*. Show that either f1 ¼ 0 or f2 ¼ 0.
Annihilators
11.28. Let W be the subspace of R4
spanned by ð1; 2; 3; 4Þ, ð1; 3; 2; 6Þ, ð1; 4; 1; 8Þ. Find a basis of the
annihilator of W.
11.29. Let W be the subspace of R3
spanned by ð1; 1; 0Þ and ð0; 1; 1Þ. Find a basis of the annihilator of W.
11.30. Show that, for any subset S of V; spanðSÞ ¼ S00
, where spanðSÞ is the linear span of S.
11.31. Let U and W be subspaces of a vector space V of finite dimension. Prove that ðU  WÞ0
¼ U0
þ W0
.
11.32. Suppose V ¼ U W. Prove that V0
¼ U0
W0
.
Transpose of a Linear Mapping
11.33. Let f be the linear functional on R2
defined by fðx; yÞ ¼ 3x  2y. For each of the following linear
mappings T :R3
! R2
, find ðTt
ðfÞÞðx; y; zÞ:
(a) Tðx; y; zÞ ¼ ðx þ y; y þ zÞ, (b) Tðx; y; zÞ ¼ ðx þ y þ z; 2x  yÞ
11.34. Suppose T1 :U ! V and T2 :V ! W are linear. Prove that ðT2  T1Þt
¼ Tt
1
 Tt
2.
11.35. Suppose T :V ! U is linear and V has finite dimension. Prove that Im Tt
¼ ðKer TÞ0
.
CHAPTER 11 Linear Functionals and the Dual Space 357
11.36. Suppose T :V ! U is linear and u 2 U. Prove that u 2 Im T or there exists f 2 V* such that Tt
ðfÞ ¼ 0
and fðuÞ ¼ 1.
11.37. Let V be of finite dimension. Show that the mapping T 7! Tt
is an isomorphism from HomðV; VÞ onto
HomðV*; V*Þ. (Here T is any linear operator on V.)
Miscellaneous Problems
11.38. Let V be a vector space over R. The line segment uv joining points u; v 2 V is defined by
uv ¼ ftu þ ð1  tÞv:0  t  1g. A subset S of V is convex if u; v 2 S implies uv  S. Let f 2 V*. Define
Wþ
¼ fv 2 V : fðvÞ  0g; W ¼ fv 2 V : fðvÞ ¼ 0g; W
¼ fv 2 V : fðvÞ  0g
Prove that Wþ
; W, and W
are convex.
11.39. Let V be a vector space of finite dimension. A hyperplane H of V may be defined as the kernel of a nonzero
linear functional f on V. Show that every subspace of V is the intersection of a finite number of
hyperplanes.
ANSWERS TO SUPPLEMENTARY PROBLEMS
11.17. (a) 6x  5y þ 4z, (b) 6x  9y þ 3z, (c) 16x þ 4y  13z
11.18. (a) f1 ¼ x; f2 ¼ y; f3 ¼ z; (b) f1 ¼ 3x  5y  2z; f2 ¼ 2x þ y; f3 ¼ x þ 2y þ z
11.19. f1ðtÞ ¼ 3t  3
2 t2
; f2ðtÞ ¼  1
2 t þ 3
4 t2
; f3ðtÞ ¼ 1  3t þ 3
2 t2
11.22. (b) Let f ðtÞ ¼ t. Then fað f ðtÞÞ ¼ a 6¼ b ¼ fbð f ðtÞÞ; and therefore, fa 6¼ fb
11.23. f1ðtÞ ¼
t2
 ðb þ cÞt þ bc
ða  bÞða  cÞ
; f2ðtÞ ¼
t2
 ða þ cÞt þ ac
ðb  aÞðb  cÞ
; f3ðtÞ ¼
t2
 ða þ bÞt þ ab
ðc  aÞðc  bÞ
11.28. ff1ðx; y; z; tÞ ¼ 5x  y þ z; f2ðx; y; z; tÞ ¼ 2y  tg
11.29. ffðx; y; zÞ ¼ x  y þ zg
11.33. (a) ðTt
ðfÞÞðx; y; zÞ ¼ 3x þ y  2z, (b) ðTt
ðfÞÞðx; y; zÞ ¼ x þ 5y þ 3z
358 CHAPTER 11 Linear Functionals and the Dual Space
Bilinear, Quadratic,
and Hermitian Forms
12.1 Introduction
This chapter generalizes the notions of linear mappings and linear functionals. Specifically, we introduce
the notion of a bilinear form. These bilinear maps also give rise to quadratic and Hermitian forms.
Although quadratic forms were discussed previously, this chapter is treated independently of the previous
results.
Although the field K is arbitrary, we will later specialize to the cases K ¼ R and K ¼ C. Furthermore,
we may sometimes need to divide by 2. In such cases, we must assume that 1 þ 1 6¼ 0, which is true when
K ¼ R or K ¼ C.
12.2 Bilinear Forms
Let V be a vector space of finite dimension over a field K. A bilinear form on V is a mapping
f :V V ! K such that, for all a; b 2 K and all ui; vi 2 V:
(i) f ðau1 þ bu2; vÞ ¼ af ðu1; vÞ þ bf ðu2; vÞ,
(ii) f ðu; av1 þ bv2Þ ¼ af ðu; v1Þ þ bf ðu; v2Þ
We express condition (i) by saying f is linear in the first variable, and condition (ii) by saying f is linear
in the second variable.
EXAMPLE 12.1
(a) Let f be the dot product on Rn
; that is, for u ¼ ðaiÞ and v ¼ ðbiÞ,
f ðu; vÞ ¼ u  v ¼ a1b1 þ a2b2 þ    þ anbn
Then f is a bilinear form on Rn
. (In fact, any inner product on a real vector space V is a bilinear form
on V.)
(b) Let f and s be arbitrarily linear functionals on V. Let f :V V ! K be defined by f ðu; vÞ ¼ fðuÞsðvÞ. Then f is
a bilinear form, because f and s are each linear.
(c) Let A ¼ ½aij be any n n matrix over a field K. Then A may be identified with the following bilinear form F on
Kn
, where X ¼ ½xi and Y ¼ ½yi are column vectors of variables:
f ðX; YÞ ¼ XT
AY ¼
P
i;j
aijxiyi ¼ a11x1y1 þ a12x1y2 þ    þ annxnyn
The above formal expression in the variables xi; yi is termed the bilinear polynomial corresponding to the matrix
A. Equation (12.1) shows that, in a certain sense, every bilinear form is of this type.
CHAPTER 12
359
Space of Bilinear Forms
Let BðVÞ denote the set of all bilinear forms on V. A vector space structure is placed on BðVÞ, where for
any f ; g 2 BðVÞ and any k 2 K, we define f þ g and kf as follows:
ð f þ gÞðu; vÞ ¼ f ðu; vÞ þ gðu; vÞ and ðkf Þðu; vÞ ¼ kf ðu; vÞ
The following theorem (proved in Problem 12.4) applies.
THEOREM 12.1: Let V be a vector space of dimension n over K. Let ff1; . . . ; fng be any basis of the
dual space V*. Then f fij : i; j ¼ 1; . . . ; ng is a basis of BðVÞ, where fij is defined by
fijðu; vÞ ¼ fiðuÞfjðvÞ. Thus, in particular, dim BðVÞ ¼ n2
.
12.3 Bilinear Forms and Matrices
Let f be a bilinear form on V and let S ¼ fu1; . . . ; ung be a basis of V. Suppose u; v 2 V and
u ¼ a1u1 þ    þ anun and v ¼ b1u1 þ    þ bnun
Then
f ðu; vÞ ¼ f ða1u1 þ    þ anun; b1u1 þ    þ bnunÞ ¼
P
i;j
aibjf ðui; ujÞ
Thus, f is completely determined by the n2
values f ðui; ujÞ.
The matrix A ¼ ½aij where aij ¼ f ðui; ujÞ is called the matrix representation of f relative to the basis S
or, simply, the ‘‘matrix of f in S.’’ It ‘‘represents’’ f in the sense that, for all u; v 2 V,
f ðu; vÞ ¼
P
i;j
aibj f ðui; ujÞ ¼ ½uT
S A½vS ð12:1Þ
[As usual, ½uS denotes the coordinate (column) vector of u in the basis S.]
Change of Basis, Congruent Matrices
We now ask, how does a matrix representing a bilinear form transform when a new basis is selected? The
answer is given in the following theorem (proved in Problem 12.5).
THEOREM 12.2: Let P be a change-of-basis matrix from one basis S to another basis S0
. If A is the
matrix representing a bilinear form f in the original basis S, then B ¼ PT
AP is the
matrix representing f in the new basis S0
.
The above theorem motivates the following definition.
DEFINITION: A matrix B is congruent to a matrix A, written B ’ A, if there exists a nonsingular
matrix P such that B ¼ PT
AP.
Thus, by Theorem 12.2, matrices representing the same bilinear form are congruent. We remark that
congruent matrices have the same rank, because P and PT
are nonsingular; hence, the following definition
is well defined.
DEFINITION: The rank of a bilinear form f on V, written rankð f Þ, is the rank of any matrix
representation of f . We say f is degenerate or nondegenerate according to whether
rankð f Þ  dim V or rankð f Þ ¼ dim V.
12.4 Alternating Bilinear Forms
Let f be a bilinear form on V. Then f is called
(i) alternating if f ðv; vÞ ¼ 0 for every v 2 V;
(ii) skew-symmetric if f ðu; vÞ ¼ f ðv; uÞ for every u; v 2 V.
360 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
Now suppose (i) is true. Then (ii) is true, because, for any u; v; 2 V,
0 ¼ f ðu þ v; u þ vÞ ¼ f ðu; uÞ þ f ðu; vÞ þ f ðv; uÞ þ f ðv; vÞ ¼ f ðu; vÞ þ f ðv; uÞ
On the other hand, suppose (ii) is true and also 1 þ 1 6¼ 0. Then (i) is true, because, for every v 2 V, we
have f ðv; vÞ ¼ f ðv; vÞ. In other words, alternating and skew-symmetric are equivalent when 1 þ 1 6¼ 0.
The main structure theorem of alternating bilinear forms (proved in Problem 12.23) is as follows.
THEOREM 12.3: Let f be an alternating bilinear form on V. Then there exists a basis of V in which f
is represented by a block diagonal matrix M of the form
M ¼ diag
0 1
1 0
 
;
0 1
1 0
 
; . . . ;
0 1
1 0
 
; ½0; ½0; . . . ½0
 
Moreover, the number of nonzero blocks is uniquely determined by f [because it is
equal to 1
2 rankð f Þ.
In particular, the above theorem shows that any alternating bilinear form must have even rank.
12.5 Symmetric Bilinear Forms, Quadratic Forms
This section investigates the important notions of symmetric bilinear forms and quadratic forms and their
representation by means of symmetric matrices. The only restriction on the field K is that 1 þ 1 6¼ 0. In
Section 12.6, we will restrict K to be the real field R, which yields important special results.
Symmetric Bilinear Forms
Let f be a bilinear form on V. Then f is said to be symmetric if, for every u; v 2 V,
f ðu; vÞ ¼ f ðv; uÞ
One can easily show that f is symmetric if and only if any matrix representation A of f is a symmetric
matrix.
The main result for symmetric bilinear forms (proved in Problem 12.10) is as follows. (We emphasize
that we are assuming that 1 þ 1 6¼ 0.)
THEOREM 12.4: Let f be a symmetric bilinear form on V. Then V has a basis fv1; . . . ; vng in which f
is represented by a diagonal matrix—that is, where f ðvi; vjÞ ¼ 0 for i 6¼ j.
THEOREM 12.4: (Alternative Form) Let A be a symmetric matrix over K. Then A is congruent to a
diagonal matrix; that is, there exists a nonsingular matrix P such that PT
AP is
diagonal.
Diagonalization Algorithm
Recall that a nonsingular matrix P is a product of elementary matrices. Accordingly, one way of
obtaining the diagonal form D ¼ PT
AP is by a sequence of elementary row operations and the same
sequence of elementary column operations. This same sequence of elementary row operations on the
identity matrix I will yield PT
. This algorithm is formalized below.
ALGORITHM 12.1: (Congruence Diagonalization of a Symmetric Matrix) The input is a symmetric
matrix A ¼ ½aij of order n.
Step 1. Form the n 2n (block) matrix M ¼ ½A1; I, where A1 ¼ A is the left half of M and the identity
matrix I is the right half of M.
Step 2. Examine the entry a11. There are three cases.
CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 361
Case I: a11 6¼ 0. (Use a11 as a pivot to put 0’s below a11 in M and to the right of a11 in A1:Þ
For i ¼ 2; . . . ; n:
(a) Apply the row operation ‘‘Replace Ri by ai1R1 þ a11Ri.’’
(b) Apply the corresponding column operation ‘‘Replace Ci by ai1C1 þ a11Ci.’’
These operations reduce the matrix M to the form
M
a11 0 * *
0 A1 * *
 
ð*Þ
Case II: a11 ¼ 0 but akk 6¼ 0, for some k  1.
(a) Apply the row operation ‘‘Interchange R1 and Rk.’’
(b) Apply the corresponding column operation ‘‘Interchange C1 and Ck.’’
(These operations bring akk into the first diagonal position, which reduces the matrix
to Case I.)
Case III: All diagonal entries aii ¼ 0 but some aij 6¼ 0.
(a) Apply the row operation ‘‘Replace Ri by Rj þ Ri.’’
(b) Apply the corresponding column operation ‘‘Replace Ci by Cj þ Ci.’’
(These operations bring 2aij into the ith diagonal position, which reduces the matrix
to Case II.)
Thus, M is finally reduced to the form ð*Þ, where A2 is a symmetric matrix of order less than
A.
Step 3. Repeat Step 2 with each new matrix Ak (by neglecting the first row and column of the
preceding matrix) until A is diagonalized. Then M is transformed into the form M0
¼ ½D; Q,
where D is diagonal.
Step 4. Set P ¼ QT
. Then D ¼ PT
AP.
Remark 1: We emphasize that in Step 2, the row operations will change both sides of M, but the
column operations will only change the left half of M.
Remark 2: The condition 1 þ 1 6¼ 0 is used in Case III, where we assume that 2aij 6¼ 0 when
aij 6¼ 0.
The justification for the above algorithm appears in Problem 12.9.
EXAMPLE 12.2 Let A ¼
1 2 3
2 5 4
3 4 8
2
4
3
5. Apply Algorithm 9.1 to find a nonsingular matrix P such
that D ¼ PT
AP is diagonal.
First form the block matrix M ¼ ½A; I; that is, let
M ¼ ½A; I ¼
1 2 3 1 0 0
2 5 4 0 1 0
3 4 8 0 0 1
2
4
3
5
Apply the row operations ‘‘Replace R2 by 2R1 þ R2’’ and ‘‘Replace R3 by 3R1 þ R3’’ to M, and then apply the
corresponding column operations ‘‘Replace C2 by 2C1 þ C2’’ and ‘‘Replace C3 by 3C1 þ C3’’ to obtain
1 2 3 1 0 0
0 1 2 2 1 0
0 2 1 3 0 1
2
4
3
5 and then
1 0 0 1 0 0
0 1 2 2 1 0
0 2 1 3 0 1
2
4
3
5
362 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
Next apply the row operation ‘‘Replace R3 by 2R2 þ R3’’ and then the corresponding column operation ‘‘Replace
C3 by 2C2 þ C3’’ to obtain
1 0 0 1 0 0
0 1 2 2 1 0
0 0 5 7 2 1
2
4
3
5 and then
1 0 0 1 0 0
0 1 0 2 1 0
0 0 5 7 2 1
2
4
3
5
Now A has been diagonalized. Set
P ¼
1 2 7
0 1 2
0 0 1
2
4
3
5 and then D ¼ P1
AP ¼
1 0 0
0 1 0
0 0 5
2
4
3
5
We emphasize that P is the transpose of the right half of the final matrix.
Quadratic Forms
We begin with a definition.
DEFINITION A: A mapping q:V ! K is a quadratic form if qðvÞ ¼ f ðv; vÞ for some symmetric
bilinear form f on V.
If 1 þ 1 6¼ 0 in K, then the bilinear form f can be obtained from the quadratic form q by the following
polar form of f :
f ðu; vÞ ¼ 1
2 ½qðu þ vÞ  qðuÞ  qðvÞ
Now suppose f is represented by a symmetric matrix A ¼ ½aij, and 1 þ 1 6¼ 0. Letting X ¼ ½xi
denote a column vector of variables, q can be represented in the form
qðXÞ ¼ f ðX; XÞ ¼ XT
AX ¼
P
i;j
aijxixj ¼
P
i
aiix2
i þ 2
P
ij
aijxixj
The above formal expression in the variables xi is also called a quadratic form. Namely, we have the
following second definition.
DEFINITION B: A quadratic form q in variables x1; x2; . . . ; xn is a polynomial such that every term
has degree two; that is,
qðx1; x2; . . . ; xnÞ ¼
P
i
cix2
i þ
P
ij
dijxixj
Using 1 þ 1 6¼ 0, the quadratic form q in Definition B determines a symmetric matrix A ¼ ½aij where
aii ¼ ci and aij ¼ aji ¼ 1
2 dij. Thus, Definitions A and B are essentially the same.
If the matrix representation A of q is diagonal, then q has the diagonal representation
qðXÞ ¼ XT
AX ¼ a11x2
1 þ a22x2
2 þ    þ annx2
n
That is, the quadratic polynomial representing q will contain no ‘‘cross product’’ terms. Moreover, by
Theorem 12.4, every quadratic form has such a representation (when 1 þ 1 6¼ 0Þ.
12.6 Real Symmetric Bilinear Forms, Law of Inertia
This section treats symmetric bilinear forms and quadratic forms on vector spaces V over the real field R.
The special nature of R permits an independent theory. The main result (proved in Problem 12.14) is as
follows.
THEOREM 12.5: Let f be a symmetric form on V over R. Then there exists a basis of V in which f is
represented by a diagonal matrix. Every other diagonal matrix representation of f has
the same number p of positive entries and the same number n of negative entries.
CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 363
The above result is sometimes called the Law of Inertia or Sylvester’s Theorem. The rank and
signature of the symmetric bilinear form f are denoted and defined by
rankð f Þ ¼ p þ n and sigð f Þ ¼ p  n
These are uniquely defined by Theorem 12.5.
A real symmetric bilinear form f is said to be
(i) positive definite if qðvÞ ¼ f ðv; vÞ  0 for every v 6¼ 0,
(ii) nonnegative semidefinite if qðvÞ ¼ f ðv; vÞ  0 for every v.
EXAMPLE 12.3 Let f be the dot product on Rn
. Recall that f is a symmetric bilinear form on Rn
. We note
that f is also positive definite. That is, for any u ¼ ðaiÞ 6¼ 0 in Rn
,
f ðu; uÞ ¼ a2
1 þ a2
2 þ    þ a2
n  0
Section 12.5 and Chapter 13 tell us how to diagonalize a real quadratic form q or, equivalently, a real
symmetric matrix A by means of an orthogonal transition matrix P. If P is merely nonsingular, then q can
be represented in diagonal form with only 1’s and 1’s as nonzero coefficients. Namely, we have the
following corollary.
COROLLARY 12.6: Any real quadratic form q has a unique representation in the form
qðx1; x2; . . . ; xnÞ ¼ x2
1 þ    þ x2
p  x2
pþ1      x2
r
where r ¼ p þ n is the rank of the form.
COROLLARY 12.6: (Alternative Form) Any real symmetric matrix A is congruent to the unique
diagonal matrix
D ¼ diagðIp; In; 0Þ
where r ¼ p þ n is the rank of A.
12.7 Hermitian Forms
Let V be a vector space of finite dimension over the complex field C. A Hermitian form on V is a
mapping f :V V ! C such that, for all a; b 2 C and all ui; v 2 V,
(i) f ðau1 þ bu2; vÞ ¼ af ðu1; vÞ þ bf ðu2; vÞ,
(ii) f ðu; vÞ ¼ f ðv; uÞ.
(As usual, 
k denotes the complex conjugate of k 2 C.)
Using (i) and (ii), we get
f ðu; av1 þ bv2Þ ¼ f ðav1 þ bv2; uÞ ¼ af ðv1; uÞ þ bf ðv2; uÞ
¼ ^
af ðv1; uÞ þ bf ðv2; uÞ ¼ 
af ðu; v1Þ þ 
bf ðu; v2Þ
That is,
ðiiiÞ f ðu; av1 þ bv2Þ ¼ 
af ðu; v1Þ þ 
bf ðu; v2Þ:
As before, we express condition (i) by saying f is linear in the first variable. On the other hand, we
express condition (iii) by saying f is ‘‘conjugate linear’’ in the second variable. Moreover, condition (ii)
tells us that f ðv; vÞ ¼ f ðv; vÞ, and hence, f ðv; vÞ is real for every v 2 V.
The results of Sections 12.5 and 12.6 for symmetric forms have their analogues for Hermitian forms.
Thus, the mapping q:V ! R, defined by qðvÞ ¼ f ðv; vÞ, is called the Hermitian quadratic form or
complex quadratic form associated with the Hermitian form f . We can obtain f from q by the polar form
f ðu; vÞ ¼ 1
4 ½qðu þ vÞ  qðu  vÞ þ 1
4 ½qðu þ ivÞ  qðu  ivÞ
364 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
Now suppose S ¼ fu1; . . . ; ung is a basis of V. The matrix H ¼ ½hij where hij ¼ f ðui; ujÞ is called the
matrix representation of f in the basis S. By (ii), f ðui; ujÞ ¼ f ðuj; uiÞ; hence, H is Hermitian and, in
particular, the diagonal entries of H are real. Thus, any diagonal representation of f contains only real
entries.
The next theorem (to be proved in Problem 12.47) is the complex analog of Theorem 12.5 on real
symmetric bilinear forms.
THEOREM 12.7: Let f be a Hermitian form on V over C. Then there exists a basis of V in which f is
represented by a diagonal matrix. Every other diagonal matrix representation of f
has the same number p of positive entries and the same number n of negative
entries.
Again the rank and signature of the Hermitian form f are denoted and defined by
rankð f Þ ¼ p þ n and sigð f Þ ¼ p  n
These are uniquely defined by Theorem 12.7.
Analogously, a Hermitian form f is said to be
(i) positive definite if qðvÞ ¼ f ðv; vÞ  0 for every v 6¼ 0,
(ii) nonnegative semidefinite if qðvÞ ¼ f ðv; vÞ  0 for every v.
EXAMPLE 12.4 Let f be the dot product on Cn
; that is, for any u ¼ ðziÞ and v ¼ ðwiÞ in Cn
,
f ðu; vÞ ¼ u  v ¼ z1 
w1 þ z2 
w2 þ    þ zn 
wn
Then f is a Hermitian form on Cn
. Moreover, f is also positive definite, because, for any u ¼ ðziÞ 6¼ 0 in Cn
,
f ðu; uÞ ¼ z1
z1 þ z2
z2 þ    þ zn
zn ¼ jz1j2
þ jz2j2
þ    þ jznj2
 0
SOLVED PROBLEMS
Bilinear Forms
12.1. Let u ¼ ðx1; x2; x3Þ and v ¼ ðy1; y2; y3Þ. Express f in matrix notation, where
f ðu; vÞ ¼ 3x1y1  2x1y3 þ 5x2y1 þ 7x2y2  8x2y3 þ 4x3y2  6x3y3
Let A ¼ ½aij, where aij is the coefficient of xiyj. Then
f ðu; vÞ ¼ XT
AY ¼ ½x1; x2; x3
3 0 2
5 7 8
0 4 6
2
4
3
5
y1
y2
y3
2
4
3
5
12.2. Let A be an n n matrix over K. Show that the mapping f defined by f ðX; YÞ ¼ XT
AY is a
bilinear form on Kn
.
For any a; b 2 K and any Xi; Yi 2 Kn
,
f ðaX1 þ bX2; YÞ ¼ ðaX1 þ bX2ÞT
AY ¼ ðaXT
1 þ bXT
2 ÞAY
¼ aXT
1 AY þ bXT
2 AY ¼ af ðX1; YÞ þ bf ðX2; YÞ
Hence, f is linear in the first variable. Also,
f ðX; aY1 þ bY2Þ ¼ XT
AðaY1 þ bY2Þ ¼ aXT
AY1 þ bXT
AY2 ¼ af ðX; Y1Þ þ bf ðX; Y2Þ
Hence, f is linear in the second variable, and so f is a bilinear form on Kn
.
CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 365
12.3. Let f be the bilinear form on R2
defined by
f ½ðx1; x2Þ; ðy1; y2Þ ¼ 2x1y1  3x1y2 þ 4x2y2
(a) Find the matrix A of f in the basis fu1 ¼ ð1; 0Þ; u2 ¼ ð1; 1Þg.
(b) Find the matrix B of f in the basis fv1 ¼ ð2; 1Þ; v2 ¼ ð1; 1Þg.
(c) Find the change-of-basis matrix P from the basis fuig to the basis fvig, and verify that
B ¼ PT
AP.
(a) Set A ¼ ½aij, where aij ¼ f ðui; ujÞ. This yields
a11 ¼ f ½ð1; 0Þ; ð1; 0Þ ¼ 2  0  0 ¼ 2; a21 ¼ f ½ð1; 1Þ; ð1; 0Þ ¼ 2  0 þ 0 ¼ 2
a12 ¼ f ½ð1; 0Þ; ð1; 1Þ ¼ 2  3  0 ¼ 1; a22 ¼ f ½ð1; 1Þ; ð1; 1Þ ¼ 2  3 þ 4 ¼ 3
Thus, A ¼
2 1
2 3
 
is the matrix of f in the basis fu1; u2g.
(b) Set B ¼ ½bij, where bij ¼ f ðvi; vjÞ. This yields
b11 ¼ f ½ð2; 1Þ; ð2; 1Þ ¼ 8  6 þ 4 ¼ 6; b21 ¼ f ½ð1; 1Þ; ð2; 1Þ ¼ 4  3  4 ¼ 3
b12 ¼ f ½ð2; 1Þ; ð1; 1Þ ¼ 4 þ 6  4 ¼ 6; b22 ¼ f ½ð1; 1Þ; ð1; 1Þ ¼ 2 þ 3 þ 4 ¼ 9
Thus, B ¼
6 6
3 9
 
is the matrix of f in the basis fv1; v2g.
(c) Writing v1 and v2 in terms of the ui yields v1 ¼ u1 þ u2 and v2 ¼ 2u1  u2. Then
P ¼
1 2
1 1
 
; PT
¼
1 1
2 1
 
PT
AP ¼
1 1
2 1
 
2 1
2 3
 
1 2
1 1
 
¼
6 6
3 9
 
¼ B
and
12.4. Prove Theorem 12.1: Let V be an n-dimensional vector space over K. Let ff1; . . . ; fng be any
basis of the dual space V*. Then f fij : i; j ¼ 1; . . . ; ng is a basis of BðVÞ, where fij is defined by
fijðu; vÞ ¼ fiðuÞfjðvÞ. Thus, dim BðVÞ ¼ n2
.
Let fu1; . . . ; ung be the basis of V dual to ffig. We first show that f fijg spans BðVÞ. Let f 2 BðVÞ and
suppose f ðui; ujÞ ¼ aij: We claim that f ¼
P
i;j aij fij. It suffices to show that
f ðus; utÞ ¼
P
aij fij ðus; utÞ for s; t ¼ 1; . . . ; n
We have
P
aij fij ðus; utÞ ¼
P
aij fijðus; utÞ ¼
P
aijfiðusÞfjðutÞ ¼
P
aijdisdjt ¼ ast ¼ f ðus; utÞ
as required. Hence, ffijg spans BðVÞ. Next, suppose
P
aijfij ¼ 0. Then for s; t ¼ 1; . . . ; n,
0 ¼ 0ðus; utÞ ¼ ð
P
aij fijÞðus; utÞ ¼ ars
The last step follows as above. Thus, f fijg is independent, and hence is a basis of BðVÞ.
12.5. Prove Theorem 12.2. Let P be the change-of-basis matrix from a basis S to a basis S0
. Let A be
the matrix representing a bilinear form in the basis S. Then B ¼ PT
AP is the matrix representing
f in the basis S0
.
Let u; v 2 V. Because P is the change-of-basis matrix from S to S0
, we have P½uS0 ¼ ½uS and also
P½vS0 ¼ ½vS; hence, ½uT
S ¼ ½uT
S0 PT
. Thus,
f ðu; vÞ ¼ ½uT
S A½vS ¼ ½uT
S0 PT
AP½vS0
Because u and v are arbitrary elements of V, PT
AP is the matrix of f in the basis S0
.
366 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
Symmetric Bilinear Forms, Quadratic Forms
12.6. Find the symmetric matrix that corresponds to each of the following quadratic forms:
(a) qðx; y; zÞ ¼ 3x2
þ 4xy  y2
þ 8xz  6yz þ z2
,
(b) q0
ðx; y; zÞ ¼ 3x2
þ xz  2yz, (c) q00
ðx; y; zÞ ¼ 2x2
 5y2
 7z2
The symmetric matrix A ¼ ½aij that represents qðx1; . . . ; xnÞ has the diagonal entry aii equal to the
coefficient of the square term x2
i and the nondiagonal entries aij and aji each equal to half of the coefficient
of the cross-product term xixj. Thus,
(a) A ¼
3 2 4
2 1 3
4 3 1
2
4
3
5, (b) A0
¼
3 0 1
2
0 0 1
1
2 1 0
2
4
3
5, (c) A00
¼
2 0 0
0 5 0
0 0 7
2
4
3
5
The third matrix A00
is diagonal, because the quadratic form q00
is diagonal; that is, q00
has no cross-product
terms.
12.7. Find the quadratic form qðXÞ that corresponds to each of the following symmetric matrices:
(a) A ¼
5 3
3 8
 
; (b) B ¼
4 5 7
5 6 8
7 8 9
2
4
3
5, (c) C ¼
2 4 1 5
4 7 6 8
1 6 3 9
5 8 9 1
2
6
6
4
3
7
7
5
The quadratic form qðXÞ that corresponds to a symmetric matrix M is defined by qðXÞ ¼ XT
MX,
where X ¼ ½xi is the column vector of unknowns.
(a) Compute as follows:
qðx; yÞ ¼ XT
AX ¼ ½x; y
5 3
3 8
 
x
y
 
¼ ½5x  3y; 3x þ 8y
x
y
 
¼ 5x2
 3xy  3xy þ 8y2
¼ 5x2
 6xy þ 8y2
As expected, the coefficient 5 of the square term x2
and the coefficient 8 of the square term y2
are
the diagonal elements of A, and the coefficient 6 of the cross-product term xy is the sum of
the nondiagonal elements 3 and 3 of A (or twice the nondiagonal element 3, because A is
symmetric).
(b) Because B is a three-square matrix, there are three unknowns, say x; y; z or x1; x2; x3. Then
qðx; y; zÞ ¼ 4x2
 10xy  6y2
þ 14xz þ 16yz  9z2
qðx1; x2; x3Þ ¼ 4x2
1  10x1x2  6x2
2 þ 14x1x3 þ 16x2x3  9x2
3
or
Here we use the fact that the coefficients of the square terms x2
1; x2
2; x2
3 (or x2
; y2
; z2
) are the respective
diagonal elements 4; 6; 9 of B, and the coefficient of the cross-product term xixj is the sum of the
nondiagonal elements bij and bji (or twice bij, because bij ¼ bji).
(c) Because C is a four-square matrix, there are four unknowns. Hence,
qðx1; x2; x3; x4Þ ¼ 2x2
1  7x2
2 þ 3x2
3 þ x2
4 þ 8x1x2  2x1x3
þ 10x1x4  12x2x3 þ 16x2x4 þ 18x3x4
12.8. Let A ¼
1 3 2
3 7 5
2 5 8
2
4
3
5. Apply Algorithm 12.1 to find a nonsingular matrix P such that
D ¼ PT
AP is diagonal, and find sigðAÞ, the signature of A.
CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 367
First form the block matrix M ¼ ½A; I:
M ¼ ½A; I ¼
1 3 2 1 0 0
3 7 5 0 1 0
2 5 8 0 0 1
2
4
3
5
Using a11 ¼ 1 as a pivot, apply the row operations ‘‘Replace R2 by 3R1 þ R2’’ and ‘‘Replace R3 by
2R1 þ R3’’ to M and then apply the corresponding column operations ‘‘Replace C2 by 3C1 þ C2’’ and
‘‘Replace C3 by 2C1 þ C3’’ to A to obtain
1 3 2 1 0 0
0 2 1 3 1 0
0 1 4 2 0 1
2
4
3
5 and then
1 0 0 1 0 0
0 2 1 3 1 0
0 1 4 2 0 1
2
4
3
5:
Next apply the row operation ‘‘Replace R3 by R2 þ 2R3’’ and then the corresponding column operation
‘‘Replace C3 by C2 þ 2C3’’ to obtain
1 0 0 1 0 0
0 2 1 3 1 0
0 0 9 1 1 2
2
4
3
5 and then
1 0 0 1 0 0
0 2 0 3 1 0
0 0 18 1 1 2
2
4
3
5
Now A has been diagonalized and the transpose of P is in the right half of M. Thus, set
P ¼
1 3 1
0 1 1
0 0 2
2
4
3
5 and then D ¼ PT
AP ¼
1 0 0
0 2 0
0 0 18
2
4
3
5
Note D has p ¼ 2 positive and n ¼ 1 negative diagonal elements. Thus, the signature of A is
sigðAÞ ¼ p  n ¼ 2  1 ¼ 1.
12.9. Justify Algorithm 12.1, which diagonalizes (under congruence) a symmetric matrix A.
Consider the block matrix M ¼ ½A; I. The algorithm applies a sequence of elementary row operations
and the corresponding column operations to the left side of M, which is the matrix A. This is equivalent to
premultiplying A by a sequence of elementary matrices, say, E1; E2; . . . ; Er, and postmultiplying A by the
transposes of the Ei. Thus, when the algorithm ends, the diagonal matrix D on the left side of M is equal to
D ¼ Er    E2E1AET
1 ET
2    ET
r ¼ QAQT
; where Q ¼ Er    E2E1
On the other hand, the algorithm only applies the elementary row operations to the identity matrix I on the
right side of M. Thus, when the algorithm ends, the matrix on the right side of M is equal to
Er    E2E1I ¼ Er    E2E1 ¼ Q
Setting P ¼ QT
, we get D ¼ PT
AP, which is a diagonalization of A under congruence.
12.10. Prove Theorem 12.4: Let f be a symmetric bilinear form on V over K (where 1 þ 1 6¼ 0). Then
V has a basis in which f is represented by a diagonal matrix.
Algorithm 12.1 shows that every symmetric matrix over K is congruent to a diagonal matrix. This is
equivalent to the statement that f has a diagonal representation.
12.11. Let q be the quadratic form associated with the symmetric bilinear form f . Verify the polar
identity f ðu; vÞ ¼ 1
2 ½qðu þ vÞ  qðuÞ  qðvÞ. (Assume that 1 þ 1 6¼ 0.)
We have
qðu þ vÞ  qðuÞ  qðvÞ ¼ f ðu þ v; u þ vÞ  f ðu; uÞ  f ðv; vÞ
¼ f ðu; uÞ þ f ðu; vÞ þ f ðv; uÞ þ f ðv; vÞ  f ðu; uÞ  f ðv; vÞ ¼ 2f ðu; vÞ
If 1 þ 1 6¼ 0, we can divide by 2 to obtain the required identity.
368 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
12.12. Consider the quadratic form qðx; yÞ ¼ 3x2
þ 2xy  y2
and the linear substitution
x ¼ s  3t; y ¼ 2s þ t
(a) Rewrite qðx; yÞ in matrix notation, and find the matrix A representing qðx; yÞ.
(b) Rewrite the linear substitution using matrix notation, and find the matrix P corresponding to
the substitution.
(c) Find qðs; tÞ using direct substitution.
(d) Find qðs; tÞ using matrix notation.
(a) Here qðx; yÞ ¼ ½x; y
3 1
1 1
 
x
y
 
. Thus, A ¼
3 1
1 1
 
; and qðXÞ ¼ XT
AX, where X ¼ ½x; yT
.
(b) Here
x
y
 
¼
1 3
2 1
 
s
t
 
. Thus, P ¼
1 3
2 1
 
; and X ¼
x
y
 
; Y ¼
s
t
 
and X ¼ PY.
(c) Substitute for x and y in q to obtain
qðs; tÞ ¼ 3ðs  3tÞ2
þ 2ðs  3tÞð2s þ tÞ  ð2s þ tÞ2
¼ 3ðs2
 6st þ 9t2
Þ þ 2ð2s2
 5st  3t2
Þ  ð4s2
þ 4st þ t2
Þ ¼ 3s2
 32st þ 20t2
(d) Here qðXÞ ¼ XT
AX and X ¼ PY. Thus, XT
¼ YT
PT
. Therefore,
qðs; tÞ ¼ qðYÞ ¼ YT
PT
APY ¼ ½s; t
1 2
3 1
 
3 1
1 1
 
1 3
2 1
 
s
t
 
¼ ½s; t
3 16
16 20
 
s
t
 
¼ 3s2
 32st þ 20t2
[As expected, the results in parts (c) and (d) are equal.]
12.13. Consider any diagonal matrix A ¼ diagða1; . . . ; anÞ over K. Show that for any nonzero scalars
k1; . . . ; kn 2 K; A is congruent to a diagonal matrix D with diagonal entries a1k2
1; . . . ; ank2
n.
Furthermore, show that
(a) If K ¼ C, then we can choose D so that its diagonal entries are only 1’s and 0’s.
(b) If K ¼ R, then we can choose D so that its diagonal entries are only 1’s, 1’s, and 0’s.
Let P ¼ diagðk1; . . . ; knÞ. Then, as required,
D ¼ PT
AP ¼ diagðkiÞ diagðaiÞ diagðkiÞ ¼ diagða1k2
1 ; . . . ; ank2
n Þ
(a) Let P ¼ diagðbiÞ, where bi ¼
1=
ffiffiffiffi
ai
p
if ai 6¼ 0
1 if ai ¼ 0
Then PT
AP has the required form.
(b) Let P ¼ diagðbiÞ, where bi ¼
1=
ffiffiffiffiffiffiffi
jaij
p
if ai 6¼ 0
1 if ai ¼ 0
Then PT
AP has the required form.
Remark: We emphasize that (b) is no longer true if ‘‘congruence’’ is replaced by
‘‘Hermitian congruence.’’
12.14. Prove Theorem 12.5: Let f be a symmetric bilinear form on V over R. Then there exists a basis
of V in which f is represented by a diagonal matrix. Every other diagonal matrix representation
of f has the same number p of positive entries and the same number n of negative entries.
By Theorem 12.4, there is a basis fu1; . . . ; ung of V in which f is represented by a diagonal matrix
with, say, p positive and n negative entries. Now suppose fw1; . . . ; wng is another basis of V, in which f is
represented by a diagonal matrix with p0
positive and n0
negative entries. We can assume without loss of
generality that the positive entries in each matrix appear first. Because rankð f Þ ¼ p þ n ¼ p0
þ n0
, it
suffices to prove that p ¼ p0
.
CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 369
Let U be the linear span of u1; . . . ; up and let W be the linear span of wp0þ1; . . . ; wn. Then f ðv; vÞ  0
for every nonzero v 2 U, and f ðv; vÞ  0 for every nonzero v 2 W. Hence, U  W ¼ f0g. Note that
dim U ¼ p and dim W ¼ n  p0
. Thus,
dimðU þ WÞ ¼ dim U þ dimW  dimðU  WÞ ¼ p þ ðn  p0
Þ  0 ¼ p  p0
þ n
But dimðU þ WÞ  dim V ¼ n; hence, p  p0
þ n  n or p  p0
. Similarly, p0
 p and therefore p ¼ p0
,
as required.
Remark: The above theorem and proof depend only on the concept of positivity. Thus, the
theorem is true for any subfield K of the real field R such as the rational field Q.
Positive Definite Real Quadratic Forms
12.15. Prove that the following definitions of a positive definite quadratic form q are equivalent:
(a) The diagonal entries are all positive in any diagonal representation of q.
(b) qðYÞ  0, for any nonzero vector Y in Rn
.
Suppose qðYÞ ¼ a1y2
1 þ a2y2
2 þ    þ any2
n. If all the coefficients are positive, then clearly qðYÞ  0
whenever Y 6¼ 0. Thus, (a) implies (b). Conversely, suppose (a) is not true; that is, suppose some diagonal
entry ak  0. Let ek ¼ ð0; . . . ; 1; . . . 0Þ be the vector whose entries are all 0 except 1 in the kth position.
Then qðekÞ ¼ ak is not positive, and so (b) is not true. That is, (b) implies (a). Accordingly, (a) and (b) are
equivalent.
12.16. Determine whether each of the following quadratic forms q is positive definite:
(a) qðx; y; zÞ ¼ x2
þ 2y2
 4xz  4yz þ 7z2
(b) qðx; y; zÞ ¼ x2
þ y2
þ 2xz þ 4yz þ 3z2
Diagonalize (under congruence) the symmetric matrix A corresponding to q.
(a) Apply the operations ‘‘Replace R3 by 2R1 þ R3’’ and ‘‘Replace C3 by 2C1 þ C3,’’ and then ‘‘Replace
R3 by R2 þ R3’’ and ‘‘Replace C3 by C2 þ C3.’’ These yield
A ¼
1 0 2
0 2 2
2 2 7
2
4
3
5 ’
1 0 0
0 2 2
0 2 3
2
4
3
5 ’
1 0 0
0 2 0
0 0 1
2
4
3
5
The diagonal representation of q only contains positive entries, 1; 2; 1, on the diagonal. Thus, q is
positive definite.
(b) We have
A ¼
1 0 1
0 1 2
1 2 3
2
4
3
5 ’
1 0 0
0 1 2
0 2 2
2
4
3
5 ’
1 0 0
0 1 0
0 0 2
2
4
3
5
There is a negative entry 2 on the diagonal representation of q. Thus, q is not positive definite.
12.17. Show that qðx; yÞ ¼ ax2
þ bxy þ cy2
is positive definite if and only if a  0 and the discriminant
D ¼ b2
 4ac  0.
Suppose v ¼ ðx; yÞ 6¼ 0. Then either x 6¼ 0 or y 6¼ 0; say, y 6¼ 0. Let t ¼ x=y. Then
qðvÞ ¼ y2
½aðx=yÞ2
þ bðx=yÞ þ c ¼ y2
ðat2
þ bt þ cÞ
However, the following are equivalent:
(i) s ¼ at2
þ bt þ c is positive for every value of t.
(ii) s ¼ at2
þ bt þ c lies above the t-axis.
(iii) a  0 and D ¼ b2
 4ac  0.
370 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
Thus, q is positive definite if and only if a  0 and D  0. [Remark: D  0 is the same as detðAÞ  0,
where A is the symmetric matrix corresponding to q.]
12.18. Determine whether or not each of the following quadratic forms q is positive definite:
(a) qðx; yÞ ¼ x2
 4xy þ 7y2
, (b) qðx; yÞ ¼ x2
þ 8xy þ 5y2
, (c) qðx; yÞ ¼ 3x2
þ 2xy þ y2
Compute the discriminant D ¼ b2
 4ac, and then use Problem 12.17.
(a) D ¼ 16  28 ¼ 12. Because a ¼ 1  0 and D  0; q is positive definite.
(b) D ¼ 64  20 ¼ 44. Because D  0; q is not positive definite.
(c) D ¼ 4  12 ¼ 8. Because a ¼ 3  0 and D  0; q is positive definite.
Hermitian Forms
12.19. Determine whether the following matrices are Hermitian:
(a)
2 2 þ 3i 4  5i
2  3i 5 6 þ 2i
4 þ 5i 6  2i 7
2
4
3
5, (b)
3 2  i 4 þ i
2  i 6 i
4 þ i i 7
2
4
3
5, (c)
4 3 5
3 2 1
5 1 6
2
4
3
5
A complex matrix A ¼ ½aij is Hermitian if A* ¼ A—that is, if aij ¼ 
aji:
(a) Yes, because it is equal to its conjugate transpose.
(b) No, even though it is symmetric.
(c) Yes. In fact, a real matrix is Hermitian if and only if it is symmetric.
12.20. Let A be a Hermitian matrix. Show that f is a Hermitian form on Cn
where f is defined by
f ðX; YÞ ¼ XT
A 
Y.
For all a; b 2 C and all X1; X2; Y 2 Cn
,
f ðaX1 þ bX2; YÞ ¼ ðaX1 þ bX2ÞT
A 
Y ¼ ðaXT
1 þ bXT
2 ÞA 
Y
¼ aXT
1 A 
Y þ bXT
2 A 
Y ¼ af ðX1; YÞ þ bf ðX2; YÞ
Hence, f is linear in the first variable. Also,
f ðX; YÞ ¼ XTA 
Y ¼ ðXTA 
YÞT
¼ 
YT AT X ¼ YT
A* 
X ¼ YT
A 
X ¼ f ðY; XÞ
Hence, f is a Hermitian form on Cn
.
Remark: We use the fact that XT
A 
Y is a scalar and so it is equal to its transpose.
12.21. Let f be a Hermitian form on V. Let H be the matrix of f in a basis S ¼ fuig of V. Prove the
following:
(a) f ðu; vÞ ¼ ½uT
S H½vS for all u; v 2 V.
(b) If P is the change-of-basis matrix from S to a new basis S0
of V, then B ¼ PT
H 
P (or
B ¼ Q*HQ, where Q ¼ 
PÞ is the matrix of f in the new basis S0
.
Note that (b) is the complex analog of Theorem 12.2.
(a) Let u; v 2 V and suppose u ¼ a1u1 þ    þ anun and v ¼ b1u1 þ    þ bnun. Then, as required,
f ðu; vÞ ¼ f ða1u1 þ    þ anun; b1u1 þ    þ bnunÞ
¼
P
i;j
ai

bjf ðui; vjÞ ¼ ½a1; . . . ; anH½
b1; . . . ; 
bnT
¼ ½uT
S H½vS
CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 371
(b) Because P is the change-of-basis matrix from S to S0
, we have P½uS0 ¼ ½uS and P½vS0 ¼ ½vS; hence,
½uT
S ¼ ½uT
S0 PT
and ½vS ¼ 
P½vS0 : Thus, by (a),
f ðu; vÞ ¼ ½uT
S H½vS ¼ ½uT
S0 PT
H 
P½vS0
But u and v are arbitrary elements of V; hence, PT
H 
P is the matrix of f in the basis S0
:
12.22. Let H ¼
1 1 þ i 2i
1  i 4 2  3i
2i 2 þ 3i 7
2
4
3
5, a Hermitian matrix.
Find a nonsingular matrix P such that D ¼ PT
H 
P is diagonal. Also, find the signature of H.
Use the modified Algorithm 12.1 that applies the same row operations but the corresponding conjugate
column operations. Thus, first form the block matrix M ¼ ½H; I:
M ¼
1 1 þ i 2i 1 0 0
1  i 4 2  3i 0 1 0
2i 2 þ 3i 7 0 0 1
2
4
3
5
Apply the row operations ‘‘Replace R2 by ð1 þ iÞR1 þ R2’’ and ‘‘Replace R3 by 2iR1 þ R3’’ and then the
corresponding conjugate column operations ‘‘Replace C2 by ð1  iÞC1 þ C2’’ and ‘‘Replace C3 by
2iC1 þ C3’’ to obtain
1 1 þ i 2i 1 0 0
0 2 5i 1 þ i 1 0
0 5i 3 2i 0 1
2
4
3
5 and then
1 0 0 1 0 0
0 2 5i 1 þ i 1 0
0 5i 3 2i 0 1
2
4
3
5
Next apply the row operation ‘‘Replace R3 by 5iR2 þ 2R3’’ and the corresponding conjugate column
operation ‘‘Replace C3 by 5iC2 þ 2C3’’ to obtain
1 0 0 1 0 0
0 2 5i 1 þ i 1 0
0 0 19 5 þ 9i 5i 2
2
4
3
5 and then
1 0 0 1 0 0
0 2 0 1 þ i 1 0
0 0 38 5 þ 9i 5i 2
2
4
3
5
Now H has been diagonalized, and the transpose of the right half of M is P. Thus, set
P ¼
1 1 þ i 5 þ 9i
0 1 5i
0 0 2
2
4
3
5; and then D ¼ PT
H 
P ¼
1 0 0
0 2 0
0 0 38
2
4
3
5:
Note D has p ¼ 2 positive elements and n ¼ 1 negative elements. Thus, the signature of H is
sigðHÞ ¼ 2  1 ¼ 1.
Miscellaneous Problems
12.23. Prove Theorem 12.3: Let f be an alternating form on V. Then there exists a basis of V in which f
is represented by a block diagonal matrix M with blocks of the form
0 1
1 0
 
or 0. The number
of nonzero blocks is uniquely determined by f [because it is equal to 1
2 rankð f Þ.
If f ¼ 0, then the theorem is obviously true. Also, if dim V ¼ 1, then f ðk1u; k2uÞ ¼ k1k2f ðu; uÞ ¼ 0
and so f ¼ 0. Accordingly, we can assume that dim V  1 and f 6¼ 0.
Because f 6¼ 0, there exist (nonzero) u1; u2 2 V such that f ðu1; u2Þ 6¼ 0. In fact, multiplying u1 by
an appropriate factor, we can assume that f ðu1; u2Þ ¼ 1 and so f ðu2; u1Þ ¼ 1. Now u1 and u2 are
linearly independent; because if, say, u2 ¼ ku1, then f ðu1; u2Þ ¼ f ðu1; ku1Þ ¼ kf ðu1; u1Þ ¼ 0. Let
U ¼ spanðu1; u2Þ; then,
372 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
(i) The matrix representation of the restriction of f to U in the basis fu1; u2g is
0 1
1 0
 
,
(ii) If u 2 U, say u ¼ au1 þ bu2, then
f ðu; u1Þ ¼ f ðau1 þ bu2; u1Þ ¼ b and f ðu; u2Þ ¼ f ðau1 þ bu2; u2Þ ¼ a
Let W consists of those vectors w 2 V such that f ðw; u1Þ ¼ 0 and f ðw; u2Þ ¼ 0: Equivalently,
W ¼ fw 2 V : f ðw; uÞ ¼ 0 for every u 2 Ug
We claim that V ¼ U W. It is clear that U  W ¼ f0g, and so it remains to show that V ¼ U þ W. Let
v 2 V. Set
u ¼ f ðv; u2Þu1  f ðv; u1Þu2 and w ¼ v  u ð1Þ
Because u is a linear combination of u1 and u2; u 2 U.
We show next that w 2 W. By (1) and (ii), f ðu; u1Þ ¼ f ðv; u1Þ; hence,
f ðw; u1Þ ¼ f ðv  u; u1Þ ¼ f ðv; u1Þ  f ðu; u1Þ ¼ 0
Similarly, f ðu; u2Þ ¼ f ðv; u2Þ and so
f ðw; u2Þ þ f ðv  u; u2Þ ¼ f ðv; u2Þ  f ðu; u2Þ ¼ 0
Then w 2 W and so, by (1), v ¼ u þ w, where u 2 W. This shows that V ¼ U þ W; therefore, V ¼ U W.
Now the restriction of f to W is an alternating bilinear form on W. By induction, there exists a basis
u3; . . . ; un of W in which the matrix representing f restricted to W has the desired form. Accordingly,
u1; u2; u3; . . . ; un is a basis of V in which the matrix representing f has the desired form.
SUPPLEMENTARY PROBLEMS
Bilinear Forms
12.24. Let u ¼ ðx1; x2Þ and v ¼ ðy1; y2Þ. Determine which of the following are bilinear forms on R2
:
(a) f ðu; vÞ ¼ 2x1y2  3x2y1, (c) f ðu; vÞ ¼ 3x2y2, (e) f ðu; vÞ ¼ 1,
(b) f ðu; vÞ ¼ x1 þ y2, (d) f ðu; vÞ ¼ x1x2 þ y1y2, (f ) f ðu; vÞ ¼ 0
12.25. Let f be the bilinear form on R2
defined by
f ½ðx1; x2Þ; ðy1; y2Þ ¼ 3x1y1  2x1y2 þ 4x2y1  x2y2
(a) Find the matrix A of f in the basis fu1 ¼ ð1; 1Þ; u2 ¼ ð1; 2Þg.
(b) Find the matrix B of f in the basis fv1 ¼ ð1; 1Þ; v2 ¼ ð3; 1Þg.
(c) Find the change-of-basis matrix P from fuig to fvig, and verify that B ¼ PT
AP.
12.26. Let V be the vector space of two-square matrices over R. Let M ¼
1 2
3 5
 
, and let f ðA; BÞ ¼ trðAT
MBÞ,
where A; B 2 V and ‘‘tr’’ denotes trace. (a) Show that f is a bilinear form on V. (b) Find the matrix of f in
the basis
1 0
0 0
 
;
0 1
0 0
 
;
0 0
1 0
 
;
0 0
0 1
 
12.27. Let BðVÞ be the set of bilinear forms on V over K. Prove the following:
(a) If f ; g 2 BðVÞ, then f þ g, kg 2 BðVÞ for any k 2 K.
(b) If f and s are linear functions on V, then f ðu; vÞ ¼ fðuÞsðvÞ belongs to BðVÞ.
12.28. Let ½ f  denote the matrix representation of a bilinear form f on V relative to a basis fuig. Show that the
mapping f 7! ½ f  is an isomorphism of BðVÞ onto the vector space V of n-square matrices.
CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 373
12.29. Let f be a bilinear form on V. For any subset S of V, let
S?
¼ fv 2 V : f ðu; vÞ ¼ 0 for every u 2 Sg and S
¼ fv 2 V : f ðv; uÞ ¼ 0 for every u 2 Sg
Show that: (a) S
and S
are subspaces of V; (b) S1  S2 implies S?
2  S?
1 and S
2  S
1 ;
(c) f0g?
¼ f0g
¼ V.
12.30. Suppose f is a bilinear form on V. Prove that: rankð f Þ ¼ dim V  dim V?
¼ dim V  dim V
, and hence,
dim V?
¼ dim V
.
12.31. Let f be a bilinear form on V. For each u 2 V, let ^
u:V ! K and ~
u:V ! K be defined by ^
uðxÞ ¼ f ðx; uÞ and
~
uðxÞ ¼ f ðu; xÞ. Prove the following:
(a) ^
u and ~
u are each linear; i.e., ^
u; ~
u 2 V*,
(b) u 7! ^
u and u 7! ~
u are each linear mappings from V into V*,
(c) rankð f Þ ¼ rankðu 7! ^
uÞ ¼ rankðu 7! ~
uÞ.
12.32. Show that congruence of matrices (denoted by ’) is an equivalence relation; that is,
(i) A ’ A; (ii) If A ’ B, then B ’ A; (iii) If A ’ B and B ’ C, then A ’ C.
Symmetric Bilinear Forms, Quadratic Forms
12.33. Find the symmetric matrix A belonging to each of the following quadratic forms:
(a) qðx; y; zÞ  2x2
 8xy þ y2
 16xz þ 14yz þ 5z2
, (c) qðx; y; zÞ ¼ xy þ y2
þ 4xz þ z2
(b) qðx; y; zÞ ¼ x2
 xz þ y2
, (d) qðx; y; zÞ ¼ xy þ yz
12.34. For each of the following symmetric matrices A, find a nonsingular matrix P such that D ¼ PT
AP is
diagonal:
(a) A ¼
1 0 2
0 3 6
2 6 7
2
4
3
5, (b) A ¼
1 2 1
2 5 3
1 3 2
2
4
3
5, (c) A ¼
1 1 0 2
1 2 1 0
0 1 1 2
2 0 2 1
2
6
6
4
3
7
7
5
12.35. Let qðx; yÞ ¼ 2x2
 6xy  3y2
and x ¼ s þ 2t, y ¼ 3s  t.
(a) Rewrite qðx; yÞ in matrix notation, and find the matrix A representing the quadratic form.
(b) Rewrite the linear substitution using matrix notation, and find the matrix P corresponding to the
substitution.
(c) Find qðs; tÞ using (i) direct substitution, (ii) matrix notation.
12.36. For each of the following quadratic forms qðx; y; zÞ, find a nonsingular linear substitution expressing the
variables x; y; z in terms of variables r; s; t such that qðr; s; tÞ is diagonal:
(a) qðx; y; zÞ ¼ x2
þ 6xy þ 8y2
 4xz þ 2yz  9z2
,
(b) qðx; y; zÞ ¼ 2x2
 3y2
þ 8xz þ 12yz þ 25z2
,
(c) qðx; y; zÞ ¼ x2
þ 2xy þ 3y2
þ 4xz þ 8yz þ 6z2
.
In each case, find the rank and signature.
12.37. Give an example of a quadratic form qðx; yÞ such that qðuÞ ¼ 0 and qðvÞ ¼ 0 but qðu þ vÞ 6¼ 0.
12.38. Let SðVÞ denote all symmetric bilinear forms on V. Show that
(a) SðVÞ is a subspace of BðVÞ; (b) If dim V ¼ n, then dim SðVÞ ¼ 1
2 nðn þ 1Þ.
12.39. Consider a real quadratic polynomial qðx1; . . . ; xnÞ ¼
Pn
i;j¼1 aijxixj; where aij ¼ aji.
374 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
(a) If a11 6¼ 0, show that the substitution
x1 ¼ y1 
1
a11
ða12y2 þ    þ a1nynÞ; x2 ¼ y2; . . . ; xn ¼ yn
yields the equation qðx1; . . . ; xnÞ ¼ a11 y2
1 þ q0
ðy2; . . . ; ynÞ, where q0
is also a quadratic polynomial.
(b) If a11 ¼ 0 but, say, a12 6¼ 0, show that the substitution
x1 ¼ y1 þ y2; x2 ¼ y1  y2; x3 ¼ y3; . . . ; xn ¼ yn
yields the equation qðx1; . . . ; xnÞ ¼
P
bij yi yj, where b11 6¼ 0, which reduces this case to case (a).
Remark: This method of diagonalizing q is known as completing the square.
Positive Definite Quadratic Forms
12.40. Determine whether or not each of the following quadratic forms is positive definite:
(a) qðx; yÞ ¼ 4x2
þ 5xy þ 7y2
, (c) qðx; y; zÞ ¼ x2
þ 4xy þ 5y2
þ 6xz þ 2yz þ 4z2
(b) qðx; yÞ ¼ 2x2
 3xy  y2
; (d) qðx; y; zÞ ¼ x2
þ 2xy þ 2y2
þ 4xz þ 6yz þ 7z2
12.41. Find those values of k such that the given quadratic form is positive definite:
(a) qðx; yÞ ¼ 2x2
 5xy þ ky2
, (b) qðx; yÞ ¼ 3x2
 kxy þ 12y2
(c) qðx; y; zÞ ¼ x2
þ 2xy þ 2y2
þ 2xz þ 6yz þ kz2
12.42. Suppose A is a real symmetric positive definite matrix. Show that A ¼ PT
P for some nonsingular matrix P.
Hermitian Forms
12.43. Modify Algorithm 12.1 so that, for a given Hermitian matrix H, it finds a nonsingular matrix P for which
D ¼ PT
A 
P is diagonal.
12.44. For each Hermitian matrix H, find a nonsingular matrix P such that D ¼ PT
H 
P is diagonal:
(a) H ¼
1 i
i 2
 
, (b) H ¼
1 2 þ 3i
2  3i 1
 
, (c) H ¼
1 i 2 þ i
i 2 1  i
2  i 1 þ i 2
2
4
3
5
Find the rank and signature in each case.
12.45. Let A be a complex nonsingular matrix. Show that H ¼ A*A is Hermitian and positive definite.
12.46. We say that B is Hermitian congruent to A if there exists a nonsingular matrix P such that B ¼ PT
A 
P or,
equivalently, if there exists a nonsingular matrix Q such that B ¼ Q*AQ. Show that Hermitian congruence
is an equivalence relation. (Note: If P ¼ 
Q, then PT
A 
P ¼ Q*AQ.)
12.47. Prove Theorem 12.7: Let f be a Hermitian form on V. Then there is a basis S of V in which f is represented
by a diagonal matrix, and every such diagonal representation has the same number p of positive entries and
the same number n of negative entries.
Miscellaneous Problems
12.48. Let e denote an elementary row operation, and let f * denote the corresponding conjugate column operation
(where each scalar k in e is replaced by 
k in f *). Show that the elementary matrix corresponding to f * is
the conjugate transpose of the elementary matrix corresponding to e.
12.49. Let V and W be vector spaces over K. A mapping f :V W ! K is called a bilinear form on V and W if
(i) f ðav1 þ bv2; wÞ ¼ af ðv1; wÞ þ bf ðv2; wÞ,
(ii) f ðv; aw1 þ bw2Þ ¼ af ðv; w1Þ þ bf ðv; w2Þ
for every a; b 2 K; vi 2 V; wj 2 W. Prove the following:
CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 375
(a) The set BðV; WÞ of bilinear forms on V and W is a subspace of the vector space of functions from
V W into K.
(b) If ff1; ... ; fmg is a basis of V* and fs1; ... ; sng is a basis of W*, then
f fij : i ¼ 1; ... ; m; j ¼ 1; ... ; ng is a basis of BðV; WÞ, where fij is defined by fijðv; wÞ ¼ fiðvÞsjðwÞ.
Thus, dim BðV; WÞ ¼ dim V dim W.
[Note that if V ¼ W, then we obtain the space BðVÞ investigated in this chapter.]
12.50. Let V be a vector space over K. A mapping f :V V . . . V
zfflfflfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflfflfflffl{
m times
! K is called a multilinear (or m-linear)
form on V if f is linear in each variable; that is, for i ¼ 1; . . . ; m,
f ð. . . ; au þ bv; . . .Þ ¼ af ð. . . ; ^
u; . . .Þ þ bf ð. . . ; ^
v; . . .Þ
where c
. . . denotes the ith element, and other elements are held fixed. An m-linear form f is said to be
alternating if f ðv1; . . . vmÞ ¼ 0 whenever vi ¼ vj for i 6¼ j. Prove the following:
(a) The set BmðVÞ of m-linear forms on V is a subspace of the vector space of functions from
V V    V into K.
(b) The set AmðVÞ of alternating m-linear forms on V is a subspace of BmðVÞ.
Remark 1: If m ¼ 2, then we obtain the space BðVÞ investigated in this chapter.
Remark 2: If V ¼ Km
, then the determinant function is an alternating m-linear form on V.
ANSWERS TO SUPPLEMENTARY PROBLEMS
Notation: M ¼ ½R1; R2; . . . denotes a matrix M with rows R1; R2; . . ..
12.24. (a) yes, (b) no, (c) yes, (d) no, (e) no, (f ) yes
12.25. (a) A ¼ ½4; 1; 7; 3, (b) B ¼ ½0; 4; 20; 32, (c) P ¼ ½3; 5; 2; 2
12.26. (b) ½1; 0; 2; 0; 0; 1; 0; 2; 3; 0; 5; 0; 0; 3; 0; 5
12.33. (a) ½2; 4; 8; 4; 1; 7; 8; 7; 5, (b) ½1; 0;  1
2 ; 0; 1; 0;  1
2 ; 0; 0,
(c) ½0; 1
2 ; 2; 1
2 ; 1; 0; 2; 0; 1, (d) ½0; 1
2 ; 0; 1
2 ; 0; 1; 1
2 ; 0; 1
2 ; 0; 1
2 ; 0
12.34. (a) P ¼ ½1; 0; 2; 0; 1; 2; 0; 0; 1; D ¼ diagð1; 3; 9Þ;
(b) P ¼ ½1; 2; 11; 0; 1; 5; 0; 0; 1; D ¼ diagð1; 1; 28Þ;
(c) P ¼ ½1; 1; 1; 4; 0; 1; 1; 2; 0; 0; 1; 0; 0; 0; 0; 1; D ¼ diagð1; 1; 0; 9Þ
12.35. A ¼ ½2; 3; 3; 3, P ¼ ½1; 2; 3; 1, qðs; tÞ ¼ 43s2
 4st þ 17t2
12.36. (a) x ¼ r  3s  19t, y ¼ s þ 7t, z ¼ t; qðr; s; tÞ ¼ r2
 s2
þ 36t2
;
(b) x ¼ r  2t; y ¼ s þ 2t; z ¼ t; qðr; s; tÞ ¼ 2r2
 3s2
þ 29t2
;
(c) x ¼ r  s  t; y ¼ s  t; z ¼ t; qðr; s; tÞ ¼ r2
 2s2
12.37. qðx; yÞ ¼ x2
 y2
, u ¼ ð1; 1Þ, v ¼ ð1; 1Þ
12.40. (a) yes, (b) no, (c) no, (d) yes
12.41. (a) k  25
8 , (b) 12  k  12, (c) k  5
12.44. (a) P ¼ ½1; i; 0; 1, D ¼ I; s ¼ 2; (b) P ¼ ½1; 2 þ 3i; 0; 1, D ¼ diagð1; 14Þ, s ¼ 0;
(c) P ¼ ½1; i; 3 þ i; 0; 1; i; 0; 0; 1, D ¼ diagð1; 1; 4Þ; s ¼ 1
d
376 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
Linear Operators on Inner
Product Spaces
13.1 Introduction
This chapter investigates the space AðVÞ of linear operators T on an inner product space V. (See
Chapter 7.) Thus, the base field K is either the real numbers R or the complex numbers C. In fact, different
terminologies will be used for the real case and the complex case. We also use the fact that the inner
products on real Euclidean space Rn
and complex Euclidean space Cn
may be defined, respectively, by
hu; vi ¼ uT
v and hu; vi ¼ uT

v
where u and v are column vectors.
The reader should review the material in Chapter 7 and be very familiar with the notions of norm
(length), orthogonality, and orthonormal bases. We also note that Chapter 7 mainly dealt with real inner
product spaces, whereas here we assume that V is a complex inner product space unless otherwise stated
or implied.
Lastly, we note that in Chapter 2, we used AH
to denote the conjugate transpose of a complex matrix A;
that is, AH
¼ AT . This notation is not standard. Many texts, expecially advanced texts, use A* to denote
such a matrix; we will use that notation in this chapter. That is, now A* ¼ AT .
13.2 Adjoint Operators
We begin with the following basic definition.
DEFINITION: A linear operator T on an inner product space V is said to have an adjoint operator T*
on V if hTðuÞ; vi ¼ hu; T*ðvÞi for every u; v 2 V.
The following example shows that the adjoint operator has a simple description within the context of
matrix mappings.
EXAMPLE 13.1
(a) Let A be a real n-square matrix viewed as a linear operator on Rn
. Then, for every u; v 2 Rn;
hAu; vi ¼ ðAuÞT
v ¼ uT
AT
v ¼ hu; AT
vi
Thus, the transpose AT
of A is the adjoint of A.
(b) Let B be a complex n-square matrix viewed as a linear operator on Cn
. Then for every u; v; 2 Cn
,
hBu; vi ¼ ðBuÞT

v ¼ uT
BT

v ¼ uT
B*
v ¼ hu; B*vi
Thus, the conjugate transpose B* of B is the adjoint of B.
CHAPTER 13
377
Remark: B* may mean either the adjoint of B as a linear operator or the conjugate transpose of B
as a matrix. By Example 13.1(b), the ambiguity makes no difference, because they denote the same
object.
The following theorem (proved in Problem 13.4) is the main result in this section.
THEOREM 13.1: Let T be a linear operator on a finite-dimensional inner product space V over K.
Then
(i) There exists a unique linear operator T* on V such that hTðuÞ; vi¼ hu; T*ðvÞi
for every u; v 2 V. (That is, T has an adjoint T*.)
(ii) If A is the matrix representation T with respect to any orthonormal basis
S ¼ fuig of V, then the matrix representation of T* in the basis S is the
conjugate transpose A* of A (or the transpose AT
of A when K is real).
We emphasize that no such simple relationship exists between the matrices representing T and T* if
the basis is not orthonormal. Thus, we see one useful property of orthonormal bases. We also emphasize
that this theorem is not valid if V has infinite dimension (Problem 13.31).
The following theorem (proved in Problem 13.5) summarizes some of the properties of the adjoint.
THEOREM 13.2: Let T; T1; T2 be linear operators on V and let k 2 K. Then
(i) ðT1 þ T2Þ* ¼ T1
* þ T2
*, (iii) ðT1T2Þ* ¼ T2
*T1
*,
(ii) ðkTÞ* ¼ 
kT*, (iv) ðT*Þ* ¼ T.
Observe the similarity between the above theorem and Theorem 2.3 on properties of the transpose
operation on matrices.
Linear Functionals and Inner Product Spaces
Recall (Chapter 11) that a linear functional f on a vector space V is a linear mapping f:V ! K. This
subsection contains an important result (Theorem 13.3) that is used in the proof of the above basic
Theorem 13.1.
Let V be an inner product space. Each u 2 V determines a mapping ^
u:V ! K defined by
^
uðvÞ ¼ hv; ui
Now, for any a; b 2 K and any v1; v2 2 V,
^
uðav1 þ bv2Þ ¼ hav1 þ bv2; ui ¼ ahv1; ui þ bhv2; ui ¼ a^
uðv1Þ þ b^
uðv2Þ
That is, ^
u is a linear functional on V. The converse is also true for spaces of finite dimension and it is
contained in the following important theorem (proved in Problem 13.3).
THEOREM 13.3: Let f be a linear functional on a finite-dimensional inner product space V. Then
there exists a unique vector u 2 V such that fðvÞ ¼ hv; ui for every v 2 V.
We remark that the above theorem is not valid for spaces of infinite dimension (Problem 13.24).
13.3 Analogy Between AðVÞ and C, Special Linear Operators
Let AðVÞ denote the algebra of all linear operators on a finite-dimensional inner product space V. The
adjoint mapping T 7! T* on AðVÞ is quite analogous to the conjugation mapping z 7! 
z on the complex
field C. To illustrate this analogy we identify in Table 13-1 certain classes of operators T 2 AðVÞ whose
behavior under the adjoint map imitates the behavior under conjugation of familiar classes of complex
numbers.
The analogy between these operators T and complex numbers z is reflected in the next theorem.
378 CHAPTER 13 Linear Operators on Inner Product Spaces
THEOREM 13.4: Let l be an eigenvalue of a linear operator T on V.
(i) If T* ¼ T1
(i.e., T is orthogonal or unitary), then jlj ¼ 1.
(ii) If T* ¼ T (i.e., T is self-adjoint), then l is real.
(iii) If T* ¼ T (i.e., T is skew-adjoint), then l is pure imaginary.
(iv) If T ¼ S*S with S nonsingular (i.e., T is positive definite), then l is real and
positive.
Proof. In each case let v be a nonzero eigenvector of T belonging to l; that is, TðvÞ ¼ lv with
v 6¼ 0. Hence, hv; vi is positive.
Proof of (i). We show that l
lhv; vi ¼ hv; vi:
l
lhv; vi ¼ hlv; lvi ¼ hTðvÞ; TðvÞi ¼ hv; T*TðvÞi ¼ hv; IðvÞi ¼ hv; vi
But hv; vi 6¼ 0; hence, l
l ¼ 1 and so jlj ¼ 1.
Proof of (ii). We show that lhv; vi ¼ 
lhv; vi:
lhv; vi ¼ hlv; vi ¼ hTðvÞ; vi ¼ hv; T*ðvÞi ¼ hv; TðvÞi ¼ hv; lvi ¼ 
lhv; vi
But hv; vi 6¼ 0; hence, l ¼ 
l and so l is real.
Proof of (iii). We show that lhv; vi ¼ 
lhv; vi:
lhv; vi ¼ hlv; vi ¼ hTðvÞ; vi ¼ hv; T*ðvÞi ¼ hv; TðvÞi ¼ hv; lvi ¼ 
lhv; vi
But hv; vi 6¼ 0; hence, l ¼ 
l or 
l ¼ l, and so l is pure imaginary.
Proof of (iv). Note first that SðvÞ 6¼ 0 because S is nonsingular; hence, hSðvÞ, SðvÞi is positive. We
show that lhv; vi ¼ hSðvÞ; SðvÞi:
lhv; vi ¼ hlv; vi ¼ hTðvÞ; vi ¼ hS*SðvÞ; vi ¼ hSðvÞ; SðvÞi
But hv; vi and hSðvÞ; SðvÞi are positive; hence, l is positive.
Table 13-1
Class of complex
numbers
Behavior under
conjugation Class of operators in AðVÞ
Behavior under the
adjoint map
Unit circle ðjzj ¼ 1Þ 
z ¼ 1=z Orthogonal operators (real case) T* ¼ T1
Unitary operators (complex case)
Self-adjoint operators
Also called:
Real axis 
z ¼ z symmetric (real case) T* ¼ T
Hermitian (complex case)
Skew-adjoint operators
Also called:
Imaginary axis 
z ¼ z skew-symmetric (real case) T* ¼ T
skew-Hermitian (complex case)
Positive real axis z ¼ 
ww; w 6¼ 0 Positive definite operators T ¼ S*S
ð0; 1Þ with S nonsingular
CHAPTER 13 Linear Operators on Inner Product Spaces 379
Remark: Each of the above operators T commutes with its adjoint; that is, TT* ¼ T*T. Such
operators are called normal operators.
13.4 Self-Adjoint Operators
Let T be a self-adjoint operator on an inner product space V; that is, suppose
T* ¼ T
(If T is defined by a matrix A, then A is symmetric or Hermitian according as A is real or complex.) By
Theorem 13.4, the eigenvalues of T are real. The following is another important property of T.
THEOREM 13.5: Let T be a self-adjoint operator on V. Suppose u and v are eigenvectors of T
belonging to distinct eigenvalues. Then u and v are orthogonal; that is, hu; vi ¼ 0.
Proof. Suppose TðuÞ ¼ l1u and TðvÞ ¼ l2v, where l1 6¼ l2. We show that l1hu; vi ¼ l2hu; vi:
l1hu; vi ¼ hl1u; vi ¼ hTðuÞ; vi ¼ hu; T*ðvÞi ¼ hu; TðvÞi
¼ hu; l2vi ¼ 
l2hu; vi ¼ l2hu; vi
(The fourth equality uses the fact that T* ¼ T, and the last equality uses the fact that the eigenvalue l2 is
real.) Because l1 6¼ l2, we get hu; vi ¼ 0. Thus, the theorem is proved.
13.5 Orthogonal and Unitary Operators
Let U be a linear operator on a finite-dimensional inner product space V. Suppose
U* ¼ U1
or equivalently UU* ¼ U*U ¼ I
Recall that U is said to be orthogonal or unitary according as the underlying field is real or complex. The
next theorem (proved in Problem 13.10) gives alternative characterizations of these operators.
THEOREM 13.6: The following conditions on an operator U are equivalent:
(i) U* ¼ U1
; that is, UU* ¼ U*U ¼ I. [U is unitary (orthogonal).]
(ii) U preserves inner products; that is, for every v; w 2 V,
hUðvÞ, UðwÞi ¼ hv; wi.
(iii) U preserves lengths; that is, for every v 2 V, kUðvÞk ¼ kvk.
EXAMPLE 13.2
(a) Let T :R3
! R3
be the linear operator that rotates each vector v about the z-axis by a fixed angle y as shown in
Fig. 10-1 (Section 10.3). That is, T is defined by
Tðx; y; zÞ ¼ ðx cos y  y sin y; x sin y þ y cos y; zÞ
We note that lengths (distances from the origin) are preserved under T. Thus, T is an orthogonal operator.
(b) Let V be l2-space (Hilbert space), defined in Section 7.3. Let T :V ! V be the linear operator defined by
Tða1; a2; a3; . . .Þ ¼ ð0; a1; a2; a3; . . .Þ
Clearly, T preserves inner products and lengths. However, T is not surjective, because, for example, ð1; 0; 0; . . .Þ
does not belong to the image of T; hence, T is not invertible. Thus, we see that Theorem 13.6 is not valid for
spaces of infinite dimension.
An isomorphism from one inner product space into another is a bijective mapping that preserves the
three basic operations of an inner product space: vector addition, scalar multiplication, and inner
380 CHAPTER 13 Linear Operators on Inner Product Spaces
products. Thus, the above mappings (orthogonal and unitary) may also be characterized as the
isomorphisms of V into itself. Note that such a mapping U also preserves distances, because
kUðvÞ  UðwÞk ¼ kUðv  wÞk ¼ kv  wk
Hence, U is called an isometry.
13.6 Orthogonal and Unitary Matrices
Let U be a linear operator on an inner product space V. By Theorem 13.1, we obtain the following results.
THEOREM 13.7A: A complex matrix A represents a unitary operator U (relative to an orthonormal
basis) if and only if A* ¼ A1
.
THEOREM 13.7B: A real matrix A represents an orthogonal operator U (relative to an orthonormal
basis) if and only if AT
¼ A1
.
The above theorems motivate the following definitions (which appeared in Sections 2.10 and 2.11).
DEFINITION: A complex matrix A for which A* ¼ A1
is called a unitary matrix.
DEFINITION: A real matrix A for which AT
¼ A1
is called an orthogonal matrix.
We repeat Theorem 2.6, which characterizes the above matrices.
THEOREM 13.8: The following conditions on a matrix A are equivalent:
(i) A is unitary (orthogonal).
(ii) The rows of A form an orthonormal set.
(iii) The columns of A form an orthonormal set.
13.7 Change of Orthonormal Basis
Orthonormal bases play a special role in the theory of inner product spaces V. Thus, we are naturally
interested in the properties of the change-of-basis matrix from one such basis to another. The following
theorem (proved in Problem 13.12) holds.
THEOREM 13.9: Let fu1; . . . ; ung be an orthonormal basis of an inner product space V. Then the
change-of-basis matrix from fuig into another orthonormal basis is unitary
(orthogonal). Conversely, if P ¼ ½aij is a unitary (orthogonal) matrix, then the
following is an orthonormal basis:
fu0
i ¼ a1iu1 þ a2iu2 þ    þ aniun : i ¼ 1; . . . ; ng
Recall that matrices A and B representing the same linear operator T are similar; that is, B ¼ P1
AP,
where P is the (nonsingular) change-of-basis matrix. On the other hand, if V is an inner product space, we
are usually interested in the case when P is unitary (or orthogonal) as suggested by Theorem 13.9. (Recall
that P is unitary if the conjugate tranpose P* ¼ P1
, and P is orthogonal if the transpose PT
¼ P1
.) This
leads to the following definition.
DEFINITION: Complex matrices A and B are unitarily equivalent if there exists a unitary matrix P
for which B ¼ P*AP. Analogously, real matrices A and B are orthogonally equivalent
if there exists an orthogonal matrix P for which B ¼ PT
AP.
Note that orthogonally equivalent matrices are necessarily congruent.
CHAPTER 13 Linear Operators on Inner Product Spaces 381
13.8 Positive Definite and Positive Operators
Let P be a linear operator on an inner product space V. Then
(i) P is said to be positive definite if P ¼ S*S for some nonsingular operators S:
(ii) P is said to be positive (or nonnegative or semidefinite) if P ¼ S*S for some operator S:
The following theorems give alternative characterizations of these operators.
THEOREM 13.10A: The following conditions on an operator P are equivalent:
(i) P ¼ T2
for some nonsingular self-adjoint operator T.
(ii) P is positive definite.
(iii) P is self-adjoint and hPðuÞ; ui  0 for every u 6¼ 0 in V.
The corresponding theorem for positive operators (proved in Problem 13.21) follows.
THEOREM 13.10B: The following conditions on an operator P are equivalent:
(i) P ¼ T2
for some self-adjoint operator T.
(ii) P is positive; that is, P ¼ S
S:
(iii) P is self-adjoint and hPðuÞ; ui  0 for every u 2 V.
13.9 Diagonalization and Canonical Forms in Inner Product Spaces
Let T be a linear operator on a finite-dimensional inner product space V over K. Representing T by a
diagonal matrix depends upon the eigenvectors and eigenvalues of T, and hence, upon the roots of
the characteristic polynomial DðtÞ of T. Now DðtÞ always factors into linear polynomials over the
complex field C but may not have any linear polynomials over the real field R. Thus, the situation
for real inner product spaces (sometimes called Euclidean spaces) is inherently different than the
situation for complex inner product spaces (sometimes called unitary spaces). Thus, we treat them
separately.
Real Inner Product Spaces, Symmetric and Orthogonal Operators
The following theorem (proved in Problem 13.14) holds.
THEOREM 13.11: Let T be a symmetric (self-adjoint) operator on a real finite-dimensional product
space V. Then there exists an orthonormal basis of V consisting of eigenvectors of
T; that is, T can be represented by a diagonal matrix relative to an orthonormal
basis.
We give the corresponding statement for matrices.
THEOREM 13.11: (Alternative Form) Let A be a real symmetric matrix. Then there exists an
orthogonal matrix P such that B ¼ P1
AP ¼ PT
AP is diagonal.
We can choose the columns of the above matrix P to be normalized orthogonal eigenvectors of A; then
the diagonal entries of B are the corresponding eigenvalues.
On the other hand, an orthogonal operator T need not be symmetric, and so it may not be represented
by a diagonal matrix relative to an orthonormal matrix. However, such a matrix T does have a simple
canonical representation, as described in the following theorem (proved in Problem 13.16).
382 CHAPTER 13 Linear Operators on Inner Product Spaces
THEOREM 13.12: Let T be an orthogonal operator on a real inner product space V. Then there exists
an orthonormal basis of V in which T is represented by a block diagonal matrix M
of the form
M ¼ diag Is; It;
cos y1  sin y1
sin y1 cos y1
 
; . . . ;
cos yr  sin yr
sin yr cos yr
 
 
The reader may recognize that each of the 2 2 diagonal blocks represents a rotation in the
corresponding two-dimensional subspace, and each diagonal entry 1 represents a reflection in the
corresponding one-dimensional subspace.
Complex Inner Product Spaces, Normal and Triangular Operators
A linear operator T is said to be normal if it commutes with its adjoint—that is, if TT* ¼ T*T. We note
that normal operators include both self-adjoint and unitary operators.
Analogously, a complex matrix A is said to be normal if it commutes with its conjugate transpose—
that is, if AA* ¼ A*A.
EXAMPLE 13.3 Let A ¼
1 1
i 3 þ 2i
 
. Then A* ¼
1 i
1 3  2i
 
.
Also AA* ¼
2 3  3i
3 þ 3i 14
 
¼ A*A. Thus, A is normal.
The following theorem (proved in Problem 13.19) holds.
THEOREM 13.13: Let T be a normal operator on a complex finite-dimensional inner product space V.
Then there exists an orthonormal basis of V consisting of eigenvectors of T; that
is, T can be represented by a diagonal matrix relative to an orthonormal basis.
We give the corresponding statement for matrices.
THEOREM 13.13: (Alternative Form) Let A be a normal matrix. Then there exists a unitary matrix
P such that B ¼ P1
AP ¼ P*AP is diagonal.
The following theorem (proved in Problem 13.20) shows that even nonnormal operators on unitary
spaces have a relatively simple form.
THEOREM 13.14: Let T be an arbitrary operator on a complex finite-dimensional inner product space
V. Then T can be represented by a triangular matrix relative to an orthonormal
basis of V.
THEOREM 13.14: (Alternative Form) Let A be an arbitrary complex matrix. Then there exists a
unitary matrix P such that B ¼ P1
AP ¼ P*AP is triangular.
13.10 Spectral Theorem
The Spectral Theorem is a reformulation of the diagonalization Theorems 13.11 and 13.13.
THEOREM 13.15: (Spectral Theorem) Let T be a normal (symmetric) operator on a complex (real)
finite-dimensional inner product space V. Then there exists linear operators
E1; . . . ; Er on V and scalars l1; . . . ; lr such that
(i) T ¼ l1E1 þ l2E2 þ    þ lrEr, (iii) E2
1 ¼ E1; E2
2 ¼ E2; . . . ; E2
r ¼ Er,
(ii) E1 þ E2 þ    þ Er ¼ I, (iv) EiEj ¼ 0 for i 6¼ j.
CHAPTER 13 Linear Operators on Inner Product Spaces 383
The above linear operators E1; . . . ; Er are projections in the sense that E2
i ¼ Ei. Moreover, they are
said to be orthogonal projections because they have the additional property that EiEj ¼ 0 for i 6¼ j.
The following example shows the relationship between a diagonal matrix representation and the
corresponding orthogonal projections.
EXAMPLE 13.4 Consider the following diagonal matrices A; E1; E2; E3:
A ¼
2
3
3
5
2
6
6
4
3
7
7
5; E1 ¼
1
0
0
0
2
6
6
4
3
7
7
5; E2 ¼
0
1
1
0
2
6
6
4
3
7
7
5; E3 ¼
0
0
0
1
2
6
6
4
3
7
7
5
The reader can verify that
(i) A ¼ 2E1 þ 3E2 þ 5E3, (ii) E1 þ E2 þ E3 ¼ I, (iii) E2
i ¼ Ei, (iv) EiEj ¼ 0 for i 6¼ j.
SOLVED PROBLEMS
Adjoints
13.1. Find the adjoint of F:R3
! R3
defined by
Fðx; y; zÞ ¼ ð3x þ 4y  5z; 2x  6y þ 7z; 5x  9y þ zÞ
First find the matrix A that represents F in the usual basis of R3
—that is, the matrix A whose rows are
the coefficients of x; y; z—and then form the transpose AT
of A. This yields
A ¼
3 4 5
2 6 7
5 9 1
2
4
3
5 and then AT
¼
3 2 5
4 6 9
5 7 1
2
4
3
5
The adjoint F* is represented by the transpose of A; hence,
F*ðx; y; zÞ ¼ ð3x þ 2y þ 5z; 4x  6y  9z; 5x þ 7y þ zÞ
13.2. Find the adjoint of G:C3
! C3
defined by
Gðx; y; zÞ ¼ ½2x þ ð1  iÞy; ð3 þ 2iÞx  4iz; 2ix þ ð4  3iÞy  3z
First find the matrix B that represents G in the usual basis of C3
, and then form the conjugate transpose
B* of B. This yields
B ¼
2 1  i 0
3 þ 2i 0 4i
2i 4  3i 3
2
4
3
5 and then B* ¼
2 3  2i 2i
1 þ i 0 4 þ 3i
0 4i 3
2
4
3
5
Then G*ðx; y; zÞ ¼ ½2x þ ð3  2iÞy  2iz; ð1 þ iÞx þ ð4 þ 3iÞz; 4iy  3z:
13.3. Prove Theorem 13.3: Let f be a linear functional on an n-dimensional inner product space V.
Then there exists a unique vector u 2 V such that fðvÞ ¼ hv; ui for every v 2 V.
Let fw1; . . . ; wng be an orthonormal basis of V. Set
u ¼ fðw1Þw1 þ fðw2Þw2 þ    þ fðwnÞwn
Let ^
u be the linear functional on V defined by ^
uðvÞ ¼ hv; ui for every v 2 V. Then, for i ¼ 1; . . . ; n,
^
uðwiÞ ¼ hwi; ui ¼ hwi; fðw1Þw1 þ    þ fðwnÞwni ¼ fðwiÞ
384 CHAPTER 13 Linear Operators on Inner Product Spaces
Because ^
u and f agree on each basis vector, ^
u ¼ f.
Now suppose u0
is another vector in V for which fðvÞ ¼ hv; u0
i for every v 2 V. Then hv; ui ¼ hv; u0
i
or hv; u  u0
i ¼ 0. In particular, this is true for v ¼ u  u0
, and so hu  u0
; u  u0
i ¼ 0. This yields
u  u0
¼ 0 and u ¼ u0
. Thus, such a vector u is unique, as claimed.
13.4. Prove Theorem 13.1: Let T be a linear operator on an n-dimensional inner product space V. Then
(a) There exists a unique linear operator T* on V such that
hTðuÞ; vi ¼ hu; T*ðvÞi for all u; v 2 V:
(b) Let A be the matrix that represents T relative to an orthonormal basis S ¼ fuig. Then the
conjugate transpose A* of A represents T* in the basis S.
(a) We first define the mapping T*. Let v be an arbitrary but fixed element of V. The map u 7! hTðuÞ; vi
is a linear functional on V. Hence, by Theorem 13.3, there exists a unique element v0
2 V such
that hTðuÞ; vi ¼ hu; v0
i for every u 2 V. We define T* : V ! V by T*ðvÞ ¼ v0
. Then
hTðuÞ; vi ¼ hu; T*ðvÞi for every u; v 2 V.
We next show that T* is linear. For any u; vi 2 V, and any a; b 2 K,
hu; T*ðav1 þ bv2Þi ¼ hTðuÞ; av1 þ bv2i ¼ 
ahTðuÞ; v1i þ 
bhTðuÞ; v2i
¼ 
ahu; T*ðv1Þi þ 
bhu; T*ðv2Þi ¼ hu; aT*ðv1Þ þ bT*ðv2Þi
But this is true for every u 2 V; hence, T*ðav1 þ bv2Þ ¼ aT*ðv1Þ þ bT*ðv2Þ. Thus, T* is linear.
(b) The matrices A ¼ ½aij and B ¼ ½bij that represent T and T*, respectively, relative to the orthonormal
basis S are given by aij ¼ hTðujÞ; uii and bij ¼ hT*ðujÞ; uii (Problem 13.67). Hence,
bij ¼ hT*ðujÞ; uii ¼ hui; T*ðujÞi ¼ hTðuiÞ; uji ¼ aji
Thus, B ¼ A*, as claimed.
13.5. Prove Theorem 13.2:
(i) ðT1 þ T2Þ* ¼ T1
* þ T2
*, (iii) ðT1T2Þ* ¼ T2
*T1
*,
(ii) ðkTÞ* ¼ 
kT*, (iv) ðT*Þ* ¼ T.
(i) For any u; v 2 V,
hðT1 þ T2ÞðuÞ; vi ¼ hT1ðuÞ þ T2ðuÞ; vi ¼ hT1ðuÞ; vi þ hT2ðuÞ; vi
¼ hu; T1
*ðvÞi þ hu; T2
*ðvÞi ¼ hu; T1
*ðvÞ þ T2
*ðvÞi
¼ hu; ðT1
* þ T2
*ÞðvÞi
The uniqueness of the adjoint implies ðT1 þ T2Þ* ¼ T1
* þ T2
*.
(ii) For any u; v 2 V,
hðkTÞðuÞ; vi ¼ hkTðuÞ; vi ¼ khTðuÞ; vi ¼ khu; T*ðvÞi ¼ hu; 
kT*ðvÞi ¼ hu; ð
kT*ÞðvÞi
The uniqueness of the adjoint implies ðkTÞ* ¼ 
kT*.
(iii) For any u; v 2 V,
hðT1T2ÞðuÞ; vi ¼ hT1ðT2ðuÞÞ; vi ¼ hT2ðuÞ; T1
*ðvÞi
¼ hu; T2
*ðT1
*ðvÞÞi ¼ hu; ðT2
*T1
*ÞðvÞi
The uniqueness of the adjoint implies ðT1T2Þ* ¼ T2
*T1
*.
(iv) For any u; v 2 V,
hT*ðuÞ; vi ¼ hv; T*ðuÞi ¼ hTðvÞ; ui ¼ hu; TðvÞi
The uniqueness of the adjoint implies ðT*Þ* ¼ T.
CHAPTER 13 Linear Operators on Inner Product Spaces 385
13.6. Show that ðaÞ I* ¼ I, and ðbÞ 0* ¼ 0.
(a) For every u; v 2 V, hIðuÞ; vi ¼ hu; vi ¼ hu; IðvÞi; hence, I* ¼ I.
(b) For every u; v 2 V, h0ðuÞ; vi ¼ h0; vi ¼ 0 ¼ hu; 0i ¼ hu; 0ðvÞi; hence, 0* ¼ 0.
13.7. Suppose T is invertible. Show that ðT1
Þ* ¼ ðT*Þ1
.
I ¼ I* ¼ ðTT1
Þ* ¼ ðT1
Þ*T*; hence; ðT1
Þ* ¼ ðT*Þ1
:
13.8. Let T be a linear operator on V, and let W be a T-invariant subspace of V. Show that W?
is
invariant under T*.
Let u 2 W?
. If w 2 W, then TðwÞ 2 W and so hw; T*ðuÞi ¼ hTðwÞ; ui ¼ 0. Thus, T*ðuÞ 2 W?
because it is orthogonal to every w 2 W. Hence, W?
is invariant under T*.
13.9. Let T be a linear operator on V. Show that each of the following conditions implies T ¼ 0:
(i) hTðuÞ; vi ¼ 0 for every u; v 2 V.
(ii) V is a complex space, and hTðuÞ; ui ¼ 0 for every u 2 V.
(iii) T is self-adjoint and hTðuÞ; ui ¼ 0 for every u 2 V.
Give an example of an operator T on a real space V for which hTðuÞ; ui ¼ 0 for every u 2 V but T 6¼ 0.
[Thus, (ii) need not hold for a real space V.]
(i) Set v ¼ TðuÞ. Then hTðuÞ; TðuÞi ¼ 0, and hence, TðuÞ ¼ 0, for every u 2 V. Accordingly, T ¼ 0.
(ii) By hypothesis, hTðv þ wÞ; v þ wi ¼ 0 for any v; w 2 V. Expanding and setting hTðvÞ; vi ¼ 0 and
hTðwÞ; wi ¼ 0, we find
hTðvÞ; wi þ hTðwÞ; vi ¼ 0 ð1Þ
Note w is arbitrary in (1). Substituting iw for w, and using hTðvÞ; iwi ¼ 
ihTðvÞ; wi ¼ ihTðvÞ; wi and
hTðiwÞ; vi ¼ hiTðwÞ; vi ¼ ihTðwÞ; vi, we find
ihTðvÞ; wi þ ihTðwÞ; vi ¼ 0
Dividing through by i and adding to (1), we obtain hTðwÞ; vi ¼ 0 for any v; w; 2 V. By (i), T ¼ 0.
(iii) By (ii), the result holds for the complex case; hence we need only consider the real case. Expanding
hTðv þ wÞ; v þ wi ¼ 0, we again obtain (1). Because T is self-adjoint and as it is a real space, we
have hTðwÞ; vi ¼ hw; TðvÞi ¼ hTðvÞ; wi. Substituting this into (1), we obtain hTðvÞ; wi ¼ 0 for any
v; w 2 V. By (i), T ¼ 0.
For an example, consider the linear operator T on R2
defined by Tðx; yÞ ¼ ðy; xÞ. Then
hTðuÞ; ui ¼ 0 for every u 2 V, but T 6¼ 0.
Orthogonal and Unitary Operators and Matrices
13.10. Prove Theorem 13.6: The following conditions on an operator U are equivalent:
(i) U* ¼ U1
; that is, U is unitary. (ii) hUðvÞ; UðwÞi ¼ hu; wi. (iii) kUðvÞk ¼ kvk.
Suppose (i) holds. Then, for every v; w; 2 V,
hUðvÞ; UðwÞi ¼ hv; U*UðwÞi ¼ hv; IðwÞi ¼ hv; wi
Thus, (i) implies (ii). Now if (ii) holds, then
kUðvÞk ¼
ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
hUðvÞ; UðvÞi
p
¼
ffiffiffiffiffiffiffiffiffiffiffi
hv; vi
p
¼ kvk
Hence, (ii) implies (iii). It remains to show that (iii) implies (i).
Suppose (iii) holds. Then for every v 2 V,
hU*UðvÞi ¼ hUðvÞ; UðvÞi ¼ hv; vi ¼ hIðvÞ; vi
Hence, hðU*U  IÞðvÞ; vi ¼ 0 for every v 2 V. But U*U  I is self-adjoint (Prove!); then, by Problem
13.9, we have U*U  I ¼ 0 and so U*U ¼ I. Thus, U* ¼ U1
, as claimed.
386 CHAPTER 13 Linear Operators on Inner Product Spaces
13.11. Let U be a unitary (orthogonal) operator on V, and let W be a subspace invariant under U. Show
that W?
is also invariant under U.
Because U is nonsingular, UðWÞ ¼ W; that is, for any w 2 W, there exists w0
2 W such that
Uðw0
Þ ¼ w. Now let v 2 W?
. Then, for any w 2 W,
hUðvÞ; wi ¼ hUðvÞ; Uðw0
Þi ¼ hv; w0
i ¼ 0
Thus, UðvÞ belongs to W?
. Therefore, W?
is invariant under U.
13.12. Prove Theorem 13.9: The change-of-basis matrix from an orthonormal basis fu1; . . . ; ung into
another orthonormal basis is unitary (orthogonal). Conversely, if P ¼ ½aij is a unitary (ortho-
gonal) matrix, then the vectors ui0 ¼
P
j ajiuj form an orthonormal basis.
Suppose fvig is another orthonormal basis and suppose
vi ¼ bi1u1 þ bi2u2 þ    þ binun; i ¼ 1; . . . ; n ð1Þ
Because fvig is orthonormal,
dij ¼ hvi; vji ¼ bi1bj1 þ bi2bj2 þ    þ binbjn ð2Þ
Let B ¼ ½bij be the matrix of coefficients in (1). (Then BT
is the change-of-basis matrix from fuig to
fvig.) Then BB* ¼ ½cij, where cij ¼ bi1bj1 þ bi2bj2 þ    þ binbjn. By (2), cij ¼ dij, and therefore BB* ¼ I.
Accordingly, B, and hence, BT
, is unitary.
It remains to prove that fu0
ig is orthonormal. By Problem 13.67,
hu0
i; u0
ji ¼ a1ia1j þ a2ia2j þ    þ anianj ¼ hCi; Cji
where Ci denotes the ith column of the unitary (orthogonal) matrix P ¼ ½aij: Because P is unitary
(orthogonal), its columns are orthonormal; hence, hu0
i; u0
ji ¼ hCi; Cji ¼ dij. Thus, fu0
ig is an orthonormal basis.
Symmetric Operators and Canonical Forms in Euclidean Spaces
13.13. Let T be a symmetric operator. Show that (a) The characteristic polynomial DðtÞ of T is a
product of linear polynomials (over R); (b) T has a nonzero eigenvector.
(a) Let A be a matrix representing T relative to an orthonormal basis of V; then A ¼ AT
. Let DðtÞ be the
characteristic polynomial of A. Viewing A as a complex self-adjoint operator, A has only real
eigenvalues by Theorem 13.4. Thus,
DðtÞ ¼ ðt  l1Þðt  l2Þ    ðt  lnÞ
where the li are all real. In other words, DðtÞ is a product of linear polynomials over R.
(b) By (a), T has at least one (real) eigenvalue. Hence, T has a nonzero eigenvector.
13.14. Prove Theorem 13.11: Let T be a symmetric operator on a real n-dimensional inner product
space V. Then there exists an orthonormal basis of V consisting of eigenvectors of T. (Hence, T
can be represented by a diagonal matrix relative to an orthonormal basis.)
The proof is by induction on the dimension of V. If dim V ¼ 1, the theorem trivially holds. Now
suppose dim V ¼ n  1. By Problem 13.13, there exists a nonzero eigenvector v1 of T. Let W be the space
spanned by v1, and let u1 be a unit vector in W, e.g., let u1 ¼ v1=kv1k.
Because v1 is an eigenvector of T, the subspace W of V is invariant under T. By Problem 13.8, W?
is
invariant under T* ¼ T. Thus, the restriction ^
T of T to W?
is a symmetric operator. By Theorem 7.4,
V ¼ W W?
. Hence, dim W?
¼ n  1, because dim W ¼ 1. By induction, there exists an orthonormal
basis fu2; . . . ; ung of W?
consisting of eigenvectors of ^
T and hence of T. But hu1; uii ¼ 0 for i ¼ 2; . . . ; n
because ui 2 W?
. Accordingly fu1; u2; . . . ; ung is an orthonormal set and consists of eigenvectors of T.
Thus, the theorem is proved.
CHAPTER 13 Linear Operators on Inner Product Spaces 387
13.15. Let qðx; yÞ ¼ 3x2
 6xy þ 11y2
. Find an orthonormal change of coordinates (linear substitution)
that diagonalizes the quadratic form q.
Find the symmetric matrix A representing q and its characteristic polynomial DðtÞ. We have
A ¼
3 3
3 11
 
and DðtÞ ¼ t2
 trðAÞ t þ jAj ¼ t2
 14t þ 24 ¼ ðt  2Þðt  12Þ
The eigenvalues are l ¼ 2 and l ¼ 12. Hence, a diagonal form of q is
qðs; tÞ ¼ 2s2
þ 12t2
(where we use s and t as new variables). The corresponding orthogonal change of coordinates is obtained
by finding an orthogonal set of eigenvectors of A.
Subtract l ¼ 2 down the diagonal of A to obtain the matrix
M ¼
1 3
3 9
 
corresponding to
x  3y ¼ 0
3x þ 9y ¼ 0
or x  3y ¼ 0
A nonzero solution is u1 ¼ ð3; 1Þ. Next subtract l ¼ 12 down the diagonal of A to obtain the matrix
M ¼
9 3
3 1
 
corresponding to
9x  3y ¼ 0
3x  y ¼ 0
or  3x  y ¼ 0
A nonzero solution is u2 ¼ ð1; 3Þ. Normalize u1 and u2 to obtain the orthonormal basis
^
u1 ¼ ð3=
ffiffiffiffiffi
10
p
; 1=
ffiffiffiffiffi
10
p
Þ; ^
u2 ¼ ð1=
ffiffiffiffiffi
10
p
; 3=
ffiffiffiffiffi
10
p
Þ
Now let P be the matrix whose columns are ^
u1 and ^
u2. Then
P ¼
3=
ffiffiffiffiffi
10
p
1=
ffiffiffiffiffi
10
p
1=
ffiffiffiffiffi
10
p
3=
ffiffiffiffiffi
10
p
 #
and D ¼ P1
AP ¼ PT
AP ¼
2 0
0 12
 
Thus, the required orthogonal change of coordinates is
x
y
 
¼ P
s
t
 
or x ¼
3s  t
ffiffiffiffiffi
10
p ; y ¼
s þ 3t
ffiffiffiffiffi
10
p
One can also express s and t in terms of x and y by using P1
¼ PT
; that is,
s ¼
3x þ y
ffiffiffiffiffi
10
p ; t ¼
x þ 3y
ffiffiffiffiffi
10
p
13.16. Prove Theorem 13.12: Let T be an orthogonal operator on a real inner product space V. Then
there exists an orthonormal basis of V in which T is represented by a block diagonal matrix M of
the form
M ¼ diag 1; . . . ; 1; 1; . . . ; 1;
cos y1 sin y1
sin y1 cos y1
 
; . . . ;
cos yr sin yr
sin yr cos yr
 
 
Let S ¼ T þ T1
¼ T þ T*. Then S* ¼ ðT þ T*Þ* ¼ T* þ T ¼ S. Thus, S is a symmetric operator
on V. By Theorem 13.11, there exists an orthonormal basis of V consisting of eigenvectors of S. If
l1; . . . ; lm denote the distinct eigenvalues of S, then V can be decomposed into the direct sum
V ¼ V1 V2    Vm where the Vi consists of the eigenvectors of S belonging to li. We claim that
each Vi is invariant under T. For suppose v 2 V; then SðvÞ ¼ liv and
SðTðvÞÞ ¼ ðT þ T1
ÞTðvÞ ¼ TðT þ T1
ÞðvÞ ¼ TSðvÞ ¼ TðlivÞ ¼ liTðvÞ
That is, TðvÞ 2 Vi. Hence, Vi is invariant under T. Because the Vi are orthogonal to each other, we can
restrict our investigation to the way that T acts on each individual Vi.
On a given Vi; we have ðT þ T1
Þv ¼ SðvÞ ¼ liv. Multiplying by T, we get
ðT2
 liT þ IÞðvÞ ¼ 0 ð1Þ
388 CHAPTER 13 Linear Operators on Inner Product Spaces
We consider the cases li ¼ 2 and li 6¼ 2 separately. If li ¼ 2, then ðT IÞ2
ðvÞ ¼ 0, which leads to
ðT IÞðvÞ ¼ 0 or TðvÞ ¼ v. Thus, T restricted to this Vi is either I or I.
If li 6¼ 2, then T has no eigenvectors in Vi, because, by Theorem 13.4, the only eigenvalues of T are
1 or 1. Accordingly, for v 6¼ 0, the vectors v and TðvÞ are linearly independent. Let W be the subspace
spanned by v and TðvÞ. Then W is invariant under T, because using (1) we get
TðTðvÞÞ ¼ T2
ðvÞ ¼ liTðvÞ  v 2 W
By Theorem 7.4, Vi ¼ W W?
. Furthermore, by Problem 13.8, W?
is also invariant under T. Thus, we
can decompose Vi into the direct sum of two-dimensional subspaces Wj where the Wj are orthogonal to
each other and each Wj is invariant under T. Thus, we can restrict our investigation to the way in which T
acts on each individual Wj.
Because T2
 liT þ I ¼ 0, the characteristic polynomial DðtÞ of T acting on Wj is
DðtÞ ¼ t2
 lit þ 1. Thus, the determinant of T is 1, the constant term in DðtÞ. By Theorem 2.7, the
matrix A representing T acting on Wj relative to any orthogonal basis of Wj must be of the form
cos y  sin y
sin y cos y
 
The union of the bases of the Wj gives an orthonormal basis of Vi, and the union of the bases of the Vi gives
an orthonormal basis of V in which the matrix representing T is of the desired form.
Normal Operators and Canonical Forms in Unitary Spaces
13.17. Determine which of the following matrices is normal:
(a) A ¼
1 i
0 1
 
, (b) B ¼
1 i
1 2 þ i
 
(a) AA* ¼
1 i
0 1
 
1 0
i 1
 
¼
2 i
i 1
 
, A*A ¼
1 0
i 1
 
1 i
0 1
 
¼
1 i
i 2
 
Because AA* 6¼ A*A, the matrix A is not normal.
(b) BB*
1 i
1 2 þ i
 
1 1
i 2  i
 
¼
2 2 þ 2i
2  2i 6
 
¼
1 1
i 2  i
 
1 i
1 2 þ i
 
¼ B*B
Because BB* ¼ B*B, the matrix B is normal.
13.18. Let T be a normal operator. Prove the following:
(a) TðvÞ ¼ 0 if and only if T*ðvÞ ¼ 0. (b) T  lI is normal.
(c) If TðvÞ ¼ lv, then T*ðvÞ ¼ 
lv; hence, any eigenvector of T is also an eigenvector of T*.
(d) If TðvÞ ¼ l1v and TðwÞ ¼ l2w where l1 6¼ l2, then hv; wi ¼ 0; that is, eigenvectors of T
belonging to distinct eigenvalues are orthogonal.
(a) We show that hTðvÞ; TðvÞi ¼ hT*ðvÞ; T*ðvÞi:
hTðvÞ; TðvÞi ¼ hv; T*TðvÞi ¼ hv; TT*ðvÞi ¼ hT*ðvÞ; T*ðvÞi
Hence, by ½I3 in the definition of the inner product in Section 7.2, TðvÞ ¼ 0 if and only if T*ðvÞ ¼ 0.
(b) We show that T  lI commutes with its adjoint:
ðT  lIÞðT  lIÞ* ¼ ðT  lIÞðT*  
lIÞ ¼ TT*  lT*  
lT þ l
lI
¼ T*T  
lT  lT* þ 
llI ¼ ðT*  
lIÞðT  lIÞ
¼ ðT  lIÞ*ðT  lIÞ
Thus, T  lI is normal.
CHAPTER 13 Linear Operators on Inner Product Spaces 389
(c) If TðvÞ ¼ lv, then ðT  lIÞðvÞ ¼ 0. Now T  lI is normal by (b); therefore, by (a),
ðT  lIÞ*ðvÞ ¼ 0. That is, ðT*  lIÞðvÞ ¼ 0; hence, T*ðvÞ ¼ 
lv.
(d) We show that l1hv; wi ¼ l2hv; wi:
l1hv; wi ¼ hl1v; wi ¼ hTðvÞ; wi ¼ hv; T*ðwÞi ¼ hv; 
l2wi ¼ l2hv; wi
But l1 6¼ l2; hence, hv; wi ¼ 0.
13.19. Prove Theorem 13.13: Let T be a normal operator on a complex finite-dimensional inner product
space V. Then there exists an orthonormal basis of V consisting of eigenvectors of T. (Thus, T
can be represented by a diagonal matrix relative to an orthonormal basis.)
The proof is by induction on the dimension of V. If dim V ¼ 1, then the theorem trivially holds. Now
suppose dim V ¼ n  1. Because V is a complex vector space, T has at least one eigenvalue and hence a
nonzero eigenvector v. Let W be the subspace of V spanned by v, and let u1 be a unit vector in W.
Because v is an eigenvector of T, the subspace W is invariant under T. However, v is also an
eigenvector of T* by Problem 13.18; hence, W is also invariant under T*. By Problem 13.8, W?
is
invariant under T** ¼ T. The remainder of the proof is identical with the latter part of the proof of
Theorem 13.11 (Problem 13.14).
13.20. Prove Theorem 13.14: Let T be any operator on a complex finite-dimensional inner product
space V. Then T can be represented by a triangular matrix relative to an orthonormal basis of V.
The proof is by induction on the dimension of V. If dim V ¼ 1, then the theorem trivially holds. Now
suppose dim V ¼ n  1. Because V is a complex vector space, T has at least one eigenvalue and hence at
least one nonzero eigenvector v. Let W be the subspace of V spanned by v, and let u1 be a unit vector in W.
Then u1 is an eigenvector of T and, say, Tðu1Þ ¼ a11u1.
By Theorem 7.4, V ¼ W W?
. Let E denote the orthogonal projection V into W?
. Clearly W?
is
invariant under the operator ET. By induction, there exists an orthonormal basis fu2; . . . ; ung of W?
such
that, for i ¼ 2; . . . ; n,
ETðuiÞ ¼ ai2u2 þi3 u3 þ    þ aiiui
(Note that fu1; u2; . . . ; ung is an orthonormal basis of V.) But E is the orthogonal projection of V onto W?
;
hence, we must have
TðuiÞ ¼ ai1u1 þ ai2u2 þ    þ aiiui
for i ¼ 2; . . . ; n. This with Tðu1Þ ¼ a11u1 gives us the desired result.
Miscellaneous Problems
13.21. Prove Theorem 13.10B: The following are equivalent:
(i) P ¼ T2
for some self-adjoint operator T.
(ii) P ¼ S*S for some operator S; that is, P is positive.
(iii) P is self-adjoint and hPðuÞ; ui  0 for every u 2 V.
Suppose (i) holds; that is, P ¼ T2
where T ¼ T*. Then P ¼ TT ¼ T*T, and so (i) implies (ii). Now
suppose (ii) holds. Then P* ¼ ðS*SÞ* ¼ S*S** ¼ S*S ¼ P, and so P is self-adjoint. Furthermore,
hPðuÞ; ui ¼ hS*SðuÞ; ui ¼ hSðuÞ; SðuÞi  0
Thus, (ii) implies (iii), and so it remains to prove that (iii) implies (i).
Now suppose (iii) holds. Because P is self-adjoint, there exists an orthonormal basis fu1; . . . ; ung of V
consisting of eigenvectors of P; say, PðuiÞ ¼ liui. By Theorem 13.4, the li are real. Using (iii), we show
that the li are nonnegative. We have, for each i,
0  hPðuiÞ; uii ¼ hliui; uii ¼ lihui; uii
Thus, hui; uii  0 forces li  0; as claimed. Accordingly,
ffiffiffiffi
li
p
is a real number. Let T be the linear
operator defined by
TðuiÞ ¼
ffiffiffiffi
li
p
ui for i ¼ 1; . . . ; n
390 CHAPTER 13 Linear Operators on Inner Product Spaces
Because T is represented by a real diagonal matrix relative to the orthonormal basis fuig, T is self-adjoint.
Moreover, for each i,
T2
ðuiÞ ¼ Tð
ffiffiffiffi
li
p
uiÞ ¼
ffiffiffiffi
li
p
TðiiÞ ¼
ffiffiffiffi
li
p ffiffiffiffi
li
p
ui ¼ liui ¼ PðuiÞ
Because T2
and P agree on a basis of V; P ¼ T2
. Thus, the theorem is proved.
Remark: The above operator T is the unique positive operator such that P ¼ T2
; it is called the
positive square root of P.
13.22. Show that any operator T is the sum of a self-adjoint operator and a skew-adjoint operator.
Set S ¼ 1
2 ðT þ T*Þ and U ¼ 1
2 ðT  T*Þ: Then T ¼ S þ U; where
S* ¼ ½1
2 ðT þ T*Þ* ¼ 1
2 ðT* þ T**Þ ¼ 1
2 ðT* þ TÞ ¼ S
U* ¼ ½1
2 ðT  T*Þ* ¼ 1
2 ðT*  TÞ ¼  1
2 ðT  T*Þ ¼ U
and
that is, S is self-adjoint and U is skew-adjoint.
13.23. Prove: Let T be an arbitrary linear operator on a finite-dimensional inner product space V. Then
T is a product of a unitary (orthogonal) operator U and a unique positive operator P; that is,
T ¼ UP. Furthermore, if T is invertible, then U is also uniquely determined.
By Theorem 13.10, T*T is a positive operator; hence, there exists a (unique) positive operator P such
that P2
¼ T*T (Problem 13.43). Observe that
kPðvÞk2
¼ hPðvÞ; PðvÞi ¼ hP2
ðvÞ; vi ¼ hT*TðvÞ; vi ¼ hTðvÞ; TðvÞi ¼ kTðvÞk2
ð1Þ
We now consider separately the cases when T is invertible and noninvertible.
If T is invertible, then we set ^
U ¼ PT1
. We show that ^
U is unitary:
^
U* ¼ ðPT1
Þ* ¼ T1
P* ¼ ðT*Þ1
P and ^
U ^
U ¼ ðT*Þ1
PPT1
¼ ðT*Þ1
T*TT1
¼ I
Thus, ^
U is unitary. We next set U ¼ ^
U1
. Then U is also unitary, and T ¼ UP as required.
To prove uniqueness, we assume T ¼ U0P0, where U0 is unitary and P0 is positive. Then
T*T ¼ P0
*U0
*U0P0 ¼ P0IP0 ¼ P2
0
But the positive square root of T*T is unique (Problem 13.43); hence, P0 ¼ P. (Note that the invertibility
of T is not used to prove the uniqueness of P.) Now if T is invertible, then P is also invertible by (1).
Multiplying U0P ¼ UP on the right by P1
yields U0 ¼ U. Thus, U is also unique when T is invertible.
Now suppose T is not invertible. Let W be the image of P; that is, W ¼ Im P. We define U1 :W ! V by
U1ðwÞ ¼ TðvÞ; where PðvÞ ¼ w ð2Þ
We must show that U1 is well defined; that is, that PðvÞ ¼ Pðv0
Þ implies TðvÞ ¼ Tðv0
Þ. This follows from
the fact that Pðv  v0
Þ ¼ 0 is equivalent to kPðv  v0
Þk ¼ 0, which forces kTðv  v0
Þk ¼ 0 by (1). Thus,
U1 is well defined. We next define U2 :W ! V. Note that, by (1), P and T have the same kernels. Hence, the
images of P and T have the same dimension; that is, dimðIm PÞ ¼ dim W ¼ dimðIm TÞ. Consequently,
W?
and ðIm TÞ?
also have the same dimension. We let U2 be any isomorphism between W?
and ðIm TÞ?
.
We next set U ¼ U1 U2. [Here U is defined as follows: If v 2 V and v ¼ w þ w0
, where w 2 W,
w0
2 W?
, then UðvÞ ¼ U1ðwÞ þ U2ðw0
Þ.] Now U is linear (Problem 13.69), and, if v 2 V and PðvÞ ¼ w,
then, by (2),
TðvÞ ¼ U1ðwÞ ¼ UðwÞ ¼ UPðvÞ
Thus, T ¼ UP, as required.
It remains to show that U is unitary. Now every vector x 2 V can be written in the form x ¼ PðvÞ þ w0
,
where w0
2 W?
. Then UðxÞ ¼ UPðvÞ þ U2ðw0
Þ ¼ TðvÞ þ U2ðw0
Þ, where hTðvÞ; U2ðw0
Þi ¼ 0 by definition
CHAPTER 13 Linear Operators on Inner Product Spaces 391
of U2. Also, hTðvÞ; TðvÞi ¼ hPðvÞ; PðvÞi by (1). Thus,
hUðxÞ; UðxÞi ¼ hTðvÞ þ U2ðw0
Þ; TðvÞ þ U2ðw0
Þi ¼ hTðvÞ; TðvÞi þ hU2ðw0
Þ; U2ðw0
Þi
¼ hPðvÞ; PðvÞi þ hw0
; w0
i ¼ hPðvÞ þ w0
; PðvÞ þ w0
Þ ¼ hx; xi
[We also used the fact that hPðvÞ; w0
i ¼ 0: Thus, U is unitary, and the theorem is proved.
13.24. Let V be the vector space of polynomials over R with inner product defined by
h f ; gi ¼
ð1
0
f ðtÞgðtÞ dt
Give an example of a linear functional f on V for which Theorem 13.3 does not hold—that is,
for which there is no polynomial hðtÞ such that fð f Þ ¼ h f ; hi for every f 2 V.
Let f:V ! R be defined by fð f Þ ¼ f ð0Þ; that is, f evaluates f ðtÞ at 0, and hence maps f ðtÞ into its
constant term. Suppose a polynomial hðtÞ exists for which
fðf Þ ¼ f ð0Þ ¼
ð1
0
f ðtÞhðtÞ dt ð1Þ
for every polynomial f ðtÞ. Observe that f maps the polynomial tf ðtÞ into 0; hence, by (1),
ð1
0
tf ðtÞhðtÞ dt ¼ 0 ð2Þ
for every polynomial f ðtÞ. In particular (2) must hold for f ðtÞ ¼ thðtÞ; that is,
ð1
0
t2
h2
ðtÞ dt ¼ 0
This integral forces hðtÞ to be the zero polynomial; hence, fð f Þ ¼ h f ; hi ¼ h f ; 0i ¼ 0 for every
polynomial f ðtÞ. This contradicts the fact that f is not the zero functional; hence, the polynomial hðtÞ
does not exist.
SUPPLEMENTARY PROBLEMS
Adjoint Operators
13.25. Find the adjoint of:
(a) A ¼
5  2i 3 þ 7i
4  6i 8 þ 3i
 
; (b) B ¼
3 5i
i 2i
 
; (c) C ¼
1 1
2 3
 
13.26. Let T :R3
! R3
be defined by Tðx; y; zÞ ¼ ðx þ 2y; 3x  4z; yÞ: Find T*ðx; y; zÞ:
13.27. Let T :C3
! C3
be defined by Tðx; y; zÞ ¼ ½ix þ ð2 þ 3iÞy; 3x þ ð3  iÞz; ð2  5iÞy þ iz:
Find T*ðx; y; zÞ:
13.28. For each linear function f on V; find u 2 V such that fðvÞ ¼ hv; ui for every v 2 V:
(a) f:R3
! R defined by fðx; y; zÞ ¼ x þ 2y  3z:
(b) f:C3
! C defined by fðx; y; zÞ ¼ ix þ ð2 þ 3iÞy þ ð1  2iÞz:
13.29. Suppose V has finite dimension. Prove that the image of T* is the orthogonal complement of the kernel of
T; that is, Im T* ¼ ðKer TÞ?
: Hence, rankðTÞ ¼ rankðT*Þ:
13.30. Show that T*T ¼ 0 implies T ¼ 0:
392 CHAPTER 13 Linear Operators on Inner Product Spaces
13.31. Let V be the vector space of polynomials over R with inner product defined by h f ; gi ¼
Ð1
0 f ðtÞgðtÞ dt: Let
D be the derivative operator on V; that is, Dð f Þ ¼ df =dt: Show that there is no operator D* on V such that
hDð f Þ; gi ¼ h f ; D*ðgÞi for every f ; g 2 V: That is, D has no adjoint.
Unitary and Orthogonal Operators and Matrices
13.32. Find a unitary (orthogonal) matrix whose first row is
(a) ð2=
ffiffiffiffiffi
13
p
; 3=
ffiffiffiffiffi
13
p
Þ, (b) a multiple of ð1; 1  iÞ, (c) a multiple of ð1; i; 1  iÞ:
13.33. Prove that the products and inverses of orthogonal matrices are orthogonal. (Thus, the orthogonal matrices
form a group under multiplication, called the orthogonal group.)
13.34. Prove that the products and inverses of unitary matrices are unitary. (Thus, the unitary matrices form a
group under multiplication, called the unitary group.)
13.35. Show that if an orthogonal (unitary) matrix is triangular, then it is diagonal.
13.36. Recall that the complex matrices A and B are unitarily equivalent if there exists a unitary matrix P such that
B ¼ P*AP. Show that this relation is an equivalence relation.
13.37. Recall that the real matrices A and B are orthogonally equivalent if there exists an orthogonal matrix P such
that B ¼ PT
AP. Show that this relation is an equivalence relation.
13.38. Let W be a subspace of V. For any v 2 V, let v ¼ w þ w0
, where w 2 W, w0
2 W?
. (Such a sum is unique
because V ¼ W W?
.) Let T :V ! V be defined by TðvÞ ¼ w  w0
. Show that T is self-adjoint unitary
operator on V.
13.39. Let V be an inner product space, and suppose U :V ! V (not assumed linear) is surjective (onto) and
preserves inner products; that is, hUðvÞ; UðwÞi ¼ hu; wi for every v; w 2 V. Prove that U is linear and
hence unitary.
Positive and Positive Definite Operators
13.40. Show that the sum of two positive (positive definite) operators is positive (positive definite).
13.41. Let T be a linear operator on V and let f :V V ! K be defined by f ðu; vÞ ¼ hTðuÞ; vi. Show that f is an
inner product on V if and only if T is positive definite.
13.42. Suppose E is an orthogonal projection onto some subspace W of V. Prove that kI þ E is positive (positive
definite) if k  0 ðk  0Þ.
13.43. Consider the operator T defined by TðuiÞ ¼
ffiffiffiffi
li
p
ui; i ¼ 1; . . . ; n, in the proof of Theorem 13.10A. Show
that T is positive and that it is the only positive operator for which T2
¼ P.
13.44. Suppose P is both positive and unitary. Prove that P ¼ I.
13.45. Determine which of the following matrices are positive (positive definite):
ðiÞ
1 1
1 1
 
; ðiiÞ
0 i
i 0
 
; ðiiiÞ
0 1
1 0
 
; ðivÞ
1 1
0 1
 
; ðvÞ
2 1
1 2
 
; ðviÞ
1 2
2 1
 
13.46. Prove that a 2 2 complex matrix A ¼
a b
c d
 
is positive if and only if (i) A ¼ A*, and (ii) a; d and
jAj ¼ ad  bc are nonnegative real numbers.
CHAPTER 13 Linear Operators on Inner Product Spaces 393
13.47. Prove that a diagonal matrix A is positive (positive definite) if and only if every diagonal entry is a
nonnegative (positive) real number.
Self-adjoint and Symmetric Matrices
13.48. For any operator T, show that T þ T* is self-adjoint and T  T* is skew-adjoint.
13.49. Suppose T is self-adjoint. Show that T2
ðvÞ ¼ 0 implies TðvÞ ¼ 0. Using this to prove that Tn
ðvÞ ¼ 0 also
implies that TðvÞ ¼ 0 for n  0.
13.50. Let V be a complex inner product space. Suppose hTðvÞ; vi is real for every v 2 V. Show that T is self-
adjoint.
13.51. Suppose T1 and T2 are self-adjoint. Show that T1T2 is self-adjoint if and only if T1 and T1 commute; that is,
T1T2 ¼ T2T1.
13.52. For each of the following symmetric matrices A, find an orthogonal matrix P and a diagonal matrix D such
that PT
AP is diagonal:
(a) A ¼
1 2
2 2
 
; (b) A ¼
5 4
4 1
 
, (c) A ¼
7 3
3 1
 
13.53. Find an orthogonal change of coordinates X ¼ PX0
that diagonalizes each of the following quadratic forms
and find the corresponding diagonal quadratic form qðx0
Þ:
(a) qðx; yÞ ¼ 2x2
 6xy þ 10y2
, (b) qðx; yÞ ¼ x2
þ 8xy  5y2
(c) qðx; y; zÞ ¼ 2x2
 4xy þ 5y2
þ 2xz  4yz þ 2z2
Normal Operators and Matrices
13.54. Let A ¼
2 i
i 2
 
. Verify that A is normal. Find a unitary matrix P such that P*AP is diagonal. Find P*AP.
13.55. Show that a triangular matrix is normal if and only if it is diagonal.
13.56. Prove that if T is normal on V, then kTðvÞk ¼ kT*ðvÞk for every v 2 V. Prove that the converse holds in
complex inner product spaces.
13.57. Show that self-adjoint, skew-adjoint, and unitary (orthogonal) operators are normal.
13.58. Suppose T is normal. Prove that
(a) T is self-adjoint if and only if its eigenvalues are real.
(b) T is unitary if and only if its eigenvalues have absolute value 1.
(c) T is positive if and only if its eigenvalues are nonnegative real numbers.
13.59. Show that if T is normal, then T and T* have the same kernel and the same image.
13.60. Suppose T1 and T2 are normal and commute. Show that T1 þ T2 and T1T2 are also normal.
13.61. Suppose T1 is normal and commutes with T2. Show that T1 also commutes with T2
*.
13.62. Prove the following: Let T1 and T2 be normal operators on a complex finite-dimensional vector space V.
Then there exists an orthonormal basis of V consisting of eigenvectors of both T1 and T2. (That is, T1 and
T2 can be simultaneously diagonalized.)
Isomorphism Problems for Inner Product Spaces
13.63. Let S ¼ fu1; . . . ; ung be an orthonormal basis of an inner product space V over K. Show that the mapping
v 7! ½vs is an (inner product space) isomorphism between V and Kn
. (Here ½vS denotes the coordinate
vector of v in the basis S.)
394 CHAPTER 13 Linear Operators on Inner Product Spaces
13.64. Show that inner product spaces V and W over K are isomorphic if and only if V and W have the same
dimension.
13.65. Suppose fu1; . . . ; ung and fu0
1; . . . ; u0
ng are orthonormal bases of V and W, respectively. Let T :V ! W be
the linear map defined by TðuiÞ ¼ u0
i for each i. Show that T is an isomorphism.
13.66. Let V be an inner product space. Recall that each u 2 V determines a linear functional ^
u in the dual space
V* by the definition ^
uðvÞ ¼ hv; ui for every v 2 V. (See the text immediately preceding Theorem 13.3.)
Show that the map u 7! ^
u is linear and nonsingular, and hence an isomorphism from V onto V*.
Miscellaneous Problems
13.67. Suppose fu1; . . . ; ung is an orthonormal basis of V: Prove
(a) ha1u1 þ a2u2 þ    þ anun; b1u1 þ b2u2 þ    þ bnuni ¼ a1

b1 þ a2

b2 þ . . . 
an

bn
(b) Let A ¼ ½aij be the matrix representing T: V ! V in the basis fuig: Then aij ¼ hTðuiÞ; uji:
13.68. Show that there exists an orthonormal basis fu1; . . . ; ung of V consisting of eigenvectors of T if and only if
there exist orthogonal projections E1; . . . ; Er and scalars l1; . . . ; lr such that
(i) T ¼ l1E1 þ    þ lrEr, (ii) E1 þ    þ Er ¼ I, (iii) EiEj ¼ 0 for i 6¼ j
13.69. Suppose V ¼ U W and suppose T1:U ! V and T2 :W ! V are linear. Show that T ¼ T1 T2 is also
linear. Here T is defined as follows: If v 2 V and v ¼ u þ w where u 2 U, w 2 W, then
TðvÞ ¼ T1ðuÞ þ T2ðwÞ
ANSWERS TO SUPPLEMENTARY PROBLEMS
Notation: ½R1; R2; . . . ; Rn denotes a matrix with rows R1; R2; . . . ; Rn.
13.25. (a) ½5 þ 2i; 4 þ 6i; 3  7i; 8  3i, (b) ½3; i; 5i; 2i, (c) ½1; 2; 1; 3
13.26. T*ðx; y; zÞ ¼ ðx þ 3y; 2x þ z; 4yÞ
13.27. T*ðx; y; zÞ ¼ ½ix þ 3y; ð2  3iÞx þ ð2 þ 5iÞz; ð3 þ iÞy  iz
13.28. (a) u ¼ ð1; 2; 3Þ, (b) u ¼ ði; 2  3i; 1 þ 2iÞ
13.32. (a) ð1=
ffiffiffiffiffi
13
p
Þ½2; 3; 3; 2, (b) ð1=
ffiffiffi
3
p
Þ½1; 1  i; 1 þ i; 1,
(c) 1
2 ½1; i; 1  i;
ffiffiffi
2
p
i; 
ffiffiffi
2
p
; 0; 1; i; 1 þ i
13.45. Only (i) and (v) are positive. Only (v) is positive definite.
13.52. (a and b) P ¼ ð1=
ffiffiffi
5
p
Þ½2; 1; 1; 2, (c) P ¼ ð1=
ffiffiffiffiffi
10
p
Þ½3; 1; 1; 3
(a) D ¼ ½2; 0; 0; 3; (b) D ¼ ½7; 0; 0; 3; (c) D ¼ ½8; 0; 0; 2
13.53. (a) x ¼ ð3x0
 y0
Þ=
ffiffiffiffiffi
10
p
; y ¼ ðx0
þ 3y0
Þ=
ffiffiffiffiffi
10
p
; (b) x ¼ ð2x0
 y0
Þ=
ffiffiffi
5
p
; y ¼ ðx0
þ 2y0
Þ=
ffiffiffi
5
p
;
(c) x ¼ x0
=
ffiffiffi
3
p
þ y0
=
ffiffiffi
2
p
þ z0
=
ffiffiffi
6
p
; y ¼ x0
=
ffiffiffi
3
p
 2z0
=
ffiffiffi
6
p
; z ¼ x0
=
ffiffiffi
3
p
 y0
=
ffiffiffi
2
p
þ z0
=
ffiffiffi
6
p
;
(a) qðx0
Þ ¼ diagð1; 11Þ; (b) qðx0
Þ ¼ diagð3; 7Þ; (c) qðx0
Þ ¼ diagð1; 17Þ
13.54. (a) P ¼ ð1=
ffiffiffi
2
p
Þ½1; 1; 1; 1; P*AP ¼ diagð2 þ i; 2  iÞ
CHAPTER 13 Linear Operators on Inner Product Spaces 395
Multilinear Products
A.1 Introduction
The material in this appendix is much more abstract than that which has previously appeared. Accordingly,
many of the proofs will be omitted. Also, we motivate the material with the following observation.
Let S be a basis of a vector space V. Theorem 5.2 may be restated as follows.
THEOREM 5.2: Let g:S ! V be the inclusion map of the basis S into V. Then, for any vector space
U and any mapping f :S ! U; there exists a unique linear mapping f 
:V ! U such
that f ¼ f 
 g:
Another way to state the fact that f ¼ f 
 g is that the diagram in Fig. A-1(a) commutes.
A.2 Bilinear Mapping and Tensor Products
Let U, V, W be vector spaces over a field K. Consider a map
f : V W ! U
Then f is said to be bilinear if, for each v 2 V; the map fv :W ! U defined by fv w
ð Þ ¼ f v; w
ð Þ is linear;
and, for each w 2 W; the map fw :V ! U defined by fw v
ð Þ ¼ f v; w
ð Þ is linear.
That is, f is linear in each of its two variables. Note that f is similar to a bilinear form except that the
values of the map f are in a vector space U rather than the field K.
DEFINITION A.1: Let V and W be vector spaces over the same field K. The tensor product of V and
W is a vector space T over K together with a bilinear map g : V W ! T;
denoted by g v; w
ð Þ ¼ v  w; with the following property: (*) For any vector
space U over K and any bilinear map f :V W ! U there exists a unique linear
map f 
:T ! U such that f 
 g ¼ f :
The tensor product (T, g) [or simply T when g is understood] of V and W is denoted by V  W; and the
element v  w is called the tensor of v and w.
Another way to state condition (*) is that the diagram in Fig. A-1(b) commutes. The fact that such
a unique linear map f* exists is called the ‘‘Universal Mapping Principle’’ (UMP). As illustrated in
Fig. A-1(b), condition (*) also says that any bilinear map f :V W ! U ‘‘factors through’’ the tensor
product T ¼ V  W: The uniqueness in (*) implies that the image of g spans T; that is, span v  w
f g
ð Þ ¼ T:
APPENDIX A
Figure A-1
396
THEOREM A.1: (Uniqueness of Tensor Products) Let (T, g) and T0
; g0
ð Þ be tensor products of V
and W. Then there exists a unique isomorphism h:T ! T0
such that hg ¼ g0
:
Proof. Because T is a tensor product, and g0
:V  W ! T0
is bilinear, there exists a unique linear map
h:T ! T0
such that hg ¼ g0
: Similarly, because T0
is a tensor product, and g:V  W ! T0
is bilinear,
there exists a unique linear map h0
:T0
! T such that h0
g0
¼ g: Using hg ¼ g0
, we get h0
hg ¼ g: Also,
because T is a tensor product, and g:V  W ! T is bilinear, there exists a unique linear map h
:T ! T
such that h
g ¼ g: But 1Tg ¼ g: Thus, h0
h ¼ h
¼ 1T. Similarly, hh0
¼ 1T0 : Therefore, h is an
isomorphism from T to T0
:
THEOREM A.2: (Existence of Tensor Product) The tensor product T ¼ V  W of vector spaces V
and W over K exists. Let v1; . . . ; vm
f g be a basis of V and let w1; . . . ; wn
f g be a
basis of W. Then the mn vectors
vi  wi i ¼ 1; . . . ; m; j ¼ 1; . . . ; n
ð Þ
form a basis of T. Thus, dim T ¼ mn ¼ dim V
ð Þ dim W
ð Þ:
Outline of Proof. Suppose v1; . . . ; vm
 
is a basis of V, and suppose w1; . . . ; wn
f g is a basis of W.
Consider the mn symbols tijji ¼ i; . . . ; m; j ¼ 1; . . . ; n
 
. Let T be the vector space generated by the tij.
That is, T consists of all linear combinations of the tij with coefficients in K. [See Problem 4.137.]
Let v 2 V and w 2 W. Say
v ¼ a1v1 þ a2v2 þ    þ amvm and w ¼ b1w1 þ b2w2 þ    þ bmwm
Let g:V W ! T be defined by
g v; w
ð Þ ¼
X
i
X
j
aibjtij
Then g is bilinear. [Proof left to reader.]
Now let f :V W ! U be bilinear. Because the tij form a basis of T, Theorem 5.2 (stated above) tells
us that there exists a unique linear map f 
:T ! U such that f 
tij ¼ f vi; wj . Then, for v ¼
P
i
aivi and
w ¼
P
j
bjwj, we have
f ðv; wÞ ¼ f
X
i
aivi;
X
j
bjwj
!
¼
X
i
X
j
aibj f vi; wj ¼
X
i
X
j
aibj tij ¼ f 
g v; w
ð Þ
ð Þ:
Therefore, f ¼ f 
g where f * is the required map in Definition A.1. Thus, T is a tensor product.
Let fv0
1; . . . ; v0
mg be any basis of V and fw0
1; . . . ; w0
mg be any basis of W.
Let v 2 V and w 2 W and say
v ¼ a0
1v0
1 þ    þ a0
mv0
m and w ¼ b0
1w0
1 þ    þ b0
mw0
m
Then
v  w ¼ g v; w
ð Þ ¼
X
i
X
j
a0
ib0
i g v0
i; w0
i
ð Þ ¼
X
i
X
j
a0
ib0
j v0
i  w0
j
Thus, the elements v0
i  w0
j span T. There are mn such elements. They cannot be linearly dependent
because tij
 
is a basis of T, and hence, dim T ¼ mn. Thus, the v0
i  w0
j form a basis of T.
Next we give two concrete examples of tensor products.
EXAMPLE A.1 Let V be the vector space of polynomials Pr1 x
ð Þ and let W be the vector space of polynomials
Ps1 y
ð Þ. Thus, the following from bases of V and W, respectively,
1; x; x2
; . . . ; xr1
and 1; y; y2
; . . . ; ys1
In particular, dim V ¼ r and dim W ¼ s: Let T be the vector space of polynomials in variables x and y
with basis
xi
y j
 
where i ¼ 0; 1; . . . ; r  1; j ¼ 0; 1; . . . ; s  1
Appendix A Multilinear Products 397
Then T is the tensor product V  W under the mapping
xi
 y j
¼ xi
yi
For example, suppose v ¼ 2  5x þ 3x3
and w ¼ 7y þ 4y2
. Then
v  w ¼ 14y þ 8y2
 35xy  20xy2
þ 21x3
y þ 12x3
y2
Note, dim T ¼ rs ¼ dim V
ð Þ dim W
ð Þ:
EXAMPLE A.2
Let V be the vector space of m n matrices over a field K and let W be the vector space of p q matrices
over K. Suppose A ¼ ½a11 belongs to V, and B belongs to W. Let T be the vector space of mp nq
matrices over K. Then T is the tensor product of V and W where A  B is the block matrix
A  B ¼ aijB
 
¼
a11B a12B    a1nB
a21B a22B    a2nB
am1B am2B    amnB
2
6
6
4
3
7
7
5
For example, suppose A ¼
1 2
3 4
 
and B ¼
1 2 3
4 5 6
 
: Then
A  B ¼
1 2 3 2 4 6
4 5 6 8 10 12
3 6 9 4 8 12
12 15 18 16 20 24
2
6
6
6
4
3
7
7
7
5
Isomorphisms of Tensor Products
First we note that tensoring is associative in a cannonical way. Namely,
THEOREM A.3: Let U, V, W be vector spaces over a field K. Then there exists a unique isomorphism
U  V
ð Þ  W ! U  V  W
ð Þ
such that, for every u 2 U; v 2 V; w 2 W;
u  v
ð Þ  w 7! u  v  w
ð Þ
Accordingly, we may omit parenthesis when tensoring any number of factors. Specifically, given
vectors spaces V1; V2; . . . ; Vm over a field K, we may unambiguously form their tensor product
V1  V2  . . .  Vm
and, for vectors vj in Vj, we may unambiguously form the tensor product
v1  v2  . . .  vm
Moreover, given a vector space V over K, we may unambiguously define the following tensor
product:
r
V ¼ V  V  . . .  V r factors
ð Þ
Also, there is a canonical isomorphism
r
V
ð Þ  s
V
ð Þ ! rþs
V
Furthermore, viewing K as a vector space over itself, we have the canonical isomorphism
K  V ! V
where we define a  v ¼ av:
                       
398 Appendix A Multilinear Products
A.3 Alternating Multilinear Maps
Let f :Vr
! U where V and U are vector spaces over K. [Recall Vr
¼ V V . . . V, r factors.]
(1) The mapping f is said to be multilinear or r-linear if f v1; . . . ; vr
ð Þ is linear as a function of each vj
when the other vi’s are held fixed. That is,
f ð. . . ; vj þ v0
j; . . .Þ ¼ f ð. . . ; vj; . . .Þ þ f ð. . . ; v0
j; . . .Þ
f ð. . . ; kvj; . . .Þ ¼ kf ð. . . ; vj; . . .Þ
where only the jth position changes.
(2) The mapping f is said to be alternating if
f v1; . . . ; vr
ð Þ ¼ 0 whenever vi ¼ vj with i 6¼ j
One can easily show (Prove!) that if f is an alternating multilinear mapping on Vr
, then
f . . . ; vi; . . . ; vj; . . . ¼ f . . . ; vj; . . . ; vi; . . .
That is, if two of the vectors are interchanged, then the associated value changes sign.
EXAMPLE A.3 (Determinants)
The determinant function D:M ! K on the space M of n n matrices may be viewed as an n-variable function
D A
ð Þ ¼ D R1; R2; . . . ; Rn
ð Þ
defined on the rows R1; R2; . . . ; Rn of A. Recall (Chapter 8) that, in this context, D is both n-linear and alternating.
We now need some additional notation. Let K ¼ k1; k2; . . . ; kr
½  denote an r-list (r-tuple) of elements
from In ¼ 1; 2; . . . ; n
ð Þ. We will then use the following notation where the vk’s denote vectors and the
aik’s denote scalars:
vK ¼ ðvk1
; vk2
; . . . ; vkr
Þ and aK ¼ a1k1
a2k2
. . . arkr
Note vK is a list of r vectors, and aK is a product of r scalars.
Now suppose the elements in K ¼ k1; k2; . . . ; kr
½  are distinct. Then K is a permutation sK of an r-list
J ¼ i1; i2; . . . ; ir
½  in standard form, that is, where i1  i2  . . .  ir. The number of such standard-form
r-lists J from In is the binomial coefficient:
n
r
 
¼
n!
r! n  r
ð Þ!
[Recall sign sK
ð Þ ¼ 1
ð ÞmK
where mK is the number of interchanges that transforms K into J.]
Now suppose A ¼ aij
 
is an r n matrix. For a given ordered r-list J, we define
DJ A
ð Þ ¼
a1i1
a1i2
. . . a1ir
a2i1
a2i2
. . . a2ir
ari1
ari2
. . . arir
















That is, DJ (A) is the determinant of the r r submatrix of A whose column subscripts belong to J.
Our main theorem below uses the following ‘‘shuffling’’ lemma.
LEMMA A.4 Let V and U be vector spaces over K, and let f :Vr
! U be an alternating r-linear
mapping. Let v1; v2; . . . ; vn be vectors in V and let A ¼ aij
 
be an r n matrix over K
where r  n. For i ¼ 1; 2; . . . ; r, let
ui ¼ ai1vi þ ai2v2 þ    þ ainvn
......................
Appendix A Multilinear Products 399
Then
f u1; . . . ; ur
ð Þ ¼
X
f
DJ A
ð Þf ðvi1
; vi2
; . . . ; vir
Þ
where the sum is over all standard-form r-lists J ¼ i1; i2; . . . ; ir
f g.
The proof is technical but straightforward. The linearity of f gives us the sum
f u1; . . . ; ur
ð Þ ¼
X
K
aKf vK
ð Þ
where the sum is over all r-lists K from 1; . . . ; n
f g. The alternating property of f tells us that f vK
ð Þ ¼ 0
when K does not contain distinct integers. The proof now mainly uses the fact that as we interchange the
vj’s to transform
f vK
ð Þ ¼ f ðvk1
; vk2
; . . . ; vkr
Þ to f vj ¼ f ðvi1
; vi2
; . . . ; vir
Þ
so that i1      ir, the associated sign of aK, will change in the same way as the sign of the
corresponding permutation sK changes when it is transformed to the identity permutation using
transpositions.
We illustrate the lemma below for r ¼ 2 and n ¼ 3.
EXAMPLE A.4 Suppose f :V2
! U is an alternating multilinear function. Let v1; v2; v3 2 V and let u; w 2 V.
Suppose
u ¼ a1v1 þ a2v2 þ a3v3 and w ¼ b1v1 þ b2v2 þ b3v3
Consider
f u; w
ð Þ ¼ f a1v1 þ a2v2 þ a3v3; b1v1 þ b2v2 þ b3v3
ð Þ
Using multilinearity, we get nine terms:
f u; w
ð Þ ¼ a1b1 f v1; vr
ð Þ þ a1b2 f v1; v2
ð Þ þ a1b3 f v1; v3
ð Þ
þ a2b1 f v2; v1
ð Þ þ a2b2 f v2; v2
ð Þ þ a2b3 f v2; v3
ð Þ
þ a3b1 f v3; v1
ð Þ þ a3b2 f v3; v2
ð Þ þ a3b3 f v3; v3
ð Þ
(Note that J ¼ 1; 2
½ ; J0
¼ 1; 3
½  and J00
¼ 2; 3
½  are the three standard-form 2-lists of I ¼ 1; 2; 3
½ .) The
alternating property of f tells us that each f vi; vi
ð Þ ¼ 0; hence, three of the above nine terms are equal to
0. The alternating property also tells us that f vi; vf ¼ f vf ; vr . Thus, three of the terms can be
transformed so their subscripts form a standard-form 2-list by a single interchange. Finally we obtain
f u; w
ð Þ ¼ a1b2  a2b1
ð Þ f v1; v2
ð Þ þ a1b3  a3b1
ð Þ f v1; v3
ð Þ þ a2b3  a3b2
ð Þ f v2; v3
ð Þ
¼
a1 a2
b1 b2







 f v1; v2
ð Þ þ
a1 a3
b1 b3







 f v1; v3
ð Þ þ
a2 a3
b2 b3







 f v2; v3
ð Þ
which is the content of Lemma A.4.
A.4 Exterior Products
The following definition applies.
DEFINITION A.2: Let V be an n-dimensionmal vector space over a field K, and let r be an integer such
that 1  r  n. The r-fold exterior product (or simply exterior product when r is
understood) is a vector space E over K together with an alternating r-linear mapping
g:Vr
! E, denoted by g v1; . . . ; vr
ð Þ ¼ v1 ^ .. . ^ vr, with the following property:
(*) For any vector space U over K and any alternating r-linear map f :Vr
! U
there exists a unique linear map f 
:E ! U such that f 
 g ¼ f .
400 Appendix A Multilinear Products
The r-fold tensor product (E, g) (or simply E when g is understood) of V is denoted by ^r
V, and the
element v1 ^    ^ vr is called the exterior product or wedge product of the vi’s.
Another way to state condition (*) is that the diagram in Fig. A-1(c) commutes. Again, the fact that
such a unique linear map f * exists is called the ‘‘Universal Mapping Principle (UMP)’’. As illustrated in
Fig. A-1(c), condition (*) also says that any alternating r-linear map f :Vr
! U ‘‘factors through’’ the
exterior product E ¼ ^r
V. Again, the uniqueness in (*) implies that the image of g spans E; that is,
span v1 ^    ^ vr
ð Þ ¼ E.
THEOREM A.5: (Uniqueness of Exterior Products) Let (E, g) and E0
; g0
ð Þ be r-fold exterior products
of V. Then there exists a unique isomorphism h:E ! E0
such that hg ¼ g0
.
The proof is the same as the proof of Theorem A.1, which uses the UMP.
THEOREM A.6: (Existence of Exterior Products) Let V be an n-dimensional vector space over K.
Then the exterior product E ¼ ^r
V exists. If r  n, then E ¼ 0
f g. If r  n, then
dim E ¼
n
r
 
. Moreover, if v1; . . . ; vn
½  is a basis of V, then the vectors
vi1
^ vi2
^    ^ vir
;
where 1  i1  i2      ir  n, form a basis of E.
We give a concrete example of an exterior product.
EXAMPLE A.5 (Cross Product)
Consider V ¼ R3
with the usual basis (i, j, k). Let E ¼ ^2
V. Note dim V ¼ 3: Thus, dim E ¼ 3 with basis
i ^ j; i ^ k; j ^ k: We identify E with R3
under the correspondence
i ¼ j ^ k; j ¼ k ^ i ¼ i ^ k; k ¼ i ^ j
Let u and w be arbitrary vectors in V ¼ R3
, say
u ¼ a1; a2; a3
ð Þ ¼ a1i þ a2j þ a3k and w ¼ b1; b2; b3
ð Þ ¼ b1i þ b2j þ b3k
Then, as in Example A.3,
u ^ w ¼ a1b2  a2b1
ð Þði ^ jÞ þ a1b3  a3b1
ð Þði ^ kÞ þ a2b3  a3b2
ð Þðj ^ kÞ
Using the above identification, we get
u ^ w ¼ a2b3  a3b2
ð Þi  a1b3  a3b1
ð Þj þ a1b2  a2b1
ð Þk
¼
a2 a3
b2 b3







i 
a1 a3
b1 b3







j þ
a1 a2
b1 b2







k
The reader may recognize that the above exterior product is precisely the well-known cross product
in R3
.
Our last theorem tells us that we are actually able to ‘‘multiply’’ exterior products, which allows us to
form an ‘‘exterior algebra’’ that is illustrated below.
THEOREM A.7: Let V be a vector space over K. Let r and s be positive integers. Then there is a
unique bilinear mapping
^r
V ^s
V ! ^rþs
V
such that, for any vectors ui; wj in V,
u1 ^    ^ ur
ð Þ w1 ^    ^ ws
ð Þ 7! u1 ^    ^ ur ^ w1 ^    ^ ws
Appendix A Multilinear Products 401
EXAMPLE A.6
We form an exterior algebra A over a field K using noncommuting variables x, y, z. Because it is an exterior algebra,
our variables satisfy:
x ^ x ¼ 0; y ^ y ¼ 0; z ^ z ¼ 0; and y ^ x ¼ x ^ y; z ^ x ¼ x ^ z; z ^ y ¼ y ^ z
Every element of A is a linear combination of the eight elements
1; x; y; z; x ^ y; x ^ z; y ^ z; x ^ y ^ z
We multiply two ‘‘polynomials’’ in A using the usual distributive law, but now we also use the above conditions. For
example,
3 þ 4y  5x ^ y þ 6x ^ z
½  ^ 5x  2y
½  ¼ 15x  6y  20x ^ y þ 12x ^ y ^ z
Observe we use the fact that
4y
½  ^ 5x
½  ¼ 20y ^ x ¼ 20x ^ y and 6x ^ z
½  ^ 2y
½  ¼ 12x ^ z ^ y ¼ 12x ^ y ^ z
402 Appendix A Multilinear Products
Algebraic Structures
B.1 Introduction
We define here algebraic structures that occur in almost all branches of mathematics. In particular, we
will define a field that appears in the definition of a vector space. We begin with the definition of a group,
which is a relatively simple algebraic structure with only one operation and is used as a building block for
many other algebraic systems.
B.2 Groups
Let G be a nonempty set with a binary operation; that is, to each pair of elements a; b 2 G there is
assigned an element ab 2 G. Then G is called a group if the following axioms hold:
G1
½  For any a; b; c 2 G, we have ab
ð Þc ¼ a bc
ð Þ (the associative law).
G2
½  There exists an element e 2 G, called the identity element, such that ae ¼ ea ¼ a for every
a 2 G.
G3
½  For each a 2 G there exists an element a1
2 G, called the inverse of a, such that
aa1
¼ a1
a ¼ e.
A group G is said to be abelian (or: commutative) if the commutative law holds—that is, if ab ¼ ba for
every a; b 2 G.
When the binary operation is denoted by juxtaposition as above, the group G is said to be written
multiplicatively. Sometimes, when G is abelian, the binary operation is denoted by + and G is said to be
written additively. In such a case, the identity element is denoted by 0 and is called the zero element; the
inverse is denoted by a and it is called the negative of a.
If A and B are subsets of a group G, then we write
AB ¼ abja 2 A; b 2 B
f g or A þ B ¼ a þ bja 2 A; b 2 B
f g
We also write a for {a}.
A subset H of a group G is called a subgroup of G if H forms a group under the operation of G. If H is
a subgroup of G and a 2 G, then the set Ha is called a right coset of H and the set aH is called a left coset
of H.
DEFINITION: A subgroup H of G is called a normal subgroup if a1
Ha  H for every a 2 G.
Equivalently, H is normal if aH ¼ Ha for every a 2 G—that is, if the right and left
cosets of H coincide.
Note that every subgroup of an abelian group is normal.
THEOREM B.1: Let H be a normal subgroup of G. Then the cosets of H in G form a group under
coset multiplication. This group is called the quotient group and is denoted by G/H.
APPENDIX B
403
EXAMPLE B.1 The set Z of integers forms an abelian group under addition. (We remark that the even integers
form a subgroup of Z but the odd integers do not.) Let H denote the set of multiples of 5; that is,
H ¼ f. . . ; 10; 5; 0; 5; 10; . . .g. Then H is a subgroup (necessarily normal) of Z. The cosets of H in Z follow:

0 ¼ 0 þ H ¼ H ¼ . . . ; 10; 5; 0; 5; 10; . . .
f g

1 ¼ 1 þ H ¼ f. . . ; 9; 4; 1; 6; 11; . . .g

2 ¼ 2 þ H ¼ . . . ; 8; 3; 2; 7; 12; . . .
f g

3 ¼ 3 þ H ¼ . . . ; 7; 2; 3; 8; 13; . . .
f g

4 ¼ 4 þ H ¼ . . . ; 6; 1; 4; 9; 14; . . .
f g
For any other integer n 2 Z, 
n ¼ n þ H coincides with one of the above cosets. Thus, by the above theorem,
Z=H ¼ 
0; 
1; 
2; 
3; 
4
f g forms a group under coset addition; its addition table follows:
þ 
0 
1 
2 
3 
4

0 
0 
1 
2 
3 
4

1 
1 
2 
3 
4 
0

2 
2 
3 
4 
0 
1

3 
3 
4 
0 
1 
2

4 
4 
0 
1 
2 
3
This quotient group Z/H is referred to as the integers modulo 5 and is frequently denoted by Z5. Analogeusly, for
any positive integer n, there exists the quotient group Zn called the integers modulo n.
EXAMPLE B.2 The permutations of n symbols (see page 267) form a group under composition of mappings; it is
called the symmetric group of degree n and is denoted by Sn. We investigate S3 here; its elements are
E ¼
1 2 3
1 2 3
 
s2 ¼
1 2 3
3 2 1
 
f1 ¼
1 2 3
2 3 1
 
s1 ¼
1 2 3
1 3 2
 
s3 ¼
1 2 3
2 1 3
 
f2 ¼
1 2 3
3 1 2
 
Here
1 2 3
i j k
 
is the permutation that maps 1 7! i; 2 7! j; 3 7! k. The multiplication table of S3 is
E s1 s2 s3 f1 f2
E E s1 s2 s3 f1 f2
s1 s1 E f1 f2 s2 s3
s2 s2 f2 E f1 f3 s1
s3 s3 f1 f2 E s1 s2
f1 f1 s3 s1 s2 f2 E
f2 f2 s2 s3 s1 E f1
(The element in the ath row and bth column is ab.) The set H ¼ E; s1
f g is a subgroup of S3; its right and left
cosets are
Right Cosets Left Cosets
H ¼ E; s1
f g H ¼ E; s1
f g
Hf1
¼ f1; s2
f g f2H ¼ f1; s3
f g
Hf2
¼ f2; s3
f g f2H ¼ f2; s2
f g
Observe that the right cosets and the left cosets are distinct; hence, H is not a normal subgroup of S3.
A mapping f from a group G into a group G0
is called a homomorphism if f ab
ð Þ ¼ f a
ð Þf b
ð Þ. For every
a; b 2 G. (If f is also bijective, i.e., one-to-one and onto, then f is called an isomorphism and G and G0
are
404 Appendix B Algebraic Structures
said to be isomorphic.) If f : G ! G0
is a homomorphism, then the kernel of f is the set of elements of G
that map into the identity element e0
2 G0
:
kernel of f ¼ a 2 G j f a
ð Þ ¼ e0
f g
(As usual, f(G) is called the image of the mapping f : G ! G0
.) The following theorem applies.
THEOREM B.2: Let f: G ! G be a homomorphism with kernel K. Then K is a normal subgroup of G,
and the quotient group G/K is isomorphic to the image of f.
EXAMPLE B.3 Let G be the group of real numbers under addition, and let G0
be the group of positive real numbers
under multiplication. The mapping f : G ! G0
defined by f a
ð Þ ¼ 2a
is a homomorphism because
f a þ b
ð Þ ¼ 2aþb
¼ 2a
2b
¼ f a
ð Þf b
ð Þ
In particular, f is bijective, hence, G and G0
are isomorphic.
EXAMPLE B.4 Let G be the group of nonzero complex numbers under multiplication, and let G0
be the group of
nonzero real numbers under multiplication. The mapping f : G ! G0
defined by f z
ð Þ ¼ jzj is a homomorphism
because
f z1z2
ð Þ ¼ jz1z2j ¼ jz1jjz2j ¼ f z1
ð Þ f z2
ð Þ
The kernel K of f consists of those complex numbers z on the unit circle—that is, for which jzj ¼ 1. Thus, G=K is
isomorphic to the image of f—that is, to the group of positive real numbers under multiplication.
B.3 Rings, Integral Domains, and Fields
Let R be a nonempty set with two binary operations, an operation of addition (denoted by +) and an
operation of multiplication (denoted by juxtaposition). Then R is called a ring if the following axioms are
satisfied:
R1
½  For any a; b; c 2 R, we have a þ b
ð Þ þ c ¼ a þ b þ c
ð Þ.
R2
½  There exists an element 0 2 R; called the zero element, such that a þ 0 ¼ 0 þ a ¼ a for every
a 2 R:
R3
½  For each a 2 R there exists an element a 2 R, called the negative of a, such that
a þ a
ð Þ ¼ a
ð Þ þ a ¼ 0.
R4
½  For any a; b 2 R; we have a þ b ¼ b þ a:
R5
½  For any a; b; c 2 R; we have ab
ð Þc ¼ a bc
ð Þ:
R6
½  For any a; b; c 2 R; we have
(i) a b þ c
ð Þ ¼ ab þ ac; and (ii) b þ c
ð Þa ¼ ba þ ca:
Observe that the axioms R1
½  through R4
½  may be summarized by saying that R is an abelian group
under addition.
Subtraction is defined in R by a  b  a þ b
ð Þ.
It can be shown (see Problem B.25) that a  0 ¼ 0  a ¼ 0 for every a 2 R:
R is called a commutative ring if ab ¼ ba for every a; b 2 R: We also say that R is a ring with a unit
element if there exists a nonzero element 1 2 R such that a  1 ¼ 1  a ¼ a for every a 2 R:
A nonempty subset S of R is called a subring of R if S forms a ring under the operations of R. We note
that S is a subring of R if and only if a; b 2 S implies a  b 2 S and ab 2 S.
A nonempty subset I of R is called a left ideal in R if (i) a  b 2 I whenever a; b 2 I; and (ii) ra 2 I
whenever r 2 R; a 2 I: Note that a left ideal I in R is also a subring of R. Similarly, we can define a right
ideal and a two-sided ideal. Clearly all ideals in commutative rings are two sided. The term ideal shall
mean two-sided ideal uniess otherwise specified.
Appendix B Algebraic Structures 405
THEOREM B.3: Let I be a (two-sided) ideal in a ring R. Then the cosets a þ I j a 2 R
f g form a ring
under coset addition and coset multiplication. This ring is denoted by R=I and is
called the quotient ring.
Now let R be a commutative ring with a unit element. For any a 2 R, the set a
ð Þ ¼ ra j r 2 R
f g is an
ideal; it is called the principal ideal generated by a. If every ideal in R is a principal ideal, then R is called
a principal ideal ring.
DEFINITION: A commutative ring R with a unit element is called an integral domain if R has no
zero divisors—that is, if ab ¼ 0 implies a ¼ 0 or b ¼ 0.
DEFINITION: A commutative ring R with a unit element is called a field if every nonzero a 2 R has a
multiplicative inverse; that is, there exists an element a1
2 R such that aa1
¼ a1
a ¼ 1:
A field is necessarily an integral domain; for if ab ¼ 0 and a 6¼ 0; then
b ¼ 1  b ¼ a1
ab ¼ a1
 0 ¼ 0
We remark that a field may also be viewed as a commutative ring in which the nonzero elements form a
group under multiplication.
EXAMPLE B.5 The set Z of integers with the usual operations of addition and multiplication is the classical
example of an integral domain with a unit element. Every ideal I in Z is a principal ideal; that is, I ¼ n
ð Þ for
some integer n. The quotient ring Zn ¼ Z= n
ð Þ is called the ring of integers module n. If n is prime, then Zn is a field.
On the other hand, if n is not prime then Zn has zero divisors. For example, in the ring Z6; 
2
3 ¼ 
0 and

2 6¼ 
0 and 
3 6¼ 
0:
EXAMPLE B.6 The rational numbers Q and the real numbers R each form a field with respect to the usual
operations of addition and multiplication.
EXAMPLE B.7 Let C denote the set of ordered pairs of real numbers with addition and multiplication defined by
a; b
ð Þ þ c; d
ð Þ ¼ a þ c; b þ d
ð Þ
a; b
ð Þ  c; d
ð Þ ¼ ac  bd; ad þ bc
ð Þ
Then C satisfies all the required properties of a field. In fact, C is just the field of complex numbers (see page 4).
EXAMPLE B.8 The set M of all 2 6 2 matrices with real entries forms a noncommutative ring with zero divisors
under the operations of matrix addition and matrix multiplication.
EXAMPLE B.9 Let R be any ring. Then the set R x
½  of all polynomials over R forms a ring with respect to the usual
operations of addition and multiplication of polynomials. Moreover, if R is an integral domain then R x
½  is also an
integral domain.
Now let D be an integral domain. We say that b divides a in D if a ¼ bc for some c 2 D. An element
u 2 D is called a unit if u divides 1—that is, if u has a multiplicative inverse. An element b 2 D is called
an associate of a 2 D if b ¼ ua for some unit u 2 D. A nonunit p 2 D is said to be irreducible if p ¼ ab
implies a or b is a unit.
An integral domain D is called a unique factorization domain if every nonunit a 2 D can be written
uniquely (up to associates and order) as a product of irreducible elements.
EXAMPLE B.10 The ring Z of integers is the classical example of a unique factorization domain. The units of Z
are 1 and 1. The only associates of n 2 Z are n and n. The irreducible elements of Z are the prime numbers.
EXAMPLE B.11 The set D ¼ a þ b
ffiffiffiffiffi
13
p
j a; b integers
 
is an integral domain. The units of D are 1;
18 5
ffiffiffiffiffi
13
p
and  18 5
ffiffiffiffiffi
13
p
. The elements 2; 3 
ffiffiffiffiffi
13
p
and 3 
ffiffiffiffiffi
13
p
are irreducible in D. Observe that
4 ¼ 2  2 ¼ 3 
ffiffiffiffiffi
13
p
3 
ffiffiffiffiffi
13
p
: Thus, D is not a unique factorization domain. (See Problem B.40.)
406 Appendix B Algebraic Structures
B.4 Modules
Let M be an additive abelian group and let R be a ring with a unit element. Then M is said to be a (left) R-
module if there exists a mapping R M ! M that satisfies the following axioms:
M1
½  r m1 þ m2
ð Þ ¼ rm1 þ rm2
M2
½  r þ s
ð Þm ¼ rm þ sm
M3
½  rs
ð Þm ¼ r sm
ð Þ
M4
½  1  m ¼ m
for any r; s 2 R and any mi 2 M.
We emphasize that an R-module is a generalization of a vector space where we allow the scalars to
come from a ring rather than a field.
EXAMPLE B.12 Let G be any additive abelian group. We make G into a module over the ring Z of integers by
defining
ng ¼ g þ g þ    þ g;
zfflfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflfflffl{
n times
0g ¼ 0; n
ð Þg ¼ ng
where n is any positive integer.
EXAMPLE B.13 Let R be a ring and let I be an ideal in R. Then I may be viewed as a module over R.
EXAMPLE B.14 Let V be a vector space over a field K and let T :V ! V be a linear mapping. We make V into a
module over the ring K x
½  of polynomials over K by defining f x
ð Þv ¼ f T
ð Þ v
ð Þ: The reader should check that a scalar
multiplication has been defined.
Let M be a module over R. An additive subgroup N of M is called a submodule of M if u 2 N and
k 2 R imply ku 2 N: (Note that N is then a module over R.)
Let M and M0
be R-modules. A mapping T :M ! M0
is called a homomorphism (or: R-homomorphism
or R-linear) if
(i) T u þ v
ð Þ ¼ T u
ð Þ þ T v
ð Þ and (ii) T ku
ð Þ ¼ kT u
ð Þ
for every u; v 2 M and every k 2 R.
PROBLEMS
Groups
B.1. Determine whether each of the following systems forms a group G:
(i) G ¼ set of integers; operation subtraction;
(ii) G ¼ f1; 1g, operation multiplication;
(iii) G ¼ set of nonzero rational numbers, operation division;
(iv) G ¼ set of nonsingular n n matrices, operation matrix multiplication;
(v) G ¼ fa þ bi : a; b 2 Zg, operation addition.
B.2. Show that in a group G:
(i) the identity element of G is unique;
(ii) each a 2 G has a unique inverse a1
2 G;
(iii) a1
ð Þ
1
¼ a; and ab
ð Þ1
¼ b1
a1
;
(iv) ab ¼ ac implies b ¼ c, and ba ¼ ca implies b ¼ c.
Appendix B Algebraic Structures 407
B.3. In a group G, the powers of a 2 G are defined by
a0
¼ e; an
¼ aan1
; an
¼ an
ð Þ1
; where n 2 N
Show that the following formulas hold for any integers r; s; t 2 Z: (i) ar
as
¼ arþs
; (ii) ar
ð Þs
¼ ars
;
(iii) arþs
ð Þ
t
¼ arsþst
.
B.4. Show that if G is an abelian group, then ab
ð Þn
¼ an
bn
for any a; b 2 G and any integer n 2 Z:
B.5. Suppose G is a group such that ab
ð Þ2
¼ a2
b2
for every a; b 2 G. Show that G is abelian.
B.6. Suppose H is a subset of a group G. Show that H is a subgroup of G if and only if (i) H is
nonempty, and (ii) a; b 2 H implies ab1
2 H:
B.7. Prove that the intersection of any number of subgroups of G is also a subgroup of G.
B.8. Show that the set of all powers of a 2 G is a subgroup of G; it is called the cyclic group generated
by a.
B.9. A group G is said to be cyclic if G is generated by some a 2 G; that is, G ¼ an
: n 2 Z
ð Þ. Show
that every subgroup of a cyclic group is cyclic.
B.10. Suppose G is a cyclic subgroup. Show that G is isomorphic to the set Z of integers under addition
or to the set Zn (of the integers module n) under addition.
B.11. Let H be a subgroup of G. Show that the right (left) cosets of H partition G into mutually disjoint
subsets.
B.12. The order of a group G, denoted by jGj; is the number of elements of G. Prove Lagrange’s
theorem: If H is a subgroup of a finite group G, then jHj divides jGj.
B.13. Suppose jGj ¼ p where p is prime. Show that G is cyclic.
B.14. Suppose H and N are subgroups of G with N normal. Show that (i) HN is a subgroup of G and
(ii) H  N is a normal subgroup of G.
B.15. Let H be a subgroup of G with only two right (left) cosets. Show that H is a normal subgroup of G.
B.16. Prove Theorem B.1: Let H be a normal subgroup of G. Then the cosets of H in G form a group
G=H under coset multiplication.
B.17. Suppose G is an abelian group. Show that any factor group G=H is also abelian.
B.18. Let f : G ! G0
be a group homomorphism. Show that
(i) f e
ð Þ ¼ e0
where e and e0
are the identity elements of G and G0
, respectively;
(ii) f a1
ð Þ ¼ f a
ð Þ1
for any a 2 G.
B.19. Prove Theorem B.2: Let f : G ! G0
be a group homomorphism with kernel K. Then K is a normal
subgroup of G, and the quotient group G=K is isomorphic to the image of f.
B.20. Let G be the multiplicative group of complex numbers z such that jzj ¼ 1; and let R be the additive
group of real numbers. Prove that G is isomorphic to R=Z:
408 Appendix B Algebraic Structures
B.21. For a fixed g 2 G, let ^
g : G ! G be defined by ^
g a
ð Þ ¼ g1
ag: Show that G is an isomorphism of
G onto G.
B.22. Let G be the multiplicative group of n n nonsingular matrices over R. Show that the mapping
A 7! jAj is a homomorphism of G into the multiplicative group of nonzero real numbers.
B.23. Let G be an abelian group. For a fixed n 2 Z; show that the map a 7! an
is a homomorphism of G
into G.
B.24. Suppose H and N are subgroups of G with N normal. Prove that H  N is normal in H and
H= H  N
ð Þ is isomorphic to HN=N.
Rings
B.25. Show that in a ring R:
(i) a  0 ¼ 0  a ¼ 0; (ii) a b
ð Þ ¼ a
ð Þb ¼ ab, (iii) a
ð Þ b
ð Þ ¼ ab:
B.26. Show that in a ring R with a unit element: (i) 1
ð Þa ¼ a; (ii) 1
ð Þ 1
ð Þ ¼ 1.
B.27. Let R be a ring. Suppose a2
¼ a for every a 2 R: Prove that R is a commutative ring. (Such a ring
is called a Boolean ring.)
B.28. Let R be a ring with a unit element. We make R into another ring ^
R by defining a b ¼ a þ b þ 1
and a  b ¼ ab þ a þ b. (i) Verify that ^
R is a ring. (ii) Determine the 0-element and 1-element of ^
R.
B.29. Let G be any (additive) abelian group. Define a multiplication in G by a  b ¼ 0. Show that this
makes G into a ring.
B.30. Prove Theorem B.3: Let I be a (two-sided) ideal in a ring R. Then the cosets a þ I j a 2 R
ð Þ form a
ring under coset addition and coset multiplication.
B.31. Let I1 and I2 be ideals in R. Prove that I1 þ I2 and I1  I2 are also ideals in R.
B.32. Let R and R0
be rings. A mapping f : R ! R0
is called a homomorphism (or: ring homomorphism) if
(i) f a þ b
ð Þ ¼ f a
ð Þ þ f b
ð Þ and (ii) f ab
ð Þ ¼ f a
ð Þ f b
ð Þ,
for every a; b 2 R. Prove that if f : R ! R0
is a homomorphism, then the set K ¼ r 2 R j f r
ð Þ ¼ 0
f g is an
ideal in R. (The set K is called the kernel of f.)
Integral Domains and Fields
B.33. Prove that in an integral domain D, if ab ¼ ac; a 6¼ 0; then b ¼ c.
B.34. Prove that F ¼ a þ b
ffiffiffi
2
p
j a; b rational
 
is a field.
B.35. Prove that D ¼ a þ b
ffiffiffi
2
p
j a; b integers
 
is an integral domain but not a field.
B.36. Prove that a finite integral domain D is a field.
B.37. Show that the only ideals in a field K are 0
f g and K.
B.38. A complex number a þ bi where a, b are integers is called a Gaussian integer. Show that the set G
of Gaussian integers is an integral domain. Also show that the units in G are 1 and i.
Appendix B Algebraic Structures 409
B.39. Let D be an integral domain and let I be an ideal in D. Prove that the factor ring D=I is an integral
domain if and only if I is a prime ideal. (An ideal I is prime if ab 2 I implies a 2 I or b 2 I:)
B.40. Consider the integral domain D ¼ a þ b
ffiffiffiffiffi
13
p
j a; b integers
 
(see Example B.11). If
a ¼ a þ b
ffiffiffiffiffi
13
p
, we define N a
ð Þ ¼ a2
 13b2
. Prove: (i) N ab
ð Þ ¼ N a
ð ÞN b
ð Þ; (ii) a is a unit if
and only if N a
ð Þ ¼ 1; (iii) the units of D are 1; 18 5
ffiffiffiffiffi
13
p
and  18 5
ffiffiffiffiffi
13
p
; (iv) the
numbers 2; 3 
ffiffiffiffiffi
13
p
and  3 
ffiffiffiffiffi
13
p
are irreducible.
Modules
B.41. Let M be an R-module and let A and B be submodules of M. Show that A þ B and A  B are also
submodules of M.
B.42. Let M be an R-module with submodule N. Show that the cosets u þ N : u 2 M
f g form an
R-module under coset addition and scalar multiplication defined by r u þ N
ð Þ ¼ ru þ N. (This
module is denoted by M=N and is called the quotient module.)
B.43. Let M and M0
be R-modules and let f : M ! M0
be an R-homomorphism. Show that the set
K ¼ u 2 M : f u
ð Þ ¼ 0
f g is a submodule of f. (The set K is called the kernel of f.)
B.44. Let M be an R-module and let E M
ð Þ denote the set of all R-homomorphism of M into itself. Define
the appropriate operations of addition and multiplication in E M
ð Þ so that E M
ð Þ becomes a ring.
410 Appendix B Algebraic Structures
Polynomials over a Field
C.1 Introduction
We will investigate polynomials over a field K and show that they have many properties that are
analogous to properties of the integers. These results play an important role in obtaining canonical forms
for a linear operator T on a vector space V over K.
C.2 Ring of Polynomials
Let K be a field. Formally, a polynomial of f over K is an infinite sequence of elements from K in which
all except a finite number of them are 0:
f ¼ . . . ; 0; an; . . . ; a1; a0
ð Þ
(We write the sequence so that it extends to the left instead of to the right.) The entry ak is called the kth
coefficient of f. If n is the largest integer for which an 6¼ 0, then we say that the degree of f is n, written
deg f ¼ n
We also call an the leading coefficient of f, and if an ¼ 1 we call f a monic polynomial. On the other hand,
if every coefficient of f is 0 then f is called the zero polynomial, written f ¼ 0. The degree of the zero
polynomial is not defined.
Now if g is another polynomial over K, say
g ¼ . . . ; 0; bm; . . . ; b1; b0
ð Þ
then the sum f þ g is the polynomial obtained by adding corresponding coefficients. That is, if m  n, then
f þ g ¼ . . . ; 0; an; . . . ; am þ bm; . . . ; a1 þ b1; a0 þ b0
ð Þ
Furthermore, the product fg is the polynomial
fg ¼ . . . ; 0; anbm; . . . ; a1b0 þ a0b1; a0b0
ð Þ
that is, the kth coefficient ck of fg is
ck ¼
X
k
t¼0
a1bk1 ¼ a0bk þ a1bk1 þ    þ akb0
The following theorem applies.
THEOREM C.1: The set P of polynomials over a field K under the above operations of addition and
multiplication forms a commutative ring with a unit element and with no zero
divisors—an integral domain. If f and g are nonzero polynomials in P, then
deg fg
ð Þ ¼ deg f
ð Þ deg g
ð Þ.
APPENDIX C
411
Notation
We identify the scalar a0 2 K with the polynomial
a0 ¼ . . . ; 0; a0
ð Þ
We also choose a symbol, say t, to denote the polynomial
t ¼ . . . ; 0; 1; 0
ð Þ
We call the symbol t an indeterminant. Multiplying t with itself, we obtain
t2
¼ . . . ; 0; 1; 0; 0
ð Þ; t3
¼ . . . ; 0; 1; 0; 0; 0
ð Þ; . . .
Thus, the above polynomial f can be written uniquely in the usual form
f ¼ antn
þ    þ ast þ a0
When the symbol t is selected as the indeterminant, the ring of polynomials over K is denoted by
K t
½ 
and a polynomial f is frequently denoted by f t
ð Þ.
We also view the field K as a subset of K t
½  under the above identification. This is possible because the
operations of addition and multiplication of elements of K are preserved under this identification:
ð. . . ; 0; a0Þ þ ð. . . ; 0; b0Þ ¼ ð. . . ; 0; a0 þ b0Þ
ð. . . ; 0; a0Þ  ð. . . ; 0; b0Þ ¼ ð. . . ; 0; a0b0Þ
We remark that the nonzero elements of K are the units of the ring K t
½ .
We also remark that every nonzero polynomial is an associate of a unique monic polynomial. Hence, if
d and d0
are monic polynomials for which d divides d0
and d0
divides d, then d ¼ d0
. (A polynomial g
divides a polynomial f if there is a polynomial h such that f ¼ hg:)
C.3 Divisibility
The following theorem formalizes the process known as ‘‘long division.’’
THEOREM C.2 (Division Algorithm): Let f and g be polynomials over a field K with g 6¼ 0. Then
there exist polynomials q and r such that
f ¼ qg þ r
where either r ¼ 0 or deg r  deg g.
Proof: If f ¼ 0 or if deg f  deg g, then we have the required representation
f ¼ 0g þ f
Now suppose deg f  deg g, say
f ¼ antn
þ    þ a1t þ a0 and g ¼ bmtm
þ    þ b1t þ b0
where an; bm 6¼ 0 and n  m. We form the polynomial
f1 ¼ f 
an
bm
tnm
g ð1Þ
Then deg f1  deg f . By induction, there exist polynomials q1 and r such that
f1 ¼ q1g þ r
412 Appendix C Polynomials over a Field
where either r ¼ 0 or deg r  deg g. Substituting this into (1) and solving for f,
f ¼ q1 þ
an
bm
tnm
 
g þ r
which is the desired representation.
THEOREM C.3: The ring K t
½  of polynomials over a field K is a principal ideal ring. If I is an ideal in
K t
½ , then there exists a unique monic polynomial d that generates I, such that d
divides every polynomial f 2 I.
Proof. Let d be a polynomial of lowest degree in I. Because we can multiply d by a nonzero scalar and
still remain in I, we can assume without loss in generality that d is a monic polynomial. Now suppose
f 2 I. By Theorem C.2 there exist polynomials q and r such that
f ¼ qd þ r where either r ¼ 0 or deg r  deg d
Now f ; d 2 I implies qd 2 I; and hence, r ¼ f  qd 2 I. But d is a polynomial of lowest degree in I.
Accordingly, r ¼ 0 and f ¼ qd; that is, d divides f. It remains to show that d is unique. If d0
is another
monic polynomial that generates I, then d divides d0
and d0
divides d. This implies that d ¼ d0
, because d
and d0
are monic. Thus, the theorem is proved.
THEOREM C.4: Let f and g be nonzero polynomials in K t
½ . Then there exists a unique monic
polynomial d such that
(i) d divides f and g; and (ii) d0
divides f and g, then d0
divides d.
DEFINITION: The above polynomial d is called the greatest common divisor of f and g. If d ¼ 1,
then f and g are said to be relatively prime.
Proof of Theorem C.4. The set I ¼ mf þ ng j m; n 2 K t
½ 
f g is an ideal. Let d be the monic polynomial
that generates I. Note f ; g 2 I; hence, d divides f and g. Now suppose d0
divides f and g. Let J be the ideal
generated by d0
. Then f ; g 2 J, and hence, I  J. Accordingly, d 2 J and so d0
divides d as claimed. It
remains to show that d is unique. If d1 is another (monic) greatest common divisor of f and g, then d
divides d1 and d1 divides d. This implies that d ¼ d1 because d and d1 are monic. Thus, the theorem is
proved.
COROLLARY C.5: Let d be the greatest common divisor of the polynomials f and g. Then there exist
polynomials m and n such that d ¼ mf þ ng. In particular, if f and g are relatively
prime, then there exist polynomials m and n such that mf þ ng ¼ 1.
The corollary follows directly from the fact that d generates the ideal
I ¼ mf þ ng j m; n 2 K t
½ 
f g
C.4 Factorization
A polynomial p 2 K t
½  of positive degree is said to be irreducible if p ¼ fg implies f or g is a scalar.
LEMMA C.6: Suppose p 2 K t
½  is irreducible. If p divides the product fg of polynomials f ; g 2 K t
½ ,
then p divides f or p divides g. More generally, if p divides the product of n
polynomials f1 f2 . . . fn, then p divides one of them.
Proof. Suppose p divides fg but not f. Because p is irreducible, the polynomials f and p must then be
relatively prime. Thus, there exist polynomials m; n 2 K t
½  such that mf þ np ¼ 1. Multiplying this
Appendix C Polynomials over a Field 413
equation by g, we obtain mfg þ npg ¼ g. But p divides fg and so mfg, and p divides npg; hence, p divides
the sum g ¼ mfg þ npg.
Now suppose p divides f1 f2    fn: If p divides f1, then we are through. If not, then by the above result p
divides the product f2    fn: By induction on n, p divides one of the polynomials f2; . . . fn: Thus, the
lemma is proved.
THEOREM C.7: (Unique Factorization Theorem) Let f be a nonzero polynomial in K t
½ : Then f can
be written uniquely (except for order) as a product
f ¼ kp1p2    pn
where k 2 K and the pi are monic irreducible polynomials in K t
½ :
Proof: We prove the existence of such a product first. If f is irreducible or if f 2 K, then such a product
clearly exists. On the other hand, suppose f ¼ gh where f and g are nonscalars. Then g and h have degrees
less than that of f. By induction, we can assume
g ¼ k1g1g2    gr and h ¼ k2h1h2    hs
where k1; k2 2 K and the gi and hj are monic irreducible polynomials. Accordingly,
f ¼ k1k2
ð Þg1g2    grk1h2    hs
is our desired representation.
We next prove uniqueness (except for order) of such a product for f. Suppose
f ¼ kp1p2    pn ¼ k0
q1q2    qm
where k; k0
2 K and the p1; . . . ; pn; q1; . . . ; qm are monic irreducible polynomials. Now p1 divides
k0
q1    qm: Because p1 is irreducible, it must divide one of the qi by the above lemma. Say p1 divides q1.
Because p1 and q1 are both irreducible and monic, p1 ¼ q1. Accordingly,
kp2    pn ¼ k0
q2    qm
By induction, we have that n ¼ m and p2 ¼ q2; . . . ; pn ¼ qm for some rearrangement of the qi. We also
have that k ¼ k0
. Thus, the theorem is proved.
If the field K is the complex field C, then we have the following result that is known as the
fundamental theorem of algebra; its proof lies beyond the scope of this text.
THEOREM C.8: (Fundamental Theorem of Algebra) Let f t
ð Þ be a nonzero polynomial over the
complex field C. Then f t
ð Þ can be written uniquely (except for order) as a product
f t
ð Þ ¼ k t  r2
ð Þ t  r2
ð Þ    t  rn
ð Þ
where k; ri 2 C—as a product of linear polynomials.
In the case of the real field R we have the following result.
THEOREM C.9: Let f t
ð Þ be a nonzero polynomial over the real field R. Then f t
ð Þ can be written
uniquely (except for order) as a product
f t
ð Þ ¼ kp1 t
ð Þp2 t
ð Þ    pm t
ð Þ
where k 2 R and the pi t
ð Þ are monic irreducible polynomials of degree one or two.
414 Appendix C Polynomials over a Field
Odds and Ends
D.1 Introduction
This appendix discusses various topics, such as equivalence relations, determinants and block matrices,
and the generalized MP (Moore–Penrose) inverse.
D.2 Relations and Equivalence Relations
A binary relation or simply relation R from a set A to a set B assigns to each ordered pair a; b
ð Þ 2 A B
exactly one of the following statements:
(i) ‘‘a is related to b,’’ written a R b, (ii) ‘‘a is not related to b’’ written a R b.
A relation from a set A to the same set A is called a relation on A.
Observe that any relation R from A to B uniquely defines a subset ^
R of A B as follows:
^
R ¼ a; b
ð Þja R b
f g
Conversely, any subset ^
R of A B defines a relation from A to B as follows:
a R b if and only if a; b
ð Þ 2 R
In view of the above correspondence between relations from A to B and subsets of A B, we redefine a
relation from A to B as follows:
DEFINITION D.1: A relation R from A to B is a subset of A B.
Equivalence Relations
Consider a nonempty set S. A relation R on S is called an equivalence relation if R is reflexive,
symmetric, and transitive; that is, if R satisfied the following three axioms:
[E1] (Reflexivity) Every a 2 A is related to itself. That is, for every a 2 A, a R a.
[E2] (Symmetry) If a is related to b, then b is related to a. That is, if a R b, then b R a.
[E3] (Transitivity) If a is related to b and b is related to c, then a is related to c. That is,
if a R b and b R c, then a R c:
The general idea behind an equivalence relation is that it is a classification of objects that are in some way
‘‘alike.’’ Clearly, the relation of equality is an equivalence relation. For this reason, one frequently uses ~
or  to denote an equivalence relation.
EXAMPLE D.1
(a) In Euclidean geometry, similarity of triangles is an equivalence relation. Specifically, suppose a; b; g are
triangles. Then (i) a is similar to itself. (ii). If a is similar to b, then b is similar to a. (iii) If a is similar to b and b
is similar to g, then a is similar to g.
APPENDIX D
415
(b) The relation  of set inclusion is not an equivalence relation. It is reflexive and transitive, but it is not symmetric
because A  B does not imply B  A.
Equivalence Relations and Partitions
Let S be a nonempty set. Recall first that a partition P of S is a subdivision of S into nonempty,
nonoverlapping subsets; that is, a collection P ¼ fAjg of nonempty subsets of S such that (i) Each a 2 S
belong to one of the Aj, (ii) The sets fAjg are mutually disjoint.
The subsets in a partition P are called cells. Thus, each a 2 S belongs to exactly one of the cells. Also,
any element b 2 Aj is called a representative of the cell Aj, and a subset B of S is called a system of
representatives if B contains exactly one element in each of the cells in fAjg.
Now suppose R is an equivalence relation on the nonempty set S. For each a 2 S, the equivalence class
of a, denoted by [a], is the set of elements of S to which a is related:
a
½  ¼ x j a Rx
f g:
The collection of equivalence classes, denoted by S=R, is called the quotient of S by R:
S=R ¼ a
½  j a 2 S
f g
The fundamental property of an equivalence relation and its quotient set is contained in the following
theorem:
THEOREM D.1: Let R be an equivalence relation on a nonempty set S. Then the quotient set S=R is a
partition of S.
EXAMPLE D.2 Let  be the relation on the set Z of integers defined by
x  y mod 5
ð Þ
which reads ‘‘x is congruent to y modulus 5’’ and which means that the difference x  y is divisible by 5.
Then  is an equivalence relation on Z.
Then there are exactly five equivalence classes in the quotient set Z=  as follows:
A0 ¼ . . . ; 10; 5; 0; 5; 10; . . .
f g
A1 ¼ . . . ; 9; 4; 1; 6; 11; . . .
f g
A2 ¼ . . . ; 8; 3; 2; 7; 12; . . .
f g
A3 ¼ . . . ; 7; 2; 3; 8; 13; . . .
f g
A4 ¼ . . . ; 6; 1; 4; 9; 14; . . .
f g
Note that any integer x, which can be expressed uniquely in the form x ¼ 5q þ r where 0  r  5, is a
member of the equivalence class Ar where r is the remainder. As expected, the equivalence classes are
disjoint and their union is Z:
Z ¼ A0 [ A1 [ A2 [ A3 [ A4
This quotient set Z= , called the integers modulo 5, is denoted
Z=5Z or simply Z5:
Usually one chooses 0; 1; 2; 3; 4
f g or 2; 1; 0; 1; 2
f g as a system of representatives of the equiva-
lence classes.
Analagously, for any positive integer m, there exists the congruence relation  defined by
x  y mod m
ð Þ
and the quotient set Z=  is called the integers modulo m.
416 Appendix D Odds and Ends
D.3 Determinants and Block Matrices
Recall first:
THEOREM 8.12: Suppose M is an upper (lower) triangular block matrix with diagonal blocks
Aj; A2; . . . ; An: Then det M
ð Þ ¼ det Aj det A2
ð Þ . . . det An
ð Þ:
Accordingly, if M ¼
A B
0 D
 
where A is r r and D is s s. Then det M
ð Þ ¼ det A
ð Þ det D
ð Þ:
THEOREM D.2: Consider the block matrix M ¼
A B
C D
 
where A is nonsingular, A is r r and D
is s s: Then det M
ð Þ ¼ det A
ð Þ det D  CA1
B
ð Þ
Proof: Follows from the fact that M ¼
I 0
CA1
I
 
A B
0 D  CA1
B
 
and the above result.
D.4 Full Rank Factorization
A matrix B is said to have full row rank r if B has r rows that are linearly independent, and a matrix C is
said to have full column rank r if C has r columns that are linearly independent.
DEFINITION D.2: Let A be a m n matrix of rank r. Then A is said to have the full rank factorization
A ¼ BC
where B has full-column rank r and C has full-row rank r.
THEOREM D.3: Every matrix A with rank r  0 has a full rank factorization.
There are many full rank factorizations of a matrix A. Fig. D-1 gives an algorithm to find one such
factorization.
EXAMPLE D.3 Let A ¼
1 1 1 2
2 2 1 3
1 1 2 3
2
4
3
5 where M ¼
1 1 0 1
0 0 1 1
0 0 0 0
2
4
3
5 is the row cannonical form of A.
We set
B ¼
1 1
2 1
1 2
2
4
3
5 and C ¼
1 1 0 1
0 0 1 1
 
Then A ¼ BC is a full rank factorization of A.
Algorithm D-1: The input is a matrix A of rank r  0. The output is a full rank factorization of A.
Step 1. Find the row cannonical form M of A.
Step 2. Let B be the matrix whose columns are the columns of A corresponding to the columns of M
with pivots.
Step 3. Let C be the matrix whose rows are the nonzero rows of M.
Then A ¼ BC is a full rank factorization of A.
Figure D-1
Appendix D Odds and Ends 417
D.5 Generalized (Moore–Penrose) Inverse
Here we assume that the field of scalars is the complex field C where the matrix AH
is the conjugate
transpose of a matrix A. [If A is a real matrix, then AH
¼ AT
.]
DEFINITION D.3: Let A be an m n matrix over C. A matrix, denoted by Aþ
, is called the
pseudoinverse or Morre–Penrose inverse or MP-inverse of A if A satisfies the
following four equations:
[MP1] AXA ¼ A; [MP3] AX
ð ÞH
¼ AX;
[MP2] XAX ¼ X; [MP4] XA
ð ÞH
¼ XA;
Clearly, Aþ
is an n m matrix. Also, Aþ
¼ A1
if A is nonsingular.
LEMMA D.4: Aþ
is unique (when it exists).
Proof. Suppose X and Y satisfy the four MP equations. Then
AY ¼ AY
ð ÞH
¼ AXAY
ð ÞH
¼ AY
ð ÞH
AX
ð ÞH
¼ AYAX ¼ AYA
ð ÞX ¼ AX
The first and fourth equations use [MP3], and the second and last equations use [MP1]. Similarly,
YA ¼ XA (which uses [MP4] and [MP1]). Then,
Y ¼ YAY ¼ YA
ð ÞY ¼ XA
ð ÞY ¼ X AY
ð Þ ¼ X AX
ð Þ ¼ X
where the first equation uses [MP2].
LEMMA D.5: Aþ
exists for any matrix A.
Fig. D-2 gives an algorithm that finds an MP-inverse for any matrix A.
Combining the above two lemmas we obtain:
THEOREM D.6: Every matrix A over C has a unique Moore–Penrose matrix Aþ
.
There are special cases when A has full-row rank or full-column rank.
THEOREM D.7: Let A be a matrix over C.
(a) If A has full column rank (columns are linearly independent), then
Aþ
¼ AH
A
ð Þ
1
AH
:
(b) If A has full row rank (rows are linearly independent), then Aþ
¼ AH
AAH
ð Þ
1
:
THEOREM D.8: Let A be a matrix over C. Suppose A ¼ BC is a full rank factorization of A. Then
Aþ
¼ Cþ
Bþ
¼ CH
CCH 1
BH
B
1
BH
Moreover, AAþ
¼ BBþ
and Aþ
A ¼ Cþ
C:
Algorithm D-2. Input is an m n matrix A over C or rank r. Output is Aþ
.
Step 1. Interchange rows and columns of A so that PAQ ¼
A11 A12
A21 A22
 
where A11 is a nonsingular
r r block. [Here P and Q are the products of elementary matrices corresponding to the
interchanges of the rows and columns.]
Step 2. Set B ¼
A11
A21
 
and C ¼ Ir; A1
11 A12
 
where Ir is the r r identity matrix.
Step 3. Set Aþ
¼ Q CH
CCH
ð Þ
1
BH
B
ð Þ
1
B11
h i
P:
Figure D-2
418 Appendix D Odds and Ends
EXAMPLE D.4 Consider the full rank factorization A ¼ BC in Example D.1; that is,
A ¼
1 1 1 2
2 2 1 3
1 1 2 3
2
4
3
5 ¼
1 1
2 1
1 2
2
4
3
5 1 1 0 1
0 0 1 1
 
¼ BC
Then
CCH 1
¼
1
5
2 1
1 3
 
; C CCH 1
¼
1
5
2 1
2 1
1 3
1 2
2
6
6
4
3
7
7
5; BH
B
1
¼
1
11
6 5
5 6
 
; B BH
B
1
¼
1
11
1 7 4
1 4 7
 
Accordingly, the following is the Moore–Penrose inverse of A:
Aþ
¼
1
55
1 18 15
1 18 15
2 19 25
3 1 10
2
6
6
4
3
7
7
5
D.6 Least-Square Solution
Consider a system AX ¼ B
Linear_Algebra,_4th_Edition.pdf
Linear_Algebra,_4th_Edition.pdf
Linear_Algebra,_4th_Edition.pdf
Linear_Algebra,_4th_Edition.pdf
Linear_Algebra,_4th_Edition.pdf
Linear_Algebra,_4th_Edition.pdf

More Related Content

DOCX
Complex VariablesThis page intentionally left blan.docx
PDF
ADVANCED CALCULUS-SCHAUMSOUTLINE SERIES.pdf
PDF
Nonlinear Conjugate Gradient Methods For Unconstrained Optimization Paginatio...
PDF
Arfken_Weber_Harris__Instructors_manual_Mathemati.pdf
PDF
Differential Equations Mechanic And Computation Richard S Palais
PDF
Functional Analysis Softcover Reprint Of Hardcover 1st Ed 2003 Erdogan Suhubi
PDF
Math-2e.pdf
PDF
A First Course In With Applications Complex Analysis
Complex VariablesThis page intentionally left blan.docx
ADVANCED CALCULUS-SCHAUMSOUTLINE SERIES.pdf
Nonlinear Conjugate Gradient Methods For Unconstrained Optimization Paginatio...
Arfken_Weber_Harris__Instructors_manual_Mathemati.pdf
Differential Equations Mechanic And Computation Richard S Palais
Functional Analysis Softcover Reprint Of Hardcover 1st Ed 2003 Erdogan Suhubi
Math-2e.pdf
A First Course In With Applications Complex Analysis

Similar to Linear_Algebra,_4th_Edition.pdf (20)

PDF
Balachandran_Magrab_2009_Vibrations_Seco.pdf
PDF
Stability Of Linear Delay Differential Equations A Numerical Approach With Ma...
PDF
Fractional Calculus Roy Abi Zeid Daou Xavier Moreau
DOCX
Engineering VibrationFourth EditionDaniEl J. inman.docx
PDF
Lectures On Factorization Homology Categories And Topological Field Theories ...
PDF
Robot And Multibody Dynamics Analysis And Algorithms 1st Edition Abhinandan J...
PDF
Quantum Quadratic Operators And Processes 1st Edition Farrukh Mukhamedov
PDF
Advanced Linear Algebra (Third Edition) By Steven Roman
PDF
Principles Of Real Analysis 2nd Edition S C Malik
PDF
Solution Manual for Precalculus, 1st Edition, Julie Miller, Donna Gerken
PDF
Lectures On The Fourier Transform And Its Applications Brad Osgood
PDF
Bertimas
PDF
An Introduction To Complex Analysis 1st Edition Ravi P Agarwal
PDF
Basic calculus
PDF
The Homotopy Theory Of Infinity1categories Julia E Bergner
PDF
lipschutz_s_set_theory_and_related_topics_schaums_outline_se.pdf
PDF
Theory And Applications Of Special Functions For Scientists And Engineers Xia...
PDF
Algebra Lineal.pdf
PDF
A Second Course In Mathematical Analysis J C Burkill H Burkill
Balachandran_Magrab_2009_Vibrations_Seco.pdf
Stability Of Linear Delay Differential Equations A Numerical Approach With Ma...
Fractional Calculus Roy Abi Zeid Daou Xavier Moreau
Engineering VibrationFourth EditionDaniEl J. inman.docx
Lectures On Factorization Homology Categories And Topological Field Theories ...
Robot And Multibody Dynamics Analysis And Algorithms 1st Edition Abhinandan J...
Quantum Quadratic Operators And Processes 1st Edition Farrukh Mukhamedov
Advanced Linear Algebra (Third Edition) By Steven Roman
Principles Of Real Analysis 2nd Edition S C Malik
Solution Manual for Precalculus, 1st Edition, Julie Miller, Donna Gerken
Lectures On The Fourier Transform And Its Applications Brad Osgood
Bertimas
An Introduction To Complex Analysis 1st Edition Ravi P Agarwal
Basic calculus
The Homotopy Theory Of Infinity1categories Julia E Bergner
lipschutz_s_set_theory_and_related_topics_schaums_outline_se.pdf
Theory And Applications Of Special Functions For Scientists And Engineers Xia...
Algebra Lineal.pdf
A Second Course In Mathematical Analysis J C Burkill H Burkill
Ad

Recently uploaded (20)

PDF
Integrating Fractal Dimension and Time Series Analysis for Optimized Hyperspe...
PPTX
Information Storage and Retrieval Techniques Unit III
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
Fundamentals of Mechanical Engineering.pptx
PPT
Occupational Health and Safety Management System
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPTX
Safety Seminar civil to be ensured for safe working.
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPT
Total quality management ppt for engineering students
PDF
Visual Aids for Exploratory Data Analysis.pdf
PPTX
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
PDF
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
PDF
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
PDF
Categorization of Factors Affecting Classification Algorithms Selection
PPTX
Artificial Intelligence
PPTX
UNIT - 3 Total quality Management .pptx
Integrating Fractal Dimension and Time Series Analysis for Optimized Hyperspe...
Information Storage and Retrieval Techniques Unit III
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Fundamentals of Mechanical Engineering.pptx
Occupational Health and Safety Management System
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
R24 SURVEYING LAB MANUAL for civil enggi
Safety Seminar civil to be ensured for safe working.
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Total quality management ppt for engineering students
Visual Aids for Exploratory Data Analysis.pdf
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
BIO-INSPIRED ARCHITECTURE FOR PARSIMONIOUS CONVERSATIONAL INTELLIGENCE : THE ...
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
Categorization of Factors Affecting Classification Algorithms Selection
Artificial Intelligence
UNIT - 3 Total quality Management .pptx
Ad

Linear_Algebra,_4th_Edition.pdf

  • 2. SCHAUM’S outlines Linear Algebra Fourth Edition Seymour Lipschutz, Ph.D. Temple University Marc Lars Lipson, Ph.D. University of Virginia Schaum’s Outline Series New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore Sydney Toronto SCHAUM’S outlines
  • 3. Copyright © 2009, 2001, 1991, 1968 by The McGraw-Hill Companies, Inc. All rights reserved. Except as permitted under the United States Copyright Act of 1976, no part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior writ- ten permission of the publisher. ISBN: 978-0-07-154353-8 MHID: 0-07-154353-8 The material in this eBook also appears in the print version of this title: ISBN: 978-0-07-154352-1, MHID: 0-07-154352-X. All trademarks are trademarks of their respective owners. Rather than put a trademark symbol after every occurrence of a trademarked name, we use names in an editorial fashion only, and to the benefit of the trademark owner, with no intention of infringement of the trademark. Where such designations appear in this book, they have been printed with initial caps. McGraw-Hill eBooks are available at special quantity discounts to use as premiums and sales promotions, or for use in corporate training programs. To contact a representative please e-mail us at bulksales@mcgraw-hill.com. TERMS OF USE This is a copyrighted work and The McGraw-Hill Companies, Inc. (“McGraw-Hill”) and its licensors reserve all rights in and to the work. Use of this work is subject to these terms. Except as permitted under the Copyright Act of 1976 and the right to store and retrieve one copy of the work, you may not decompile, disassemble, reverse engineer, reproduce, modify, create derivative works based upon, transmit, distribute, disseminate, sell, publish or sublicense the work or any part of it without McGraw-Hill’s prior consent. You may use the work for your own noncommercial and personal use; any other use of the work is strictly prohibited. Your right to use the work may be terminated if you fail to comply with these terms. THE WORK IS PROVIDED “AS IS.” McGRAW-HILL AND ITS LICENSORS MAKE NO GUARANTEES OR WARRANTIES AS TO THE ACCURACY, ADEQUACY OR COMPLETENESS OF OR RESULTS TO BE OBTAINED FROM USING THE WORK, INCLUDING ANY INFORMATION THAT CAN BE ACCESSED THROUGH THE WORK VIA HYPERLINK OR OTHERWISE, AND EXPRESSLY DISCLAIM ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. McGraw-Hill and its licensors do not warrant or guarantee that the functions contained in the work will meet your requirements or that its operation will be uninterrupted or error free. Neither McGraw-Hill nor its licensors shall be liable to you or anyone else for any inaccuracy, error or omission, regardless of cause, in the work or for any damages resulting therefrom. McGraw-Hill has no responsibility for the content of any information accessed through the work. Under no circumstances shall McGraw-Hill and/or its licensors be liable for any indirect, incidental, special, punitive, consequential or similar damages that result from the use of or inability to use the work, even if any of them has been advised of the possibility of such damages. This limitation of liability shall apply to any claim or cause whatsoever whether such claim or cause arises in contract, tort or otherwise.
  • 4. Preface Linear algebra has in recent years become an essential part of the mathematical background required by mathematicians and mathematics teachers, engineers, computer scientists, physicists, economists, and statisticians, among others. This requirement reflects the importance and wide applications of the subject matter. This book is designed for use as a textbook for a formal course in linear algebra or as a supplement to all current standard texts. It aims to present an introduction to linear algebra which will be found helpful to all readers regardless of their fields of specification. More material has been included than can be covered in most first courses. This has been done to make the book more flexible, to provide a useful book of reference, and to stimulate further interest in the subject. Each chapter begins with clear statements of pertinent definitions, principles, and theorems together with illustrative and other descriptive material. This is followed by graded sets of solved and supplementary problems. The solved problems serve to illustrate and amplify the theory, and to provide the repetition of basic principles so vital to effective learning. Numerous proofs, especially those of all essential theorems, are included among the solved problems. The supplementary problems serve as a complete review of the material of each chapter. The first three chapters treat vectors in Euclidean space, matrix algebra, and systems of linear equations. These chapters provide the motivation and basic computational tools for the abstract investigations of vector spaces and linear mappings which follow. After chapters on inner product spaces and orthogonality and on determinants, there is a detailed discussion of eigenvalues and eigenvectors giving conditions for representing a linear operator by a diagonal matrix. This naturally leads to the study of various canonical forms, specifically, the triangular, Jordan, and rational canonical forms. Later chapters cover linear functions and the dual space V*, and bilinear, quadratic, and Hermitian forms. The last chapter treats linear operators on inner product spaces. The main changes in the fourth edition have been in the appendices. First of all, we have expanded Appendix A on the tensor and exterior products of vector spaces where we have now included proofs on the existence and uniqueness of such products. We also added appendices covering algebraic structures, including modules, and polynomials over a field. Appendix D, ‘‘Odds and Ends,’’ includes the Moore–Penrose generalized inverse which appears in various applications, such as statistics. There are also many additional solved and supplementary problems. Finally, we wish to thank the staff of the McGraw-Hill Schaum’s Outline Series, especially Charles Wall, for their unfailing cooperation. SEYMOUR LIPSCHUTZ MARC LARS LIPSON iii
  • 6. Contents CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors 1 1.1 Introduction 1.2 Vectors in Rn 1.3 Vector Addition and Scalar Multi- plication 1.4 Dot (Inner) Product 1.5 Located Vectors, Hyperplanes, Lines, Curves in Rn 1.6 Vectors in R3 (Spatial Vectors), ijk Notation 1.7 Complex Numbers 1.8 Vectors in Cn CHAPTER 2 Algebra of Matrices 27 2.1 Introduction 2.2 Matrices 2.3 Matrix Addition and Scalar Multiplica- tion 2.4 Summation Symbol 2.5 Matrix Multiplication 2.6 Transpose of a Matrix 2.7 Square Matrices 2.8 Powers of Matrices, Polynomials in Matrices 2.9 Invertible (Nonsingular) Matrices 2.10 Special Types of Square Matrices 2.11 Complex Matrices 2.12 Block Matrices CHAPTER 3 Systems of Linear Equations 57 3.1 Introduction 3.2 Basic Definitions, Solutions 3.3 Equivalent Systems, Elementary Operations 3.4 Small Square Systems of Linear Equations 3.5 Systems in Triangular and Echelon Forms 3.6 Gaussian Elimination 3.7 Echelon Matrices, Row Canonical Form, Row Equivalence 3.8 Gaussian Elimination, Matrix Formulation 3.9 Matrix Equation of a System of Linear Equations 3.10 Systems of Linear Equations and Linear Combinations of Vectors 3.11 Homogeneous Systems of Linear Equations 3.12 Elementary Matrices 3.13 LU Decomposition CHAPTER 4 Vector Spaces 112 4.1 Introduction 4.2 Vector Spaces 4.3 Examples of Vector Spaces 4.4 Linear Combinations, Spanning Sets 4.5 Subspaces 4.6 Linear Spans, Row Space of a Matrix 4.7 Linear Dependence and Independence 4.8 Basis and Dimension 4.9 Application to Matrices, Rank of a Matrix 4.10 Sums and Direct Sums 4.11 Coordinates CHAPTER 5 Linear Mappings 164 5.1 Introduction 5.2 Mappings, Functions 5.3 Linear Mappings (Linear Transformations) 5.4 Kernel and Image of a Linear Mapping 5.5 Singular and Nonsingular Linear Mappings, Isomorphisms 5.6 Operations with Linear Mappings 5.7 Algebra A(V ) of Linear Operators CHAPTER 6 Linear Mappings and Matrices 195 6.1 Introduction 6.2 Matrix Representation of a Linear Operator 6.3 Change of Basis 6.4 Similarity 6.5 Matrices and General Linear Mappings CHAPTER 7 Inner Product Spaces, Orthogonality 226 7.1 Introduction 7.2 Inner Product Spaces 7.3 Examples of Inner Product Spaces 7.4 Cauchy–Schwarz Inequality, Applications 7.5 Orthogonal- ity 7.6 Orthogonal Sets and Bases 7.7 Gram–Schmidt Orthogonalization Process 7.8 Orthogonal and Positive Definite Matrices 7.9 Complex Inner Product Spaces 7.10 Normed Vector Spaces (Optional) v
  • 7. CHAPTER 8 Determinants 264 8.1 Introduction 8.2 Determinants of Orders 1 and 2 8.3 Determinants of Order 3 8.4 Permutations 8.5 Determinants of Arbitrary Order 8.6 Proper- ties of Determinants 8.7 Minors and Cofactors 8.8 Evaluation of Determi- nants 8.9 Classical Adjoint 8.10 Applications to Linear Equations, Cramer’s Rule 8.11 Submatrices, Minors, Principal Minors 8.12 Block Matrices and Determinants 8.13 Determinants and Volume 8.14 Determi- nant of a Linear Operator 8.15 Multilinearity and Determinants CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 292 9.1 Introduction 9.2 Polynomials of Matrices 9.3 Characteristic Polyno- mial, Cayley–Hamilton Theorem 9.4 Diagonalization, Eigenvalues and Eigenvectors 9.5 Computing Eigenvalues and Eigenvectors, Diagonalizing Matrices 9.6 Diagonalizing Real Symmetric Matrices and Quadratic Forms 9.7 Minimal Polynomial 9.8 Characteristic and Minimal Polyno- mials of Block Matrices CHAPTER 10 Canonical Forms 325 10.1 Introduction 10.2 Triangular Form 10.3 Invariance 10.4 Invariant Direct-Sum Decompositions 10.5 Primary Decomposition 10.6 Nilpotent Operators 10.7 Jordan Canonical Form 10.8 Cyclic Subspaces 10.9 Rational Canonical Form 10.10 Quotient Spaces CHAPTER 11 Linear Functionals and the Dual Space 349 11.1 Introduction 11.2 Linear Functionals and the Dual Space 11.3 Dual Basis 11.4 Second Dual Space 11.5 Annihilators 11.6 Transpose of a Linear Mapping CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 359 12.1 Introduction 12.2 Bilinear Forms 12.3 Bilinear Forms and Matrices 12.4 Alternating Bilinear Forms 12.5 Symmetric Bilinear Forms, Quadratic Forms 12.6 Real Symmetric Bilinear Forms, Law of Inertia 12.7 Hermitian Forms CHAPTER 13 Linear Operators on Inner Product Spaces 377 13.1 Introduction 13.2 Adjoint Operators 13.3 Analogy Between A(V ) and C, Special Linear Operators 13.4 Self-Adjoint Operators 13.5 Orthogonal and Unitary Operators 13.6 Orthogonal and Unitary Matrices 13.7 Change of Orthonormal Basis 13.8 Positive Definite and Positive Operators 13.9 Diagonalization and Canonical Forms in Inner Product Spaces 13.10 Spectral Theorem APPENDIX A Multilinear Products 396 APPENDIX B Algebraic Structures 403 APPENDIX C Polynomials over a Field 411 APPENDIX D Odds and Ends 415 List of Symbols 420 Index 421 vi Contents
  • 8. CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors 1.1 Introduction There are two ways to motivate the notion of a vector: one is by means of lists of numbers and subscripts, and the other is by means of certain objects in physics. We discuss these two ways below. Here we assume the reader is familiar with the elementary properties of the field of real numbers, denoted by R. On the other hand, we will review properties of the field of complex numbers, denoted by C. In the context of vectors, the elements of our number fields are called scalars. Although we will restrict ourselves in this chapter to vectors whose elements come from R and then from C, many of our operations also apply to vectors whose entries come from some arbitrary field K. Lists of Numbers Suppose the weights (in pounds) of eight students are listed as follows: 156; 125; 145; 134; 178; 145; 162; 193 One can denote all the values in the list using only one symbol, say w, but with different subscripts; that is, w1; w2; w3; w4; w5; w6; w7; w8 Observe that each subscript denotes the position of the value in the list. For example, w1 ¼ 156; the first number; w2 ¼ 125; the second number; . . . Such a list of values, w ¼ ðw1; w2; w3; . . . ; w8Þ is called a linear array or vector. Vectors in Physics Many physical quantities, such as temperature and speed, possess only ‘‘magnitude.’’ These quantities can be represented by real numbers and are called scalars. On the other hand, there are also quantities, such as force and velocity, that possess both ‘‘magnitude’’ and ‘‘direction.’’ These quantities, which can be represented by arrows having appropriate lengths and directions and emanating from some given reference point O, are called vectors. Now we assume the reader is familiar with the space R3 where all the points in space are represented by ordered triples of real numbers. Suppose the origin of the axes in R3 is chosen as the reference point O for the vectors discussed above. Then every vector is uniquely determined by the coordinates of its endpoint, and vice versa. There are two important operations, vector addition and scalar multiplication, associated with vectors in physics. The definition of these operations and the relationship between these operations and the endpoints of the vectors are as follows. 1 CHAPTER 1
  • 9. (i) Vector Addition: The resultant u þ v of two vectors u and v is obtained by the parallelogram law; that is, u þ v is the diagonal of the parallelogram formed by u and v. Furthermore, if ða; b; cÞ and ða0 ; b0 ; c0 Þ are the endpoints of the vectors u and v, then ða þ a0 ; b þ b0 ; c þ c0 Þ is the endpoint of the vector u þ v. These properties are pictured in Fig. 1-1(a). (ii) Scalar Multiplication: The product ku of a vector u by a real number k is obtained by multiplying the magnitude of u by k and retaining the same direction if k > 0 or the opposite direction if k < 0. Also, if ða; b; cÞ is the endpoint of the vector u, then ðka; kb; kcÞ is the endpoint of the vector ku. These properties are pictured in Fig. 1-1(b). Mathematically, we identify the vector u with its ða; b; cÞ and write u ¼ ða; b; cÞ. Moreover, we call the ordered triple ða; b; cÞ of real numbers a point or vector depending upon its interpretation. We generalize this notion and call an n-tuple ða1; a2; . . . ; anÞ of real numbers a vector. However, special notation may be used for the vectors in R3 called spatial vectors (Section 1.6). 1.2 Vectors in Rn The set of all n-tuples of real numbers, denoted by Rn , is called n-space. A particular n-tuple in Rn , say u ¼ ða1; a2; . . . ; anÞ is called a point or vector. The numbers ai are called the coordinates, components, entries, or elements of u. Moreover, when discussing the space Rn , we use the term scalar for the elements of R. Two vectors, u and v, are equal, written u ¼ v, if they have the same number of components and if the corresponding components are equal. Although the vectors ð1; 2; 3Þ and ð2; 3; 1Þ contain the same three numbers, these vectors are not equal because corresponding entries are not equal. The vector ð0; 0; . . . ; 0Þ whose entries are all 0 is called the zero vector and is usually denoted by 0. EXAMPLE 1.1 (a) The following are vectors: ð2; 5Þ; ð7; 9Þ; ð0; 0; 0Þ; ð3; 4; 5Þ The first two vectors belong to R2 , whereas the last two belong to R3 . The third is the zero vector in R3 . (b) Find x; y; z such that ðx y; x þ y; z 1Þ ¼ ð4; 2; 3Þ. By definition of equality of vectors, corresponding entries must be equal. Thus, x y ¼ 4; x þ y ¼ 2; z 1 ¼ 3 Solving the above system of equations yields x ¼ 3, y ¼ 1, z ¼ 4. Figure 1-1 2 CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors
  • 10. Column Vectors Sometimes a vector in n-space Rn is written vertically rather than horizontally. Such a vector is called a column vector, and, in this context, the horizontally written vectors in Example 1.1 are called row vectors. For example, the following are column vectors with 2; 2; 3, and 3 components, respectively: 1 2 ; 3 4 ; 1 5 6 2 4 3 5; 1:5 2 3 15 2 6 4 3 7 5 We also note that any operation defined for row vectors is defined analogously for column vectors. 1.3 Vector Addition and Scalar Multiplication Consider two vectors u and v in Rn , say u ¼ ða1; a2; . . . ; anÞ and v ¼ ðb1; b2; . . . ; bnÞ Their sum, written u þ v, is the vector obtained by adding corresponding components from u and v. That is, u þ v ¼ ða1 þ b1; a2 þ b2; . . . ; an þ bnÞ The scalar product or, simply, product, of the vector u by a real number k, written ku, is the vector obtained by multiplying each component of u by k. That is, ku ¼ kða1; a2; . . . ; anÞ ¼ ðka1; ka2; . . . ; kanÞ Observe that u þ v and ku are also vectors in Rn . The sum of vectors with different numbers of components is not defined. Negatives and subtraction are defined in Rn as follows: u ¼ ð1Þu and u v ¼ u þ ðvÞ The vector u is called the negative of u, and u v is called the difference of u and v. Now suppose we are given vectors u1; u2; . . . ; um in Rn and scalars k1; k2; . . . ; km in R. We can multiply the vectors by the corresponding scalars and then add the resultant scalar products to form the vector v ¼ k1u1 þ k2u2 þ k3u3 þ þ kmum Such a vector v is called a linear combination of the vectors u1; u2; . . . ; um. EXAMPLE 1.2 (a) Let u ¼ ð2; 4; 5Þ and v ¼ ð1; 6; 9Þ. Then u þ v ¼ ð2 þ 1; 4 þ ð5Þ; 5 þ 9Þ ¼ ð3; 1; 4Þ 7u ¼ ð7ð2Þ; 7ð4Þ; 7ð5ÞÞ ¼ ð14; 28; 35Þ v ¼ ð1Þð1; 6; 9Þ ¼ ð1; 6; 9Þ 3u 5v ¼ ð6; 12; 15Þ þ ð5; 30; 45Þ ¼ ð1; 42; 60Þ (b) The zero vector 0 ¼ ð0; 0; . . . ; 0Þ in Rn is similar to the scalar 0 in that, for any vector u ¼ ða1; a2; . . . ; anÞ. u þ 0 ¼ ða1 þ 0; a2 þ 0; . . . ; an þ 0Þ ¼ ða1; a2; . . . ; anÞ ¼ u (c) Let u ¼ 2 3 4 2 4 3 5 and v ¼ 3 1 2 2 4 3 5. Then 2u 3v ¼ 4 6 8 2 4 3 5 þ 9 3 6 2 4 3 5 ¼ 5 9 2 2 4 3 5. CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors 3
  • 11. Basic properties of vectors under the operations of vector addition and scalar multiplication are described in the following theorem. THEOREM 1.1: For any vectors u; v; w in Rn and any scalars k; k0 in R, (i) ðu þ vÞ þ w ¼ u þ ðv þ wÞ, (v) kðu þ vÞ ¼ ku þ kv, (ii) u þ 0 ¼ u; (vi) ðk þ k0 Þu ¼ ku þ k0 u, (iii) u þ ðuÞ ¼ 0; (vii) (kk’)u=k(k’u); (iv) u þ v ¼ v þ u, (viii) 1u ¼ u. We postpone the proof of Theorem 1.1 until Chapter 2, where it appears in the context of matrices (Problem 2.3). Suppose u and v are vectors in Rn for which u ¼ kv for some nonzero scalar k in R. Then u is called a multiple of v. Also, u is said to be in the same or opposite direction as v according to whether k 0 or k 0. 1.4 Dot (Inner) Product Consider arbitrary vectors u and v in Rn ; say, u ¼ ða1; a2; . . . ; anÞ and v ¼ ðb1; b2; . . . ; bnÞ The dot product or inner product or scalar product of u and v is denoted and defined by u v ¼ a1b1 þ a2b2 þ þ anbn That is, u v is obtained by multiplying corresponding components and adding the resulting products. The vectors u and v are said to be orthogonal (or perpendicular) if their dot product is zero—that is, if u v ¼ 0. EXAMPLE 1.3 (a) Let u ¼ ð1; 2; 3Þ, v ¼ ð4; 5; 1Þ, w ¼ ð2; 7; 4Þ. Then, u v ¼ 1ð4Þ 2ð5Þ þ 3ð1Þ ¼ 4 10 3 ¼ 9 u w ¼ 2 14 þ 12 ¼ 0; v w ¼ 8 þ 35 4 ¼ 39 Thus, u and w are orthogonal. (b) Let u ¼ 2 3 4 2 4 3 5 and v ¼ 3 1 2 2 4 3 5. Then u v ¼ 6 3 þ 8 ¼ 11. (c) Suppose u ¼ ð1; 2; 3; 4Þ and v ¼ ð6; k; 8; 2Þ. Find k so that u and v are orthogonal. First obtain u v ¼ 6 þ 2k 24 þ 8 ¼ 10 þ 2k. Then set u v ¼ 0 and solve for k: 10 þ 2k ¼ 0 or 2k ¼ 10 or k ¼ 5 Basic properties of the dot product in Rn (proved in Problem 1.13) follow. THEOREM 1.2: For any vectors u; v; w in Rn and any scalar k in R: (i) ðu þ vÞ w ¼ u w þ v w; (iii) u v ¼ v u, (ii) ðkuÞ v ¼ kðu vÞ, (iv) u u 0; and u u ¼ 0 iff u ¼ 0. Note that (ii) says that we can ‘‘take k out’’ from the first position in an inner product. By (iii) and (ii), u ðkvÞ ¼ ðkvÞ u ¼ kðv uÞ ¼ kðu vÞ 4 CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors
  • 12. That is, we can also ‘‘take k out’’ from the second position in an inner product. The space Rn with the above operations of vector addition, scalar multiplication, and dot product is usually called Euclidean n-space. Norm (Length) of a Vector The norm or length of a vector u in Rn , denoted by kuk, is defined to be the nonnegative square root of u u. In particular, if u ¼ ða1; a2; . . . ; anÞ, then kuk ¼ ffiffiffiffiffiffiffiffiffi u u p ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi a2 1 þ a2 2 þ þ a2 n q That is, kuk is the square root of the sum of the squares of the components of u. Thus, kuk 0, and kuk ¼ 0 if and only if u ¼ 0. A vector u is called a unit vector if kuk ¼ 1 or, equivalently, if u u ¼ 1. For any nonzero vector v in Rn , the vector ^ v ¼ 1 kvk v ¼ v kvk is the unique unit vector in the same direction as v. The process of finding ^ v from v is called normalizing v. EXAMPLE 1.4 (a) Suppose u ¼ ð1; 2; 4; 5; 3Þ. To find kuk, we can first find kuk2 ¼ u u by squaring each component of u and adding, as follows: kuk2 ¼ 12 þ ð2Þ2 þ ð4Þ2 þ 52 þ 32 ¼ 1 þ 4 þ 16 þ 25 þ 9 ¼ 55 Then kuk ¼ ffiffiffiffiffi 55 p . (b) Let v ¼ ð1; 3; 4; 2Þ and w ¼ ð1 2 ; 1 6 ; 5 6 ; 1 6Þ. Then kvk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 9 þ 16 þ 4 p ¼ ffiffiffiffiffi 30 p and kwk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 9 36 þ 1 36 þ 25 36 þ 1 36 r ¼ ffiffiffiffiffi 36 36 r ¼ ffiffiffi 1 p ¼ 1 Thus w is a unit vector, but v is not a unit vector. However, we can normalize v as follows: ^ v ¼ v kvk ¼ 1 ffiffiffiffiffi 30 p ; 3 ffiffiffiffiffi 30 p ; 4 ffiffiffiffiffi 30 p ; 2 ffiffiffiffiffi 30 p This is the unique unit vector in the same direction as v. The following formula (proved in Problem 1.14) is known as the Schwarz inequality or Cauchy– Schwarz inequality. It is used in many branches of mathematics. THEOREM 1.3 (Schwarz): For any vectors u; v in Rn , ju vj kukkvk. Using the above inequality, we also prove (Problem 1.15) the following result known as the ‘‘triangle inequality’’ or Minkowski’s inequality. THEOREM 1.4 (Minkowski): For any vectors u; v in Rn , ku þ vk kuk þ kvk. Distance, Angles, Projections The distance between vectors u ¼ ða1; a2; . . . ; anÞ and v ¼ ðb1; b2; . . . ; bnÞ in Rn is denoted and defined by dðu; vÞ ¼ ku vk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ða1 b1Þ2 þ ða2 b2Þ2 þ þ ðan bnÞ2 q One can show that this definition agrees with the usual notion of distance in the Euclidean plane R2 or space R3 . CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors 5
  • 13. The angle y between nonzero vectors u; v in Rn is defined by cos y ¼ u v kukkvk This definition is well defined, because, by the Schwarz inequality (Theorem 1.3), 1 u v kukkvk 1 Note that if u v ¼ 0, then y ¼ 90 (or y ¼ p=2). This then agrees with our previous definition of orthogonality. The projection of a vector u onto a nonzero vector v is the vector denoted and defined by projðu; vÞ ¼ u v kvk2 v ¼ u v v v v We show below that this agrees with the usual notion of vector projection in physics. EXAMPLE 1.5 (a) Suppose u ¼ ð1; 2; 3Þ and v ¼ ð2; 4; 5Þ. Then dðu; vÞ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ð1 2Þ2 þ ð2 4Þ2 þ ð3 5Þ2 q ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 36 þ 4 p ¼ ffiffiffiffiffi 41 p To find cos y, where y is the angle between u and v, we first find u v ¼ 2 8 þ 15 ¼ 9; kuk2 ¼ 1 þ 4 þ 9 ¼ 14; kvk2 ¼ 4 þ 16 þ 25 ¼ 45 Then cos y ¼ u v kukkvk ¼ 9 ffiffiffiffiffi 14 p ffiffiffiffiffi 45 p Also, projðu; vÞ ¼ u v kvk2 v ¼ 9 45 ð2; 4; 5Þ ¼ 1 5 ð2; 4; 5Þ ¼ 2 5 ; 4 5 ; 1 (b) Consider the vectors u and v in Fig. 1-2(a) (with respective endpoints A and B). The (perpendicular) projection of u onto v is the vector u* with magnitude ku*k ¼ kuk cos y ¼ kuk u v kukvk ¼ u v kvk To obtain u*, we multiply its magnitude by the unit vector in the direction of v, obtaining u* ¼ ku*k v kvk ¼ u v kvk v kvk ¼ u v kvk2 v This is the same as the above definition of projðu; vÞ. Figure 1-2 z y x 0 u ( ) b B b b b ( , , ) 1 2 3 u = B – A A a a a ( , , ) 1 2 3 P b a b a b a ( – , – , – ) 1 1 2 2 3 3 0 u ( ) a Projection of onto u* u A u* B C θ 6 CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors
  • 14. 1.5 Located Vectors, Hyperplanes, Lines, Curves in Rn This section distinguishes between an n-tuple PðaiÞ Pða1; a2; . . . ; anÞ viewed as a point in Rn and an n-tuple u ¼ ½c1; c2; . . . ; cn viewed as a vector (arrow) from the origin O to the point Cðc1; c2; . . . ; cnÞ. Located Vectors Any pair of points AðaiÞ and BðbiÞ in Rn defines the located vector or directed line segment from A to B, written AB ! . We identify AB ! with the vector u ¼ B A ¼ ½b1 a1; b2 a2; . . . ; bn an because AB ! and u have the same magnitude and direction. This is pictured in Fig. 1-2(b) for the points Aða1; a2; a3Þ and Bðb1; b2; b3Þ in R3 and the vector u ¼ B A which has the endpoint Pðb1 a1, b2 a2, b3 a3Þ. Hyperplanes A hyperplane H in Rn is the set of points ðx1; x2; . . . ; xnÞ that satisfy a linear equation a1x1 þ a2x2 þ þ anxn ¼ b where the vector u ¼ ½a1; a2; . . . ; an of coefficients is not zero. Thus a hyperplane H in R2 is a line, and a hyperplane H in R3 is a plane. We show below, as pictured in Fig. 1-3(a) for R3 , that u is orthogonal to any directed line segment PQ ! , where Pð piÞ and QðqiÞ are points in H: [For this reason, we say that u is normal to H and that H is normal to u:] Because Pð piÞ and QðqiÞ belong to H; they satisfy the above hyperplane equation—that is, a1 p1 þ a2 p2 þ þ an pn ¼ b and a1q1 þ a2q2 þ þ anqn ¼ b v ¼ PQ ! ¼ Q P ¼ ½q1 p1; q2 p2; . . . ; qn pn Let Then u v ¼ a1ðq1 p1Þ þ a2ðq2 p2Þ þ þ anðqn pnÞ ¼ ða1q1 þ a2q2 þ þ anqnÞ ða1 p1 þ a2 p2 þ þ an pnÞ ¼ b b ¼ 0 Thus v ¼ PQ ! is orthogonal to u; as claimed. Figure 1-3 CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors 7
  • 15. Lines in Rn The line L in Rn passing through the point Pðb1; b2; . . . ; bnÞ and in the direction of a nonzero vector u ¼ ½a1; a2; . . . ; an consists of the points Xðx1; x2; . . . ; xnÞ that satisfy X ¼ P þ tu or x1 ¼ a1t þ b1 x2 ¼ a2t þ b2 :::::::::::::::::::: xn ¼ ant þ bn or LðtÞ ¼ ðait þ biÞ 8 : where the parameter t takes on all real values. Such a line L in R3 is pictured in Fig. 1-3(b). EXAMPLE 1.6 (a) Let H be the plane in R3 corresponding to the linear equation 2x 5y þ 7z ¼ 4. Observe that Pð1; 1; 1Þ and Qð5; 4; 2Þ are solutions of the equation. Thus P and Q and the directed line segment v ¼ PQ ! ¼ Q P ¼ ½5 1; 4 1; 2 1 ¼ ½4; 3; 1 lie on the plane H. The vector u ¼ ½2; 5; 7 is normal to H, and, as expected, u v ¼ ½2; 5; 7 ½4; 3; 1 ¼ 8 15 þ 7 ¼ 0 That is, u is orthogonal to v. (b) Find an equation of the hyperplane H in R4 that passes through the point Pð1; 3; 4; 2Þ and is normal to the vector u ¼ ½4; 2; 5; 6. The coefficients of the unknowns of an equation of H are the components of the normal vector u; hence, the equation of H must be of the form 4x1 2x2 þ 5x3 þ 6x4 ¼ k Substituting P into this equation, we obtain 4ð1Þ 2ð3Þ þ 5ð4Þ þ 6ð2Þ ¼ k or 4 6 20 þ 12 ¼ k or k ¼ 10 Thus, 4x1 2x2 þ 5x3 þ 6x4 ¼ 10 is the equation of H. (c) Find the parametric representation of the line L in R4 passing through the point Pð1; 2; 3; 4Þ and in the direction of u ¼ ½5; 6; 7; 8. Also, find the point Q on L when t ¼ 1. Substitution in the above equation for L yields the following parametric representation: x1 ¼ 5t þ 1; x2 ¼ 6t þ 2; x3 ¼ 7t þ 3; x4 ¼ 8t 4 or, equivalently, LðtÞ ¼ ð5t þ 1; 6t þ 2; 7t þ 3; 8t 4Þ Note that t ¼ 0 yields the point P on L. Substitution of t ¼ 1 yields the point Qð6; 8; 4; 4Þ on L. Curves in Rn Let D be an interval (finite or infinite) on the real line R. A continuous function F: D ! Rn is a curve in Rn . Thus, to each point t 2 D there is assigned the following point in Rn : FðtÞ ¼ ½F1ðtÞ; F2ðtÞ; . . . ; FnðtÞ Moreover, the derivative (if it exists) of FðtÞ yields the vector VðtÞ ¼ dFðtÞ dt ¼ dF1ðtÞ dt ; dF2ðtÞ dt ; . . . ; dFnðtÞ dt 8 CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors
  • 16. which is tangent to the curve. Normalizing VðtÞ yields TðtÞ ¼ VðtÞ kVðtÞk Thus, TðtÞ is the unit tangent vector to the curve. (Unit vectors with geometrical significance are often presented in bold type.) EXAMPLE 1.7 Consider the curve FðtÞ ¼ ½sin t; cos t; t in R3 . Taking the derivative of FðtÞ [or each component of FðtÞ] yields VðtÞ ¼ ½cos t; sin t; 1 which is a vector tangent to the curve. We normalize VðtÞ. First we obtain kVðtÞk2 ¼ cos2 t þ sin2 t þ 1 ¼ 1 þ 1 ¼ 2 Then the unit tangent vection TðtÞ to the curve follows: TðtÞ ¼ VðtÞ kVðtÞk ¼ cos t ffiffiffi 2 p ; sin t ffiffiffi 2 p ; 1 ffiffiffi 2 p 1.6 Vectors in R3 (Spatial Vectors), ijk Notation Vectors in R3 , called spatial vectors, appear in many applications, especially in physics. In fact, a special notation is frequently used for such vectors as follows: i ¼ ½1; 0; 0 denotes the unit vector in the x direction: j ¼ ½0; 1; 0 denotes the unit vector in the y direction: k ¼ ½0; 0; 1 denotes the unit vector in the z direction: Then any vector u ¼ ½a; b; c in R3 can be expressed uniquely in the form u ¼ ½a; b; c ¼ ai þ bj þ cj Because the vectors i; j; k are unit vectors and are mutually orthogonal, we obtain the following dot products: i i ¼ 1; j j ¼ 1; k k ¼ 1 and i j ¼ 0; i k ¼ 0; j k ¼ 0 Furthermore, the vector operations discussed above may be expressed in the ijk notation as follows. Suppose u ¼ a1i þ a2j þ a3k and v ¼ b1i þ b2j þ b3k Then u þ v ¼ ða1 þ b1Þi þ ða2 þ b2Þj þ ða3 þ b3Þk and cu ¼ ca1i þ ca2j þ ca3k where c is a scalar. Also, u v ¼ a1b1 þ a2b2 þ a3b3 and kuk ¼ ffiffiffiffiffiffiffiffiffi u u p ¼ a2 1 þ a2 2 þ a2 3 EXAMPLE 1.8 Suppose u ¼ 3i þ 5j 2k and v ¼ 4i 8j þ 7k. (a) To find u þ v, add corresponding components, obtaining u þ v ¼ 7i 3j þ 5k (b) To find 3u 2v, first multiply by the scalars and then add: 3u 2v ¼ ð9i þ 13j 6kÞ þ ð8i þ 16j 14kÞ ¼ i þ 29j 20k CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors 9
  • 17. (c) To find u v, multiply corresponding components and then add: u v ¼ 12 40 14 ¼ 42 (d) To find kuk, take the square root of the sum of the squares of the components: kuk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 9 þ 25 þ 4 p ¼ ffiffiffiffiffi 38 p Cross Product There is a special operation for vectors u and v in R3 that is not defined in Rn for n 6¼ 3. This operation is called the cross product and is denoted by u v. One way to easily remember the formula for u v is to use the determinant (of order two) and its negative, which are denoted and defined as follows: a b c d ¼ ad bc and a b c d ¼ bc ad Here a and d are called the diagonal elements and b and c are the nondiagonal elements. Thus, the determinant is the product ad of the diagonal elements minus the product bc of the nondiagonal elements, but vice versa for the negative of the determinant. Now suppose u ¼ a1i þ a2j þ a3k and v ¼ b1i þ b2j þ b3k. Then u v ¼ ða2b3 a3b2Þi þ ða3b1 a1b3Þj þ ða1b2 a2b1Þk ¼ a1 a2 a3 b1 b2 b3 i a1 a2 a3 b1 b2 b3 j þ a1 a2 a3 b1 b2 b3 i That is, the three components of u v are obtained from the array a1 a2 a3 b1 b2 b3 (which contain the components of u above the component of v) as follows: (1) Cover the first column and take the determinant. (2) Cover the second column and take the negative of the determinant. (3) Cover the third column and take the determinant. Note that u v is a vector; hence, u v is also called the vector product or outer product of u and v. EXAMPLE 1.9 Find u v where: (a) u ¼ 4i þ 3j þ 6k, v ¼ 2i þ 5j 3k, (b) u ¼ ½2; 1; 5, v ¼ ½3; 7; 6. (a) Use 4 3 6 2 5 3 to get u v ¼ ð9 30Þi þ ð12 þ 12Þj þ ð20 6Þk ¼ 39i þ 24j þ 14k (b) Use 2 1 5 3 7 6 to get u v ¼ ½6 35; 15 12; 14 þ 3 ¼ ½41; 3; 17 Remark: The cross products of the vectors i; j; k are as follows: i j ¼ k; j k ¼ i; k i ¼ j j i ¼ k; k j ¼ i; i k ¼ j Thus, if we view the triple ði; j; kÞ as a cyclic permutation, where i follows k and hence k precedes i, then the product of two of them in the given direction is the third one, but the product of two of them in the opposite direction is the negative of the third one. Two important properties of the cross product are contained in the following theorem. 10 CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors
  • 18. THEOREM 1.5: Let u; v; w be vectors in R3 . (a) The vector u v is orthogonal to both u and v. (b) The absolute value of the ‘‘triple product’’ u v w represents the volume of the parallelopiped formed by the vectors u; v, w. [See Fig. 1-4(a).] We note that the vectors u; v, u v form a right-handed system, and that the following formula gives the magnitude of u v: ku vk ¼ kukkvk sin y where y is the angle between u and v. 1.7 Complex Numbers The set of complex numbers is denoted by C. Formally, a complex number is an ordered pair ða; bÞ of real numbers where equality, addition, and multiplication are defined as follows: ða; bÞ ¼ ðc; dÞ if and only if a ¼ c and b ¼ d ða; bÞ þ ðc; dÞ ¼ ða þ c; b þ dÞ ða; bÞ ðc; dÞ ¼ ðac bd; ad þ bcÞ We identify the real number a with the complex number ða; 0Þ; that is, a $ ða; 0Þ This is possible because the operations of addition and multiplication of real numbers are preserved under the correspondence; that is, ða; 0Þ þ ðb; 0Þ ¼ ða þ b; 0Þ and ða; 0Þ ðb; 0Þ ¼ ðab; 0Þ Thus we view R as a subset of C, and replace ða; 0Þ by a whenever convenient and possible. We note that the set C of complex numbers with the above operations of addition and multiplication is a field of numbers, like the set R of real numbers and the set Q of rational numbers. Figure 1-4 CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors 11
  • 19. The complex number ð0; 1Þ is denoted by i. It has the important property that i2 ¼ ii ¼ ð0; 1Þð0; 1Þ ¼ ð1; 0Þ ¼ 1 or i ¼ ffiffiffiffiffiffiffi 1 p Accordingly, any complex number z ¼ ða; bÞ can be written in the form z ¼ ða; bÞ ¼ ða; 0Þ þ ð0; bÞ ¼ ða; 0Þ þ ðb; 0Þ ð0; 1Þ ¼ a þ bi The above notation z ¼ a þ bi, where a Re z and b Im z are called, respectively, the real and imaginary parts of z, is more convenient than ða; bÞ. In fact, the sum and product of complex numbers z ¼ a þ bi and w ¼ c þ di can be derived by simply using the commutative and distributive laws and i2 ¼ 1: z þ w ¼ ða þ biÞ þ ðc þ diÞ ¼ a þ c þ bi þ di ¼ ða þ bÞ þ ðc þ dÞi zw ¼ ða þ biÞðc þ diÞ ¼ ac þ bci þ adi þ bdi2 ¼ ðac bdÞ þ ðbc þ adÞi We also define the negative of z and subtraction in C by z ¼ 1z and w z ¼ w þ ðzÞ Warning: The letter i representing ffiffiffiffiffiffiffi 1 p has no relationship whatsoever to the vector i ¼ ½1; 0; 0 in Section 1.6. Complex Conjugate, Absolute Value Consider a complex number z ¼ a þ bi. The conjugate of z is denoted and defined by z ¼ a þ bi ¼ a bi Then z z ¼ ða þ biÞða biÞ ¼ a2 b2 i2 ¼ a2 þ b2 . Note that z is real if and only if z ¼ z. The absolute value of z, denoted by jzj, is defined to be the nonnegative square root of z z. Namely, jzj ¼ ffiffiffiffi z z p ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi a2 þ b2 p Note that jzj is equal to the norm of the vector ða; bÞ in R2 . Suppose z 6¼ 0. Then the inverse z1 of z and division in C of w by z are given, respectively, by z1 ¼ z z z ¼ a a2 þ b2 b a2 þ b2 i and w z w z z z ¼ wz1 EXAMPLE 1.10 Suppose z ¼ 2 þ 3i and w ¼ 5 2i. Then z þ w ¼ ð2 þ 3iÞ þ ð5 2iÞ ¼ 2 þ 5 þ 3i 2i ¼ 7 þ i zw ¼ ð2 þ 3iÞð5 2iÞ ¼ 10 þ 15i 4i 6i2 ¼ 16 þ 11i z ¼ 2 þ 3i ¼ 2 3i and w ¼ 5 2i ¼ 5 þ 2i w z ¼ 5 2i 2 þ 3i ¼ ð5 2iÞð2 3iÞ ð2 þ 3iÞð2 3iÞ ¼ 4 19i 13 ¼ 4 13 19 13 i jzj ¼ ffiffiffiffiffiffiffiffiffiffiffi 4 þ 9 p ¼ ffiffiffiffiffi 13 p and jwj ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffi ffi 25 þ 4 p ¼ ffiffiffiffiffi 29 p Complex Plane Recall that the real numbers R can be represented by points on a line. Analogously, the complex numbers C can be represented by points in the plane. Specifically, we let the point ða; bÞ in the plane represent the complex number a þ bi as shown in Fig. 1-4(b). In such a case, jzj is the distance from the origin O to the point z. The plane with this representation is called the complex plane, just like the line representing R is called the real line. 12 CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors
  • 20. 1.8 Vectors in Cn The set of all n-tuples of complex numbers, denoted by Cn , is called complex n-space. Just as in the real case, the elements of Cn are called points or vectors, the elements of C are called scalars, and vector addition in Cn and scalar multiplication on Cn are given by ½z1; z2; . . . ; zn þ ½w1; w2; . . . ; wn ¼ ½z1 þ w1; z2 þ w2; . . . ; zn þ wn z½z1; z2; . . . ; zn ¼ ½zz1; zz2; . . . ; zzn where the zi, wi, and z belong to C. EXAMPLE 1.11 Consider vectors u ¼ ½2 þ 3i; 4 i; 3 and v ¼ ½3 2i; 5i; 4 6i in C3 . Then u þ v ¼ ½2 þ 3i; 4 i; 3 þ ½3 2i; 5i; 4 6i ¼ ½5 þ i; 4 þ 4i; 7 6i ð5 2iÞu ¼ ½ð5 2iÞð2 þ 3iÞ; ð5 2iÞð4 iÞ; ð5 2iÞð3Þ ¼ ½16 þ 11i; 18 13i; 15 6i Dot (Inner) Product in Cn Consider vectors u ¼ ½z1; z2; . . . ; zn and v ¼ ½w1; w2; . . . ; wn in Cn . The dot or inner product of u and v is denoted and defined by u v ¼ z1 w1 þ z2 w2 þ þ zn wn This definition reduces to the real case because wi ¼ wi when wi is real. The norm of u is defined by kuk ¼ ffiffiffiffiffiffiffiffiffi u u p ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi z1 z1 þ z2 z2 þ þ zn zn p ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi jz1j2 þ jz2j2 þ þ jvnj2 q We emphasize that u u and so kuk are real and positive when u 6¼ 0 and 0 when u ¼ 0. EXAMPLE 1.12 Consider vectors u ¼ ½2 þ 3i; 4 i; 3 þ 5i and v ¼ ½3 4i; 5i; 4 2i in C3. Then u v ¼ ð2 þ 3iÞð3 4iÞ þ ð4 iÞð5iÞ þ ð3 þ 5iÞð4 2iÞ ¼ ð2 þ 3iÞð3 þ 4iÞ þ ð4 iÞð5iÞ þ ð3 þ 5iÞð4 þ 2iÞ ¼ ð6 þ 13iÞ þ ð5 20iÞ þ ð2 þ 26iÞ ¼ 9 þ 19i u u ¼ j2 þ 3ij2 þ j4 ij2 þ j3 þ 5ij2 ¼ 4 þ 9 þ 16 þ 1 þ 9 þ 25 ¼ 64 kuk ¼ ffiffiffiffiffi 64 p ¼ 8 The space Cn with the above operations of vector addition, scalar multiplication, and dot product, is called complex Euclidean n-space. Theorem 1.2 for Rn also holds for Cn if we replace u v ¼ v u by u v ¼ u v On the other hand, the Schwarz inequality (Theorem 1.3) and Minkowski’s inequality (Theorem 1.4) are true for Cn with no changes. SOLVED PROBLEMS Vectors in Rn 1.1. Determine which of the following vectors are equal: u1 ¼ ð1; 2; 3Þ; u2 ¼ ð2; 3; 1Þ; u3 ¼ ð1; 3; 2Þ; u4 ¼ ð2; 3; 1Þ Vectors are equal only when corresponding entries are equal; hence, only u2 ¼ u4. CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors 13
  • 21. 1.2. Let u ¼ ð2; 7; 1Þ, v ¼ ð3; 0; 4Þ, w ¼ ð0; 5; 8Þ. Find: (a) 3u 4v, (b) 2u þ 3v 5w. First perform the scalar multiplication and then the vector addition. (a) 3u 4v ¼ 3ð2; 7; 1Þ 4ð3; 0; 4Þ ¼ ð6; 21; 3Þ þ ð12; 0; 16Þ ¼ ð18; 21; 13Þ (b) 2u þ 3v 5w ¼ ð4; 14; 2Þ þ ð9; 0; 12Þ þ ð0; 25; 40Þ ¼ ð5; 39; 54Þ 1.3. Let u ¼ 5 3 4 2 4 3 5; v ¼ 1 5 2 2 4 3 5; w ¼ 3 1 2 2 4 3 5. Find: (a) 5u 2v, (b) 2u þ 4v 3w. First perform the scalar multiplication and then the vector addition: (a) 5u 2v ¼ 5 5 3 4 2 4 3 5 2 1 5 2 2 4 3 5 ¼ 25 15 20 2 4 3 5 þ 2 10 4 2 4 3 5 ¼ 27 5 24 2 4 3 5 (b) 2u þ 4v 3w ¼ 10 6 8 2 4 3 5 þ 4 20 8 2 4 3 5 þ 9 3 6 2 4 3 5 ¼ 23 17 22 2 4 3 5 1.4. Find x and y, where: (a) ðx; 3Þ ¼ ð2; x þ yÞ, (b) ð4; yÞ ¼ xð2; 3Þ. (a) Because the vectors are equal, set the corresponding entries equal to each other, yielding x ¼ 2; 3 ¼ x þ y Solve the linear equations, obtaining x ¼ 2; y ¼ 1: (b) First multiply by the scalar x to obtain ð4; yÞ ¼ ð2x; 3xÞ. Then set corresponding entries equal to each other to obtain 4 ¼ 2x; y ¼ 3x Solve the equations to yield x ¼ 2, y ¼ 6. 1.5. Write the vector v ¼ ð1; 2; 5Þ as a linear combination of the vectors u1 ¼ ð1; 1; 1Þ, u2 ¼ ð1; 2; 3Þ, u3 ¼ ð2; 1; 1Þ. We want to express v in the form v ¼ xu1 þ yu2 þ zu3 with x; y; z as yet unknown. First we have 1 2 5 2 4 3 5 ¼ x 1 1 1 2 4 3 5 þ y 1 2 3 2 4 3 5 þ z 2 1 1 2 4 3 5 ¼ x þ y þ 2z x þ 2y z x þ 3y þ z 2 4 3 5 (It is more convenient to write vectors as columns than as rows when forming linear combinations.) Set corresponding entries equal to each other to obtain x þ y þ 2z ¼ 1 x þ 2y z ¼ 2 x þ 3y þ z ¼ 5 or x þ y þ 2z ¼ 1 y 3z ¼ 3 2y z ¼ 4 or x þ y þ 2z ¼ 1 y 3z ¼ 3 5z ¼ 10 This unique solution of the triangular system is x ¼ 6, y ¼ 3, z ¼ 2. Thus, v ¼ 6u1 þ 3u2 þ 2u3. 14 CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors
  • 22. 1.6. Write v ¼ ð2; 5; 3Þ as a linear combination of u1 ¼ ð1; 3; 2Þ; u2 ¼ ð2; 4; 1Þ; u3 ¼ ð1; 5; 7Þ: Find the equivalent system of linear equations and then solve. First, 2 5 3 2 4 3 5 ¼ x 1 3 2 2 4 3 5 þ y 2 4 1 2 4 3 5 þ z 1 5 7 2 4 3 5 ¼ x þ 2y þ z 3x 4y 5z 2x y þ 7z 2 4 3 5 Set the corresponding entries equal to each other to obtain x þ 2y þ z ¼ 2 3x 4y 5z ¼ 5 2x y þ 7z ¼ 3 or x þ 2y þ z ¼ 2 2y 2z ¼ 1 5y þ 5z ¼ 1 or x þ 2y þ z ¼ 2 2y 2z ¼ 1 0 ¼ 3 The third equation, 0x þ 0y þ 0z ¼ 3, indicates that the system has no solution. Thus, v cannot be written as a linear combination of the vectors u1, u2, u3. Dot (Inner) Product, Orthogonality, Norm in Rn 1.7. Find u v where: (a) u ¼ ð2; 5; 6Þ and v ¼ ð8; 2; 3Þ, (b) u ¼ ð4; 2; 3; 5; 1Þ and v ¼ ð2; 6; 1; 4; 8Þ. Multiply the corresponding components and add: (a) u v ¼ 2ð8Þ 5ð2Þ þ 6ð3Þ ¼ 16 10 18 ¼ 12 (b) u v ¼ 8 þ 12 þ 3 20 8 ¼ 5 1.8. Let u ¼ ð5; 4; 1Þ, v ¼ ð3; 4; 1Þ, w ¼ ð1; 2; 3Þ. Which pair of vectors, if any, are perpendicular (orthogonal)? Find the dot product of each pair of vectors: u v ¼ 15 16 þ 1 ¼ 0; v w ¼ 3 þ 8 þ 3 ¼ 14; u w ¼ 5 8 þ 3 ¼ 0 Thus, u and v are orthogonal, u and w are orthogonal, but v and w are not. 1.9. Find k so that u and v are orthogonal, where: (a) u ¼ ð1; k; 3Þ and v ¼ ð2; 5; 4Þ, (b) u ¼ ð2; 3k; 4; 1; 5Þ and v ¼ ð6; 1; 3; 7; 2kÞ. Compute u v, set u v equal to 0, and then solve for k: (a) u v ¼ 1ð2Þ þ kð5Þ 3ð4Þ ¼ 5k 10. Then 5k 10 ¼ 0, or k ¼ 2. (b) u v ¼ 12 3k 12 þ 7 þ 10k ¼ 7k þ 7. Then 7k þ 7 ¼ 0, or k ¼ 1. 1.10. Find kuk, where: (a) u ¼ ð3; 12; 4Þ, (b) u ¼ ð2; 3; 8; 7Þ. First find kuk2 ¼ u u by squaring the entries and adding. Then kuk ¼ ffiffiffiffiffiffiffiffiffiffi kuk2 q . (a) kuk2 ¼ ð3Þ2 þ ð12Þ2 þ ð4Þ2 ¼ 9 þ 144 þ 16 ¼ 169. Then kuk ¼ ffiffiffiffiffiffiffiffi 169 p ¼ 13. (b) kuk2 ¼ 4 þ 9 þ 64 þ 49 ¼ 126. Then kuk ¼ ffiffiffiffiffiffiffiffi 126 p . CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors 15
  • 23. 1.11. Recall that normalizing a nonzero vector v means finding the unique unit vector ^ v in the same direction as v, where ^ v ¼ 1 kvk v Normalize: (a) u ¼ ð3; 4Þ, (b) v ¼ ð4; 2; 3; 8Þ, (c) w ¼ ð1 2, 2 3, 1 4). (a) First find kuk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffi 9 þ 16 p ¼ ffiffiffiffiffi 25 p ¼ 5. Then divide each entry of u by 5, obtaining ^ u ¼ ð3 5, 4 5). (b) Here kvk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 16 þ 4 þ 9 þ 64 p ¼ ffiffiffiffiffi 93 p . Then ^ v ¼ 4 ffiffiffiffiffi 93 p ; 2 ffiffiffiffiffi 93 p ; 3 ffiffiffiffiffi 93 p ; 8 ffiffiffiffiffi 93 p (c) Note that w and any positive multiple of w will have the same normalized form. Hence, first multiply w by 12 to ‘‘clear fractions’’—that is, first find w0 ¼ 12w ¼ ð6; 8; 3Þ. Then kw0 k ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 36 þ 64 þ 9 p ¼ ffiffiffiffiffiffiffiffi 109 p and ^ w ¼ b w0 ¼ 6 ffiffiffiffiffiffiffiffi 109 p ; 8 ffiffiffiffiffiffiffiffi 109 p ; 3 ffiffiffiffiffiffiffiffi 109 p 1.12. Let u ¼ ð1; 3; 4Þ and v ¼ ð3; 4; 7Þ. Find: (a) cos y, where y is the angle between u and v; (b) projðu; vÞ, the projection of u onto v; (c) dðu; vÞ, the distance between u and v. First find u v ¼ 3 12 þ 28 ¼ 19, kuk2 ¼ 1 þ 9 þ 16 ¼ 26, kvk2 ¼ 9 þ 16 þ 49 ¼ 74. Then (a) cos y ¼ u v kukkvk ¼ 19 ffiffiffiffiffi 26 p ffiffiffiffiffi 74 p , (b) projðu; vÞ ¼ u v kvk2 v ¼ 19 74 ð3; 4; 7Þ ¼ 57 74 ; 76 74 ; 133 74 ¼ 57 74 ; 38 37 ; 133 74 ; (c) dðu; vÞ ¼ ku vk ¼ kð2; 7 3Þk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4 þ 49 þ 9 p ¼ ffiffiffiffiffi 62 p : 1.13. Prove Theorem 1.2: For any u; v; w in Rn and k in R: (i) ðu þ vÞ w ¼ u w þ v w, (ii) ðkuÞ v ¼ kðu vÞ, (iii) u v ¼ v u, (iv) u u 0, and u u ¼ 0 iff u ¼ 0. Let u ¼ ðu1; u2; . . . ; unÞ, v ¼ ðv1; v2; . . . ; vnÞ, w ¼ ðw1; w2; . . . ; wnÞ. (i) Because u þ v ¼ ðu1 þ v1; u2 þ v2; . . . ; un þ vnÞ, ðu þ vÞ w ¼ ðu1 þ v1Þw1 þ ðu2 þ v2Þw2 þ þ ðun þ vnÞwn ¼ u1w1 þ v1w1 þ u2w2 þ þ unwn þ vnwn ¼ ðu1w1 þ u2w2 þ þ unwnÞ þ ðv1w1 þ v2w2 þ þ vnwnÞ ¼ u w þ v w (ii) Because ku ¼ ðku1; ku2; . . . ; kunÞ, ðkuÞ v ¼ ku1v1 þ ku2v2 þ þ kunvn ¼ kðu1v1 þ u2v2 þ þ unvnÞ ¼ kðu vÞ (iii) u v ¼ u1v1 þ u2v2 þ þ unvn ¼ v1u1 þ v2u2 þ þ vnun ¼ v u (iv) Because u2 i is nonnegative for each i, and because the sum of nonnegative real numbers is nonnegative, u u ¼ u2 1 þ u2 2 þ þ u2 n 0 Furthermore, u u ¼ 0 iff ui ¼ 0 for each i, that is, iff u ¼ 0. 16 CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors
  • 24. 1.14. Prove Theorem 1.3 (Schwarz): ju vj kukkvk. For any real number t, and using Theorem 1.2, we have 0 ðtu þ vÞ ðtu þ vÞ ¼ t2 ðu uÞ þ 2tðu vÞ þ ðv vÞ ¼ kuk2 t2 þ 2ðu vÞt þ kvk2 Let a ¼ kuk2 , b ¼ 2ðu vÞ, c ¼ kvk2 . Then, for every value of t, at2 þ bt þ c 0. This means that the quadratic polynomial cannot have two real roots. This implies that the discriminant D ¼ b2 4ac 0 or, equivalently, b2 4ac. Thus, 4ðu vÞ2 4kuk2 kvk2 Dividing by 4 gives us our result. 1.15. Prove Theorem 1.4 (Minkowski): ku þ vk kuk þ kvk. By the Schwarz inequality and other properties of the dot product, ku þ vk2 ¼ ðu þ vÞ ðu þ vÞ ¼ ðu uÞ þ 2ðu vÞ þ ðv vÞ kuk2 þ 2kukkvk þ kvk2 ¼ ðkuk þ kvkÞ2 Taking the square root of both sides yields the desired inequality. Points, Lines, Hyperplanes in Rn Here we distinguish between an n-tuple Pða1; a2; . . . ; anÞ viewed as a point in Rn and an n-tuple u ¼ ½c1; c2; . . . ; cn viewed as a vector (arrow) from the origin O to the point Cðc1; c2; . . . ; cnÞ. 1.16. Find the vector u identified with the directed line segment PQ ! for the points: (a) Pð1; 2; 4Þ and Qð6; 1; 5Þ in R3 , (b) Pð2; 3; 6; 5Þ and Qð7; 1; 4; 8Þ in R4 . (a) u ¼ PQ ! ¼ Q P ¼ ½6 1; 1 ð2Þ; 5 4 ¼ ½5; 3; 9 (b) u ¼ PQ ! ¼ Q P ¼ ½7 2; 1 3; 4 þ 6; 8 5 ¼ ½5; 2; 10; 13 1.17. Find an equation of the hyperplane H in R4 that passes through Pð3; 4; 1; 2Þ and is normal to u ¼ ½2; 5; 6; 3. The coefficients of the unknowns of an equation of H are the components of the normal vector u. Thus, an equation of H is of the form 2x1 þ 5x2 6x3 3x4 ¼ k. Substitute P into this equation to obtain k ¼ 26. Thus, an equation of H is 2x1 þ 5x2 6x3 3x4 ¼ 26. 1.18. Find an equation of the plane H in R3 that contains Pð1; 3; 4Þ and is parallel to the plane H0 determined by the equation 3x 6y þ 5z ¼ 2. The planes H and H0 are parallel if and only if their normal directions are parallel or antiparallel (opposite direction). Hence, an equation of H is of the form 3x 6y þ 5z ¼ k. Substitute P into this equation to obtain k ¼ 1. Then an equation of H is 3x 6y þ 5z ¼ 1. 1.19. Find a parametric representation of the line L in R4 passing through Pð4; 2; 3; 1Þ in the direction of u ¼ ½2; 5; 7; 8. Here L consists of the points XðxiÞ that satisfy X ¼ P þ tu or xi ¼ ait þ bi or LðtÞ ¼ ðait þ biÞ where the parameter t takes on all real values. Thus we obtain x1 ¼ 4 þ 2t; x2 ¼ 2 þ 2t; x3 ¼ 3 7t; x4 ¼ 1 þ 8t or LðtÞ ¼ ð4 þ 2t; 2 þ 2t; 3 7t; 1 þ 8tÞ CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors 17
  • 25. 1.20. Let C be the curve FðtÞ ¼ ðt2 ; 3t 2; t3 ; t2 þ 5Þ in R4 , where 0 t 4. (a) Find the point P on C corresponding to t ¼ 2. (b) Find the initial point Q and terminal point Q 0 of C. (c) Find the unit tangent vector T to the curve C when t ¼ 2. (a) Substitute t ¼ 2 into FðtÞ to get P ¼ f ð2Þ ¼ ð4; 4; 8; 9Þ. (b) The parameter t ranges from t ¼ 0 to t ¼ 4. Hence, Q ¼ f ð0Þ ¼ ð0; 2; 0; 5Þ and Q 0 ¼ Fð4Þ ¼ ð16; 10; 64; 21Þ. (c) Take the derivative of FðtÞ—that is, of each component of FðtÞ—to obtain a vector V that is tangent to the curve: VðtÞ ¼ dFðtÞ dt ¼ ½2t; 3; 3t2 ; 2t Now find V when t ¼ 2; that is, substitute t ¼ 2 in the equation for VðtÞ to obtain V ¼ Vð2Þ ¼ ½4; 3; 12; 4. Then normalize V to obtain the desired unit tangent vector T. We have kVk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 16 þ 9 þ 144 þ 16 p ¼ ffiffiffiffiffiffiffiffi 185 p and T ¼ 4 ffiffiffiffiffiffiffiffi 185 p ; 3 ffiffiffiffiffiffiffiffi 185 p ; 12 ffiffiffiffiffiffiffiffi 185 p ; 4 ffiffiffiffiffiffiffiffi 185 p Spatial Vectors (Vectors in R3 ), ijk Notation, Cross Product 1.21. Let u ¼ 2i 3j þ 4k, v ¼ 3i þ j 2k, w ¼ i þ 5j þ 3k. Find: (a) u þ v, (b) 2u 3v þ 4w, (c) u v and u w, (d) kuk and kvk. Treat the coefficients of i, j, k just like the components of a vector in R3 . (a) Add corresponding coefficients to get u þ v ¼ 5i 2j 2k. (b) First perform the scalar multiplication and then the vector addition: 2u 3v þ 4w ¼ ð4i 6j þ 8kÞ þ ð9i þ 3j þ 6kÞ þ ð4i þ 20j þ 12kÞ ¼ i þ 17j þ 26k (c) Multiply corresponding coefficients and then add: u v ¼ 6 3 8 ¼ 5 and u w ¼ 2 15 þ 12 ¼ 1 (d) The norm is the square root of the sum of the squares of the coefficients: kuk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4 þ 9 þ 16 p ¼ ffiffiffiffiffi 29 p and kvk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 9 þ 1 þ 4 p ¼ ffiffiffiffiffi 14 p 1.22. Find the (parametric) equation of the line L: (a) through the points Pð1; 3; 2Þ and Qð2; 5; 6Þ; (b) containing the point Pð1; 2; 4Þ and perpendicular to the plane H given by the equation 3x þ 5y þ 7z ¼ 15: (a) First find v ¼ PQ ! ¼ Q P ¼ ½1; 2; 8 ¼ i þ 2j 8k. Then LðtÞ ¼ ðt þ 1; 2t þ 3; 8t þ 2Þ ¼ ðt þ 1Þi þ ð2t þ 3Þj þ ð8t þ 2Þk (b) Because L is perpendicular to H, the line L is in the same direction as the normal vector N ¼ 3i þ 5j þ 7k to H. Thus, LðtÞ ¼ ð3t þ 1; 5t 2; 7t þ 4Þ ¼ ð3t þ 1Þi þ ð5t 2Þj þ ð7t þ 4Þk 1.23. Let S be the surface xy2 þ 2yz ¼ 16 in R3 . (a) Find the normal vector Nðx; y; zÞ to the surface S. (b) Find the tangent plane H to S at the point Pð1; 2; 3Þ. 18 CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors
  • 26. (a) The formula for the normal vector to a surface Fðx; y; zÞ ¼ 0 is Nðx; y; zÞ ¼ Fxi þ Fyj þ Fzk where Fx, Fy, Fz are the partial derivatives. Using Fðx; y; zÞ ¼ xy2 þ 2yz 16, we obtain Fx ¼ y2 ; Fy ¼ 2xy þ 2z; Fz ¼ 2y Thus, Nðx; y; zÞ ¼ y2 i þ ð2xy þ 2zÞj þ 2yk. (b) The normal to the surface S at the point P is NðPÞ ¼ Nð1; 2; 3Þ ¼ 4i þ 10j þ 4k Hence, N ¼ 2i þ 5j þ 2k is also normal to S at P. Thus an equation of H has the form 2x þ 5y þ 2z ¼ c. Substitute P in this equation to obtain c ¼ 18. Thus the tangent plane H to S at P is 2x þ 5y þ 2z ¼ 18. 1.24. Evaluate the following determinants and negative of determinants of order two: (a) (i) 3 4 5 9 , (ii) 2 1 4 3 , (iii) 4 5 3 2 (b) (i) 3 6 4 2 , (ii) 7 5 3 2 , (iii) 4 1 8 3 Use a b c d ¼ ad bc and a b c d ¼ bc ad. Thus, (a) (i) 27 20 ¼ 7, (ii) 6 þ 4 ¼ 10, (iii) 8 þ 15 ¼ 7: (b) (i) 24 6 ¼ 18, (ii) 15 14 ¼ 29, (iii) 8 þ 12 ¼ 4: 1.25. Let u ¼ 2i 3j þ 4k, v ¼ 3i þ j 2k, w ¼ i þ 5j þ 3k. Find: (a) u v, (b) u w (a) Use 2 3 4 3 1 2 to get u v ¼ ð6 4Þi þ ð12 þ 4Þj þ ð2 þ 9Þk ¼ 2i þ 16j þ 11k: (b) Use 2 3 4 1 5 3 to get u w ¼ ð9 20Þi þ ð4 6Þj þ ð10 þ 3Þk ¼ 29i 2j þ 13k: 1.26. Find u v, where: (a) u ¼ ð1; 2; 3Þ, v ¼ ð4; 5; 6Þ; (b) u ¼ ð4; 7; 3Þ, v ¼ ð6; 5; 2Þ. (a) Use 1 2 3 4 5 6 to get u v ¼ ½12 15; 12 6; 5 8 ¼ ½3; 6; 3: (b) Use 4 7 3 6 5 2 to get u v ¼ ½14 þ 15; 18 þ 8; 20 42 ¼ ½29; 26; 22: 1.27. Find a unit vector u orthogonal to v ¼ ½1; 3; 4 and w ¼ ½2; 6; 5. First find v w, which is orthogonal to v and w. The array 1 3 4 2 6 5 gives v w ¼ ½15 þ 24; 8 þ 5; 6 61 ¼ ½9; 13; 12: Normalize v w to get u ¼ ½9= ffiffiffiffiffiffiffiffi 394 p , 13= ffiffiffiffiffiffiffiffi 394 p , 12= ffiffiffiffiffiffiffiffi 394 p : 1.28. Let u ¼ ða1; a2; a3Þ and v ¼ ðb1; b2; b3Þ so u v ¼ ða2b3 a3b2; a3b1 a1b3; a1b2 a2b1Þ. Prove: (a) u v is orthogonal to u and v [Theorem 1.5(a)]. (b) ku vk2 ¼ ðu uÞðv vÞ ðu vÞ2 (Lagrange’s identity). CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors 19
  • 27. (a) We have u ðu vÞ ¼ a1ða2b3 a3b2Þ þ a2ða3b1 a1b3Þ þ a3ða1b2 a2b1Þ ¼ a1a2b3 a1a3b2 þ a2a3b1 a1a2b3 þ a1a3b2 a2a3b1 ¼ 0 Thus, u v is orthogonal to u. Similarly, u v is orthogonal to v. (b) We have ku vk2 ¼ ða2b3 a3b2Þ2 þ ða3b1 a1b3Þ2 þ ða1b2 a2b1Þ2 ð1Þ ðu uÞðv vÞ ðu vÞ2 ¼ ða2 1 þ a2 2 þ a2 3Þðb2 1 þ b2 2 þ b2 3Þ ða1b1 þ a2b2 þ a3b3Þ2 ð2Þ Expansion of the right-hand sides of (1) and (2) establishes the identity. Complex Numbers, Vectors in Cn 1.29. Suppose z ¼ 5 þ 3i and w ¼ 2 4i. Find: (a) z þ w, (b) z w, (c) zw. Use the ordinary rules of algebra together with i2 ¼ 1 to obtain a result in the standard form a þ bi. (a) z þ w ¼ ð5 þ 3iÞ þ ð2 4iÞ ¼ 7 i (b) z w ¼ ð5 þ 3iÞ ð2 4iÞ ¼ 5 þ 3i 2 þ 4i ¼ 3 þ 7i (c) zw ¼ ð5 þ 3iÞð2 4iÞ ¼ 10 14i 12i2 ¼ 10 14i þ 12 ¼ 22 14i 1.30. Simplify: (a) ð5 þ 3iÞð2 7iÞ, (b) ð4 3iÞ2 , (c) ð1 þ 2iÞ3 . (a) ð5 þ 3iÞð2 7iÞ ¼ 10 þ 6i 35i 21i2 ¼ 31 29i (b) ð4 3iÞ2 ¼ 16 24i þ 9i2 ¼ 7 24i (c) ð1 þ 2iÞ3 ¼ 1 þ 6i þ 12i2 þ 8i3 ¼ 1 þ 6i 12 8i ¼ 11 2i 1.31. Simplify: (a) i0 ; i3 ; i4 , (b) i5 ; i6 ; i7 ; i8 , (c) i39 ; i174 , i252 , i317 : (a) i0 ¼ 1, i3 ¼ i2 ðiÞ ¼ ð1ÞðiÞ ¼ i; i4 ¼ ði2 Þði2 Þ ¼ ð1Þð1Þ ¼ 1 (b) i5 ¼ ði4 ÞðiÞ ¼ ð1ÞðiÞ ¼ i, i6 ¼ ði4 Þði2 Þ ¼ ð1Þði2 Þ ¼ i2 ¼ 1, i7 ¼ i3 ¼ i, i8 ¼ i4 ¼ 1 (c) Using i4 ¼ 1 and in ¼ i4qþr ¼ ði4 Þq ir ¼ 1q ir ¼ ir , divide the exponent n by 4 to obtain the remainder r: i39 ¼ i4ð9Þþ3 ¼ ði4 Þ9 i3 ¼ 19 i3 ¼ i3 ¼ i; i174 ¼ i2 ¼ 1; i252 ¼ i0 ¼ 1; i317 ¼ i1 ¼ i 1.32. Find the complex conjugate of each of the following: (a) 6 þ 4i, 7 5i, 4 þ i, 3 i, (b) 6, 3, 4i, 9i. (a) 6 þ 4i ¼ 6 4i, 7 5i ¼ 7 þ 5i, 4 þ i ¼ 4 i, 3 i ¼ 3 þ i (b) 6 ¼ 6, 3 ¼ 3, 4i ¼ 4i, 9i ¼ 9i (Note that the conjugate of a real number is the original number, but the conjugate of a pure imaginary number is the negative of the original number.) 1.33. Find z z and jzj when z ¼ 3 þ 4i. For z ¼ a þ bi, use z z ¼ a2 þ b2 and z ¼ ffiffiffiffi z z p ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi a2 þ b2 p . z z ¼ 9 þ 16 ¼ 25; jzj ¼ ffiffiffiffiffi 25 p ¼ 5 1.34. Simpify 2 7i 5 þ 3i : To simplify a fraction z=w of complex numbers, multiply both numerator and denominator by w, the conjugate of the denominator: 2 7i 5 þ 3i ¼ ð2 7iÞð5 3iÞ ð5 þ 3iÞð5 3iÞ ¼ 11 41i 34 ¼ 11 34 41 34 i 20 CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors
  • 28. 1.35. Prove: For any complex numbers z, w 2 C, (i) z þ w ¼ z þ w, (ii) zw ¼ z w, (iii) z ¼ z. Suppose z ¼ a þ bi and w ¼ c þ di where a; b; c; d 2 R. (i) z þ w ¼ ða þ biÞ þ ðc þ diÞ ¼ ða þ cÞ þ ðb þ dÞi ¼ ða þ cÞ ðb þ dÞi ¼ a þ c bi di ¼ ða biÞ þ ðc diÞ ¼ z þ w (ii) zw ¼ ða þ biÞðc þ diÞ ¼ ðac bdÞ þ ðad þ bcÞi ¼ ðac bdÞ ðad þ bcÞi ¼ ða biÞðc diÞ ¼ z w (iii) z ¼ a þ bi ¼ a bi ¼ a ðbÞi ¼ a þ bi ¼ z 1.36. Prove: For any complex numbers z; w 2 C, jzwj ¼ jzjjwj. By (ii) of Problem 1.35, jzwj2 ¼ ðzwÞðzwÞ ¼ ðzwÞð z wÞ ¼ ðz zÞðw wÞ ¼ jzj2 jwj2 The square root of both sides gives us the desired result. 1.37. Prove: For any complex numbers z; w 2 C, jz þ wj jzj þ jwj. Suppose z ¼ a þ bi and w ¼ c þ di where a; b; c; d 2 R. Consider the vectors u ¼ ða; bÞ and v ¼ ðc; dÞ in R2 . Note that jzj ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi a2 þ b2 p ¼ kuk; jwj ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi c2 þ d2 p ¼ kvk and jz þ wj ¼ jða þ cÞ þ ðb þ dÞij ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ða þ cÞ2 þ ðb þ dÞ2 q ¼ kða þ c; b þ dÞk ¼ ku þ vk By Minkowski’s inequality (Problem 1.15), ku þ vk kuk þ kvk, and so jz þ wj ¼ ku þ vk kuk þ kvk ¼ jzj þ jwj 1.38. Find the dot products u v and v u where: (a) u ¼ ð1 2i; 3 þ iÞ, v ¼ ð4 þ 2i; 5 6iÞ; (b) u ¼ ð3 2i; 4i; 1 þ 6iÞ, v ¼ ð5 þ i; 2 3i; 7 þ 2iÞ. Recall that conjugates of the second vector appear in the dot product ðz1; . . . ; znÞ ðw1; . . . ; wnÞ ¼ z1 w1 þ þ zn wn (a) u v ¼ ð1 2iÞð4 þ 2iÞ þ ð3 þ iÞð5 6iÞ ¼ ð1 2iÞð4 2iÞ þ ð3 þ iÞð5 þ 6iÞ ¼ 10i þ 9 þ 23i ¼ 9 þ 13i v u ¼ ð4 þ 2iÞð1 2iÞ þ ð5 6iÞð3 þ iÞ ¼ ð4 þ 2iÞð1 þ 2iÞ þ ð5 6iÞð3 iÞ ¼ 10i þ 9 23i ¼ 9 13i (b) u v ¼ ð3 2iÞð5 þ iÞ þ ð4iÞð2 3iÞ þ ð1 þ 6iÞð7 þ 2iÞ ¼ ð3 2iÞð5 iÞ þ ð4iÞð2 þ 3iÞ þ ð1 þ 6iÞð7 2iÞ ¼ 20 þ 35i v u ¼ ð5 þ iÞð3 2iÞ þ ð2 3iÞð4iÞ þ ð7 þ 2iÞð1 þ 6iÞ ¼ ð5 þ iÞð3 þ 2iÞ þ ð2 3iÞð4iÞ þ ð7 þ 2iÞð1 6iÞ ¼ 20 35i In both cases, v u ¼ u v. This holds true in general, as seen in Problem 1.40. 1.39. Let u ¼ ð7 2i; 2 þ 5iÞ and v ¼ ð1 þ i; 3 6iÞ. Find: (a) u þ v, (b) 2iu, (c) ð3 iÞv, (d) u v, (e) kuk and kvk. (a) u þ v ¼ ð7 2i þ 1 þ i; 2 þ 5i 3 6iÞ ¼ ð8 i; 1 iÞ (b) 2iu ¼ ð14i 4i2 ; 4i þ 10i2 Þ ¼ ð4 þ 14i; 10 þ 4iÞ (c) ð3 iÞv ¼ ð3 þ 3i i i2 ; 9 18i þ 3i þ 6i2 Þ ¼ ð4 þ 2i; 15 15iÞ CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors 21
  • 29. (d) u v ¼ ð7 2iÞð1 þ iÞ þ ð2 þ 5iÞð3 6iÞ ¼ ð7 2iÞð1 iÞ þ ð2 þ 5iÞð3 þ 6iÞ ¼ 5 9i 36 3i ¼ 31 12i (e) kuk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 72 þ ð2Þ2 þ 22 þ 52 q ¼ ffiffiffiffiffi 82 p and kvk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 12 þ 12 þ ð3Þ2 þ ð6Þ2 q ¼ ffiffiffiffiffi 47 p 1.40. Prove: For any vectors u; v 2 Cn and any scalar z 2 C, (i) u v ¼ v u, (ii) ðzuÞ v ¼ zðu vÞ, (iii) u ðzvÞ ¼ zðu vÞ. Suppose u ¼ ðz1; z2; . . . ; znÞ and v ¼ ðw1; w2; . . . ; wnÞ. (i) Using the properties of the conjugate, v u ¼ w1 z1 þ w2 z2 þ þ wn zn ¼ w1 z1 þ w2 z2 þ þ wn zn ¼ w1z1 þ w2z2 þ þ wnzn ¼ z1 w1 þ z2 w2 þ þ zn wn ¼ u v (ii) Because zu ¼ ðzz1; zz2; . . . ; zznÞ, ðzuÞ v ¼ zz1 w1 þ zz2 w2 þ þ zzn wn ¼ zðz1 w1 þ z2 w2 þ þ zn wnÞ ¼ zðu vÞ (Compare with Theorem 1.2 on vectors in Rn .) (iii) Using (i) and (ii), u ðzvÞ ¼ ðzvÞ u ¼ zðv uÞ ¼ zðv uÞ ¼ zðu vÞ SUPPLEMENTARY PROBLEMS Vectors in Rn 1.41. Let u ¼ ð1; 2; 4Þ, v ¼ ð3; 5; 1Þ, w ¼ ð2; 1; 3Þ. Find: (a) 3u 2v; (b) 5u þ 3v 4w; (c) u v, u w, v w; (d) kuk, kvk; (e) cos y, where y is the angle between u and v; (f ) dðu; vÞ; (g) projðu; vÞ. 1.42. Repeat Problem 1.41 for vectors u ¼ 1 3 4 2 4 3 5, v ¼ 2 1 5 2 4 3 5, w ¼ 3 2 6 2 4 3 5. 1.43. Let u ¼ ð2; 5; 4; 6; 3Þ and v ¼ ð5; 2; 1; 7; 4Þ. Find: (a) 4u 3v; (b) 5u þ 2v; (c) u v; (d) kuk and kvk; (e) projðu; vÞ; ( f ) dðu; vÞ. 1.44. Normalize each vector: (a) u ¼ ð5; 7Þ; (b) v ¼ ð1; 2; 2; 4Þ; (c) w ¼ 1 2 ; 1 3 ; 3 4 . 1.45. Let u ¼ ð1; 2; 2Þ, v ¼ ð3; 12; 4Þ, and k ¼ 3. (a) Find kuk, kvk, ku þ vk, kkuk: (b) Verify that kkuk ¼ jkjkuk and ku þ vk kuk þ kvk. 1.46. Find x and y where: (a) ðx; y þ 1Þ ¼ ðy 2; 6Þ; (b) xð2; yÞ ¼ yð1; 2Þ. 1.47. Find x; y; z where ðx; y þ 1; y þ zÞ ¼ ð2x þ y; 4; 3zÞ. 22 CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors
  • 30. 1.48. Write v ¼ ð2; 5Þ as a linear combination of u1 and u2, where: (a) u1 ¼ ð1; 2Þ and u2 ¼ ð3; 5Þ; (b) u1 ¼ ð3; 4Þ and u2 ¼ ð2; 3Þ. 1.49. Write v ¼ 9 3 16 2 4 3 5 as a linear combination of u1 ¼ 1 3 3 2 4 3 5, u2 ¼ 2 5 1 2 4 3 5, u3 ¼ 4 2 3 2 4 3 5. 1.50. Find k so that u and v are orthogonal, where: (a) u ¼ ð3; k; 2Þ, v ¼ ð6; 4; 3Þ; (b) u ¼ ð5; k; 4; 2Þ, v ¼ ð1; 3; 2; 2kÞ; (c) u ¼ ð1; 7; k þ 2; 2Þ, v ¼ ð3; k; 3; kÞ. Located Vectors, Hyperplanes, Lines in Rn 1.51. Find the vector v identified with the directed line segment PQ ! for the points: (a) Pð2; 3; 7Þ and Qð1; 6; 5Þ in R3 ; (b) Pð1; 8; 4; 6Þ and Qð3; 5; 2; 4Þ in R4 . 1.52. Find an equation of the hyperplane H in R4 that: (a) contains Pð1; 2; 3; 2Þ and is normal to u ¼ ½2; 3; 5; 6; (b) contains Pð3; 1; 2; 5Þ and is parallel to 2x1 3x2 þ 5x3 7x4 ¼ 4. 1.53. Find a parametric representation of the line in R4 that: (a) passes through the points Pð1; 2; 1; 2Þ and Qð3; 5; 7; 9Þ; (b) passes through Pð1; 1; 3; 3Þ and is perpendicular to the hyperplane 2x1 þ 4x2 þ 6x3 8x4 ¼ 5. Spatial Vectors (Vectors in R3 ), ijk Notation 1.54. Given u ¼ 3i 4j þ 2k, v ¼ 2i þ 5j 3k, w ¼ 4i þ 7j þ 2k. Find: (a) 2u 3v; (b) 3u þ 4v 2w; (c) u v, u w, v w; (d) kuk, kvk, kwk. 1.55. Find the equation of the plane H: (a) with normal N ¼ 3i 4j þ 5k and containing the point Pð1; 2; 3Þ; (b) parallel to 4x þ 3y 2z ¼ 11 and containing the point Qð2; 1; 3Þ. 1.56. Find the (parametric) equation of the line L: (a) through the point Pð2; 5; 3Þ and in the direction of v ¼ 4i 5j þ 7k; (b) perpendicular to the plane 2x 3y þ 7z ¼ 4 and containing Pð1; 5; 7Þ. 1.57. Consider the following curve C in R3 where 0 t 5: FðtÞ ¼ t3 i t2 j þ ð2t 3Þk (a) Find the point P on C corresponding to t ¼ 2. (b) Find the initial point Q and the terminal point Q 0 . (c) Find the unit tangent vector T to the curve C when t ¼ 2. 1.58. Consider a moving body B whose position at time t is given by RðtÞ ¼ t2 i þ t3 j þ 3tk. [Then VðtÞ ¼ dRðtÞ=dt and AðtÞ ¼ dVðtÞ=dt denote, respectively, the velocity and acceleration of B.] When t ¼ 1, find for the body B: (a) position; (b) velocity v; (c) speed s; (d) acceleration a. CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors 23
  • 31. 1.59. Find a normal vector N and the tangent plane H to each surface at the given point: (a) surface x2 y þ 3yz ¼ 20 and point Pð1; 3; 2Þ; (b) surface x2 þ 3y2 5z2 ¼ 160 and point Pð3; 2; 1Þ: Cross Product 1.60. Evaluate the following determinants and negative of determinants of order two: (a) 2 5 3 6 ; 3 6 1 4 ; 4 2 7 3 (b) 6 4 7 5 ; 1 3 2 4 ; 8 3 6 2 1.61. Given u ¼ 3i 4j þ 2k, v ¼ 2i þ 5j 3k, w ¼ 4i þ 7j þ 2k, find: (a) u v, (b) u w, (c) v w. 1.62. Given u ¼ ½2; 1; 3, v ¼ ½4; 2; 2, w ¼ ½1; 1; 5, find: (a) u v, (b) u w, (c) v w. 1.63. Find the volume V of the parallelopiped formed by the vectors u; v; w appearing in: (a) Problem 1.60 (b) Problem 1.61. 1.64. Find a unit vector u orthogonal to: (a) v ¼ ½1; 2; 3 and w ¼ ½1; 1; 2; (b) v ¼ 3i j þ 2k and w ¼ 4i 2j k. 1.65. Prove the following properties of the cross product: (a) u v ¼ ðv uÞ (d) u ðv þ wÞ ¼ ðu vÞ þ ðu wÞ (b) u u ¼ 0 for any vector u (e) ðv þ wÞ u ¼ ðv uÞ þ ðw uÞ (c) ðkuÞ v ¼ kðu vÞ ¼ u ðkvÞ ( f ) ðu vÞ w ¼ ðu wÞv ðv wÞu Complex Numbers 1.66. Simplify: (a) ð4 7iÞð9 þ 2iÞ; (b) ð3 5iÞ 2 ; (c) 1 4 7i ; (d) 9 þ 2i 3 5i ; (e) ð1 iÞ 3 . 1.67. Simplify: (a) 1 2i ; (b) 2 þ 3i 7 3i ; (c) i15 ; i25 ; i34 ; (d) 1 3 i 2 . 1.68. Let z ¼ 2 5i and w ¼ 7 þ 3i. Find: (a) v þ w; (b) zw; (c) z=w; (d) z; w; (e) jzj, jwj. 1.69. Show that for complex numbers z and w: (a) Re z ¼ 1 2 ðz þ zÞ, (b) Im z ¼ 1 2 ðz z), (c) zw ¼ 0 implies z ¼ 0 or w ¼ 0. Vectors in Cn 1.70. Let u ¼ ð1 þ 7i; 2 6iÞ and v ¼ ð5 2i; 3 4iÞ. Find: (a) u þ v (b) ð3 þ iÞu (c) 2iu þ ð4 þ 7iÞv (d) u v (e) kuk and kvk. 24 CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors
  • 32. 1.71. Prove: For any vectors u; v; w in Cn : (a) ðu þ vÞ w ¼ u w þ v w, (b) w ðu þ vÞ ¼ w u þ w v. 1.72. Prove that the norm in Cn satisfies the following laws: ½N1 For any vector u, kuk 0; and kuk ¼ 0 if and only if u ¼ 0. ½N2 For any vector u and complex number z, kzuk ¼ jzjkuk. ½N3 For any vectors u and v, ku þ vk kuk þ kvk. ANSWERS TO SUPPLEMENTARY PROBLEMS 1.41. (a) ð3; 16; 4Þ; (b) (6,1,35); (c) 3; 12; 8; (d) ffiffiffiffiffi 21 p , ffiffiffiffiffi 35 p , ffiffiffiffiffi 14 p ; (e) 3= ffiffiffiffiffi 21 p ffiffiffiffiffi 35 p ; ( f ) ffiffiffiffiffi 62 p ; (g) 3 35 ð3; 5; 1Þ ¼ ð 9 35, 15 35, 3 35) 1.42. (Column vectors) (a) ð1; 7; 22Þ; (b) ð1; 26; 29Þ; (c) 15; 27; 34; (d) ffiffiffiffiffi 26 p , ffiffiffiffiffi 30 p ; (e) 15=ð ffiffiffiffiffi 26 p ffiffiffiffiffi 30 p Þ; ( f ) ffiffiffiffiffi 86 p ; (g) 15 30 v ¼ ð1; 1 2 ; 5 2Þ 1.43. (a) ð13; 14; 13; 45; 0Þ; (b) ð20; 29; 22; 16; 23Þ; (c) 6; (d) ffiffiffiffiffi 90 p ; ffiffiffiffiffi 95 p ; (e) 6 95 v; ( f ) ffiffiffiffiffiffiffiffi 167 p 1.44. (a) ð5= ffiffiffiffiffi 76 p ; 9= ffiffiffiffiffi 76 p Þ; (b) ð1 5 ; 2 5 ; 2 5 ; 4 5Þ; (c) ð6= ffiffiffiffiffiffiffiffi 133 p ; 4 ffiffiffiffiffiffiffiffi 133 p ; 9 ffiffiffiffiffiffiffiffi 133 p Þ 1.45. (a) 3; 13; ffiffiffiffiffiffiffiffi 120 p ; 9 1.46. (a) x ¼ 3; y ¼ 5; (b) x ¼ 0; y ¼ 0, and x ¼ 1; y ¼ 2 1.47. x ¼ 3; y ¼ 3; z ¼ 3 2 1.48. (a) v ¼ 5u1 u2; (b) v ¼ 16u1 23u2 1.49. v ¼ 3u1 u2 þ 2u3 1.50. (a) 6; (b) 3; (c) 3 2 1.51. (a) v ¼ ½1; 9; 2; (b) [2; 3; 6; 10] 1.52. (a) 2x1 þ 3x2 5x3 þ 6x4 ¼ 35; (b) 2x1 3x2 þ 5x3 7x4 ¼ 16 1.53. (a) ½2t þ 1; 7t þ 2; 6t þ 1; 11t þ 2; (b) ½2t þ 1; 4t þ 1; 6t þ 3; 8t þ 3 1.54. (a) 23j þ 13k; (b) 9i 6j 10k; (c) 20; 12; 37; (d) ffiffiffiffiffi 29 p ; ffiffiffiffiffi 38 p ; ffiffiffiffiffi 69 p 1.55. (a) 3x 4y þ 5z ¼ 20; (b) 4x þ 3y 2z ¼ 1 1.56. (a) ½4t þ 2; 5t þ 5; 7t 3; (b) ½2t þ 1; 3t 5; 7t þ 7 1.57. (a) P ¼ Fð2Þ ¼ 8i 4j þ k; (b) Q ¼ Fð0Þ ¼ 3k, Q0 ¼ Fð5Þ ¼ 125i 25j þ 7k; (c) T ¼ ð6i 2j þ kÞ= ffiffiffiffiffi 41 p 1.58. (a) i þ j þ 2k; (b) 2i þ 3j þ 2k; (c) ffiffiffiffiffi 17 p ; (d) 2i þ 6j 1.59. (a) N ¼ 6i þ 7j þ 9k, 6x þ 7y þ 9z ¼ 45; (b) N ¼ 6i 12j 10k, 3x 6y 5z ¼ 16 CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors 25
  • 33. 1.60. (a) 3; 6; 26; (b) 2; 10; 34 1.61. (a) 2i þ 13j þ 23k; (b) 22i þ 2j þ 37k; (c) 31i 16j 6k 1.62. (a) ½5; 8; 6; (b) ½2; 7; 1; (c) ½7; 18; 5 1.63. (a) 143; (b) 17 1.64. (a) ð7; 1; 3Þ= ffiffiffiffiffi 59 p ; (b) ð5i þ 11j 2kÞ= ffiffiffiffiffiffiffiffi 150 p 1.66. (a) 50 55i; (b) 16 30i; (c) 1 65 ð4 þ 7iÞ; (d) 1 2 ð1 þ 3iÞ; (e) 2 2i 1.67. (a) 1 2 i; (b) 1 58 ð5 þ 27iÞ; (c) 1; i; 1; (d) 1 50 ð4 þ 3iÞ 1.68. (a) 9 2i; (b) 29 29i; (c) 1 61 ð1 41iÞ; (d) 2 þ 5i, 7 3i; (e) ffiffiffiffiffi 29 p , ffiffiffiffiffi 58 p 1.69. (c) Hint: If zw ¼ 0, then jzwj ¼ jzjjwj ¼ j0j ¼ 0 1.70. (a) ð6 þ 5i, 5 10iÞ; (b) ð4 þ 22i, 12 16iÞ; (c) ð8 41i, 4 33iÞ; (d) 12 þ 2i; (e) ffiffiffiffiffi 90 p , ffiffiffiffiffi 54 p 26 CHAPTER 1 Vectors in Rn and Cn , Spatial Vectors
  • 34. Algebra of Matrices 2.1 Introduction This chapter investigates matrices and algebraic operations defined on them. These matrices may be viewed as rectangular arrays of elements where each entry depends on two subscripts (as compared with vectors, where each entry depended on only one subscript). Systems of linear equations and their solutions (Chapter 3) may be efficiently investigated using the language of matrices. Furthermore, certain abstract objects introduced in later chapters, such as ‘‘change of basis,’’ ‘‘linear transformations,’’ and ‘‘quadratic forms,’’ can be represented by these matrices (rectangular arrays). On the other hand, the abstract treatment of linear algebra presented later on will give us new insight into the structure of these matrices. The entries in our matrices will come from some arbitrary, but fixed, field K. The elements of K are called numbers or scalars. Nothing essential is lost if the reader assumes that K is the real field R. 2.2 Matrices A matrix A over a field K or, simply, a matrix A (when K is implicit) is a rectangular array of scalars usually presented in the following form: A ¼ a11 a12 . . . a1n a21 a22 . . . a2n am1 am2 . . . amn 2 6 6 4 3 7 7 5 The rows of such a matrix A are the m horizontal lists of scalars: ða11; a12; . . . ; a1nÞ; ða21; a22; . . . ; a2nÞ; . . . ; ðam1; am2; . . . ; amnÞ and the columns of A are the n vertical lists of scalars: a11 a21 . . . am1 2 6 6 4 3 7 7 5; a12 a22 . . . am2 2 6 6 4 3 7 7 5; . . . ; a1n a2n . . . amn 2 6 6 4 3 7 7 5 Note that the element aij, called the ij-entry or ij-element, appears in row i and column j. We frequently denote such a matrix by simply writing A ¼ ½aij. A matrix with m rows and n columns is called an m by n matrix, written m n. The pair of numbers m and n is called the size of the matrix. Two matrices A and B are equal, written A ¼ B, if they have the same size and if corresponding elements are equal. Thus, the equality of two m n matrices is equivalent to a system of mn equalities, one for each corresponding pair of elements. A matrix with only one row is called a row matrix or row vector, and a matrix with only one column is called a column matrix or column vector. A matrix whose entries are all zero is called a zero matrix and will usually be denoted by 0. 27 CHAPTER 2
  • 35. Matrices whose entries are all real numbers are called real matrices and are said to be matrices over R. Analogously, matrices whose entries are all complex numbers are called complex matrices and are said to be matrices over C. This text will be mainly concerned with such real and complex matrices. EXAMPLE 2.1 (a) The rectangular array A ¼ 1 4 5 0 3 2 is a 2 3 matrix. Its rows are ð1; 4; 5Þ and ð0; 3; 2Þ, and its columns are 1 0 ; 4 3 ; 5 2 (b) The 2 4 zero matrix is the matrix 0 ¼ 0 0 0 0 0 0 0 0 . (c) Find x; y; z; t such that x þ y 2z þ t x y z t ¼ 3 7 1 5 By definition of equality of matrices, the four corresponding entries must be equal. Thus, x þ y ¼ 3; x y ¼ 1; 2z þ t ¼ 7; z t ¼ 5 Solving the above system of equations yields x ¼ 2, y ¼ 1, z ¼ 4, t ¼ 1. 2.3 Matrix Addition and Scalar Multiplication Let A ¼ ½aij and B ¼ ½bij be two matrices with the same size, say m n matrices. The sum of A and B, written A þ B, is the matrix obtained by adding corresponding elements from A and B. That is, A þ B ¼ a11 þ b11 a12 þ b12 . . . a1n þ b1n a21 þ b21 a22 þ b22 . . . a2n þ b2n am1 þ bm1 am2 þ bm2 . . . amn þ bmn 2 6 6 4 3 7 7 5 The product of the matrix A by a scalar k, written k A or simply kA, is the matrix obtained by multiplying each element of A by k. That is, kA ¼ ka11 ka12 . . . ka1n ka21 ka22 . . . ka2n kam1 kam2 . . . kamn 2 6 6 4 3 7 7 5 Observe that A þ B and kA are also m n matrices. We also define A ¼ ð1ÞA and A B ¼ A þ ðBÞ The matrix A is called the negative of the matrix A, and the matrix A B is called the difference of A and B. The sum of matrices with different sizes is not defined. 28 CHAPTER 2 Algebra of Matrices
  • 36. EXAMPLE 2.2 Let A ¼ 1 2 3 0 4 5 and B ¼ 4 6 8 1 3 7 . Then A þ B ¼ 1 þ 4 2 þ 6 3 þ 8 0 þ 1 4 þ ð3Þ 5 þ ð7Þ # ¼ 5 4 11 1 1 2 # 3A ¼ 3ð1Þ 3ð2Þ 3ð3Þ 3ð0Þ 3ð4Þ 3ð5Þ # ¼ 3 6 9 0 12 15 # 2A 3B ¼ 2 4 6 0 8 10 # þ 12 18 24 3 9 21 # ¼ 10 22 18 3 17 31 # The matrix 2A 3B is called a linear combination of A and B. Basic properties of matrices under the operations of matrix addition and scalar multiplication follow. THEOREM 2.1: Consider any matrices A; B; C (with the same size) and any scalars k and k0 . Then (i) ðA þ BÞ þ C ¼ A þ ðB þ CÞ, (v) kðA þ BÞ ¼ kA þ kB, (ii) A þ 0 ¼ 0 þ A ¼ A, (vi) ðk þ k0 ÞA ¼ kA þ k0 A, (iii) A þ ðAÞ ¼ ðAÞ þ A ¼ 0; (vii) ðkk0 ÞA ¼ kðk0 AÞ, (iv) A þ B ¼ B þ A, (viii) 1 A ¼ A. Note first that the 0 in (ii) and (iii) refers to the zero matrix. Also, by (i) and (iv), any sum of matrices A1 þ A2 þ þ An requires no parentheses, and the sum does not depend on the order of the matrices. Furthermore, using (vi) and (viii), we also have A þ A ¼ 2A; A þ A þ A ¼ 3A; . . . and so on. The proof of Theorem 2.1 reduces to showing that the ij-entries on both sides of each matrix equation are equal. (See Problem 2.3.) Observe the similarity between Theorem 2.1 for matrices and Theorem 1.1 for vectors. In fact, the above operations for matrices may be viewed as generalizations of the corresponding operations for vectors. 2.4 Summation Symbol Before we define matrix multiplication, it will be instructive to first introduce the summation symbol S (the Greek capital letter sigma). Suppose f ðkÞ is an algebraic expression involving the letter k. Then the expression P n k¼1 f ðkÞ or equivalently Pn k¼1 f ðkÞ has the following meaning. First we set k ¼ 1 in f ðkÞ, obtaining f ð1Þ Then we set k ¼ 2 in f ðkÞ, obtaining f ð2Þ, and add this to f ð1Þ, obtaining f ð1Þ þ f ð2Þ CHAPTER 2 Algebra of Matrices 29
  • 37. Then we set k ¼ 3 in f ðkÞ, obtaining f ð3Þ, and add this to the previous sum, obtaining f ð1Þ þ f ð2Þ þ f ð3Þ We continue this process until we obtain the sum f ð1Þ þ f ð2Þ þ þ f ðnÞ Observe that at each step we increase the value of k by 1 until we reach n. The letter k is called the index, and 1 and n are called, respectively, the lower and upper limits. Other letters frequently used as indices are i and j. We also generalize our definition by allowing the sum to range from any integer n1 to any integer n2. That is, we define P n2 k¼n1 f ðkÞ ¼ f ðn1Þ þ f ðn1 þ 1Þ þ f ðn1 þ 2Þ þ þ f ðn2Þ EXAMPLE 2.3 (a) P 5 k¼1 xk ¼ x1 þ x2 þ x3 þ x4 þ x5 and P n i¼1 aibi ¼ a1b1 þ a2b2 þ þ anbn (b) P 5 j¼2 j2 ¼ 22 þ 32 þ 42 þ 52 ¼ 54 and P n i¼0 aixi ¼ a0 þ a1x þ a2x2 þ þ anxn (c) P p k¼1 aikbkj ¼ ai1b1j þ ai2b2j þ ai3b3j þ þ aipbpj 2.5 Matrix Multiplication The product of matrices A and B, written AB, is somewhat complicated. For this reason, we first begin with a special case. The product AB of a row matrix A ¼ ½ai and a column matrix B ¼ ½bi with the same number of elements is defined to be the scalar (or 1 1 matrix) obtained by multiplying corresponding entries and adding; that is, AB ¼ ½a1; a2; . . . ; an b1 b2 . . . bn 2 6 6 4 3 7 7 5 ¼ a1b1 þ a2b2 þ þ anbn ¼ P n k¼1 akbk We emphasize that AB is a scalar (or a 1 1 matrix). The product AB is not defined when A and B have different numbers of elements. EXAMPLE 2.4 (a) ½7; 4; 5 3 2 1 2 4 3 5 ¼ 7ð3Þ þ ð4Þð2Þ þ 5ð1Þ ¼ 21 8 5 ¼ 8 (b) ½6; 1; 8; 3 4 9 2 5 2 6 6 4 3 7 7 5 ¼ 24 þ 9 16 þ 15 ¼ 32 We are now ready to define matrix multiplication in general. 30 CHAPTER 2 Algebra of Matrices
  • 38. DEFINITION: Suppose A ¼ ½aik and B ¼ ½bkj are matrices such that the number of columns of A is equal to the number of rows of B; say, A is an m p matrix and B is a p n matrix. Then the product AB is the m n matrix whose ij-entry is obtained by multiplying the ith row of A by the jth column of B. That is, a11 . . . a1p : . . . : ai1 . . . aip : . . . : am1 . . . amp 2 6 6 6 6 4 3 7 7 7 7 5 b11 . . . b1j . . . b1n : . . . : . . . : : . . . : . . . : : . . . : . . . : bp1 . . . bpj . . . bpn 2 6 6 6 6 4 3 7 7 7 7 5 ¼ c11 . . . c1n : . . . : : cij : : . . . : cm1 . . . cmn 2 6 6 6 6 4 3 7 7 7 7 5 where cij ¼ ai1b1j þ ai2b2j þ þ aipbpj ¼ P p k¼1 aikbkj The product AB is not defined if A is an m p matrix and B is a q n matrix, where p 6¼ q. EXAMPLE 2.5 (a) Find AB where A ¼ 1 3 2 1 and B ¼ 2 0 4 5 2 6 . Because A is 2 2 and B is 2 3, the product AB is defined and AB is a 2 3 matrix. To obtain the first row of the product matrix AB, multiply the first row [1, 3] of A by each column of B, 2 5 ; 0 2 ; 4 6 respectively. That is, AB ¼ 2 þ 15 0 6 4 þ 18 ¼ 17 6 14 To obtain the second row of AB, multiply the second row ½2; 1 of A by each column of B. Thus, AB ¼ 17 6 14 4 5 0 þ 2 8 6 ¼ 17 6 14 1 2 14 (b) Suppose A ¼ 1 2 3 4 and B ¼ 5 6 0 2 . Then AB ¼ 5 þ 0 6 4 15 þ 0 18 8 ¼ 5 2 15 10 and BA ¼ 5 þ 18 10 þ 24 0 6 0 8 ¼ 23 34 6 8 The above example shows that matrix multiplication is not commutative—that is, in general, AB 6¼ BA. However, matrix multiplication does satisfy the following properties. THEOREM 2.2: Let A; B; C be matrices. Then, whenever the products and sums are defined, (i) ðABÞC ¼ AðBCÞ (associative law), (ii) AðB þ CÞ ¼ AB þ AC (left distributive law), (iii) ðB þ CÞA ¼ BA þ CA (right distributive law), (iv) kðABÞ ¼ ðkAÞB ¼ AðkBÞ, where k is a scalar. We note that 0A ¼ 0 and B0 ¼ 0, where 0 is the zero matrix. CHAPTER 2 Algebra of Matrices 31
  • 39. 2.6 Transpose of a Matrix The transpose of a matrix A, written AT , is the matrix obtained by writing the columns of A, in order, as rows. For example, 1 2 3 4 5 6 T ¼ 1 4 2 5 3 6 2 4 3 5 and ½1; 3; 5T ¼ 1 3 5 2 4 3 5 In other words, if A ¼ ½aij is an m n matrix, then AT ¼ ½bij is the n m matrix where bij ¼ aji. Observe that the tranpose of a row vector is a column vector. Similarly, the transpose of a column vector is a row vector. The next theorem lists basic properties of the transpose operation. THEOREM 2.3: Let A and B be matrices and let k be a scalar. Then, whenever the sum and product are defined, (i) ðA þ BÞT ¼ AT þ BT , (iii) ðkAÞT ¼ kAT , (ii) ðAT ÞT ¼ A; (iv) ðABÞT ¼ BT AT . We emphasize that, by (iv), the transpose of a product is the product of the transposes, but in the reverse order. 2.7 Square Matrices A square matrix is a matrix with the same number of rows as columns. An n n square matrix is said to be of order n and is sometimes called an n-square matrix. Recall that not every two matrices can be added or multiplied. However, if we only consider square matrices of some given order n, then this inconvenience disappears. Specifically, the operations of addition, multiplication, scalar multiplication, and transpose can be performed on any n n matrices, and the result is again an n n matrix. EXAMPLE 2.6 The following are square matrices of order 3: A ¼ 1 2 3 4 4 4 5 6 7 2 4 3 5 and B ¼ 2 5 1 0 3 2 1 2 4 2 4 3 5 The following are also matrices of order 3: A þ B ¼ 3 3 4 4 1 6 6 8 3 2 6 4 3 7 5; 2A ¼ 2 4 6 8 8 8 10 12 14 2 6 4 3 7 5; AT ¼ 1 4 5 2 4 6 3 4 7 2 6 4 3 7 5 AB ¼ 5 7 15 12 0 20 17 7 35 2 6 4 3 7 5; BA ¼ 27 30 33 22 24 26 27 30 33 2 6 4 3 7 5 Diagonal and Trace Let A ¼ ½aij be an n-square matrix. The diagonal or main diagonal of A consists of the elements with the same subscripts—that is, a11; a22; a33; . . . ; ann 32 CHAPTER 2 Algebra of Matrices
  • 40. The trace of A, written trðAÞ, is the sum of the diagonal elements. Namely, trðAÞ ¼ a11 þ a22 þ a33 þ þ ann The following theorem applies. THEOREM 2.4: Suppose A ¼ ½aij and B ¼ ½bij are n-square matrices and k is a scalar. Then (i) trðA þ BÞ ¼ trðAÞ þ trðBÞ, (iii) trðAT Þ ¼ trðAÞ, (ii) trðkAÞ ¼ k trðAÞ, (iv) trðABÞ ¼ trðBAÞ. EXAMPLE 2.7 Let A and B be the matrices A and B in Example 2.6. Then diagonal of A ¼ f1; 4; 7g and trðAÞ ¼ 1 4 þ 7 ¼ 4 diagonal of B ¼ f2; 3; 4g and trðBÞ ¼ 2 þ 3 4 ¼ 1 Moreover, trðA þ BÞ ¼ 3 1 þ 3 ¼ 5; trð2AÞ ¼ 2 8 þ 14 ¼ 8; trðAT Þ ¼ 1 4 þ 7 ¼ 4 trðABÞ ¼ 5 þ 0 35 ¼ 30; trðBAÞ ¼ 27 24 33 ¼ 30 As expected from Theorem 2.4, trðA þ BÞ ¼ trðAÞ þ trðBÞ; trðAT Þ ¼ trðAÞ; trð2AÞ ¼ 2 trðAÞ Furthermore, although AB 6¼ BA, the traces are equal. Identity Matrix, Scalar Matrices The n-square identity or unit matrix, denoted by In, or simply I, is the n-square matrix with 1’s on the diagonal and 0’s elsewhere. The identity matrix I is similar to the scalar 1 in that, for any n-square matrix A, AI ¼ IA ¼ A More generally, if B is an m n matrix, then BIn ¼ ImB ¼ B. For any scalar k, the matrix kI that contains k’s on the diagonal and 0’s elsewhere is called the scalar matrix corresponding to the scalar k. Observe that ðkIÞA ¼ kðIAÞ ¼ kA That is, multiplying a matrix A by the scalar matrix kI is equivalent to multiplying A by the scalar k. EXAMPLE 2.8 The following are the identity matrices of orders 3 and 4 and the corresponding scalar matrices for k ¼ 5: 1 0 0 0 1 0 0 0 1 2 4 3 5; 1 1 1 1 2 6 6 4 3 7 7 5; 5 0 0 0 5 0 0 0 5 2 4 3 5; 5 5 5 5 2 6 6 4 3 7 7 5 Remark 1: It is common practice to omit blocks or patterns of 0’s when there is no ambiguity, as in the above second and fourth matrices. Remark 2: The Kronecker delta function dij is defined by dij ¼ 0 if i 6¼ j 1 if i ¼ j Thus, the identity matrix may be defined by I ¼ ½dij. CHAPTER 2 Algebra of Matrices 33
  • 41. 2.8 Powers of Matrices, Polynomials in Matrices Let A be an n-square matrix over a field K. Powers of A are defined as follows: A2 ¼ AA; A3 ¼ A2 A; . . . ; Anþ1 ¼ An A; . . . ; and A0 ¼ I Polynomials in the matrix A are also defined. Specifically, for any polynomial f ðxÞ ¼ a0 þ a1x þ a2x2 þ þ anxn where the ai are scalars in K, f ðAÞ is defined to be the following matrix: f ðAÞ ¼ a0I þ a1A þ a2A2 þ þ anAn [Note that f ðAÞ is obtained from f ðxÞ by substituting the matrix A for the variable x and substituting the scalar matrix a0I for the scalar a0.] If f ðAÞ is the zero matrix, then A is called a zero or root of f ðxÞ. EXAMPLE 2.9 Suppose A ¼ 1 2 3 4 . Then A2 ¼ 1 2 3 4 1 2 3 4 ¼ 7 6 9 22 and A3 ¼ A2 A ¼ 7 6 9 22 1 2 3 4 ¼ 11 38 57 106 Suppose f ðxÞ ¼ 2x2 3x þ 5 and gðxÞ ¼ x2 þ 3x 10. Then f ðAÞ ¼ 2 7 6 9 22 3 1 2 3 4 þ 5 1 0 0 1 ¼ 16 18 27 61 gðAÞ ¼ 7 6 9 22 þ 3 1 2 3 4 10 1 0 0 1 ¼ 0 0 0 0 Thus, A is a zero of the polynomial gðxÞ. 2.9 Invertible (Nonsingular) Matrices A square matrix A is said to be invertible or nonsingular if there exists a matrix B such that AB ¼ BA ¼ I where I is the identity matrix. Such a matrix B is unique. That is, if AB1 ¼ B1A ¼ I and AB2 ¼ B2A ¼ I, then B1 ¼ B1I ¼ B1ðAB2Þ ¼ ðB1AÞB2 ¼ IB2 ¼ B2 We call such a matrix B the inverse of A and denote it by A1 . Observe that the above relation is symmetric; that is, if B is the inverse of A, then A is the inverse of B. EXAMPLE 2.10 Suppose that A ¼ 2 5 1 3 and B ¼ 3 5 1 2 . Then AB ¼ 6 5 10 þ 10 3 3 5 þ 6 ¼ 1 0 0 1 and BA ¼ 6 5 15 15 2 þ 2 5 þ 6 ¼ 1 0 0 1 Thus, A and B are inverses. It is known (Theorem 3.16) that AB ¼ I if and only if BA ¼ I. Thus, it is necessary to test only one product to determine whether or not two given matrices are inverses. (See Problem 2.17.) Now suppose A and B are invertible. Then AB is invertible and ðABÞ1 ¼ B1 A1 . More generally, if A1; A2; . . . ; Ak are invertible, then their product is invertible and ðA1A2 . . . AkÞ1 ¼ A1 k . . . A1 2 A1 1 the product of the inverses in the reverse order. 34 CHAPTER 2 Algebra of Matrices
  • 42. Inverse of a 2 2 Matrix Let A be an arbitrary 2 2 matrix, say A ¼ a b c d . We want to derive a formula for A1 , the inverse of A. Specifically, we seek 22 ¼ 4 scalars, say x1, y1, x2, y2, such that a b c d x1 x2 y1 y2 ¼ 1 0 0 1 or ax1 þ by1 ax2 þ by2 cx1 þ dy1 cx2 þ dy2 ¼ 1 0 0 1 Setting the four entries equal to the corresponding entries in the identity matrix yields four equations, which can be partitioned into two 2 2 systems as follows: ax1 þ by1 ¼ 1; ax2 þ by2 ¼ 0 cx1 þ dy1 ¼ 0; cx2 þ dy2 ¼ 1 Suppose we let jAj ¼ ab bc (called the determinant of A). Assuming jAj 6¼ 0, we can solve uniquely for the above unknowns x1, y1, x2, y2, obtaining x1 ¼ d jAj ; y1 ¼ c jAj ; x2 ¼ b jAj ; y2 ¼ a jAj Accordingly, A1 ¼ a b c d 1 ¼ d=jAj b=jAj c=jAj a=jAj ¼ 1 jAj d b c a In other words, when jAj 6¼ 0, the inverse of a 2 2 matrix A may be obtained from A as follows: (1) Interchange the two elements on the diagonal. (2) Take the negatives of the other two elements. (3) Multiply the resulting matrix by 1=jAj or, equivalently, divide each element by jAj. In case jAj ¼ 0, the matrix A is not invertible. EXAMPLE 2.11 Find the inverse of A ¼ 2 3 4 5 and B ¼ 1 3 2 6 . First evaluate jAj ¼ 2ð5Þ 3ð4Þ ¼ 10 12 ¼ 2. Because jAj 6¼ 0, the matrix A is invertible and A1 ¼ 1 2 5 3 4 2 ¼ 5 2 3 2 2 1 Now evaluate jBj ¼ 1ð6Þ 3ð2Þ ¼ 6 6 ¼ 0. Because jBj ¼ 0, the matrix B has no inverse. Remark: The above property that a matrix is invertible if and only if A has a nonzero determinant is true for square matrices of any order. (See Chapter 8.) Inverse of an n n Matrix Suppose A is an arbitrary n-square matrix. Finding its inverse A1 reduces, as above, to finding the solution of a collection of n n systems of linear equations. The solution of such systems and an efficient way of solving such a collection of systems is treated in Chapter 3. 2.10 Special Types of Square Matrices This section describes a number of special kinds of square matrices. Diagonal and Triangular Matrices A square matrix D ¼ ½dij is diagonal if its nondiagonal entries are all zero. Such a matrix is sometimes denoted by D ¼ diagðd11; d22; . . . ; dnnÞ CHAPTER 2 Algebra of Matrices 35
  • 43. where some or all the dii may be zero. For example, 3 0 0 0 7 0 0 0 2 2 4 3 5; 4 0 0 5 ; 6 0 9 8 2 6 6 4 3 7 7 5 are diagonal matrices, which may be represented, respectively, by diagð3; 7; 2Þ; diagð4; 5Þ; diagð6; 0; 9; 8Þ (Observe that patterns of 0’s in the third matrix have been omitted.) A square matrix A ¼ ½aij is upper triangular or simply triangular if all entries below the (main) diagonal are equal to 0—that is, if aij ¼ 0 for i j. Generic upper triangular matrices of orders 2, 3, 4 are as follows: a11 a12 0 a22 ; b11 b12 b13 b22 b23 b33 2 4 3 5; c11 c12 c13 c14 c22 c23 c24 c33 c34 c44 2 6 6 4 3 7 7 5 (As with diagonal matrices, it is common practice to omit patterns of 0’s.) The following theorem applies. THEOREM 2.5: Suppose A ¼ ½aij and B ¼ ½bij are n n (upper) triangular matrices. Then (i) A þ B, kA, AB are triangular with respective diagonals: ða11 þ b11; . . . ; ann þ bnnÞ; ðka11; . . . ; kannÞ; ða11b11; . . . ; annbnnÞ (ii) For any polynomial f ðxÞ, the matrix f ðAÞ is triangular with diagonal ð f ða11Þ; f ða22Þ; . . . ; f ðannÞÞ (iii) A is invertible if and only if each diagonal element aii 6¼ 0, and when A1 exists it is also triangular. A lower triangular matrix is a square matrix whose entries above the diagonal are all zero. We note that Theorem 2.5 is true if we replace ‘‘triangular’’ by either ‘‘lower triangular’’ or ‘‘diagonal.’’ Remark: A nonempty collection A of matrices is called an algebra (of matrices) if A is closed under the operations of matrix addition, scalar multiplication, and matrix multiplication. Clearly, the square matrices with a given order form an algebra of matrices, but so do the scalar, diagonal, triangular, and lower triangular matrices. Special Real Square Matrices: Symmetric, Orthogonal, Normal [Optional until Chapter 12] Suppose now A is a square matrix with real entries—that is, a real square matrix. The relationship between A and its transpose AT yields important kinds of matrices. (a) Symmetric Matrices A matrix A is symmetric if AT ¼ A. Equivalently, A ¼ ½aij is symmetric if symmetric elements (mirror elements with respect to the diagonal) are equal—that is, if each aij ¼ aji. A matrix A is skew-symmetric if AT ¼ A or, equivalently, if each aij ¼ aji. Clearly, the diagonal elements of such a matrix must be zero, because aii ¼ aii implies aii ¼ 0. (Note that a matrix A must be square if AT ¼ A or AT ¼ A.) 36 CHAPTER 2 Algebra of Matrices
  • 44. EXAMPLE 2.12 Let A ¼ 2 3 5 3 6 7 5 7 8 2 4 3 5; B ¼ 0 3 4 3 0 5 4 5 0 2 4 3 5; C ¼ 1 0 0 0 0 1 : (a) By inspection, the symmetric elements in A are equal, or AT ¼ A. Thus, A is symmetric. (b) The diagonal elements of B are 0 and symmetric elements are negatives of each other, or BT ¼ B. Thus, B is skew-symmetric. (c) Because C is not square, C is neither symmetric nor skew-symmetric. (b) Orthogonal Matrices A real matrix A is orthogonal if AT ¼ A1 —that is, if AAT ¼ AT A ¼ I. Thus, A must necessarily be square and invertible. EXAMPLE 2.13 Let A ¼ 1 9 8 9 4 9 4 9 4 9 7 9 8 9 1 9 4 9 2 6 4 3 7 5. Multiplying A by AT yields I; that is, AAT ¼ I. This means AT A ¼ I, as well. Thus, AT ¼ A1 ; that is, A is orthogonal. Now suppose A is a real orthogonal 3 3 matrix with rows u1 ¼ ða1; a2; a3Þ; u2 ¼ ðb1; b2; b3Þ; u3 ¼ ðc1; c2; c3Þ Because A is orthogonal, we must have AAT ¼ I. Namely, AAT ¼ a1 a2 a3 b1 b2 b3 c1 c2 c3 2 4 3 5 a1 b1 c1 a2 b2 c2 a3 b3 c3 2 4 3 5 ¼ 1 0 0 0 1 0 0 0 1 2 4 3 5 ¼ I Multiplying A by AT and setting each entry equal to the corresponding entry in I yields the following nine equations: a2 1 þ a2 2 þ a2 3 ¼ 1; a1b1 þ a2b2 þ a3b3 ¼ 0; a1c1 þ a2c2 þ a3c3 ¼ 0 b1a1 þ b2a2 þ b3a3 ¼ 0; b2 1 þ b2 2 þ b2 3 ¼ 1; b1c1 þ b2c2 þ b3c3 ¼ 0 c1a1 þ c2a2 þ c3a3 ¼ 0; c1b1 þ c2b2 þ c3b3 ¼ 0; c2 1 þ c2 2 þ c2 3 ¼ 1 Accordingly, u1 u1 ¼ 1, u2 u2 ¼ 1, u3 u3 ¼ 1, and ui uj ¼ 0 for i 6¼ j. Thus, the rows u1, u2, u3 are unit vectors and are orthogonal to each other. Generally speaking, vectors u1, u2; . . . ; um in Rn are said to form an orthonormal set of vectors if the vectors are unit vectors and are orthogonal to each other; that is, ui uj ¼ 0 if i 6¼ j 1 if i ¼ j In other words, ui uj ¼ dij where dij is the Kronecker delta function: We have shown that the condition AAT ¼ I implies that the rows of A form an orthonormal set of vectors. The condition AT A ¼ I similarly implies that the columns of A also form an orthonormal set of vectors. Furthermore, because each step is reversible, the converse is true. The above results for 3 3 matrices are true in general. That is, the following theorem holds. THEOREM 2.6: Let A be a real matrix. Then the following are equivalent: (a) A is orthogonal. (b) The rows of A form an orthonormal set. (c) The columns of A form an orthonormal set. For n ¼ 2, we have the following result (proved in Problem 2.28). CHAPTER 2 Algebra of Matrices 37
  • 45. THEOREM 2.7: Let A be a real 2 2 orthogonal matrix. Then, for some real number y, A ¼ cos y sin y sin y cos y or A ¼ cos y sin y sin y cos y (c) Normal Matrices A real matrix A is normal if it commutes with its transpose AT —that is, if AAT ¼ AT A. If A is symmetric, orthogonal, or skew-symmetric, then A is normal. There are also other normal matrices. EXAMPLE 2.14 Let A ¼ 6 3 3 6 . Then AAT ¼ 6 3 3 6 6 3 3 6 ¼ 45 0 0 45 and AT A ¼ 6 3 3 6 6 3 3 6 ¼ 45 0 0 45 Because AAT ¼ AT A, the matrix A is normal. 2.11 Complex Matrices Let A be a complex matrix—that is, a matrix with complex entries. Recall (Section 1.7) that if z ¼ a þ bi is a complex number, then z ¼ a bi is its conjugate. The conjugate of a complex matrix A, written A, is the matrix obtained from A by taking the conjugate of each entry in A. That is, if A ¼ ½aij, then A ¼ ½bij, where bij ¼ aij. (We denote this fact by writing A ¼ ½ aij.) The two operations of transpose and conjugation commute for any complex matrix A, and the special notation AH is used for the conjugate transpose of A. That is, AH ¼ ð AÞT ¼ ðAT Þ Note that if A is real, then AH ¼ AT . [Some texts use A* instead of AH :] EXAMPLE 2.15 Let A ¼ 2 þ 8i 5 3i 4 7i 6i 1 4i 3 þ 2i . Then AH ¼ 2 8i 6i 5 þ 3i 1 þ 4i 4 þ 7i 3 2i 2 4 3 5. Special Complex Matrices: Hermitian, Unitary, Normal [Optional until Chapter 12] Consider a complex matrix A. The relationship between A and its conjugate transpose AH yields important kinds of complex matrices (which are analogous to the kinds of real matrices described above). A complex matrix A is said to be Hermitian or skew-Hermitian according as to whether AH ¼ A or AH ¼ A: Clearly, A ¼ ½aij is Hermitian if and only if symmetric elements are conjugate—that is, if each aij ¼ aji—in which case each diagonal element aii must be real. Similarly, if A is skew-symmetric, then each diagonal element aii ¼ 0. (Note that A must be square if AH ¼ A or AH ¼ A.) A complex matrix A is unitary if AH A1 ¼ A1 AH ¼ I—that is, if AH ¼ A1 : Thus, A must necessarily be square and invertible. We note that a complex matrix A is unitary if and only if its rows (columns) form an orthonormal set relative to the dot product of complex vectors. A complex matrix A is said to be normal if it commutes with AH —that is, if AAH ¼ AH A 38 CHAPTER 2 Algebra of Matrices
  • 46. (Thus, A must be a square matrix.) This definition reduces to that for real matrices when A is real. EXAMPLE 2.16 Consider the following complex matrices: A ¼ 3 1 2i 4 þ 7i 1 þ 2i 4 2i 4 7i 2i 5 2 4 3 5 B ¼ 1 2 1 i 1 þ i i 1 1 þ i 1 þ i 1 þ i 0 2 4 3 5 C ¼ 2 þ 3i 1 i 1 þ 2i (a) By inspection, the diagonal elements of A are real, and the symmetric elements 1 2i and 1 þ 2i are conjugate, 4 þ 7i and 4 7i are conjugate, and 2i and 2i are conjugate. Thus, A is Hermitian. (b) Multiplying B by BH yields I; that is, BBH ¼ I. This implies BH B ¼ I, as well. Thus, BH ¼ B1 , which means B is unitary. (c) To show C is normal, we evaluate CCH and CH C: CCH ¼ 2 þ 3i 1 i 1 þ 2i 2 3i i 1 1 2i ¼ 14 4 4i 4 þ 4i 6 and similarly CH C ¼ 14 4 4i 4 þ 4i 6 . Because CCH ¼ CH C, the complex matrix C is normal. We note that when a matrix A is real, Hermitian is the same as symmetric, and unitary is the same as orthogonal. 2.12 Block Matrices Using a system of horizontal and vertical (dashed) lines, we can partition a matrix A into submatrices called blocks (or cells) of A. Clearly a given matrix may be divided into blocks in different ways. For example, 1 2 0 1 3 2 3 5 7 2 3 1 4 5 9 4 6 3 1 8 2 6 6 4 3 7 7 5; 1 2 0 1 3 2 3 5 7 2 3 1 4 5 9 4 6 3 1 8 2 6 6 4 3 7 7 5; 1 2 0 1 3 2 3 5 7 2 3 1 4 5 9 4 6 3 1 8 2 6 6 4 3 7 7 5 The convenience of the partition of matrices, say A and B, into blocks is that the result of operations on A and B can be obtained by carrying out the computation with the blocks, just as if they were the actual elements of the matrices. This is illustrated below, where the notation A ¼ ½Aij will be used for a block matrix A with blocks Aij. Suppose that A ¼ ½Aij and B ¼ ½Bij are block matrices with the same numbers of row and column blocks, and suppose that corresponding blocks have the same size. Then adding the corresponding blocks of A and B also adds the corresponding elements of A and B, and multiplying each block of A by a scalar k multiplies each element of A by k. Thus, A þ B ¼ A11 þ B11 A12 þ B12 . . . A1n þ B1n A21 þ B21 A22 þ B22 . . . A2n þ B2n . . . . . . . . . . . . Am1 þ Bm1 Am2 þ Bm2 . . . Amn þ Bmn 2 6 6 6 4 3 7 7 7 5 and kA ¼ kA11 kA12 . . . kA1n kA21 kA22 . . . kA2n . . . . . . . . . . . . kAm1 kAm2 . . . kAmn 2 6 6 4 3 7 7 5 CHAPTER 2 Algebra of Matrices 39
  • 47. The case of matrix multiplication is less obvious, but still true. That is, suppose that U ¼ ½Uik and V ¼ ½Vkj are block matrices such that the number of columns of each block Uik is equal to the number of rows of each block Vkj. (Thus, each product UikVkj is defined.) Then UV ¼ W11 W12 . . . W1n W21 W22 . . . W2n . . . . . . . . . . . . Wm1 Wm2 . . . Wmn 2 6 6 4 3 7 7 5; where Wij ¼ Ui1V1j þ Ui2V2j þ þ UipVpj The proof of the above formula for UV is straightforward but detailed and lengthy. It is left as an exercise (Problem 2.85). Square Block Matrices Let M be a block matrix. Then M is called a square block matrix if (i) M is a square matrix. (ii) The blocks form a square matrix. (iii) The diagonal blocks are also square matrices. The latter two conditions will occur if and only if there are the same number of horizontal and vertical lines and they are placed symmetrically. Consider the following two block matrices: A ¼ 1 2 3 4 5 1 1 1 1 1 9 8 7 6 5 4 4 4 4 4 3 5 3 5 3 2 6 6 6 6 4 3 7 7 7 7 5 and B ¼ 1 2 3 4 5 1 1 1 1 1 9 8 7 6 5 4 4 4 4 4 3 5 3 5 3 2 6 6 6 6 4 3 7 7 7 7 5 The block matrix A is not a square block matrix, because the second and third diagonal blocks are not square. On the other hand, the block matrix B is a square block matrix. Block Diagonal Matrices Let M ¼ ½Aij be a square block matrix such that the nondiagonal blocks are all zero matrices; that is, Aij ¼ 0 when i 6¼ j. Then M is called a block diagonal matrix. We sometimes denote such a block diagonal matrix by writing M ¼ diagðA11; A22; . . . ; ArrÞ or M ¼ A11 A22 Arr The importance of block diagonal matrices is that the algebra of the block matrix is frequently reduced to the algebra of the individual blocks. Specifically, suppose f ðxÞ is a polynomial and M is the above block diagonal matrix. Then f ðMÞ is a block diagonal matrix, and f ðMÞ ¼ diagð f ðA11Þ; f ðA22Þ; . . . ; f ðArrÞÞ Also, M is invertible if and only if each Aii is invertible, and, in such a case, M1 is a block diagonal matrix, and M1 ¼ diagðA1 11 ; A1 22 ; . . . ; A1 rr Þ Analogously, a square block matrix is called a block upper triangular matrix if the blocks below the diagonal are zero matrices and a block lower triangular matrix if the blocks above the diagonal are zero matrices. 40 CHAPTER 2 Algebra of Matrices
  • 48. EXAMPLE 2.17 Determine which of the following square block matrices are upper diagonal, lower diagonal, or diagonal: A ¼ 1 2 0 3 4 5 0 0 6 2 4 3 5; B ¼ 1 0 0 0 2 3 4 0 5 0 6 0 0 7 8 9 2 6 6 4 3 7 7 5; C ¼ 1 0 0 0 2 3 0 4 5 2 4 3 5; D ¼ 1 2 0 3 4 5 0 6 7 2 4 3 5 (a) A is upper triangular because the block below the diagonal is a zero block. (b) B is lower triangular because all blocks above the diagonal are zero blocks. (c) C is diagonal because the blocks above and below the diagonal are zero blocks. (d) D is neither upper triangular nor lower triangular. Also, no other partitioning of D will make it into either a block upper triangular matrix or a block lower triangular matrix. SOLVED PROBLEMS Matrix Addition and Scalar Multiplication 2.1 Given A ¼ 1 2 3 4 5 6 and B ¼ 3 0 2 7 1 8 , find: (a) A þ B, (b) 2A 3B. (a) Add the corresponding elements: A þ B ¼ 1 þ 3 2 þ 0 3 þ 2 4 7 5 þ 1 6 þ 8 ¼ 4 2 5 3 6 2 (b) First perform the scalar multiplication and then a matrix addition: 2A 3B ¼ 2 4 6 8 10 12 þ 9 0 6 21 3 24 ¼ 7 4 0 29 7 36 (Note that we multiply B by 3 and then add, rather than multiplying B by 3 and subtracting. This usually prevents errors.) 2.2. Find x; y; z; t where 3 x y z t ¼ x 6 1 2t þ 4 x þ y z þ t 3 : Write each side as a single equation: 3x 3y 3z 3t ¼ x þ 4 x þ y þ 6 z þ t 1 2t þ 3 Set corresponding entries equal to each other to obtain the following system of four equations: 3x ¼ x þ 4; 3y ¼ x þ y þ 6; 3z ¼ z þ t 1; 3t ¼ 2t þ 3 or 2x ¼ 4; 2y ¼ 6 þ x; 2z ¼ t 1; t ¼ 3 The solution is x ¼ 2, y ¼ 4, z ¼ 1, t ¼ 3. 2.3. Prove Theorem 2.1 (i) and (v): (i) ðA þ BÞ þ C ¼ A þ ðB þ CÞ, (v) kðA þ BÞ ¼ kA þ kB. Suppose A ¼ ½aij, B ¼ ½bij, C ¼ ½cij. The proof reduces to showing that corresponding ij-entries in each side of each matrix equation are equal. [We prove only (i) and (v), because the other parts of Theorem 2.1 are proved similarly.] CHAPTER 2 Algebra of Matrices 41
  • 49. (i) The ij-entry of A þ B is aij þ bij; hence, the ij-entry of ðA þ BÞ þ C is ðaij þ bijÞ þ cij. On the other hand, the ij-entry of B þ C is bij þ cij; hence, the ij-entry of A þ ðB þ CÞ is aij þ ðbij þ cijÞ. However, for scalars in K, ðaij þ bijÞ þ cij ¼ aij þ ðbij þ cijÞ Thus, ðA þ BÞ þ C and A þ ðB þ CÞ have identical ij-entries. Therefore, ðA þ BÞ þ C ¼ A þ ðB þ CÞ. (v) The ij-entry of A þ B is aij þ bij; hence, kðaij þ bijÞ is the ij-entry of kðA þ BÞ. On the other hand, the ij- entries of kA and kB are kaij and kbij, respectively. Thus, kaij þ kbij is the ij-entry of kA þ kB. However, for scalars in K, kðaij þ bijÞ ¼ kaij þ kbij Thus, kðA þ BÞ and kA þ kB have identical ij-entries. Therefore, kðA þ BÞ ¼ kA þ kB. Matrix Multiplication 2.4. Calculate: (a) ½8; 4; 5 3 2 1 2 4 3 5, (b) ½6; 1; 7; 5 4 9 3 2 2 6 6 4 3 7 7 5, (c) ½3; 8; 2; 4 5 1 6 2 4 3 5 (a) Multiply the corresponding entries and add: ½8; 4; 5 3 2 1 2 4 3 5 ¼ 8ð3Þ þ ð4Þð2Þ þ 5ð1Þ ¼ 24 8 5 ¼ 11 (b) Multiply the corresponding entries and add: ½6; 1; 7; 5 4 9 3 2 2 6 6 6 4 3 7 7 7 5 ¼ 24 þ 9 21 þ 10 ¼ 22 (c) The product is not defined when the row matrix and the column matrix have different numbers of elements. 2.5. Let ðr sÞ denote an r s matrix. Find the sizes of those matrix products that are defined: (a) ð2 3Þð3 4Þ; (c) ð1 2Þð3 1Þ; (e) ð4 4Þð3 3Þ (b) ð4 1Þð1 2Þ, (d) ð5 2Þð2 3Þ, (f) ð2 2Þð2 4Þ In each case, the product is defined if the inner numbers are equal, and then the product will have the size of the outer numbers in the given order. (a) 2 4, (c) not defined, (e) not defined (b) 4 2, (d) 5 3, (f) 2 4 2.6. Let A ¼ 1 3 2 1 and B ¼ 2 0 4 3 2 6 . Find: (a) AB, (b) BA. (a) Because A is a 2 2 matrix and B a 2 3 matrix, the product AB is defined and is a 2 3 matrix. To obtain the entries in the first row of AB, multiply the first row ½1; 3 of A by the columns 2 3 ; 0 2 ; 4 6 of B, respectively, as follows: AB ¼ 1 3 2 1 2 0 4 3 2 6 ¼ 2 þ 9 0 6 4 þ 18 ¼ 11 6 14 42 CHAPTER 2 Algebra of Matrices
  • 50. To obtain the entries in the second row of AB, multiply the second row ½2; 1 of A by the columns of B: AB ¼ 1 3 2 1 2 0 4 3 2 6 ¼ 11 6 14 4 3 0 þ 2 8 6 Thus, AB ¼ 11 6 14 1 2 14 : (b) The size of B is 2 3 and that of A is 2 2. The inner numbers 3 and 2 are not equal; hence, the product BA is not defined. 2.7. Find AB, where A ¼ 2 3 1 4 2 5 and B ¼ 2 1 0 6 1 3 5 1 4 1 2 2 2 4 3 5. Because A is a 2 3 matrix and B a 3 4 matrix, the product AB is defined and is a 2 4 matrix. Multiply the rows of A by the columns of B to obtain AB ¼ 4 þ 3 4 2 þ 9 1 0 15 þ 2 12 þ 3 2 8 2 þ 20 4 6 þ 5 0 þ 10 10 24 2 þ 10 ¼ 3 6 13 13 26 5 0 32 : 2.8. Find: (a) 1 6 3 5 2 7 , (b) 2 7 1 6 3 5 , (c) ½2; 7 1 6 3 5 . (a) The first factor is 2 2 and the second is 2 1, so the product is defined as a 2 1 matrix: 1 6 3 5 2 7 ¼ 2 42 6 35 ¼ 40 41 (b) The product is not defined, because the first factor is 2 1 and the second factor is 2 2. (c) The first factor is 1 2 and the second factor is 2 2, so the product is defined as a 1 2 (row) matrix: ½2; 7 1 6 3 5 ¼ ½2 þ 21; 12 35 ¼ ½23; 23 2.9. Clearly, 0A ¼ 0 and A0 ¼ 0, where the 0’s are zero matrices (with possibly different sizes). Find matrices A and B with no zero entries such that AB ¼ 0. Let A ¼ 1 2 2 4 and B ¼ 6 2 3 1 . Then AB ¼ 0 0 0 0 . 2.10. Prove Theorem 2.2(i): ðABÞC ¼ AðBCÞ. Let A ¼ ½aij, B ¼ ½bjk, C ¼ ½ckl, and let AB ¼ S ¼ ½sik, BC ¼ T ¼ ½tjl. Then sik ¼ P m j¼1 aijbjk and tjl ¼ P n k¼1 bjkckl Multiplying S ¼ AB by C, the il-entry of ðABÞC is si1c1l þ si2c2l þ þ sincnl ¼ P n k¼1 sikckl ¼ P n k¼1 P m j¼1 ðaijbjkÞckl On the other hand, multiplying A by T ¼ BC, the il-entry of AðBCÞ is ai1t1l þ ai2t2l þ þ aintnl ¼ P m j¼1 aijtjl ¼ P m j¼1 P n k¼1 aijðbjkcklÞ The above sums are equal; that is, corresponding elements in ðABÞC and AðBCÞ are equal. Thus, ðABÞC ¼ AðBCÞ. CHAPTER 2 Algebra of Matrices 43
  • 51. 2.11. Prove Theorem 2.2(ii): AðB þ CÞ ¼ AB þ AC. Let A ¼ ½aij, B ¼ ½bjk, C ¼ ½cjk, and let D ¼ B þ C ¼ ½djk, E ¼ AB ¼ ½eik, F ¼ AC ¼ ½ fik. Then djk ¼ bjk þ cjk; eik ¼ P m j¼1 aijbjk; fik ¼ P m j¼1 aijcjk Thus, the ik-entry of the matrix AB þ AC is eik þ fik ¼ P m j¼1 aijbjk þ P m j¼1 aijcjk ¼ P m j¼1 aijðbjk þ cjkÞ On the other hand, the ik-entry of the matrix AD ¼ AðB þ CÞ is ai1d1k þ ai2d2k þ þ aimdmk ¼ P m j¼1 aijdjk ¼ P m j¼1 aijðbjk þ cjkÞ Thus, AðB þ CÞ ¼ AB þ AC, because the corresponding elements are equal. Transpose 2.12. Find the transpose of each matrix: A ¼ 1 2 3 7 8 9 ; B ¼ 1 2 3 2 4 5 3 5 6 2 4 3 5; C ¼ ½1; 3; 5; 7; D ¼ 2 4 6 2 4 3 5 Rewrite the rows of each matrix as columns to obtain the transpose of the matrix: AT ¼ 1 7 2 8 3 9 2 4 3 5; BT ¼ 1 2 3 2 4 5 3 5 6 2 4 3 5; CT ¼ 1 3 5 7 2 6 6 4 3 7 7 5; DT ¼ ½2; 4; 6 (Note that BT ¼ B; such a matrix is said to be symmetric. Note also that the transpose of the row vector C is a column vector, and the transpose of the column vector D is a row vector.) 2.13. Prove Theorem 2.3(iv): ðABÞT ¼ BT AT . Let A ¼ ½aik and B ¼ ½bkj. Then the ij-entry of AB is ai1b1j þ ai2b2j þ þ aimbmj This is the ji-entry (reverse order) of ðABÞT . Now column j of B becomes row j of BT , and row i of A becomes column i of AT . Thus, the ij-entry of BT AT is ½b1j; b2j; . . . ; bmj½ai1; ai2; . . . ; aim T ¼ b1jai1 þ b2jai2 þ þ bmjaim Thus, ðABÞT ¼ BT AT on because the corresponding entries are equal. Square Matrices 2.14. Find the diagonal and trace of each matrix: (a) A ¼ 1 3 6 2 5 8 4 2 9 2 4 3 5, (b) B ¼ 2 4 8 3 7 9 5 0 2 2 4 3 5, (c) C ¼ 1 2 3 4 5 6 . (a) The diagonal of A consists of the elements from the upper left corner of A to the lower right corner of A or, in other words, the elements a11, a22, a33. Thus, the diagonal of A consists of the numbers 1; 5, and 9. The trace of A is the sum of the diagonal elements. Thus, trðAÞ ¼ 1 5 þ 9 ¼ 5 (b) The diagonal of B consists of the numbers 2; 7, and 2. Hence, trðBÞ ¼ 2 7 þ 2 ¼ 3 (c) The diagonal and trace are only defined for square matrices. 44 CHAPTER 2 Algebra of Matrices
  • 52. 2.15. Let A ¼ 1 2 4 3 , and let f ðxÞ ¼ 2x3 4x þ 5 and gðxÞ ¼ x2 þ 2x þ 11. Find (a) A2 , (b) A3 , (c) f ðAÞ, (d) gðAÞ. (a) A2 ¼ AA ¼ 1 2 4 3 1 2 4 3 ¼ 1 þ 8 2 6 4 12 8 þ 9 ¼ 9 4 8 17 (b) A3 ¼ AA2 ¼ 1 2 4 3 9 4 8 17 ¼ 9 16 4 þ 34 36 þ 24 16 51 ¼ 7 30 60 67 (c) First substitute A for x and 5I for the constant in f ðxÞ, obtaining f ðAÞ ¼ 2A3 4A þ 5I ¼ 2 7 30 60 67 4 1 2 4 3 þ 5 1 0 0 1 Now perform the scalar multiplication and then the matrix addition: f ðAÞ ¼ 14 60 120 134 þ 4 8 16 12 þ 5 0 0 5 ¼ 13 52 104 117 (d) Substitute A for x and 11I for the constant in gðxÞ, and then calculate as follows: gðAÞ ¼ A2 þ 2A 11I ¼ 9 4 8 17 þ 2 1 2 4 3 11 1 0 0 1 ¼ 9 4 8 17 þ 2 4 8 6 þ 11 0 0 11 ¼ 0 0 0 0 Because gðAÞ is the zero matrix, A is a root of the polynomial gðxÞ. 2.16. Let A ¼ 1 3 4 3 . (a) Find a nonzero column vector u ¼ x y such that Au ¼ 3u. (b) Describe all such vectors. (a) First set up the matrix equation Au ¼ 3u, and then write each side as a single matrix (column vector) as follows: 1 3 4 3 x y ¼ 3 x y ; and then x þ 3y 4x 3y ¼ 3x 3y Set the corresponding elements equal to each other to obtain a system of equations: x þ 3y ¼ 3x 4x 3y ¼ 3y or 2x 3y ¼ 0 4x 6y ¼ 0 or 2x 3y ¼ 0 The system reduces to one nondegenerate linear equation in two unknowns, and so has an infinite number of solutions. To obtain a nonzero solution, let, say, y ¼ 2; then x ¼ 3. Thus, u ¼ ð3; 2ÞT is a desired nonzero vector. (b) To find the general solution, set y ¼ a, where a is a parameter. Substitute y ¼ a into 2x 3y ¼ 0 to obtain x ¼ 3 2 a. Thus, u ¼ ð3 2 a; aÞT represents all such solutions. Invertible Matrices, Inverses 2.17. Show that A ¼ 1 0 2 2 1 3 4 1 8 2 4 3 5 and B ¼ 11 2 2 4 0 1 6 1 1 2 4 3 5 are inverses. Compute the product AB, obtaining AB ¼ 11 þ 0 þ 12 2 þ 0 2 2 þ 0 2 22 þ 4 þ 18 4 þ 0 3 4 1 3 44 4 þ 48 8 þ 0 8 8 þ 1 8 2 4 3 5 ¼ 1 0 0 0 1 0 0 0 1 2 4 3 5 ¼ I Because AB ¼ I, we can conclude (Theorem 3.16) that BA ¼ I. Accordingly, A and B are inverses. CHAPTER 2 Algebra of Matrices 45
  • 53. 2.18. Find the inverse, if possible, of each matrix: (a) A ¼ 5 3 4 2 ; (b) B ¼ 2 3 1 3 ; (c) 2 6 3 9 : Use the formula for the inverse of a 2 2 matrix appearing in Section 2.9. (a) First find jAj ¼ 5ð2Þ 3ð4Þ ¼ 10 12 ¼ 2. Next interchange the diagonal elements, take the negatives of the nondiagonal elements, and multiply by 1=jAj: A1 ¼ 1 2 2 3 4 5 ¼ 1 3 2 2 5 2 # (b) First find jBj ¼ 2ð3Þ ð3Þð1Þ ¼ 6 þ 3 ¼ 9. Next interchange the diagonal elements, take the negatives of the nondiagonal elements, and multiply by 1=jBj: B1 ¼ 1 9 3 3 1 2 ¼ 1 3 1 3 1 9 2 9 # (c) First find jCj ¼ 2ð9Þ 6ð3Þ ¼ 18 18 ¼ 0. Because jCj ¼ 0; C has no inverse. 2.19. Let A ¼ 1 1 1 0 1 2 1 2 4 2 6 6 4 3 7 7 5. Find A1 ¼ x1 x2 x3 y1 y2 y3 z1 z2 z3 2 4 3 5. Multiplying A by A1 and setting the nine entries equal to the nine entries of the identity matrix I yields the following three systems of three equations in three of the unknowns: x1 þ y1 þ z1 ¼ 1 x2 þ y2 þ z2 ¼ 0 x3 þ y3 þ z3 ¼ 0 y1 þ 2z1 ¼ 0 y2 þ 2z2 ¼ 1 y3 þ 2z3 ¼ 0 x1 þ 2y1 þ 4z1 ¼ 0 x2 þ 2y2 þ 4z2 ¼ 0 x3 þ 2y3 þ 4z3 ¼ 1 [Note that A is the coefficient matrix for all three systems.] Solving the three systems for the nine unknowns yields x1 ¼ 0; y1 ¼ 2; z1 ¼ 1; x2 ¼ 2; y2 ¼ 3; z2 ¼ 1; x3 ¼ 1; y3 ¼ 2; z3 ¼ 1 Thus; A1 ¼ 0 2 1 2 3 2 1 1 1 2 6 4 3 7 5 (Remark: Chapter 3 gives an efficient way to solve the three systems.) 2.20. Let A and B be invertible matrices (with the same size). Show that AB is also invertible and ðABÞ1 ¼ B1 A1 . [Thus, by induction, ðA1A2 . . . AmÞ1 ¼ A1 m . . . A1 2 A1 1 .] Using the associativity of matrix multiplication, we get ðABÞðB1 A1 Þ ¼ AðBB1 ÞA1 ¼ AIA1 ¼ AA1 ¼ I ðB1 A1 ÞðABÞ ¼ B1 ðA1 AÞB ¼ A1 IB ¼ B1 B ¼ I Thus, ðABÞ1 ¼ B1 A1 . 46 CHAPTER 2 Algebra of Matrices
  • 54. Diagonal and Triangular Matrices 2.21. Write out the diagonal matrices A ¼ diagð4; 3; 7Þ, B ¼ diagð2; 6Þ, C ¼ diagð3; 8; 0; 5Þ. Put the given scalars on the diagonal and 0’s elsewhere: A ¼ 4 0 0 0 3 0 0 0 7 2 4 3 5; B ¼ 2 0 0 6 ; C ¼ 3 8 0 5 2 6 6 4 3 7 7 5 2.22. Let A ¼ diagð2; 3; 5Þ and B ¼ diagð7; 0; 4Þ. Find (a) AB, A2 , B2 ; (b) f ðAÞ, where f ðxÞ ¼ x2 þ 3x 2; (c) A1 and B1 . (a) The product matrix AB is a diagonal matrix obtained by multiplying corresponding diagonal entries; hence, AB ¼ diagð2ð7Þ; 3ð0Þ; 5ð4ÞÞ ¼ diagð14; 0; 20Þ Thus, the squares A2 and B2 are obtained by squaring each diagonal entry; hence, A2 ¼ diagð22 ; 32 ; 52 Þ ¼ diagð4; 9; 25Þ and B2 ¼ diagð49; 0; 16Þ (b) f ðAÞ is a diagonal matrix obtained by evaluating f ðxÞ at each diagonal entry. We have f ð2Þ ¼ 4 þ 6 2 ¼ 8; f ð3Þ ¼ 9 þ 9 2 ¼ 16; f ð5Þ ¼ 25 þ 15 2 ¼ 38 Thus, f ðAÞ ¼ diagð8; 16; 38Þ. (c) The inverse of a diagonal matrix is a diagonal matrix obtained by taking the inverse (reciprocal) of each diagonal entry. Thus, A1 ¼ diagð1 2 ; 1 3 ; 1 5Þ, but B has no inverse because there is a 0 on the diagonal. 2.23. Find a 2 2 matrix A such that A2 is diagonal but not A. Let A ¼ 1 2 3 1 . Then A2 ¼ 7 0 0 7 , which is diagonal. 2.24. Find an upper triangular matrix A such that A3 ¼ 8 57 0 27 . Set A ¼ x y 0 z . Then x3 ¼ 8, so x ¼ 2; and z3 ¼ 27, so z ¼ 3. Next calculate A3 using x ¼ 2 and y ¼ 3: A2 ¼ 2 y 0 3 2 y 0 3 ¼ 4 5y 0 9 and A3 ¼ 2 y 0 3 4 5y 0 9 ¼ 8 19y 0 27 Thus, 19y ¼ 57, or y ¼ 3. Accordingly, A ¼ 2 3 0 3 . 2.25. Let A ¼ ½aij and B ¼ ½bij be upper triangular matrices. Prove that AB is upper triangular with diagonal a11b11, a22b22; . . . ; annbnn. Let AB ¼ ½cij. Then cij ¼ Pn k¼1 aikbkj and cii ¼ Pn k¼1 aikbki. Suppose i j. Then, for any k, either i k or k j, so that either aik ¼ 0 or bkj ¼ 0. Thus, cij ¼ 0, and AB is upper triangular. Suppose i ¼ j. Then, for k i, we have aik ¼ 0; and, for k i, we have bki ¼ 0. Hence, cii ¼ aiibii, as claimed. [This proves one part of Theorem 2.5(i); the statements for A þ B and kA are left as exercises.] CHAPTER 2 Algebra of Matrices 47
  • 55. Special Real Matrices: Symmetric and Orthogonal 2.26. Determine whether or not each of the following matrices is symmetric—that is, AT ¼ A—or skew-symmetric—that is, AT ¼ A: (a) A ¼ 5 7 1 7 8 2 1 2 4 2 4 3 5; (b) B ¼ 0 4 3 4 0 5 3 5 0 2 4 3 5; (c) C ¼ 0 0 0 0 0 0 (a) By inspection, the symmetric elements (mirror images in the diagonal) are 7 and 7, 1 and 1, 2 and 2. Thus, A is symmetric, because symmetric elements are equal. (b) By inspection, the diagonal elements are all 0, and the symmetric elements, 4 and 4, 3 and 3, and 5 and 5, are negatives of each other. Hence, B is skew-symmetric. (c) Because C is not square, C is neither symmetric nor skew-symmetric. 2.27. Suppose B ¼ 4 x þ 2 2x 3 x þ 1 is symmetric. Find x and B. Set the symmetric elements x þ 2 and 2x 3 equal to each other, obtaining 2x 3 ¼ x þ 2 or x ¼ 5. Hence, B ¼ 4 7 7 6 . 2.28. Let A be an arbitrary 2 2 (real) orthogonal matrix. (a) Prove: If ða; bÞ is the first row of A, then a2 þ b2 ¼ 1 and A ¼ a b b a or A ¼ a b b a : (b) Prove Theorem 2.7: For some real number y, A ¼ cos y sin y sin y cos y or A ¼ cos y sin y sin y cos y (a) Suppose ðx; yÞ is the second row of A. Because the rows of A form an orthonormal set, we get a2 þ b2 ¼ 1; x2 þ y2 ¼ 1; ax þ by ¼ 0 Similarly, the columns form an orthogonal set, so a2 þ x2 ¼ 1; b2 þ y2 ¼ 1; ab þ xy ¼ 0 Therefore, x2 ¼ 1 a2 ¼ b2 , whence x ¼ b: Case (i): x ¼ b. Then bða þ yÞ ¼ 0, so y ¼ a. Case (ii): x ¼ b. Then bðy aÞ ¼ 0, so y ¼ a. This means, as claimed, A ¼ a b b a or A ¼ a b b a (b) Because a2 þ b2 ¼ 1, we have 1 a 1. Let a ¼ cos y. Then b2 ¼ 1 cos2 y, so b ¼ sin y. This proves the theorem. 2.29. Find a 2 2 orthogonal matrix A whose first row is a (positive) multiple of ð3; 4Þ. Normalize ð3; 4Þ to get ð3 5 ; 4 5Þ. Then, by Problem 2.28, A ¼ 3 5 4 5 4 5 3 5 # or A ¼ 3 5 4 5 4 5 3 5 # : 2.30. Find a 3 3 orthogonal matrix P whose first two rows are multiples of u1 ¼ ð1; 1; 1Þ and u2 ¼ ð0; 1; 1Þ, respectively. (Note that, as required, u1 and u2 are orthogonal.) 48 CHAPTER 2 Algebra of Matrices
  • 56. First find a nonzero vector u3 orthogonal to u1 and u2; say (cross product) u3 ¼ u1 u2 ¼ ð2; 1; 1Þ. Let A be the matrix whose rows are u1; u2; u3; and let P be the matrix obtained from A by normalizing the rows of A. Thus, A ¼ 1 1 1 0 1 1 2 1 1 2 6 4 3 7 5 and P ¼ 1= ffiffiffi 3 p 1= ffiffiffi 3 p 1= ffiffiffi 3 p 0 1= ffiffiffi 2 p 1= ffiffiffi 2 p 2= ffiffiffi 6 p 1= ffiffiffi 6 p 1= ffiffiffi 6 p 2 6 6 6 4 3 7 7 7 5 Complex Matrices: Hermitian and Unitary Matrices 2.31. Find AH where (a) A ¼ 3 5i 2 þ 4i 6 þ 7i 1 þ 8i , (b) A ¼ 2 3i 5 þ 8i 4 3 7i 6 i 5i 2 4 3 5 Recall that AH ¼ AT , the conjugate tranpose of A. Thus, (a) AH ¼ 3 þ 5i 6 7i 2 4i 1 8i , (b) AH ¼ 2 þ 3i 4 6 þ i 5 8i 3 þ 7i 5i 2.32. Show that A ¼ 1 3 2 3 i 2 3 i 2 3 i 1 3 2 3 i # is unitary. The rows of A form an orthonormal set: 1 3 2 3 i; 2 3 i 1 3 2 3 i; 2 3 i ¼ 1 9 þ 4 9 þ 4 9 ¼ 1 1 3 2 3 i; 2 3 i 2 3 i; 1 3 2 3 i ¼ 2 9 i þ 4 9 þ 2 9 i 4 9 ¼ 0 2 3 i; 1 3 2 3 i 2 3 i; 1 3 2 3 i ¼ 4 9 þ 1 9 þ 4 9 ¼ 1 Thus, A is unitary. 2.33. Prove the complex analogue of Theorem 2.6: Let A be a complex matrix. Then the following are equivalent: (i) A is unitary. (ii) The rows of A form an orthonormal set. (iii) The columns of A form an orthonormal set. (The proof is almost identical to the proof on page 37 for the case when A is a 3 3 real matrix.) First recall that the vectors u1; u2; . . . ; un in Cn form an orthonormal set if they are unit vectors and are orthogonal to each other, where the dot product in Cn is defined by ða1; a2; . . . ; anÞ ðb1; b2; . . . ; bnÞ ¼ a1 b1 þ a2 b2 þ þ an bn Suppose A is unitary, and R1; R2; . . . ; Rn are its rows. Then RT 1 ; RT 2 ; . . . ; RT n are the columns of AH . Let AAH ¼ ½cij. By matrix multiplication, cij ¼ Ri RT j ¼ Ri Rj. Because A is unitary, we have AAH ¼ I. Multi- plying A by AH and setting each entry cij equal to the corresponding entry in I yields the following n2 equations: R1 R1 ¼ 1; R2 R2 ¼ 1; . . . ; Rn Rn ¼ 1; and Ri Rj ¼ 0; for i 6¼ j Thus, the rows of A are unit vectors and are orthogonal to each other; hence, they form an orthonormal set of vectors. The condition AT A ¼ I similarly shows that the columns of A also form an orthonormal set of vectors. Furthermore, because each step is reversible, the converse is true. This proves the theorem. Block Matrices 2.34. Consider the following block matrices (which are partitions of the same matrix): (a) 1 2 0 1 3 2 3 5 7 2 3 1 4 5 9 2 4 3 5, (b) 1 2 0 1 3 2 3 5 7 2 3 1 4 5 9 2 4 3 5 CHAPTER 2 Algebra of Matrices 49
  • 57. Find the size of each block matrix and also the size of each block. (a) The block matrix has two rows of matrices and three columns of matrices; hence, its size is 2 3. The block sizes are 2 2, 2 2, and 2 1 for the first row; and 1 2, 1 2, and 1 1 for the second row. (b) The size of the block matrix is 3 2; and the block sizes are 1 3 and 1 2 for each of the three rows. 2.35. Compute AB using block multiplication, where A ¼ 1 2 1 3 4 0 0 0 2 2 4 3 5 and B ¼ 1 2 3 1 4 5 6 1 0 0 0 1 2 4 3 5 Here A ¼ E F 01 2 G and B ¼ R S 01 3 T , where E; F; G; R; S; T are the given blocks, and 01 2 and 01 3 are zero matrices of the indicated sites. Hence, AB ¼ ER ES þ FT 01 3 GT ¼ ½ 0 0 0 9 12 15 19 26 33 2 3 7 þ 1 0 2 6 4 3 7 5 ¼ 9 12 15 4 19 26 33 7 0 0 0 2 2 4 3 5 2.36. Let M ¼ diagðA; B; CÞ, where A ¼ 1 2 3 4 , B ¼ ½5, C ¼ 1 3 5 7 . Find M2 . Because M is block diagonal, square each block: A2 ¼ 7 10 15 22 ; B2 ¼ ½25; C2 ¼ 16 24 40 64 ; so M2 ¼ 7 10 15 22 25 16 24 40 64 2 6 6 6 6 4 3 7 7 7 7 5 Miscellaneous Problem 2.37. Let f ðxÞ and gðxÞ be polynomials and let A be a square matrix. Prove (a) ð f þ gÞðAÞ ¼ f ðAÞ þ gðAÞ, (b) ð f gÞðAÞ ¼ f ðAÞgðAÞ, (c) f ðAÞgðAÞ ¼ gðAÞ f ðAÞ. Suppose f ðxÞ ¼ Pr i¼1 aixi and gðxÞ ¼ Ps j¼1 bjxj . (a) We can assume r ¼ s ¼ n by adding powers of x with 0 as their coefficients. Then f ðxÞ þ gðxÞ ¼ P n i¼1 ðai þ biÞxi Hence, ð f þ gÞðAÞ ¼ P n i¼1 ðai þ biÞAi ¼ P n i¼1 ai Ai þ P n i¼1 biAi ¼ f ðAÞ þ gðAÞ (b) We have f ðxÞgðxÞ ¼ P i;j aibjxiþj . Then f ðAÞgðAÞ ¼ P i aiAi ! P j bjAj ! ¼ P i;j aibjAiþj ¼ ð fgÞðAÞ (c) Using f ðxÞgðxÞ ¼ gðxÞf ðxÞ, we have f ðAÞgðAÞ ¼ ð fgÞðAÞ ¼ ðg f ÞðAÞ ¼ gðAÞ f ðAÞ 50 CHAPTER 2 Algebra of Matrices
  • 58. SUPPLEMENTARY PROBLEMS Algebra of Matrices Problems 2.38–2.41 refer to the following matrices: A ¼ 1 2 3 4 ; B ¼ 5 0 6 7 ; C ¼ 1 3 4 2 6 5 ; D ¼ 3 7 1 4 8 9 2.38. Find (a) 5A 2B, (b) 2A þ 3B, (c) 2C 3D. 2.39. Find (a) AB and ðABÞC, (b) BC and AðBCÞ. [Note that ðABÞC ¼ AðBCÞ.] 2.40. Find (a) A2 and A3 , (b) AD and BD, (c) CD. 2.41. Find (a) AT , (b) BT , (c) ðABÞT , (d) AT BT . [Note that AT BT 6¼ ðABÞT .] Problems 2.42 and 2.43 refer to the following matrices: A ¼ 1 1 2 0 3 4 ; B ¼ 4 0 3 1 2 3 ; C ¼ 2 3 0 1 5 1 4 2 1 0 0 3 2 4 3 5; D ¼ 2 1 3 2 4 3 5: 2.42. Find (a) 3A 4B, (b) AC, (c) BC, (d) AD, (e) BD, ( f ) CD. 2.43. Find (a) AT , (b) AT B, (c) AT C. 2.44. Let A ¼ 1 2 3 6 . Find a 2 3 matrix B with distinct nonzero entries such that AB ¼ 0. 2.45 Let e1 ¼ ½1; 0; 0, e2 ¼ ½0; 1; 0, e3 ¼ ½0; 0; 1, and A ¼ a1 a2 a3 a4 b1 b2 b3 b4 c1 c2 c3 c4 2 4 3 5. Find e1A, e2A, e3A. 2.46. Let ei ¼ ½0; . . . ; 0; 1; 0; . . . ; 0, where 1 is the ith entry. Show (a) eiA ¼ Ai, ith row of A. (c) If eiA ¼ eiB, for each i, then A ¼ B. (b) BeT j ¼ Bj , jth column of B. (d) If AeT j ¼ BeT j , for each j, then A ¼ B. 2.47. Prove Theorem 2.2(iii) and (iv): (iii) ðB þ CÞA ¼ BA þ CA, (iv) kðABÞ ¼ ðkAÞB ¼ AðkBÞ. 2.48. Prove Theorem 2.3: (i) ðA þ BÞT ¼ AT þ BT , (ii) ðAT ÞT ¼ A, (iii) ðkAÞT ¼ kAT . 2.49. Show (a) If A has a zero row, then AB has a zero row. (b) If B has a zero column, then AB has a zero column. Square Matrices, Inverses 2.50. Find the diagonal and trace of each of the following matrices: (a) A ¼ 2 5 8 3 6 7 4 0 1 2 4 3 5, (b) B ¼ 1 3 4 6 1 7 2 5 1 2 4 3 5, (c) C ¼ 4 3 6 2 5 0 Problems 2.51–2.53 refer to A ¼ 2 5 3 1 , B ¼ 4 2 1 6 , C ¼ 6 4 3 2 . 2.51. Find (a) A2 and A3 , (b) f ðAÞ and gðAÞ, where f ðxÞ ¼ x3 2x2 5; gðxÞ ¼ x2 3x þ 17: CHAPTER 2 Algebra of Matrices 51
  • 59. 2.52. Find (a) B2 and B3 , (b) f ðBÞ and gðBÞ, where f ðxÞ ¼ x2 þ 2x 22; gðxÞ ¼ x2 3x 6: 2.53. Find a nonzero column vector u such that Cu ¼ 4u. 2.54. Find the inverse of each of the following matrices (if it exists): A ¼ 7 4 5 3 ; B ¼ 2 3 4 5 ; C ¼ 4 6 2 3 ; D ¼ 5 2 6 3 2.55. Find the inverses of A ¼ 1 1 2 1 2 5 1 3 7 2 4 3 5 and B ¼ 1 1 1 0 1 1 1 3 2 2 4 3 5. [Hint: See Problem 2.19.] 2.56. Suppose A is invertible. Show that if AB ¼ AC, then B ¼ C. Give an example of a nonzero matrix A such that AB ¼ AC but B 6¼ C. 2.57. Find 2 2 invertible matrices A and B such that A þ B 6¼ 0 and A þ B is not invertible. 2.58. Show (a) A is invertible if and only if AT is invertible. (b) The operations of inversion and transpose commute; that is, ðAT Þ1 ¼ ðA1 ÞT . (c) If A has a zero row or zero column, then A is not invertible. Diagonal and triangular matrices 2.59. Let A ¼ diagð1; 2; 3Þ and B ¼ diagð2; 5; 0Þ. Find (a) AB, A2 , B2 ; (b) f ðAÞ, where f ðxÞ ¼ x2 þ 4x 3; (c) A1 and B1 . 2.60. Let A ¼ 1 2 0 1 and B ¼ 1 1 0 0 1 1 0 0 1 2 4 3 5. (a) Find An . (b) Find Bn . 2.61. Find all real triangular matrices A such that A2 ¼ B, where (a) B ¼ 4 21 0 25 , (b) B ¼ 1 4 0 9 . 2.62. Let A ¼ 5 2 0 k . Find all numbers k for which A is a root of the polynomial: (a) f ðxÞ ¼ x2 7x þ 10, (b) gðxÞ ¼ x2 25, (c) hðxÞ ¼ x2 4. 2.63. Let B ¼ 1 0 26 27 : Find a matrix A such that A3 ¼ B. 2.64. Let B ¼ 1 8 5 0 9 5 0 0 4 2 4 3 5. Find a triangular matrix A with positive diagonal entries such that A2 ¼ B. 2.65. Using only the elements 0 and 1, find the number of 3 3 matrices that are (a) diagonal, (b) upper triangular, (c) nonsingular and upper triangular. Generalize to n n matrices. 2.66. Let Dk ¼ kI, the scalar matrix belonging to the scalar k. Show (a) DkA ¼ kA, (b) BDk ¼ kB, (c) Dk þ Dk0 ¼ Dkþk0 , (d) DkDk0 ¼ Dkk0 2.67. Suppose AB ¼ C, where A and C are upper triangular. (a) Find 2 2 nonzero matrices A; B; C, where B is not upper triangular. (b) Suppose A is also invertible. Show that B must also be upper triangular. 52 CHAPTER 2 Algebra of Matrices
  • 60. Special Types of Real Matrices 2.68. Find x; y; z such that A is symmetric, where (a) A ¼ 2 x 3 4 5 y z 1 7 2 4 3 5, (b) A ¼ 7 6 2x y z 2 x 2 5 2 4 3 5. 2.69. Suppose A is a square matrix. Show (a) A þ AT is symmetric, (b) A AT is skew-symmetric, (c) A ¼ B þ C, where B is symmetric and C is skew-symmetric. 2.70. Write A ¼ 4 5 1 3 as the sum of a symmetric matrix B and a skew-symmetric matrix C. 2.71. Suppose A and B are symmetric. Show that the following are also symmetric: (a) A þ B; (b) kA, for any scalar k; (c) A2 ; (d) An , for n 0; (e) f ðAÞ, for any polynomial f ðxÞ. 2.72. Find a 2 2 orthogonal matrix P whose first row is a multiple of (a) ð3; 4Þ, (b) ð1; 2Þ. 2.73. Find a 3 3 orthogonal matrix P whose first two rows are multiples of (a) ð1; 2; 3Þ and ð0; 2; 3Þ, (b) ð1; 3; 1Þ and ð1; 0; 1Þ. 2.74. Suppose A and B are orthogonal matrices. Show that AT , A1 , AB are also orthogonal. 2.75. Which of the following matrices are normal? A ¼ 3 4 4 3 , B ¼ 1 2 2 3 , C ¼ 1 1 1 0 1 1 0 0 1 2 4 3 5. Complex Matrices 2.76. Find real numbers x; y; z such that A is Hermitian, where A ¼ 3 x þ 2i yi 3 2i 0 1 þ zi yi 1 xi 1 2 4 3 5: 2.77. Suppose A is a complex matrix. Show that AAH and AH A are Hermitian. 2.78. Let A be a square matrix. Show that (a) A þ AH is Hermitian, (b) A AH is skew-Hermitian, (c) A ¼ B þ C, where B is Hermitian and C is skew-Hermitian. 2.79. Determine which of the following matrices are unitary: A ¼ i=2 ffiffiffi 3 p =2 ffiffiffi 3 p =2 i=2 ; B ¼ 1 2 1 þ i 1 i 1 i 1 þ i ; C ¼ 1 2 1 i 1 þ i i 1 1 þ i 1 þ i 1 þ i 0 2 4 3 5 2.80. Suppose A and B are unitary. Show that AH , A1 , AB are unitary. 2.81. Determine which of the following matrices are normal: A ¼ 3 þ 4i 1 i 2 þ 3i and B ¼ 1 0 1 i i . CHAPTER 2 Algebra of Matrices 53
  • 61. Block Matrices 2.82. Let U ¼ 1 2 0 0 0 3 4 0 0 0 0 0 5 1 2 0 0 3 4 1 2 6 6 4 3 7 7 5 and V ¼ 3 2 0 0 2 4 0 0 0 0 1 2 0 0 2 3 0 0 4 1 2 6 6 6 6 4 3 7 7 7 7 5 . (a) Find UV using block multiplication. (b) Are U and V block diagonal matrices? (c) Is UV block diagonal? 2.83. Partition each of the following matrices so that it becomes a square block matrix with as many diagonal blocks as possible: A ¼ 1 0 0 0 0 2 0 0 3 2 4 3 5; B ¼ 1 2 0 0 0 3 0 0 0 0 0 0 4 0 0 0 0 5 0 0 0 0 0 0 6 2 6 6 6 6 4 3 7 7 7 7 5 ; C ¼ 0 1 0 0 0 0 2 0 0 2 4 3 5 2.84. Find M2 and M3 for (a) M ¼ 2 0 0 0 0 1 4 0 0 2 1 0 0 0 0 3 2 6 6 4 3 7 7 5, (b) M ¼ 1 1 0 0 2 3 0 0 0 0 1 2 0 0 4 5 2 6 6 4 3 7 7 5. 2.85. For each matrix M in Problem 2.84, find f ðMÞ where f ðxÞ ¼ x2 þ 4x 5. 2.86. Suppose U ¼ ½Uik and V ¼ ½Vkj are block matrices for which UV is defined and the number of columns of each block Uik is equal to the number of rows of each block Vkj. Show that UV ¼ ½Wij, where Wij ¼ P k UikVkj. 2.87. Suppose M and N are block diagonal matrices where corresponding blocks have the same size, say M ¼ diagðAiÞ and N ¼ diagðBiÞ. Show (i) M þ N ¼ diagðAi þ BiÞ, (iii) MN ¼ diagðAiBiÞ, (ii) kM ¼ diagðkAiÞ, (iv) f ðMÞ ¼ diagð f ðAiÞÞ for any polynomial f ðxÞ. ANSWERS TO SUPPLEMENTARY PROBLEMS Notation: A ¼ ½R1; R2; . . . denotes a matrix A with rows R1; R2; . . . . 2.38. (a) ½5; 10; 27; 34, (b) ½17; 4; 12; 13, (c) ½7; 27; 11; 8; 36; 37 2.39. (a) ½7; 14; 39; 28, ½21; 105; 98; 17; 285; 296 (b) ½5; 15; 20; 8; 60; 59, ½21; 105; 98; 17; 285; 296 2.40. (a) ½7; 6; 9; 22, ½11; 38; 57; 106; (b) ½11; 9; 17; 7; 53; 39, ½15; 35; 5; 10; 98; 69; (c) not defined 2.41. (a) ½1; 3; 2; 4, (b) ½5; 6; 0; 7, (c) ½7; 39; 14; 28; (d) ½5; 15; 10; 40 2.42. (a) ½13; 3; 18; 4; 17; 0, (b) ½5; 2; 4; 5; 11; 3; 12; 18, (c) ½11; 12; 0; 5; 15; 5; 8; 4, (d) ½9; 9, (e) ½1; 9, (f ) not defined 54 CHAPTER 2 Algebra of Matrices
  • 62. 2.43. (a) ½1; 0; 1; 3; 2; 4, (b) ½4; 0; 3; 7; 6; 12; 4; 8; 6], (c) not defined 2.44. ½2; 4; 6; 1; 2; 3 2.45. ½a1; a2; a3; a4, ½b1; b2; b3; b4, ½c1; c2; c3; c4 2.50. (a) 2; 6; 1; trðAÞ ¼ 5, (b) 1; 1; 1; trðBÞ ¼ 1, (c) not defined 2.51. (a) ½11; 15; 9; 14, ½67; 40; 24; 59, (b) ½50; 70; 42; 36, gðAÞ ¼ 0 2.52. (a) ½14; 4; 2; 34, ½60; 52; 26; 200, (b) f ðBÞ ¼ 0, ½4; 10; 5; 46 2.53. u ¼ ½2a; aT 2.54. ½3; 4; 5; 7, ½ 5 2 ; 3 2; 2; 1, not defined, ½1; 2 3; 2; 5 3 2.55. ½1; 1; 1; 2; 5; 3; 1; 2; 1, ½1; 1; 0; 1; 3; 1; 1; 4; 1 2.56. A ¼ ½1; 2; 1; 2, B ¼ ½0; 0; 1; 1, C ¼ ½2; 2; 0; 0 2.57. A ¼ ½1; 2; 0; 3; B ¼ ½4; 3; 3; 0 2.58. (c) Hint: Use Problem 2.48 2.59. (a) AB ¼ diagð2; 10; 0Þ, A2 ¼ diagð1; 4; 9Þ, B2 ¼ diagð4; 25; 0Þ; (b) f ðAÞ ¼ diagð2; 9; 6Þ; (c) A1 ¼ diagð1; 1 2 ; 1 3Þ, C1 does not exist 2.60. (a) ½1; 2n; 0; 1, (b) ½1; n; 1 2 nðn 1Þ; 0; 1; n; 0; 0; 1 2.61. (a) ½2; 3; 0; 5, ½2; 3; 0; 5, ½2; 7; 0; 5, ½2; 7; 0; 5, (b) none 2.62. (a) k ¼ 2, (b) k ¼ 5, (c) none 2.63. ½1; 0; 2; 3 2.64. ½1; 2; 1; 0; 3; 1; 0; 0; 2 2.65. All entries below the diagonal must be 0 to be upper triangular, and all diagonal entries must be 1 to be nonsingular. (a) 8 ð2n Þ, (b) 26 ð2nðnþ1Þ=2 Þ, (c) 23 ð2nðn1Þ=2 Þ 2.67. (a) A ¼ ½1; 1; 0; 0, B ¼ ½1; 2; 3; 4, C ¼ ½4; 6; 0; 0 2.68. (a) x ¼ 4, y ¼ 1, z ¼ 3; (b) x ¼ 0, y ¼ 6, z any real number 2.69. (c) Hint: Let B ¼ 1 2 ðA þ AT Þ and C ¼ 1 2 ðA AT Þ: 2.70. B ¼ ½4; 3; 3; 3, C ¼ ½0; 2; 2; 0 2.72. (a) ½3 5, 4 5; 4 5, 3 5], (b) ½1= ffiffiffi 5 p , 2= ffiffiffi 5 p ; 2= ffiffiffi 5 p , 1= ffiffiffi 5 p 2.73. (a) ½1= ffiffiffiffiffi 14 p , 2= ffiffiffiffiffi 14 p , 3= ffiffiffiffiffi 14 p ; 0; 2= ffiffiffiffiffi 13 p , 3= ffiffiffiffiffi 13 p ; 12= ffiffiffiffiffiffiffiffi 157 p , 3= ffiffiffiffiffiffiffiffi 157 p , 2= ffiffiffiffiffiffiffiffi 157 p (b) ½1= ffiffiffiffiffi 11 p , 3= ffiffiffiffiffi 11 p , 1= ffiffiffiffiffi 11 p ; 1= ffiffiffi 2 p , 0; 1= ffiffiffi 2 p ; 3= ffiffiffiffiffi 22 p , 2= ffiffiffiffiffi 22 p , 3= ffiffiffiffiffi 22 p 2.75. A; C CHAPTER 2 Algebra of Matrices 55
  • 63. 2.76. x ¼ 3, y ¼ 0, z ¼ 3 2.78. (c) Hint: Let B ¼ 1 2 ðA þ AH Þ and C ¼ 1 2 ðA AH Þ. 2.79. A; B; C 2.81. A 2.82. (a) UV ¼ diagð½7; 6; 17; 10; ½1; 9; 7; 5); (b) no; (c) yes 2.83. A: line between first and second rows (columns); B: line between second and third rows (columns) and between fourth and fifth rows (columns); C: C itself—no further partitioning of C is possible. 2.84. (a) M2 ¼ diagð½4, ½9; 8; 4; 9, ½9Þ, M3 ¼ diagð½8; ½25; 44; 22; 25, ½27Þ (b) M2 ¼ diagð½3; 4; 8; 11, ½9; 12; 24; 33Þ M3 ¼ diagð½11; 15; 30; 41, ½57; 78; 156; 213Þ 2.85. (a) diagð½7, ½8; 24; 12; 8, ½16Þ, (b) diagð½2; 8; 16; 181], ½8; 20; 40; 48Þ 56 CHAPTER 2 Algebra of Matrices
  • 64. Systems of Linear Equations 3.1 Introduction Systems of linear equations play an important and motivating role in the subject of linear algebra. In fact, many problems in linear algebra reduce to finding the solution of a system of linear equations. Thus, the techniques introduced in this chapter will be applicable to abstract ideas introduced later. On the other hand, some of the abstract results will give us new insights into the structure and properties of systems of linear equations. All our systems of linear equations involve scalars as both coefficients and constants, and such scalars may come from any number field K. There is almost no loss in generality if the reader assumes that all our scalars are real numbers—that is, that they come from the real field R. 3.2 Basic Definitions, Solutions This section gives basic definitions connected with the solutions of systems of linear equations. The actual algorithms for finding such solutions will be treated later. Linear Equation and Solutions A linear equation in unknowns x1; x2; . . . ; xn is an equation that can be put in the standard form a1x1 þ a2x2 þ þ anxn ¼ b ð3:1Þ where a1; a2; . . . ; an, and b are constants. The constant ak is called the coefficient of xk, and b is called the constant term of the equation. A solution of the linear equation (3.1) is a list of values for the unknowns or, equivalently, a vector u in Kn , say x1 ¼ k1; x2 ¼ k2; . . . ; xn ¼ kn or u ¼ ðk1; k2; . . . ; knÞ such that the following statement (obtained by substituting ki for xi in the equation) is true: a1k1 þ a2k2 þ þ ankn ¼ b In such a case we say that u satisfies the equation. Remark: Equation (3.1) implicitly assumes there is an ordering of the unknowns. In order to avoid subscripts, we will usually use x; y for two unknowns; x; y; z for three unknowns; and x; y; z; t for four unknowns; they will be ordered as shown. 57 CHAPTER 3
  • 65. EXAMPLE 3.1 Consider the following linear equation in three unknowns x; y; z: x þ 2y 3z ¼ 6 We note that x ¼ 5; y ¼ 2; z ¼ 1, or, equivalently, the vector u ¼ ð5; 2; 1Þ is a solution of the equation. That is, 5 þ 2ð2Þ 3ð1Þ ¼ 6 or 5 þ 4 3 ¼ 6 or 6 ¼ 6 On the other hand, w ¼ ð1; 2; 3Þ is not a solution, because on substitution, we do not get a true statement: 1 þ 2ð2Þ 3ð3Þ ¼ 6 or 1 þ 4 9 ¼ 6 or 4 ¼ 6 System of Linear Equations A system of linear equations is a list of linear equations with the same unknowns. In particular, a system of m linear equations L1; L2; . . . ; Lm in n unknowns x1; x2; . . . ; xn can be put in the standard form a11x1 þ a12x2 þ þ a1nxn ¼ b1 a21x1 þ a22x2 þ þ a2nxn ¼ b2 ð3:2Þ ::::::::::::::::::::::::::::::::::::::::::::::::::: am1x1 þ am2x2 þ þ amnxn ¼ bm where the aij and bi are constants. The number aij is the coefficient of the unknown xj in the equation Li, and the number bi is the constant of the equation Li. The system (3.2) is called an m n (read: m by n) system. It is called a square system if m ¼ n—that is, if the number m of equations is equal to the number n of unknowns. The system (3.2) is said to be homogeneous if all the constant terms are zero—that is, if b1 ¼ 0, b2 ¼ 0; . . . ; bm ¼ 0. Otherwise the system is said to be nonhomogeneous. A solution (or a particular solution) of the system (3.2) is a list of values for the unknowns or, equivalently, a vector u in Kn , which is a solution of each of the equations in the system. The set of all solutions of the system is called the solution set or the general solution of the system. EXAMPLE 3.2 Consider the following system of linear equations: x1 þ x2 þ 4x3 þ 3x4 ¼ 5 2x1 þ 3x2 þ x3 2x4 ¼ 1 x1 þ 2x2 5x3 þ 4x4 ¼ 3 It is a 3 4 system because it has three equations in four unknowns. Determine whether (a) u ¼ ð8; 6; 1; 1Þ and (b) v ¼ ð10; 5; 1; 2Þ are solutions of the system. (a) Substitute the values of u in each equation, obtaining 8 þ 6 þ 4ð1Þ þ 3ð1Þ ¼ 5 or 8 þ 6 þ 4 þ 3 ¼ 5 or 5 ¼ 5 2ð8Þ þ 3ð6Þ þ 1 2ð1Þ ¼ 1 or 16 þ 18 þ 1 2 ¼ 1 or 1 ¼ 1 8 þ 2ð6Þ 5ð1Þ þ 4ð1Þ ¼ 3 or 8 þ 12 5 þ 4 ¼ 3 or 3 ¼ 3 Yes, u is a solution of the system because it is a solution of each equation. (b) Substitute the values of v into each successive equation, obtaining 10 þ 5 þ 4ð1Þ þ 3ð2Þ ¼ 5 or 10 þ 5 þ 4 þ 6 ¼ 5 or 5 ¼ 5 2ð10Þ þ 3ð5Þ þ 1 2ð2Þ ¼ 1 or 20 þ 15 þ 1 4 ¼ 1 or 8 ¼ 1 No, v is not a solution of the system, because it is not a solution of the second equation. (We do not need to substitute v into the third equation.) 58 CHAPTER 3 Systems of Linear Equations
  • 66. The system (3.2) of linear equations is said to be consistent if it has one or more solutions, and it is said to be inconsistent if it has no solution. If the field K of scalars is infinite, such as when K is the real field R or the complex field C, then we have the following important result. THEOREM 3.1: Suppose the field K is infinite. Then any system l of linear equations has (i) a unique solution, (ii) no solution, or (iii) an infinite number of solutions. This situation is pictured in Fig. 3-1. The three cases have a geometrical description when the system l consists of two equations in two unknowns (Section 3.4). Augmented and Coefficient Matrices of a System Consider again the general system (3.2) of m equations in n unknowns. Such a system has associated with it the following two matrices: M ¼ a11 a12 . . . a1n b1 a21 a22 . . . a2n b2 ::::::::::::::::::::::::::::::::::::::: am1 am2 . . . amn bn 2 6 6 4 3 7 7 5 and A ¼ a11 a12 . . . a1n a21 a22 . . . a2n ::::::::::::::::::::::::::::::: am1 am2 . . . amn 2 6 6 4 3 7 7 5 The first matrix M is called the augmented matrix of the system, and the second matrix A is called the coefficient matrix. The coefficient matrix A is simply the matrix of coefficients, which is the augmented matrix M without the last column of constants. Some texts write M ¼ ½A; B to emphasize the two parts of M, where B denotes the column vector of constants. The augmented matrix M and the coefficient matrix A of the system in Example 3.2 are as follows: M ¼ 1 1 4 3 5 2 3 1 2 1 1 2 5 4 3 2 4 3 5 and A ¼ 1 1 4 3 2 3 1 2 1 2 5 4 2 4 3 5 As expected, A consists of all the columns of M except the last, which is the column of constants. Clearly, a system of linear equations is completely determined by its augmented matrix M, and vice versa. Specifically, each row of M corresponds to an equation of the system, and each column of M corresponds to the coefficients of an unknown, except for the last column, which corresponds to the constants of the system. Degenerate Linear Equations A linear equation is said to be degenerate if all the coefficients are zero—that is, if it has the form 0x1 þ 0x2 þ þ 0xn ¼ b ð3:3Þ Figure 3-1 CHAPTER 3 Systems of Linear Equations 59
  • 67. The solution of such an equation depends only on the value of the constant b. Specifically, (i) If b 6¼ 0, then the equation has no solution. (ii) If b ¼ 0, then every vector u ¼ ðk1; k2; . . . ; knÞ in Kn is a solution. The following theorem applies. THEOREM 3.2: Let l be a system of linear equations that contains a degenerate equation L, say with constant b. (i) If b 6¼ 0, then the system l has no solution. (ii) If b ¼ 0, then L may be deleted from the system without changing the solution set of the system. Part (i) comes from the fact that the degenerate equation has no solution, so the system has no solution. Part (ii) comes from the fact that every element in Kn is a solution of the degenerate equation. Leading Unknown in a Nondegenerate Linear Equation Now let L be a nondegenerate linear equation. This means one or more of the coefficients of L are not zero. By the leading unknown of L, we mean the first unknown in L with a nonzero coefficient. For example, x3 and y are the leading unknowns, respectively, in the equations 0x1 þ 0x2 þ 5x3 þ 6x4 þ 0x5 þ 8x6 ¼ 7 and 0x þ 2y 4z ¼ 5 We frequently omit terms with zero coefficients, so the above equations would be written as 5x3 þ 6x4 þ 8x6 ¼ 7 and 2y 4z ¼ 5 In such a case, the leading unknown appears first. 3.3 Equivalent Systems, Elementary Operations Consider the system (3.2) of m linear equations in n unknowns. Let L be the linear equation obtained by multiplying the m equations by constants c1; c2; . . . ; cm, respectively, and then adding the resulting equations. Specifically, let L be the following linear equation: ðc1a11 þ þ cmam1Þx1 þ þ ðc1a1n þ þ cmamnÞxn ¼ c1b1 þ þ cmbm Then L is called a linear combination of the equations in the system. One can easily show (Problem 3.43) that any solution of the system (3.2) is also a solution of the linear combination L. EXAMPLE 3.3 Let L1, L2, L3 denote, respectively, the three equations in Example 3.2. Let L be the equation obtained by multiplying L1, L2, L3 by 3; 2; 4, respectively, and then adding. Namely, 3L1: 3x1 þ 3x2 þ 12x3 þ 9x4 ¼ 15 2L2: 4x1 6x2 2x3 þ 4x4 ¼ 2 4L1: 4x1 þ 8x2 20x3 þ 16x4 ¼ 12 ðSumÞ L: 3x1 þ 5x2 10x3 þ 29x4 ¼ 25 60 CHAPTER 3 Systems of Linear Equations
  • 68. Then L is a linear combination of L1, L2, L3. As expected, the solution u ¼ ð8; 6; 1; 1Þ of the system is also a solution of L. That is, substituting u in L, we obtain a true statement: 3ð8Þ þ 5ð6Þ 10ð1Þ þ 29ð1Þ ¼ 25 or 24 þ 30 10 þ 29 ¼ 25 or 9 ¼ 9 The following theorem holds. THEOREM 3.3: Two systems of linear equations have the same solutions if and only if each equation in each system is a linear combination of the equations in the other system. Two systems of linear equations are said to be equivalent if they have the same solutions. The next subsection shows one way to obtain equivalent systems of linear equations. Elementary Operations The following operations on a system of linear equations L1; L2; . . . ; Lm are called elementary operations. ½E1 Interchange two of the equations. We indicate that the equations Li and Lj are interchanged by writing: ‘‘Interchange Li and Lj’’ or ‘‘Li ! Lj’’ ½E2 Replace an equation by a nonzero multiple of itself. We indicate that equation Li is replaced by kLi (where k 6¼ 0) by writing ‘‘Replace Li by kLi’’ or ‘‘kLi ! Li’’ ½E3 Replace an equation by the sum of a multiple of another equation and itself. We indicate that equation Lj is replaced by the sum of kLi and Lj by writing ‘‘Replace Lj by kLi þ Lj’’ or ‘‘kLi þ Lj ! Lj’’ The arrow ! in ½E2 and ½E3 may be read as ‘‘replaces.’’ The main property of the above elementary operations is contained in the following theorem (proved in Problem 3.45). THEOREM 3.4: Suppose a system of m of linear equations is obtained from a system l of linear equations by a finite sequence of elementary operations. Then m and l have the same solutions. Remark: Sometimes (say to avoid fractions when all the given scalars are integers) we may apply ½E2 and ½E3 in one step; that is, we may apply the following operation: ½E Replace equation Lj by the sum of kLi and k0 Lj (where k0 6¼ 0), written ‘‘Replace Lj by kLi þ k0 Lj’’ or ‘‘kLi þ k0 Lj ! Lj’’ We emphasize that in operations ½E3 and [E], only equation Lj is changed. Gaussian elimination, our main method for finding the solution of a given system of linear equations, consists of using the above operations to transform a given system into an equivalent system whose solution can be easily obtained. The details of Gaussian elimination are discussed in subsequent sections. 3.4 Small Square Systems of Linear Equations This section considers the special case of one equation in one unknown, and two equations in two unknowns. These simple systems are treated separately because their solution sets can be described geometrically, and their properties motivate the general case. CHAPTER 3 Systems of Linear Equations 61
  • 69. Linear Equation in One Unknown The following simple basic result is proved in Problem 3.5. THEOREM 3.5: Consider the linear equation ax ¼ b. (i) If a 6¼ 0, then x ¼ b=a is a unique solution of ax ¼ b. (ii) If a ¼ 0, but b 6¼ 0, then ax ¼ b has no solution. (iii) If a ¼ 0 and b ¼ 0, then every scalar k is a solution of ax ¼ b. EXAMPLE 3.4 Solve (a) 4x 1 ¼ x þ 6, (b) 2x 5 x ¼ x þ 3, (c) 4 þ x 3 ¼ 2x þ 1 x. (a) Rewrite the equation in standard form obtaining 3x ¼ 7. Then x ¼ 7 3 is the unique solution [Theorem 3.5(i)]. (b) Rewrite the equation in standard form, obtaining 0x ¼ 8. The equation has no solution [Theorem 3.5(ii)]. (c) Rewrite the equation in standard form, obtaining 0x ¼ 0. Then every scalar k is a solution [Theorem 3.5(iii)]. System of Two Linear Equations in Two Unknowns (2 2 System) Consider a system of two nondegenerate linear equations in two unknowns x and y, which can be put in the standard form A1x þ B1y ¼ C1 A2x þ B2y ¼ C2 ð3:4Þ Because the equations are nondegenerate, A1 and B1 are not both zero, and A2 and B2 are not both zero. The general solution of the system (3.4) belongs to one of three types as indicated in Fig. 3-1. If R is the field of scalars, then the graph of each equation is a line in the plane R2 and the three types may be described geometrically as pictured in Fig. 3-2. Specifically, (1) The system has exactly one solution. Here the two lines intersect in one point [Fig. 3-2(a)]. This occurs when the lines have distinct slopes or, equivalently, when the coefficients of x and y are not proportional: A1 A2 6¼ B1 B2 or; equivalently; A1B2 A2B1 6¼ 0 For example, in Fig. 3-2(a), 1=3 6¼ 1=2. y L1 x L2 0 –3 3 –3 3 L x y L x y 1 2 : – = – 1 : 3 + 2 = 12 6 (a) y (b) L1 x L2 0 3 –3 3 L x y L x y 1 2 : + 3 = 3 : 2 + 6 = – 8 6 –3 y (c) L L 1 2 and x 0 3 –3 3 L x y L x y 1 2 : + 2 = 4 : 2 + 4 = 8 6 –3 Figure 3-2 62 CHAPTER 3 Systems of Linear Equations
  • 70. (2) The system has no solution. Here the two lines are parallel [Fig. 3-2(b)]. This occurs when the lines have the same slopes but different y intercepts, or when A1 A2 ¼ B1 B2 6¼ C1 C2 For example, in Fig. 3-2(b), 1=2 ¼ 3=6 6¼ 3=8. (3) The system has an infinite number of solutions. Here the two lines coincide [Fig. 3-2(c)]. This occurs when the lines have the same slopes and same y intercepts, or when the coefficients and constants are proportional, A1 A2 ¼ B1 B2 ¼ C1 C2 For example, in Fig. 3-2(c), 1=2 ¼ 2=4 ¼ 4=8. Remark: The following expression and its value is called a determinant of order two: A1 B1 A2 B2 ¼ A1B2 A2B1 Determinants will be studied in Chapter 8. Thus, the system (3.4) has a unique solution if and only if the determinant of its coefficients is not zero. (We show later that this statement is true for any square system of linear equations.) Elimination Algorithm The solution to system (3.4) can be obtained by the process of elimination, whereby we reduce the system to a single equation in only one unknown. Assuming the system has a unique solution, this elimination algorithm has two parts. ALGORITHM 3.1: The input consists of two nondegenerate linear equations L1 and L2 in two unknowns with a unique solution. Part A. (Forward Elimination) Multiply each equation by a constant so that the resulting coefficients of one unknown are negatives of each other, and then add the two equations to obtain a new equation L that has only one unknown. Part B. (Back-Substitution) Solve for the unknown in the new equation L (which contains only one unknown), substitute this value of the unknown into one of the original equations, and then solve to obtain the value of the other unknown. Part A of Algorithm 3.1 can be applied to any system even if the system does not have a unique solution. In such a case, the new equation L will be degenerate and Part B will not apply. EXAMPLE 3.5 (Unique Case). Solve the system L1: 2x 3y ¼ 8 L2: 3x þ 4y ¼ 5 The unknown x is eliminated from the equations by forming the new equation L ¼ 3L1 þ 2L2. That is, we multiply L1 by 3 and L2 by 2 and add the resulting equations as follows: 3L1: 6x þ 9y ¼ 24 2L2: 6x þ 8y ¼ 10 Addition : 17y ¼ 34 CHAPTER 3 Systems of Linear Equations 63
  • 71. We now solve the new equation for y, obtaining y ¼ 2. We substitute y ¼ 2 into one of the original equations, say L1, and solve for the other unknown x, obtaining 2x 3ð2Þ ¼ 8 or 2x 6 ¼ 8 or 2x ¼ 2 or x ¼ 1 Thus, x ¼ 1, y ¼ 2, or the pair u ¼ ð1; 2Þ is the unique solution of the system. The unique solution is expected, because 2=3 6¼ 3=4. [Geometrically, the lines corresponding to the equations intersect at the point ð1; 2Þ.] EXAMPLE 3.6 (Nonunique Cases) (a) Solve the system L1: x 3y ¼ 4 L2: 2x þ 6y ¼ 5 We eliminated x from the equations by multiplying L1 by 2 and adding it to L2—that is, by forming the new equation L ¼ 2L1 þ L2. This yields the degenerate equation 0x þ 0y ¼ 13 which has a nonzero constant b ¼ 13. Thus, this equation and the system have no solution. This is expected, because 1=ð2Þ ¼ 3=6 6¼ 4=5. (Geometrically, the lines corresponding to the equations are parallel.) (b) Solve the system L1: x 3y ¼ 4 L2: 2x þ 6y ¼ 8 We eliminated x from the equations by multiplying L1 by 2 and adding it to L2—that is, by forming the new equation L ¼ 2L1 þ L2. This yields the degenerate equation 0x þ 0y ¼ 0 where the constant term is also zero. Thus, the system has an infinite number of solutions, which correspond to the solutions of either equation. This is expected, because 1=ð2Þ ¼ 3=6 ¼ 4=ð8Þ. (Geometrically, the lines corresponding to the equations coincide.) To find the general solution, let y ¼ a, and substitute into L1 to obtain x 3a ¼ 4 or x ¼ 3a þ 4 Thus, the general solution of the system is x ¼ 3a þ 4; y ¼ a or u ¼ ð3a þ 4; aÞ where a (called a parameter) is any scalar. 3.5 Systems in Triangular and Echelon Forms The main method for solving systems of linear equations, Gaussian elimination, is treated in Section 3.6. Here we consider two simple types of systems of linear equations: systems in triangular form and the more general systems in echelon form. Triangular Form Consider the following system of linear equations, which is in triangular form: 2x1 3x2 þ 5x3 2x4 ¼ 9 5x2 x3 þ 3x4 ¼ 1 7x3 x4 ¼ 3 2x4 ¼ 8 64 CHAPTER 3 Systems of Linear Equations
  • 72. That is, the first unknown x1 is the leading unknown in the first equation, the second unknown x2 is the leading unknown in the second equation, and so on. Thus, in particular, the system is square and each leading unknown is directly to the right of the leading unknown in the preceding equation. Such a triangular system always has a unique solution, which may be obtained by back-substitution. That is, (1) First solve the last equation for the last unknown to get x4 ¼ 4. (2) Then substitute this value x4 ¼ 4 in the next-to-last equation, and solve for the next-to-last unknown x3 as follows: 7x3 4 ¼ 3 or 7x3 ¼ 7 or x3 ¼ 1 (3) Now substitute x3 ¼ 1 and x4 ¼ 4 in the second equation, and solve for the second unknown x2 as follows: 5x2 1 þ 12 ¼ 1 or 5x2 þ 11 ¼ 1 or 5x2 ¼ 10 or x2 ¼ 2 (4) Finally, substitute x2 ¼ 2, x3 ¼ 1, x4 ¼ 4 in the first equation, and solve for the first unknown x1 as follows: 2x1 þ 6 þ 5 8 ¼ 9 or 2x1 þ 3 ¼ 9 or 2x1 ¼ 6 or x1 ¼ 3 Thus, x1 ¼ 3 , x2 ¼ 2, x3 ¼ 1, x4 ¼ 4, or, equivalently, the vector u ¼ ð3; 2; 1; 4Þ is the unique solution of the system. Remark: There is an alternative form for back-substitution (which will be used when solving a system using the matrix format). Namely, after first finding the value of the last unknown, we substitute this value for the last unknown in all the preceding equations before solving for the next-to-last unknown. This yields a triangular system with one less equation and one less unknown. For example, in the above triangular system, we substitute x4 ¼ 4 in all the preceding equations to obtain the triangular system 2x1 3x2 þ 5x3 ¼ 17 5x2 x3 ¼ 1 7x3 ¼ 7 We then repeat the process using the new last equation. And so on. Echelon Form, Pivot and Free Variables The following system of linear equations is said to be in echelon form: 2x1 þ 6x2 x3 þ 4x4 2x5 ¼ 15 x3 þ 2x4 þ 2x5 ¼ 5 3x4 9x5 ¼ 6 That is, no equation is degenerate and the leading unknown in each equation other than the first is to the right of the leading unknown in the preceding equation. The leading unknowns in the system, x1, x3, x4, are called pivot variables, and the other unknowns, x2 and x5, are called free variables. Generally speaking, an echelon system or a system in echelon form has the following form: a11x1 þ a12x2 þ a13x3 þ a14x4 þ þ a1nxn ¼ b1 a2j2 xj2 þ a2;j2þ1xj2þ1 þ þ a2nxn ¼ b2 :::::::::::::::::::::::::::::::::::::::::::::: arjr xjr þ þ arnxn ¼ br ð3:5Þ where 1 j2 jr and a11, a2j2 ; . . . ; arjr are not zero. The pivot variables are x1, xj2 ; . . . ; xjr . Note that r n. The solution set of any echelon system is described in the following theorem (proved in Problem 3.10). CHAPTER 3 Systems of Linear Equations 65
  • 73. THEOREM 3.6: Consider a system of linear equations in echelon form, say with r equations in n unknowns. There are two cases: (i) r ¼ n. That is, there are as many equations as unknowns (triangular form). Then the system has a unique solution. (ii) r n. That is, there are more unknowns than equations. Then we can arbitrarily assign values to the n r free variables and solve uniquely for the r pivot variables, obtaining a solution of the system. Suppose an echelon system contains more unknowns than equations. Assuming the field K is infinite, the system has an infinite number of solutions, because each of the n r free variables may be assigned any scalar. The general solution of a system with free variables may be described in either of two equivalent ways, which we illustrate using the above echelon system where there are r ¼ 3 equations and n ¼ 5 unknowns. One description is called the ‘‘Parametric Form’’ of the solution, and the other description is called the ‘‘Free-Variable Form.’’ Parametric Form Assign arbitrary values, called parameters, to the free variables x2 and x5, say x2 ¼ a and x5 ¼ b, and then use back-substitution to obtain values for the pivot variables x1, x3, x5 in terms of the parameters a and b. Specifically, (1) Substitute x5 ¼ b in the last equation, and solve for x4: 3x4 9b ¼ 6 or 3x4 ¼ 6 þ 9b or x4 ¼ 2 þ 3b (2) Substitute x4 ¼ 2 þ 3b and x5 ¼ b into the second equation, and solve for x3: x3 þ 2ð2 þ 3bÞ þ 2b ¼ 5 or x3 þ 4 þ 8b ¼ 5 or x3 ¼ 1 8b (3) Substitute x2 ¼ a, x3 ¼ 1 8b, x4 ¼ 2 þ 3b, x5 ¼ b into the first equation, and solve for x1: 2x1 þ 6a ð1 8bÞ þ 4ð2 þ 3bÞ 2b ¼ 15 or x1 ¼ 4 3a 9b Accordingly, the general solution in parametric form is x1 ¼ 4 3a 9b; x2 ¼ a; x3 ¼ 1 8b; x4 ¼ 2 þ 3b; x5 ¼ b or, equivalently, v ¼ ð4 3a 9b; a; 1 8b; 2 þ 3b; bÞ where a and b are arbitrary numbers. Free-Variable Form Use back-substitution to solve for the pivot variables x1, x3, x4 directly in terms of the free variables x2 and x5. That is, the last equation gives x4 ¼ 2 þ 3x5. Substitution in the second equation yields x3 ¼ 1 8x5, and then substitution in the first equation yields x1 ¼ 4 3x2 9x5. Accordingly, x1 ¼ 4 3x2 9x5; x2 ¼ free variable; x3 ¼ 1 8x5; x4 ¼ 2 þ 3x5; x5 ¼ free variable or, equivalently, v ¼ ð4 3x2 9x5; x2; 1 8x5; 2 þ 3x5; x5Þ is the free-variable form for the general solution of the system. We emphasize that there is no difference between the above two forms of the general solution, and the use of one or the other to represent the general solution is simply a matter of taste. Remark: A particular solution of the above system can be found by assigning any values to the free variables and then solving for the pivot variables by back-substitution. For example, setting x2 ¼ 1 and x5 ¼ 1, we obtain x4 ¼ 2 þ 3 ¼ 5; x3 ¼ 1 8 ¼ 7; x1 ¼ 4 3 9 ¼ 8 Thus, u ¼ ð8; 1; 7; 5; 1Þ is the particular solution corresponding to x2 ¼ 1 and x5 ¼ 1. 66 CHAPTER 3 Systems of Linear Equations
  • 74. 3.6 Gaussian Elimination The main method for solving the general system (3.2) of linear equations is called Gaussian elimination. It essentially consists of two parts: Part A. (Forward Elimination) Step-by-step reduction of the system yielding either a degenerate equation with no solution (which indicates the system has no solution) or an equivalent simpler system in triangular or echelon form. Part B. (Backward Elimination) Step-by-step back-substitution to find the solution of the simpler system. Part B has already been investigated in Section 3.4. Accordingly, we need only give the algorithm for Part A, which is as follows. ALGORITHM 3.2 for (Part A): Input: The m n system (3.2) of linear equations. ELIMINATION STEP: Find the first unknown in the system with a nonzero coefficient (which now must be x1). (a) Arrange so that a11 6¼ 0. That is, if necessary, interchange equations so that the first unknown x1 appears with a nonzero coefficient in the first equation. (b) Use a11 as a pivot to eliminate x1 from all equations except the first equation. That is, for i 1: (1) Set m ¼ ai1=a11; (2) Replace Li by mL1 þ Li The system now has the following form: a11x1 þ a12x2 þ a13x3 þ þ a1nxn ¼ b1 a2j2 xj2 þ þ a2nxn ¼ b2 ::::::::::::::::::::::::::::::::::::::: amj2 xj2 þ þ amnxn ¼ bn where x1 does not appear in any equation except the first, a11 6¼ 0, and xj2 denotes the first unknown with a nonzero coefficient in any equation other than the first. (c) Examine each new equation L. (1) If L has the form 0x1 þ 0x2 þ þ 0xn ¼ b with b 6¼ 0, then STOP The system is inconsistent and has no solution. (2) If L has the form 0x1 þ 0x2 þ þ 0xn ¼ 0 or if L is a multiple of another equation, then delete L from the system. RECURSION STEP: Repeat the Elimination Step with each new ‘‘smaller’’ subsystem formed by all the equations excluding the first equation. OUTPUT: Finally, the system is reduced to triangular or echelon form, or a degenerate equation with no solution is obtained indicating an inconsistent system. The next remarks refer to the Elimination Step in Algorithm 3.2. (1) The following number m in (b) is called the multiplier: m ¼ ai1 a11 ¼ coefficient to be deleted pivot (2) One could alternatively apply the following operation in (b): Replace Li by ai1L1 þ a11Li This would avoid fractions if all the scalars were originally integers. CHAPTER 3 Systems of Linear Equations 67
  • 75. Gaussian Elimination Example Here we illustrate in detail Gaussian elimination using the following system of linear equations: L1: x 3y 2z ¼ 6 L2: 2x 4y 3z ¼ 8 L3: 3x þ 6y þ 8z ¼ 5 Part A. We use the coefficient 1 of x in the first equation L1 as the pivot in order to eliminate x from the second equation L2 and from the third equation L3. This is accomplished as follows: (1) Multiply L1 by the multiplier m ¼ 2 and add it to L2; that is, ‘‘Replace L2 by 2L1 þ L2.’’ (2) Multiply L1 by the multiplier m ¼ 3 and add it to L3; that is, ‘‘Replace L3 by 3L1 þ L3.’’ These steps yield ð2ÞL1: 2x þ 6y þ 4z ¼ 12 L2: 2x 4y 3z ¼ 8 New L2: 2y þ z ¼ 4 3L1: 3x 9y 6z ¼ 18 L3: 3x þ 6y þ 8z ¼ 5 New L3: 3y þ 2z ¼ 13 Thus, the original system is replaced by the following system: L1: x 3y 2z ¼ 6 L2: 2y þ z ¼ 4 L3: 3y þ 2z ¼ 13 (Note that the equations L2 and L3 form a subsystem with one less equation and one less unknown than the original system.) Next we use the coefficient 2 of y in the (new) second equation L2 as the pivot in order to eliminate y from the (new) third equation L3. This is accomplished as follows: (3) Multiply L2 by the multiplier m ¼ 3 2 and add it to L3; that is, ‘‘Replace L3 by 3 2 L2 þ L3:’’ (Alternately, ‘‘Replace L3 by 3L2 þ 2L3,’’ which will avoid fractions.) This step yields 3 2 L2: 3y þ 3 2 z ¼ 6 L3: 3y þ 2z ¼ 13 New L3: 7 2 z ¼ 7 or 3L2: 6y þ 3z ¼ 12 2L3: 6y þ 4z ¼ 26 New L3: 7z ¼ 14 Thus, our system is replaced by the following system: L1: x 3y 2z ¼ 6 L2: 2y þ z ¼ 4 L3: 7z ¼ 14 ðor 7 2 z ¼ 7Þ The system is now in triangular form, so Part A is completed. Part B. The values for the unknowns are obtained in reverse order, z; y; x, by back-substitution. Specifically, (1) Solve for z in L3 to get z ¼ 2. (2) Substitute z ¼ 2 in L2, and solve for y to get y ¼ 3. (3) Substitute y ¼ 3 and z ¼ 2 in L1, and solve for x to get x ¼ 1. Thus, the solution of the triangular system and hence the original system is as follows: x ¼ 1; y ¼ 3; z ¼ 2 or; equivalently; u ¼ ð1; 3; 2Þ: 68 CHAPTER 3 Systems of Linear Equations
  • 76. Condensed Format The Gaussian elimination algorithm involves rewriting systems of linear equations. Sometimes we can avoid excessive recopying of some of the equations by adopting a ‘‘condensed format.’’ This format for the solution of the above system follows: Number Equation Operation ð1Þ x 3y 2z ¼ 6 ð2Þ 2x 4y 3z ¼ 8 ð3Þ 3x þ 6y þ 8z ¼ 5 ð20 Þ 2y þ z ¼ 4 Replace L2 by 2L1 þ L2 ð30 Þ 3y þ 2z ¼ 13 Replace L3 by 3L1 þ L3 ð300 Þ 7z ¼ 14 Replace L3 by 3L2 þ 2L3 That is, first we write down the number of each of the original equations. As we apply the Gaussian elimination algorithm to the system, we only write down the new equations, and we label each new equation using the same number as the original corresponding equation, but with an added prime. (After each new equation, we will indicate, for instructional purposes, the elementary operation that yielded the new equation.) The system in triangular form consists of equations (1), ð20 Þ, and ð300 Þ, the numbers with the largest number of primes. Applying back-substitution to these equations again yields x ¼ 1, y ¼ 3, z ¼ 2. Remark: If two equations need to be interchanged, say to obtain a nonzero coefficient as a pivot, then this is easily accomplished in the format by simply renumbering the two equations rather than changing their positions. EXAMPLE 3.7 Solve the following system: x þ 2y 3z ¼ 1 2x þ 5y 8z ¼ 4 3x þ 8y 13z ¼ 7 We solve the system by Gaussian elimination. Part A. (Forward Elimination) We use the coefficient 1 of x in the first equation L1 as the pivot in order to eliminate x from the second equation L2 and from the third equation L3. This is accomplished as follows: (1) Multiply L1 by the multiplier m ¼ 2 and add it to L2; that is, ‘‘Replace L2 by 2L1 þ L2.’’ (2) Multiply L1 by the multiplier m ¼ 3 and add it to L3; that is, ‘‘Replace L3 by 3L1 þ L3.’’ The two steps yield x þ 2y 3z ¼ 1 y 2z ¼ 2 2y 4z ¼ 4 or x þ 2y 3z ¼ 1 y 2z ¼ 2 (The third equation is deleted, because it is a multiple of the second equation.) The system is now in echelon form with free variable z. Part B. (Backward Elimination) To obtain the general solution, let the free variable z ¼ a, and solve for x and y by back-substitution. Substitute z ¼ a in the second equation to obtain y ¼ 2 þ 2a. Then substitute z ¼ a and y ¼ 2 þ 2a into the first equation to obtain x þ 2ð2 þ 2aÞ 3a ¼ 1 or x þ 4 þ 4a 3a ¼ 1 or x ¼ 3 a Thus, the following is the general solution where a is a parameter: x ¼ 3 a; y ¼ 2 þ 2a; z ¼ a or u ¼ ð3 a; 2 þ 2a; aÞ CHAPTER 3 Systems of Linear Equations 69
  • 77. EXAMPLE 3.8 Solve the following system: x1 þ 3x2 2x3 þ 5x4 ¼ 4 2x1 þ 8x2 x3 þ 9x4 ¼ 9 3x1 þ 5x2 12x3 þ 17x4 ¼ 7 We use Gaussian elimination. Part A. (Forward Elimination) We use the coefficient 1 of x1 in the first equation L1 as the pivot in order to eliminate x1 from the second equation L2 and from the third equation L3. This is accomplished by the following operations: (1) ‘‘Replace L2 by 2L1 þ L2’’ and (2) ‘‘Replace L3 by 3L1 þ L3’’ These yield: x1 þ 3x2 2x3 þ 5x4 ¼ 4 2x2 þ 3x3 x4 ¼ 1 4x2 6x3 þ 2x4 ¼ 5 We now use the coefficient 2 of x2 in the second equation L2 as the pivot and the multiplier m ¼ 2 in order to eliminate x2 from the third equation L3. This is accomplished by the operation ‘‘Replace L3 by 2L2 þ L3,’’ which then yields the degenerate equation 0x1 þ 0x2 þ 0x3 þ 0x4 ¼ 3 This equation and, hence, the original system have no solution: DO NOT CONTINUE Remark 1: As in the above examples, Part A of Gaussian elimination tells us whether or not the system has a solution—that is, whether or not the system is consistent. Accordingly, Part B need never be applied when a system has no solution. Remark 2: If a system of linear equations has more than four unknowns and four equations, then it may be more convenient to use the matrix format for solving the system. This matrix format is discussed later. 3.7 Echelon Matrices, Row Canonical Form, Row Equivalence One way to solve a system of linear equations is by working with its augmented matrix M rather than the system itself. This section introduces the necessary matrix concepts for such a discussion. These concepts, such as echelon matrices and elementary row operations, are also of independent interest. Echelon Matrices A matrix A is called an echelon matrix, or is said to be in echelon form, if the following two conditions hold (where a leading nonzero element of a row of A is the first nonzero element in the row): (1) All zero rows, if any, are at the bottom of the matrix. (2) Each leading nonzero entry in a row is to the right of the leading nonzero entry in the preceding row. That is, A ¼ ½aij is an echelon matrix if there exist nonzero entries a1j1 ; a2j2 ; . . . ; arjr ; where j1 j2 jr 70 CHAPTER 3 Systems of Linear Equations
  • 78. with the property that aij ¼ 0 for ðiÞ i r; j ji ðiiÞ i r The entries a1j1 , a2j2 ; . . . ; arjr , which are the leading nonzero elements in their respective rows, are called the pivots of the echelon matrix. EXAMPLE 3.9 The following is an echelon matrix whose pivots have been circled: A ¼ 0 2 3 4 5 9 0 7 0 0 0 3 4 1 2 5 0 0 0 0 0 5 7 2 0 0 0 0 0 0 8 6 0 0 0 0 0 0 0 0 2 6 6 6 6 4 3 7 7 7 7 5 Observe that the pivots are in columns C2; C4; C6; C7, and each is to the right of the one above. Using the above notation, the pivots are a1j1 ¼ 2; a2j2 ¼ 3; a3j3 ¼ 5; a4j4 ¼ 8 where j1 ¼ 2, j2 ¼ 4, j3 ¼ 6, j4 ¼ 7. Here r ¼ 4. Row Canonical Form A matrix A is said to be in row canonical form (or row-reduced echelon form) if it is an echelon matrix— that is, if it satisfies the above properties (1) and (2), and if it satisfies the following additional two properties: (3) Each pivot (leading nonzero entry) is equal to 1. (4) Each pivot is the only nonzero entry in its column. The major difference between an echelon matrix and a matrix in row canonical form is that in an echelon matrix there must be zeros below the pivots [Properties (1) and (2)], but in a matrix in row canonical form, each pivot must also equal 1 [Property (3)] and there must also be zeros above the pivots [Property (4)]. The zero matrix 0 of any size and the identity matrix I of any size are important special examples of matrices in row canonical form. EXAMPLE 3.10 The following are echelon matrices whose pivots have been circled: 2 3 2 0 4 5 6 0 0 0 1 3 2 0 0 0 0 0 0 6 2 0 0 0 0 0 0 0 2 6 6 4 3 7 7 5; 1 2 3 0 0 1 0 0 0 2 4 3 5; 0 1 3 0 0 4 0 0 0 1 0 3 0 0 0 0 1 2 2 4 3 5 The third matrix is also an example of a matrix in row canonical form. The second matrix is not in row canonical form, because it does not satisfy property (4); that is, there is a nonzero entry above the second pivot in the third column. The first matrix is not in row canonical form, because it satisfies neither property (3) nor property (4); that is, some pivots are not equal to 1 and there are nonzero entries above the pivots. CHAPTER 3 Systems of Linear Equations 71
  • 79. Elementary Row Operations Suppose A is a matrix with rows R1; R2; . . . ; Rm. The following operations on A are called elementary row operations. ½E1 (Row Interchange): Interchange rows Ri and Rj. This may be written as ‘‘Interchange Ri and Rj’’ or ‘‘Ri ! Rj’’ ½E2 (Row Scaling): Replace row Ri by a nonzero multiple kRi of itself. This may be written as ‘‘Replace Ri by kRi ðk 6¼ 0Þ’’ or ‘‘kRi ! Ri’’ ½E3 (Row Addition): Replace row Rj by the sum of a multiple kRi of a row Ri and itself. This may be written as ‘‘Replace Rj by kRi þ Rj’’ or ‘‘kRi þ Rj ! Rj’’ The arrow ! in E2 and E3 may be read as ‘‘replaces.’’ Sometimes (say to avoid fractions when all the given scalars are integers) we may apply ½E2 and ½E3 in one step; that is, we may apply the following operation: ½E Replace Rj by the sum of a multiple kRi of a row Ri and a nonzero multiple k0 Rj of itself. This may be written as ‘‘Replace Rj by kRi þ k0 Rj ðk0 6¼ 0Þ’’ or ‘‘kRi þ k0 Rj ! Rj’’ We emphasize that in operations ½E3 and ½E only row Rj is changed. Row Equivalence, Rank of a Matrix A matrix A is said to be row equivalent to a matrix B, written A B if B can be obtained from A by a sequence of elementary row operations. In the case that B is also an echelon matrix, B is called an echelon form of A. The following are two basic results on row equivalence. THEOREM 3.7: Suppose A ¼ ½aij and B ¼ ½bij are row equivalent echelon matrices with respective pivot entries a1j1 ; a2j2 ; . . . arjr and b1k1 ; b2k2 ; . . . bsks Then A and B have the same number of nonzero rows—that is, r ¼ s—and the pivot entries are in the same positions—that is, j1 ¼ k1, j2 ¼ k2; . . . ; jr ¼ kr. THEOREM 3.8: Every matrix A is row equivalent to a unique matrix in row canonical form. The proofs of the above theorems will be postponed to Chapter 4. The unique matrix in Theorem 3.8 is called the row canonical form of A. Using the above theorems, we can now give our first definition of the rank of a matrix. DEFINITION: The rank of a matrix A, written rankðAÞ, is equal to the number of pivots in an echelon form of A. The rank is a very important property of a matrix and, depending on the context in which the matrix is used, it will be defined in many different ways. Of course, all the definitions lead to the same number. The next section gives the matrix format of Gaussian elimination, which finds an echelon form of any matrix A (and hence the rank of A), and also finds the row canonical form of A. 72 CHAPTER 3 Systems of Linear Equations
  • 80. One can show that row equivalence is an equivalence relation. That is, (1) A A for any matrix A. (2) If A B, then B A. (3) If A B and B C, then A C. Property (2) comes from the fact that each elementary row operation has an inverse operation of the same type. Namely, (i) ‘‘Interchange Ri and Rj’’ is its own inverse. (ii) ‘‘Replace Ri by kRi’’ and ‘‘Replace Ri by ð1=kÞRi’’ are inverses. (iii) ‘‘Replace Rj by kRi þ Rj’’ and ‘‘Replace Rj by kRi þ Rj’’ are inverses. There is a similar result for operation [E] (Problem 3.73). 3.8 Gaussian Elimination, Matrix Formulation This section gives two matrix algorithms that accomplish the following: (1) Algorithm 3.3 transforms any matrix A into an echelon form. (2) Algorithm 3.4 transforms the echelon matrix into its row canonical form. These algorithms, which use the elementary row operations, are simply restatements of Gaussian elimination as applied to matrices rather than to linear equations. (The term ‘‘row reduce’’ or simply ‘‘reduce’’ will mean to transform a matrix by the elementary row operations.) ALGORITHM 3.3 (Forward Elimination): The input is any matrix A. (The algorithm puts 0’s below each pivot, working from the ‘‘top-down.’’) The output is an echelon form of A. Step 1. Find the first column with a nonzero entry. Let j1 denote this column. (a) Arrange so that a1j1 6¼ 0. That is, if necessary, interchange rows so that a nonzero entry appears in the first row in column j1. (b) Use a1j1 as a pivot to obtain 0’s below a1j1 . Specifically, for i 1: ð1Þ Set m ¼ aij1 =a1j1 ; ð2Þ Replace Ri by mR1 þ Ri [That is, apply the operation ðaij1 =a1j1 ÞR1 þ Ri ! Ri:] Step 2. Repeat Step 1 with the submatrix formed by all the rows excluding the first row. Here we let j2 denote the first column in the subsystem with a nonzero entry. Hence, at the end of Step 2, we have a2j2 6¼ 0. Steps 3 to r. Continue the above process until a submatrix has only zero rows. We emphasize that at the end of the algorithm, the pivots will be a1j1 ; a2j2 ; . . . ; arjr where r denotes the number of nonzero rows in the final echelon matrix. Remark 1: The following number m in Step 1(b) is called the multiplier: m ¼ aij1 a1j1 ¼ entry to be deleted pivot CHAPTER 3 Systems of Linear Equations 73
  • 81. Remark 2: One could replace the operation in Step 1(b) by the following which would avoid fractions if all the scalars were originally integers. Replace Ri by aij1 R1 þ a1j1 Ri: ALGORITHM 3.4 (Backward Elimination): The input is a matrix A ¼ ½aij in echelon form with pivot entries a1j1 ; a2j2 ; . . . ; arjr The output is the row canonical form of A. Step 1. (a) (Use row scaling so the last pivot equals 1.) Multiply the last nonzero row Rr by 1=arjr . (b) (Use arjr ¼ 1 to obtain 0’s above the pivot.) For i ¼ r 1; r 2; . . . ; 2; 1: ð1Þ Set m ¼ aijr ; ð2Þ Replace Ri by mRr þ Ri (That is, apply the operations aijr Rr þ Ri ! Ri.) Steps 2 to r1. Repeat Step 1 for rows Rr1, Rr2; . . . ; R2. Step r. (Use row scaling so the first pivot equals 1.) Multiply R1 by 1=a1j1 . There is an alternative form of Algorithm 3.4, which we describe here in words. The formal description of this algorithm is left to the reader as a supplementary problem. ALTERNATIVE ALGORITHM 3.4 Puts 0’s above the pivots row by row from the bottom up (rather than column by column from right to left). The alternative algorithm, when applied to an augmented matrix M of a system of linear equations, is essentially the same as solving for the pivot unknowns one after the other from the bottom up. Remark: We emphasize that Gaussian elimination is a two-stage process. Specifically, Stage A (Algorithm 3.3). Puts 0’s below each pivot, working from the top row R1 down. Stage B (Algorithm 3.4). Puts 0’s above each pivot, working from the bottom row Rr up. There is another algorithm, called Gauss–Jordan, that also row reduces a matrix to its row canonical form. The difference is that Gauss–Jordan puts 0’s both below and above each pivot as it works its way from the top row R1 down. Although Gauss–Jordan may be easier to state and understand, it is much less efficient than the two-stage Gaussian elimination algorithm. EXAMPLE 3.11 Consider the matrix A ¼ 1 2 3 1 2 2 4 4 6 10 3 6 6 9 13 2 4 3 5. (a) Use Algorithm 3.3 to reduce A to an echelon form. (b) Use Algorithm 3.4 to further reduce A to its row canonical form. (a) First use a11 ¼ 1 as a pivot to obtain 0’s below a11; that is, apply the operations ‘‘Replace R2 by 2R1 þ R2’’ and ‘‘Replace R3 by 3R1 þ R3.’’ Then use a23 ¼ 2 as a pivot to obtain 0 below a23; that is, apply the operation ‘‘Replace R3 by 3 2 R2 þ R3.’’ This yields A 1 2 3 1 2 0 0 2 4 6 0 0 3 6 7 2 4 3 5 1 2 3 1 2 0 0 2 4 6 0 0 0 0 2 2 4 3 5 The matrix is now in echelon form. 74 CHAPTER 3 Systems of Linear Equations
  • 82. (b) Multiply R3 by 1 2 so the pivot entry a35 ¼ 1, and then use a35 ¼ 1 as a pivot to obtain 0’s above it by the operations ‘‘Replace R2 by 6R3 þ R2’’ and then ‘‘Replace R1 by 2R3 þ R1.’’ This yields A 1 2 3 1 2 0 0 2 4 6 0 0 0 0 1 2 4 3 5 1 2 3 1 0 0 0 2 4 0 0 0 0 0 1 2 4 3 5: Multiply R2 by 1 2 so the pivot entry a23 ¼ 1, and then use a23 ¼ 1 as a pivot to obtain 0’s above it by the operation ‘‘Replace R1 by 3R2 þ R1.’’ This yields A 1 2 3 1 0 0 0 1 2 0 0 0 0 0 1 2 4 3 5 1 2 0 7 0 0 0 1 2 0 0 0 0 0 1 2 4 3 5: The last matrix is the row canonical form of A. Application to Systems of Linear Equations One way to solve a system of linear equations is by working with its augmented matrix M rather than the equations themselves. Specifically, we reduce M to echelon form (which tells us whether the system has a solution), and then further reduce M to its row canonical form (which essentially gives the solution of the original system of linear equations). The justification for this process comes from the following facts: (1) Any elementary row operation on the augmented matrix M of the system is equivalent to applying the corresponding operation on the system itself. (2) The system has a solution if and only if the echelon form of the augmented matrix M does not have a row of the form ð0; 0; . . . ; 0; bÞ with b 6¼ 0. (3) In the row canonical form of the augmented matrix M (excluding zero rows), the coefficient of each basic variable is a pivot entry equal to 1, and it is the only nonzero entry in its respective column; hence, the free-variable form of the solution of the system of linear equations is obtained by simply transferring the free variables to the other side. This process is illustrated below. EXAMPLE 3.12 Solve each of the following systems: (a) x1 þ x2 2x3 þ 4x4 ¼ 5 2x1 þ 2x2 3x3 þ x4 ¼ 3 3x1 þ 3x2 4x3 2x4 ¼ 1 (b) x1 þ x2 2x3 þ 3x4 ¼ 4 2x1 þ 3x2 þ 3x3 x4 ¼ 3 5x1 þ 7x2 þ 4x3 þ x4 ¼ 5 (c) x þ 2y þ z ¼ 3 2x þ 5y z ¼ 4 3x 2y z ¼ 5 (a) Reduce its augmented matrix M to echelon form and then to row canonical form as follows: M ¼ 1 1 2 4 5 2 2 3 1 3 3 3 4 2 1 2 4 3 5 1 1 2 4 5 0 0 1 7 7 0 0 2 14 14 2 4 3 5 1 1 0 10 9 0 0 1 7 7 0 0 0 0 0 2 4 3 5 Rewrite the row canonical form in terms of a system of linear equations to obtain the free variable form of the solution. That is, x1 þ x2 10x4 ¼ 9 x3 7x4 ¼ 7 or x1 ¼ 9 x2 þ 10x4 x3 ¼ 7 þ 7x4 (The zero row is omitted in the solution.) Observe that x1 and x3 are the pivot variables, and x2 and x4 are the free variables. CHAPTER 3 Systems of Linear Equations 75
  • 83. (b) First reduce its augmented matrix M to echelon form as follows: M ¼ 1 1 2 3 4 2 3 3 1 3 5 7 4 1 5 2 4 3 5 1 1 2 3 4 0 1 7 7 5 0 2 14 14 15 2 4 3 5 1 1 2 3 4 0 1 7 7 5 0 0 0 0 5 2 4 3 5 There is no need to continue to find the row canonical form of M, because the echelon form already tells us that the system has no solution. Specifically, the third row of the echelon matrix corresponds to the degenerate equation 0x1 þ 0x2 þ 0x3 þ 0x4 ¼ 5 which has no solution. Thus, the system has no solution. (c) Reduce its augmented matrix M to echelon form and then to row canonical form as follows: M ¼ 1 2 1 3 2 5 1 4 3 2 1 5 2 6 4 3 7 5 1 2 1 3 0 1 3 10 0 8 4 4 2 6 4 3 7 5 1 2 1 3 0 1 3 10 0 0 28 84 2 6 4 3 7 5 1 2 1 3 0 1 3 10 0 0 1 3 2 6 4 3 7 5 1 2 0 0 0 1 0 1 0 0 1 3 2 6 4 3 7 5 1 0 0 2 0 1 0 1 0 0 1 3 2 6 4 3 7 5 Thus, the system has the unique solution x ¼ 2, y ¼ 1, z ¼ 3, or, equivalently, the vector u ¼ ð2; 1; 3Þ. We note that the echelon form of M already indicated that the solution was unique, because it corresponded to a triangular system. Application to Existence and Uniqueness Theorems This subsection gives theoretical conditions for the existence and uniqueness of a solution of a system of linear equations using the notion of the rank of a matrix. THEOREM 3.9: Consider a system of linear equations in n unknowns with augmented matrix M ¼ ½A; B. Then, (a) The system has a solution if and only if rankðAÞ ¼ rankðMÞ. (b) The solution is unique if and only if rankðAÞ ¼ rankðMÞ ¼ n. Proof of (a). The system has a solution if and only if an echelon form of M ¼ ½A; B does not have a row of the form ð0; 0; . . . ; 0; bÞ; with b 6¼ 0 If an echelon form of M does have such a row, then b is a pivot of M but not of A, and hence, rankðMÞ rankðAÞ. Otherwise, the echelon forms of A and M have the same pivots, and hence, rankðAÞ ¼ rankðMÞ. This proves (a). Proof of (b). The system has a unique solution if and only if an echelon form has no free variable. This means there is a pivot for each unknown. Accordingly, n ¼ rankðAÞ ¼ rankðMÞ. This proves (b). The above proof uses the fact (Problem 3.74) that an echelon form of the augmented matrix M ¼ ½A; B also automatically yields an echelon form of A. 76 CHAPTER 3 Systems of Linear Equations
  • 84. 3.9 Matrix Equation of a System of Linear Equations The general system (3.2) of m linear equations in n unknowns is equivalent to the matrix equation a11 a12 . . . a1n a21 a22 . . . a2n ::::::::::::::::::::::::::::::: am1 am2 . . . amn 2 6 6 4 3 7 7 5 x1 x2 x3 . . . xn 2 6 6 6 6 4 3 7 7 7 7 5 ¼ b1 b2 . . . bm 2 6 6 4 3 7 7 5 or AX ¼ B where A ¼ ½aij is the coefficient matrix, X ¼ ½xj is the column vector of unknowns, and B ¼ ½bi is the column vector of constants. (Some texts write Ax ¼ b rather than AX ¼ B, in order to emphasize that x and b are simply column vectors.) The statement that the system of linear equations and the matrix equation are equivalent means that any vector solution of the system is a solution of the matrix equation, and vice versa. EXAMPLE 3.13 The following system of linear equations and matrix equation are equivalent: x1 þ 2x2 4x3 þ 7x4 ¼ 4 3x1 5x2 þ 6x3 8x4 ¼ 8 4x1 3x2 2x3 þ 6x4 ¼ 11 and 1 2 4 7 3 5 6 8 4 3 2 6 2 4 3 5 x1 x2 x3 x4 2 6 6 4 3 7 7 5 ¼ 4 8 11 2 4 3 5 We note that x1 ¼ 3, x2 ¼ 1, x3 ¼ 2, x4 ¼ 1, or, in other words, the vector u ¼ ½3; 1; 2; 1 is a solution of the system. Thus, the (column) vector u is also a solution of the matrix equation. The matrix form AX ¼ B of a system of linear equations is notationally very convenient when discussing and proving properties of systems of linear equations. This is illustrated with our first theorem (described in Fig. 3-1), which we restate for easy reference. THEOREM 3.1: Suppose the field K is infinite. Then the system AX ¼ B has: (a) a unique solution, (b) no solution, or (c) an infinite number of solutions. Proof. It suffices to show that if AX ¼ B has more than one solution, then it has infinitely many. Suppose u and v are distinct solutions of AX ¼ B; that is, Au ¼ B and Av ¼ B. Then, for any k 2 K, A½u þ kðu vÞ ¼ Au þ kðAu AvÞ ¼ B þ kðB BÞ ¼ B Thus, for each k 2 K, the vector u þ kðu vÞ is a solution of AX ¼ B. Because all such solutions are distinct (Problem 3.47), AX ¼ B has an infinite number of solutions. Observe that the above theorem is true when K is the real field R (or the complex field C). Section 3.3 shows that the theorem has a geometrical description when the system consists of two equations in two unknowns, where each equation represents a line in R2 . The theorem also has a geometrical description when the system consists of three nondegenerate equations in three unknowns, where the three equations correspond to planes H1, H2, H3 in R3 . That is, (a) Unique solution: Here the three planes intersect in exactly one point. (b) No solution: Here the planes may intersect pairwise but with no common point of intersection, or two of the planes may be parallel. (c) Infinite number of solutions: Here the three planes may intersect in a line (one free variable), or they may coincide (two free variables). These three cases are pictured in Fig. 3-3. Matrix Equation of a Square System of Linear Equations A system AX ¼ B of linear equations is square if and only if the matrix A of coefficients is square. In such a case, we have the following important result. CHAPTER 3 Systems of Linear Equations 77
  • 85. THEOREM 3.10: A square system AX ¼ B of linear equations has a unique solution if and only if the matrix A is invertible. In such a case, A1 B is the unique solution of the system. We only prove here that if A is invertible, then A1 B is a unique solution. If A is invertible, then AðA1 BÞ ¼ ðAA1 ÞB ¼ IB ¼ B and hence, A1 B is a solution. Now suppose v is any solution, so Av ¼ B. Then v ¼ Iv ¼ ðA1 AÞv ¼ A1 ðAvÞ ¼ A1 B Thus, the solution A1 B is unique. EXAMPLE 3.14 Consider the following system of linear equations, whose coefficient matrix A and inverse A1 are also given: x þ 2y þ 3z ¼ 1 x þ 3y þ 6z ¼ 3 2x þ 6y þ 13z ¼ 5 ; A ¼ 1 2 3 1 3 6 2 6 13 2 4 3 5; A1 ¼ 3 8 3 1 7 3 0 2 1 2 4 3 5 By Theorem 3.10, the unique solution of the system is A1 B ¼ 3 8 3 1 7 3 0 2 1 2 4 3 5 1 3 5 2 4 3 5 ¼ 6 5 1 2 4 3 5 That is, x ¼ 6, y ¼ 5, z ¼ 1. Remark: We emphasize that Theorem 3.10 does not usually help us to find the solution of a square system. That is, finding the inverse of a coefficient matrix A is not usually any easier than solving the system directly. Thus, unless we are given the inverse of a coefficient matrix A, as in Example 3.14, we usually solve a square system by Gaussian elimination (or some iterative method whose discussion lies beyond the scope of this text). ( ) Unique solution a H2 H3 H1 H1 H2 H3 ( ) Infinite number of solutions c H3 H H 1 2 and (i) (ii) (iii) H H H 1 2 3 , , and (i) ( ) No solutions b H3 H2 H1 (ii) (iii) (i ) v H1 H2 H3 H2 H3 H1 H3 Figure 3-3 78 CHAPTER 3 Systems of Linear Equations
  • 86. 3.10 Systems of Linear Equations and Linear Combinations of Vectors The general system (3.2) of linear equations may be rewritten as the following vector equation: x1 a11 a21 . . . am1 2 6 6 4 3 7 7 5 þ x2 a12 a22 . . . am2 2 6 6 4 3 7 7 5 þ þ xn a1n a2n . . . amn 2 6 6 4 3 7 7 5 ¼ b1 b2 . . . bm 2 6 6 4 3 7 7 5 Recall that a vector v in Kn is said to be a linear combination of vectors u1; u2; . . . ; um in Kn if there exist scalars a1; a2; . . . ; am in K such that v ¼ a1u1 þ a2u2 þ þ amum Accordingly, the general system (3.2) of linear equations and the above equivalent vector equation have a solution if and only if the column vector of constants is a linear combination of the columns of the coefficient matrix. We state this observation formally. THEOREM 3.11: A system AX ¼ B of linear equations has a solution if and only if B is a linear combination of the columns of the coefficient matrix A. Thus, the answer to the problem of expressing a given vector v in Kn as a linear combination of vectors u1; u2; . . . ; um in Kn reduces to solving a system of linear equations. Linear Combination Example Suppose we want to write the vector v ¼ ð1; 2; 5Þ as a linear combination of the vectors u1 ¼ ð1; 1; 1Þ; u2 ¼ ð1; 2; 3Þ; u3 ¼ ð2; 1; 1Þ First we write v ¼ xu1 þ yu2 þ zu3 with unknowns x; y; z, and then we find the equivalent system of linear equations which we solve. Specifically, we first write 1 2 5 2 4 3 5 ¼ x 1 1 1 2 4 3 5 þ y 1 2 3 2 4 3 5 þ z 2 1 1 2 4 3 5 ð*Þ Then 1 2 5 2 4 3 5 ¼ x x x 2 4 3 5 þ y 2y 3y 2 4 3 5 þ 2z z z 2 4 3 5 ¼ x þ y þ 2z x þ 2y z x þ 3y þ z 2 4 3 5 Setting corresponding entries equal to each other yields the following equivalent system: x þ y þ 2z ¼ 1 x þ 2y z ¼ 2 x þ 3y þ z ¼ 5 ð**Þ For notational convenience, we have written the vectors in Rn as columns, because it is then easier to find the equivalent system of linear equations. In fact, one can easily go from the vector equation (*) directly to the system (**). Now we solve the equivalent system of linear equations by reducing the system to echelon form. This yields x þ y þ 2z ¼ 1 y 3z ¼ 3 2y z ¼ 4 and then x þ y þ 2z ¼ 1 y 3z ¼ 3 5z ¼ 10 Back-substitution yields the solution x ¼ 6, y ¼ 3, z ¼ 2. Thus, v ¼ 6u1 þ 3u2 þ 2u3. CHAPTER 3 Systems of Linear Equations 79
  • 87. EXAMPLE 3.15 (a) Write the vector v ¼ ð4; 9; 19Þ as a linear combination of u1 ¼ ð1; 2; 3Þ; u2 ¼ ð3; 7; 10Þ; u3 ¼ ð2; 1; 9Þ: Find the equivalent system of linear equations by writing v ¼ xu1 þ yu2 þ zu3, and reduce the system to an echelon form. We have x þ 3y þ 2z ¼ 4 2x 7y þ z ¼ 9 3x þ 10y þ 9z ¼ 19 or x þ 3y þ 2z ¼ 4 y þ 5z ¼ 17 y þ 3z ¼ 7 or x þ 3y þ 2z ¼ 4 y þ 5z ¼ 17 8z ¼ 24 Back-substitution yields the solution x ¼ 4, y ¼ 2, z ¼ 3. Thus, v is a linear combination of u1; u2; u3. Specifically, v ¼ 4u1 2u2 þ 3u3. (b) Write the vector v ¼ ð2; 3; 5Þ as a linear combination of u1 ¼ ð1; 2; 3Þ; u2 ¼ ð2; 3; 4Þ; u3 ¼ ð1; 3; 5Þ Find the equivalent system of linear equations by writing v ¼ xu1 þ yu2 þ zu3, and reduce the system to an echelon form. We have x þ 2y þ z ¼ 2 2x þ 3y þ 3z ¼ 3 3x 4y 5z ¼ 5 or x þ 2y þ z ¼ 2 y þ z ¼ 1 2y 2z ¼ 1 or x þ 2y þ z ¼ 2 5y þ 5z ¼ 1 0 ¼ 3 The system has no solution. Thus, it is impossible to write v as a linear combination of u1; u2; u3. Linear Combinations of Orthogonal Vectors, Fourier Coefficients Recall first (Section 1.4) that the dot (inner) product u v of vectors u ¼ ða1; . . . ; anÞ and v ¼ ðb1; . . . ; bnÞ in Rn is defined by u v ¼ a1b1 þ a2b2 þ þ anbn Furthermore, vectors u and v are said to be orthogonal if their dot product u v ¼ 0. Suppose that u1; u2; . . . ; un in Rn are n nonzero pairwise orthogonal vectors. This means ðiÞ ui uj ¼ 0 for i 6¼ j and ðiiÞ ui ui 6¼ 0 for each i Then, for any vector v in Rn , there is an easy way to write v as a linear combination of u1; u2; . . . ; un, which is illustrated in the next example. EXAMPLE 3.16 Consider the following three vectors in R3 : u1 ¼ ð1; 1; 1Þ; u2 ¼ ð1; 3; 2Þ; u3 ¼ ð5; 1; 4Þ These vectors are pairwise orthogonal; that is, u1 u2 ¼ 1 3 þ 2 ¼ 0; u1 u3 ¼ 5 1 4 ¼ 0; u2 u3 ¼ 5 þ 3 8 ¼ 0 Suppose we want to write v ¼ ð4; 14; 9Þ as a linear combination of u1; u2; u3. Method 1. Find the equivalent system of linear equations as in Example 3.14 and then solve, obtaining v ¼ 3u1 4u2 þ u3. Method 2. (This method uses the fact that the vectors u1; u2; u3 are mutually orthogonal, and hence, the arithmetic is much simpler.) Set v as a linear combination of u1; u2; u3 using unknown scalars x; y; z as follows: ð4; 14; 9Þ ¼ xð1; 1; 1Þ þ yð1; 3; 2Þ þ zð5; 1; 4Þ ð*Þ 80 CHAPTER 3 Systems of Linear Equations
  • 88. Take the dot product of (*) with respect to u1 to get ð4; 14; 9Þ ð1; 1; 1Þ ¼ xð1; 1; 1Þ ð1; 1; 1Þ or 9 ¼ 3x or x ¼ 3 (The last two terms drop out, because u1 is orthogonal to u2 and to u3.) Next take the dot product of (*) with respect to u2 to obtain ð4; 14; 9Þ ð1; 3; 2Þ ¼ yð1; 3; 2Þ ð1; 3; 2Þ or 56 ¼ 14y or y ¼ 4 Finally, take the dot product of (*) with respect to u3 to get ð4; 14; 9Þ ð5; 1; 4Þ ¼ zð5; 1; 4Þ ð5; 1; 4Þ or 42 ¼ 42z or z ¼ 1 Thus, v ¼ 3u1 4u2 þ u3. The procedure in Method 2 in Example 3.16 is valid in general. Namely, THEOREM 3.12: Suppose u1; u2; . . . ; un are nonzero mutually orthogonal vectors in Rn . Then, for any vector v in Rn , v ¼ v u1 u1 u1 u1 þ v u2 u2 u2 u2 þ þ v un un un un We emphasize that there must be n such orthogonal vectors ui in Rn for the formula to be used. Note also that each ui ui 6¼ 0, because each ui is a nonzero vector. Remark: The following scalar ki (appearing in Theorem 3.12) is called the Fourier coefficient of v with respect to ui: ki ¼ v ui ui ui ¼ v ui kuik2 It is analogous to a coefficient in the celebrated Fourier series of a function. 3.11 Homogeneous Systems of Linear Equations A system of linear equations is said to be homogeneous if all the constant terms are zero. Thus, a homogeneous system has the form AX ¼ 0. Clearly, such a system always has the zero vector 0 ¼ ð0; 0; . . . ; 0Þ as a solution, called the zero or trivial solution. Accordingly, we are usually interested in whether or not the system has a nonzero solution. Because a homogeneous system AX ¼ 0 has at least the zero solution, it can always be put in an echelon form, say a11x1 þ a12x2 þ a13x3 þ a14x4 þ þ a1nxn ¼ 0 a2j2 xj2 þ a2;j2þ1xj2þ1 þ þ a2nxn ¼ 0 :::::::::::::::::::::::::::::::::::::::::::: arjr xjr þ þ arnxn ¼ 0 Here r denotes the number of equations in echelon form and n denotes the number of unknowns. Thus, the echelon system has n r free variables. The question of nonzero solutions reduces to the following two cases: (i) r ¼ n. The system has only the zero solution. (ii) r n. The system has a nonzero solution. Accordingly, if we begin with fewer equations than unknowns, then, in echelon form, r n, and the system has a nonzero solution. This proves the following important result. THEOREM 3.13: A homogeneous system AX ¼ 0 with more unknowns than equations has a nonzero solution. CHAPTER 3 Systems of Linear Equations 81
  • 89. EXAMPLE 3.17 Determine whether or not each of the following homogeneous systems has a nonzero solution: (a) x þ y z ¼ 0 2x 3y þ z ¼ 0 x 4y þ 2z ¼ 0 (b) x þ y z ¼ 0 2x þ 4y z ¼ 0 3x þ 2y þ 2z ¼ 0 (c) x1 þ 2x2 3x3 þ 4x4 ¼ 0 2x1 3x2 þ 5x3 7x4 ¼ 0 5x1 þ 6x2 9x3 þ 8x4 ¼ 0 (a) Reduce the system to echelon form as follows: x þ y z ¼ 0 5y þ 3z ¼ 0 5y þ 3z ¼ 0 and then x þ y z ¼ 0 5y þ 3z ¼ 0 The system has a nonzero solution, because there are only two equations in the three unknowns in echelon form. Here z is a free variable. Let us, say, set z ¼ 5. Then, by back-substitution, y ¼ 3 and x ¼ 2. Thus, the vector u ¼ ð2; 3; 5Þ is a particular nonzero solution. (b) Reduce the system to echelon form as follows: x þ y z ¼ 0 2y þ z ¼ 0 y þ 5z ¼ 0 and then x þ y z ¼ 0 2y þ z ¼ 0 11z ¼ 0 In echelon form, there are three equations in three unknowns. Thus, the system has only the zero solution. (c) The system must have a nonzero solution (Theorem 3.13), because there are four unknowns but only three equations. (Here we do not need to reduce the system to echelon form.) Basis for the General Solution of a Homogeneous System Let W denote the general solution of a homogeneous system AX ¼ 0. A list of nonzero solution vectors u1; u2; . . . ; us of the system is said to be a basis for W if each solution vector w 2 W can be expressed uniquely as a linear combination of the vectors u1; u2; . . . ; us; that is, there exist unique scalars a1; a2; . . . ; as such that w ¼ a1u1 þ a2u2 þ þ asus The number s of such basis vectors is equal to the number of free variables. This number s is called the dimension of W, written as dim W ¼ s. When W ¼ f0g—that is, the system has only the zero solution— we define dim W ¼ 0. The following theorem, proved in Chapter 5, page 171, tells us how to find such a basis. THEOREM 3.14: Let W be the general solution of a homogeneous system AX ¼ 0, and suppose that the echelon form of the homogeneous system has s free variables. Let u1; u2; . . . ; us be the solutions obtained by setting one of the free variables equal to 1 (or any nonzero constant) and the remaining free variables equal to 0. Then dim W ¼ s, and the vectors u1; u2; . . . ; us form a basis of W. We emphasize that the general solution W may have many bases, and that Theorem 3.12 only gives us one such basis. EXAMPLE 3.18 Find the dimension and a basis for the general solution W of the homogeneous system x1 þ 2x2 3x3 þ 2x4 4x5 ¼ 0 2x1 þ 4x2 5x3 þ x4 6x5 ¼ 0 5x1 þ 10x2 13x3 þ 4x4 16x5 ¼ 0 82 CHAPTER 3 Systems of Linear Equations
  • 90. First reduce the system to echelon form. Apply the following operations: ‘‘Replace L2 by 2L1 þ L2’’ and ‘‘Replace L3 by 5L1 þ L3’’ and then ‘‘Replace L3 by 2L2 þ L3’’ These operations yield x1 þ 2x2 3x3 þ 2x4 4x5 ¼ 0 x3 3x4 þ 2x5 ¼ 0 2x3 6x4 þ 4x5 ¼ 0 and x1 þ 2x2 3x3 þ 2x4 4x5 ¼ 0 x3 3x4 þ 2x5 ¼ 0 The system in echelon form has three free variables, x2; x4; x5; hence, dim W ¼ 3. Three solution vectors that form a basis for W are obtained as follows: (1) Set x2 ¼ 1, x4 ¼ 0, x5 ¼ 0. Back-substitution yields the solution u1 ¼ ð2; 1; 0; 0; 0Þ. (2) Set x2 ¼ 0, x4 ¼ 1, x5 ¼ 0. Back-substitution yields the solution u2 ¼ ð7; 0; 3; 1; 0Þ. (3) Set x2 ¼ 0, x4 ¼ 0, x5 ¼ 1. Back-substitution yields the solution u3 ¼ ð2; 0; 2; 0; 1Þ. The vectors u1 ¼ ð2; 1; 0; 0; 0Þ, u2 ¼ ð7; 0; 3; 1; 0Þ, u3 ¼ ð2; 0; 2; 0; 1Þ form a basis for W. Remark: Any solution of the system in Example 3.18 can be written in the form au1 þ bu2 þ cu3 ¼ að2; 1; 0; 0; 0Þ þ bð7; 0; 3; 1; 0Þ þ cð2; 0; 2; 0; 1Þ ¼ ð2a þ 7b 2c; a; 3b 2c; b; cÞ or x1 ¼ 2a þ 7b 2c; x2 ¼ a; x3 ¼ 3b 2c; x4 ¼ b; x5 ¼ c where a; b; c are arbitrary constants. Observe that this representation is nothing more than the parametric form of the general solution under the choice of parameters x2 ¼ a, x4 ¼ b, x5 ¼ c. Nonhomogeneous and Associated Homogeneous Systems Let AX ¼ B be a nonhomogeneous system of linear equations. Then AX ¼ 0 is called the associated homogeneous system. For example, x þ 2y 4z ¼ 7 3x 5y þ 6z ¼ 8 and x þ 2y 4z ¼ 0 3x 5y þ 6z ¼ 0 show a nonhomogeneous system and its associated homogeneous system. The relationship between the solution U of a nonhomogeneous system AX ¼ B and the solution W of its associated homogeneous system AX ¼ 0 is contained in the following theorem. THEOREM 3.15: Let v0 be a particular solution of AX ¼ B and let W be the general solution of AX ¼ 0. Then the following is the general solution of AX ¼ B: U ¼ v0 þ W ¼ fv0 þ w : w 2 Wg That is, U ¼ v0 þ W is obtained by adding v0 to each element in W. We note that this theorem has a geometrical interpretation in R3 . Specifically, suppose W is a line through the origin O. Then, as pictured in Fig. 3-4, U ¼ v0 þ W is the line parallel to W obtained by adding v0 to each element of W. Similarly, whenever W is a plane through the origin O, then U ¼ v0 þ W is a plane parallel to W. CHAPTER 3 Systems of Linear Equations 83
  • 91. 3.12 Elementary Matrices Let e denote an elementary row operation and let eðAÞ denote the results of applying the operation e to a matrix A. Now let E be the matrix obtained by applying e to the identity matrix I; that is, E ¼ eðIÞ Then E is called the elementary matrix corresponding to the elementary row operation e. Note that E is always a square matrix. EXAMPLE 3.19 Consider the following three elementary row operations: ð1Þ Interchange R2 and R3: ð2Þ Replace R2 by 6R2: ð3Þ Replace R3 by 4R1 þ R3: The 3 3 elementary matrices corresponding to the above elementary row operations are as follows: E1 ¼ 1 0 0 0 0 1 0 1 0 2 4 3 5; E2 ¼ 1 0 0 0 6 0 0 0 1 2 4 3 5; E3 ¼ 1 0 0 0 1 0 4 0 1 2 4 3 5 The following theorem, proved in Problem 3.34, holds. THEOREM 3.16: Let e be an elementary row operation and let E be the corresponding m m elementary matrix. Then eðAÞ ¼ EA where A is any m n matrix. In other words, the result of applying an elementary row operation e to a matrix A can be obtained by premultiplying A by the corresponding elementary matrix E. Now suppose e0 is the inverse of an elementary row operation e, and let E0 and E be the corresponding matrices. We note (Problem 3.33) that E is invertible and E0 is its inverse. This means, in particular, that any product P ¼ Ek . . . E2E1 of elementary matrices is invertible. Figure 3-4 84 CHAPTER 3 Systems of Linear Equations
  • 92. Applications of Elementary Matrices Using Theorem 3.16, we are able to prove (Problem 3.35) the following important properties of matrices. THEOREM 3.17: Let A be a square matrix. Then the following are equivalent: (a) A is invertible (nonsingular). (b) A is row equivalent to the identity matrix I. (c) A is a product of elementary matrices. Recall that square matrices A and B are inverses if AB ¼ BA ¼ I. The next theorem (proved in Problem 3.36) demonstrates that we need only show that one of the products is true, say AB ¼ I, to prove that matrices are inverses. THEOREM 3.18: Suppose AB ¼ I. Then BA ¼ I, and hence, B ¼ A1 . Row equivalence can also be defined in terms of matrix multiplication. Specifically, we will prove (Problem 3.37) the following. THEOREM 3.19: B is row equivalent to A if and only if there exists a nonsingular matrix P such that B ¼ PA. Application to Finding the Inverse of an n n Matrix The following algorithm finds the inverse of a matrix. ALGORITHM 3.5: The input is a square matrix A. The output is the inverse of A or that the inverse does not exist. Step 1. Form the n 2n (block) matrix M ¼ ½A; I, where A is the left half of M and the identity matrix I is the right half of M. Step 2. Row reduce M to echelon form. If the process generates a zero row in the A half of M, then STOP A has no inverse. (Otherwise A is in triangular form.) Step 3. Further row reduce M to its row canonical form M ½I; B where the identity matrix I has replaced A in the left half of M. Step 4. Set A1 ¼ B, the matrix that is now in the right half of M. The justification for the above algorithm is as follows. Suppose A is invertible and, say, the sequence of elementary row operations e1; e2; . . . ; eq applied to M ¼ ½A; I reduces the left half of M, which is A, to the identity matrix I. Let Ei be the elementary matrix corresponding to the operation ei. Then, by applying Theorem 3.16. we get Eq . . . E2E1A ¼ I or ðEq . . . E2E1IÞA ¼ I; so A1 ¼ Eq . . . E2E1I That is, A1 can be obtained by applying the elementary row operations e1; e2; . . . ; eq to the identity matrix I, which appears in the right half of M. Thus, B ¼ A1 , as claimed. EXAMPLE 3.20 Find the inverse of the matrix A ¼ 1 0 2 2 1 3 4 1 8 2 4 3 5. CHAPTER 3 Systems of Linear Equations 85
  • 93. First form the (block) matrix M ¼ ½A; I and row reduce M to an echelon form: M ¼ 1 0 2 1 0 0 2 1 3 0 1 0 4 1 8 0 0 1 2 4 3 5 1 0 2 1 0 0 0 1 1 2 1 0 0 1 0 4 0 1 2 4 3 5 1 0 2 1 0 0 0 1 1 2 1 0 0 0 1 6 1 1 2 4 3 5 In echelon form, the left half of M is in triangular form; hence, A has an inverse. Next we further row reduce M to its row canonical form: M 1 0 0 11 2 2 0 1 0 4 0 1 0 0 1 6 1 1 2 4 3 5 1 0 0 11 2 2 0 1 0 4 0 1 0 0 1 6 1 1 2 4 3 5 The identity matrix is now in the left half of the final matrix; hence, the right half is A1 . In other words, A1 ¼ 11 2 2 4 0 1 6 1 1 2 4 3 5 Elementary Column Operations Now let A be a matrix with columns C1; C2; . . . ; Cn. The following operations on A, analogous to the elementary row operations, are called elementary column operations: ½F1 (Column Interchange): Interchange columns Ci and Cj. ½F2 (Column Scaling): Replace Ci by kCi (where k 6¼ 0). ½F3 (Column Addition): Replace Cj by kCi þ Cj. We may indicate each of the column operations by writing, respectively, ð1Þ Ci $ Cj; ð2Þ kCi ! Ci; ð3Þ ðkCi þ CjÞ ! Cj Moreover, each column operation has an inverse operation of the same type, just like the corresponding row operation. Now let f denote an elementary column operation, and let F be the matrix obtained by applying f to the identity matrix I; that is, F ¼ f ðIÞ Then F is called the elementary matrix corresponding to the elementary column operation f . Note that F is always a square matrix. EXAMPLE 3.21 Consider the following elementary column operations: ð1Þ Interchange C1 and C3; ð2Þ Replace C3 by 2C3; ð3Þ Replace C3 by 3C2 þ C3 The corresponding three 3 3 elementary matrices are as follows: F1 ¼ 0 0 1 0 1 0 1 0 0 2 4 3 5; F2 ¼ 1 0 0 0 1 0 0 0 2 2 4 3 5; F3 ¼ 1 0 0 0 1 3 0 0 1 2 4 3 5 The following theorem is analogous to Theorem 3.16 for the elementary row operations. THEOREM 3.20: For any matrix A; f ðAÞ ¼ AF. That is, the result of applying an elementary column operation f on a matrix A can be obtained by postmultiplying A by the corresponding elementary matrix F. 86 CHAPTER 3 Systems of Linear Equations
  • 94. Matrix Equivalence A matrix B is equivalent to a matrix A if B can be obtained from A by a sequence of row and column operations. Alternatively, B is equivalent to A, if there exist nonsingular matrices P and Q such that B ¼ PAQ. Just like row equivalence, equivalence of matrices is an equivalence relation. The main result of this subsection (proved in Problem 3.38) is as follows. THEOREM 3.21: Every m n matrix A is equivalent to a unique block matrix of the form Ir 0 0 0 where Ir is the r-square identity matrix. The following definition applies. DEFINITION: The nonnegative integer r in Theorem 3.18 is called the rank of A, written rankðAÞ. Note that this definition agrees with the previous definition of the rank of a matrix. 3.13 LU DECOMPOSITION Suppose A is a nonsingular matrix that can be brought into (upper) triangular form U using only row- addition operations; that is, suppose A can be triangularized by the following algorithm, which we write using computer notation. ALGORITHM 3.6: The input is a matrix A and the output is a triangular matrix U. Step 1. Repeat for i ¼ 1; 2; . . . ; n 1: Step 2. Repeat for j ¼ i þ 1, i þ 2; . . . ; n (a) Set mij : ¼ aij=aii. (b) Set Rj : ¼ mijRi þ Rj [End of Step 2 inner loop.] [End of Step 1 outer loop.] The numbers mij are called multipliers. Sometimes we keep track of these multipliers by means of the following lower triangular matrix L: L ¼ 1 0 0 . . . 0 0 m21 1 0 . . . 0 0 m31 m32 1 . . . 0 0 mn1 mn2 mn3 . . . mn;n1 1 2 6 6 6 6 4 3 7 7 7 7 5 That is, L has 1’s on the diagonal, 0’s above the diagonal, and the negative of the multiplier mij as its ij-entry below the diagonal. The above matrix L and the triangular matrix U obtained in Algorithm 3.6 give us the classical LU factorization of such a matrix A. Namely, THEOREM 3.22: Let A be a nonsingular matrix that can be brought into triangular form U using only row-addition operations. Then A ¼ LU, where L is the above lower triangular matrix with 1’s on the diagonal, and U is an upper triangular matrix with no 0’s on the diagonal. ......................................................... CHAPTER 3 Systems of Linear Equations 87
  • 95. EXAMPLE 3.22 SupposeA ¼ 1 2 3 3 4 13 2 1 5 2 4 3 5.WenotethatA maybereducedtotriangularformbytheoperations ‘‘Replace R2 by 3R1 þ R2’’; ‘‘Replace R3 by 2R1 þ R3’’; and then ‘‘Replace R3 by 3 2 R2 þ R3’’ That is, A 1 2 3 0 2 4 0 3 1 2 4 3 5 1 2 3 0 2 4 0 0 7 2 4 3 5 This gives us the classical factorization A ¼ LU, where L ¼ 1 0 0 3 1 0 2 3 2 1 2 6 4 3 7 5 and U ¼ 1 2 3 0 2 4 0 0 7 2 6 4 3 7 5 We emphasize: (1) The entries 3; 2; 3 2 in L are the negatives of the multipliers in the above elementary row operations. (2) U is the triangular form of A. Application to Systems of Linear Equations Consider a computer algorithm M. Let CðnÞ denote the running time of the algorithm as a function of the size n of the input data. [The function CðnÞ is sometimes called the time complexity or simply the complexity of the algorithm M.] Frequently, CðnÞ simply counts the number of multiplications and divisions executed by M, but does not count the number of additions and subtractions because they take much less time to execute. Now consider a square system of linear equations AX ¼ B, where A ¼ ½aij; X ¼ ½x1; . . . ; xnT ; B ¼ ½b1; . . . ; bnT and suppose A has an LU factorization. Then the system can be brought into triangular form (in order to apply back-substitution) by applying Algorithm 3.6 to the augmented matrix M ¼ ½A; B of the system. The time complexity of Algorithm 3.6 and back-substitution are, respectively, CðnÞ 1 2 n3 and CðnÞ 1 2 n2 where n is the number of equations. On the other hand, suppose we already have the factorization A ¼ LU. Then, to triangularize the system, we need only apply the row operations in the algorithm (retained by the matrix L) to the column vector B. In this case, the time complexity is CðnÞ 1 2 n2 Of course, to obtain the factorization A ¼ LU requires the original algorithm where CðnÞ 1 2 n3 . Thus, nothing may be gained by first finding the LU factorization when a single system is involved. However, there are situations, illustrated below, where the LU factorization is useful. Suppose, for a given matrix A, we need to solve the system AX ¼ B 88 CHAPTER 3 Systems of Linear Equations
  • 96. repeatedly for a sequence of different constant vectors, say B1; B2; . . . ; Bk. Also, suppose some of the Bi depend upon the solution of the system obtained while using preceding vectors Bj. In such a case, it is more efficient to first find the LU factorization of A, and then to use this factorization to solve the system for each new B. EXAMPLE 3.23 Consider the following system of linear equations: x þ 2y þ z ¼ k1 2x þ 3y þ 3z ¼ k2 3x þ 10y þ 2z ¼ k3 or AX ¼ B; where A ¼ 1 2 1 2 3 3 3 10 2 2 4 3 5 and B ¼ k1 k2 k3 2 4 3 5 Suppose we want to solve the system three times where B is equal, say, to B1; B2; B3. Furthermore, suppose B1 ¼ ½1; 1; 1T , and suppose Bjþ1 ¼ Bj þ Xj ðfor j ¼ 1; 2Þ where Xj is the solution of AX ¼ Bj. Here it is more efficient to first obtain the LU factorization of A and then use the LU factorization to solve the system for each of the B’s. (This is done in Problem 3.42.) SOLVED PROBLEMS Linear Equations, Solutions, 2 2 Systems 3.1. Determine whether each of the following equations is linear: (a) 5x þ 7y 8yz ¼ 16, (b) x þ py þ ez ¼ log 5, (c) 3x þ ky 8z ¼ 16 (a) No, because the product yz of two unknowns is of second degree. (b) Yes, because p; e, and log 5 are constants. (c) As it stands, there are four unknowns: x; y; z; k. Because of the term ky it is not a linear equation. However, assuming k is a constant, the equation is linear in the unknowns x; y; z. 3.2. Determine whether the following vectors are solutions of x1 þ 2x2 4x3 þ 3x4 ¼ 15: (a) u ¼ ð3; 2; 1; 4Þ and (b) v ¼ ð1; 2; 4; 5Þ: (a) Substitute to obtain 3 þ 2ð2Þ 4ð1Þ þ 3ð4Þ ¼ 15, or 15 ¼ 15; yes, it is a solution. (b) Substitute to obtain 1 þ 2ð2Þ 4ð4Þ þ 3ð5Þ ¼ 15, or 4 ¼ 15; no, it is not a solution. 3.3. Solve (a) ex ¼ p, (b) 3x 4 x ¼ 2x þ 3, (c) 7 þ 2x 4 ¼ 3x þ 3 x (a) Because e 6¼ 0, multiply by 1=e to obtain x ¼ p=e. (b) Rewrite in standard form, obtaining 0x ¼ 7. The equation has no solution. (c) Rewrite in standard form, obtaining 0x ¼ 0. Every scalar k is a solution. 3.4. Prove Theorem 3.4: Consider the equation ax ¼ b. (i) If a 6¼ 0, then x ¼ b=a is a unique solution of ax ¼ b. (ii) If a ¼ 0 but b 6¼ 0, then ax ¼ b has no solution. (iii) If a ¼ 0 and b ¼ 0, then every scalar k is a solution of ax ¼ b. Suppose a 6¼ 0. Then the scalar b=a exists. Substituting b=a in ax ¼ b yields aðb=aÞ ¼ b, or b ¼ b; hence, b=a is a solution. On the other hand, suppose x0 is a solution to ax ¼ b, so that ax0 ¼ b. Multiplying both sides by 1=a yields x0 ¼ b=a. Hence, b=a is the unique solution of ax ¼ b. Thus, (i) is proved. On the other hand, suppose a ¼ 0. Then, for any scalar k, we have ak ¼ 0k ¼ 0. If b 6¼ 0, then ak 6¼ b. Accordingly, k is not a solution of ax ¼ b, and so (ii) is proved. If b ¼ 0, then ak ¼ b. That is, any scalar k is a solution of ax ¼ b, and so (iii) is proved. CHAPTER 3 Systems of Linear Equations 89
  • 97. 3.5. Solve each of the following systems: (a) 2x 5y ¼ 11 3x þ 4y ¼ 5 (b) 2x 3y ¼ 8 6x þ 9y ¼ 6 (c) 2x 3y ¼ 8 4x þ 6y ¼ 16 (a) Eliminate x from the equations by forming the new equation L ¼ 3L1 þ 2L2. This yields the equation 23y ¼ 23; and so y ¼ 1 Substitute y ¼ 1 in one of the original equations, say L1, to get 2x 5ð1Þ ¼ 11 or 2x þ 5 ¼ 11 or 2x ¼ 6 or x ¼ 3 Thus, x ¼ 3, y ¼ 1 or the pair u ¼ ð3; 1Þ is the unique solution of the system. (b) Eliminate x from the equations by forming the new equation L ¼ 3L1 þ L2. This yields the equation 0x þ 0y ¼ 30 This is a degenerate equation with a nonzero constant; hence, this equation and the system have no solution. (Geometrically, the lines corresponding to the equations are parallel.) (c) Eliminate x from the equations by forming the new equation L ¼ 2L1 þ L2. This yields the equation 0x þ 0y ¼ 0 This is a degenerate equation where the constant term is also zero. Thus, the system has an infinite number of solutions, which correspond to the solution of either equation. (Geometrically, the lines corresponding to the equations coincide.) To find the general solution, set y ¼ a and substitute in L1 to obtain 2x 3a ¼ 8 or 2x ¼ 3a þ 8 or x ¼ 3 2 a þ 4 Thus, the general solution is x ¼ 3 2 a þ 4; y ¼ a or u ¼ 3 2 a þ 4; a where a is any scalar. 3.6. Consider the system x þ ay ¼ 4 ax þ 9y ¼ b (a) For which values of a does the system have a unique solution? (b) Find those pairs of values (a; b) for which the system has more than one solution. (a) Eliminate x from the equations by forming the new equation L ¼ aL1 þ L2. This yields the equation ð9 a2 Þy ¼ b 4a ð1Þ The system has a unique solution if and only if the coefficient of y in (1) is not zero—that is, if 9 a2 6¼ 0 or if a 6¼ 3. (b) The system has more than one solution if both sides of (1) are zero. The left-hand side is zero when a ¼ 3. When a ¼ 3, the right-hand side is zero when b 12 ¼ 0 or b ¼ 12. When a ¼ 3, the right- hand side is zero when b þ 12 0 or b ¼ 12. Thus, (3; 12) and ð3; 12Þ are the pairs for which the system has more than one solution. Systems in Triangular and Echelon Form 3.7. Determine the pivot and free variables in each of the following systems: 2x1 3x2 6x3 5x4 þ 2x5 ¼ 7 x3 þ 3x4 7x5 ¼ 6 x4 2x5 ¼ 1 (a) 2x 6y þ 7z ¼ 1 4y þ 3z ¼ 8 2z ¼ 4 (b) x þ 2y 3z ¼ 2 2x þ 3y þ z ¼ 4 3x þ 4y þ 5z ¼ 8 (c) (a) In echelon form, the leading unknowns are the pivot variables, and the others are the free variables. Here x1, x3, x4 are the pivot variables, and x2 and x5 are the free variables. 90 CHAPTER 3 Systems of Linear Equations
  • 98. (b) The leading unknowns are x; y; z, so they are the pivot variables. There are no free variables (as in any triangular system). (c) The notion of pivot and free variables applies only to a system in echelon form. 3.8. Solve the triangular system in Problem 3.7(b). Because it is a triangular system, solve by back-substitution. (i) The last equation gives z ¼ 2. (ii) Substitute z ¼ 2 in the second equation to get 4y þ 6 ¼ 8 or y ¼ 1 2. (iii) Substitute z ¼ 2 and y ¼ 1 2 in the first equation to get 2x 6 1 2 þ 7ð2Þ ¼ 1 or 2x þ 11 ¼ 1 or x ¼ 5 Thus, x ¼ 5, y ¼ 1 2, z ¼ 2 or u ¼ ð5; 1 2 ; 2Þ is the unique solution to the system. 3.9. Solve the echelon system in Problem 3.7(a). Assign parameters to the free variables, say x2 ¼ a and x5 ¼ b, and solve for the pivot variables by back- substitution. (i) Substitute x5 ¼ b in the last equation to get x4 2b ¼ 1 or x4 ¼ 2b þ 1. (ii) Substitute x5 ¼ b and x4 ¼ 2b þ 1 in the second equation to get x3 þ 3ð2b þ 1Þ 7b ¼ 6 or x3 b þ 3 ¼ 6 or x3 ¼ b þ 3 (iii) Substitute x5 ¼ b, x4 ¼ 2b þ 1, x3 ¼ b þ 3, x2 ¼ a in the first equation to get 2x1 3a 6ðb þ 3Þ 5ð2b þ 1Þ þ 2b ¼ 7 or 2x1 3a 14b 23 ¼ 7 or x1 ¼ 3 2 a þ 7b þ 15 Thus, x1 ¼ 3 2 a þ 7b þ 15; x2 ¼ a; x3 ¼ b þ 3; x4 ¼ 2b þ 1; x5 ¼ b or u ¼ 3 2 a þ 7b þ 15; a; b þ 3; 2b þ 1; b is the parametric form of the general solution. Alternatively, solving for the pivot variable x1; x3; x4 in terms of the free variables x2 and x5 yields the following free-variable form of the general solution: x1 ¼ 3 2 x2 þ 7x5 þ 15; x3 ¼ x5 þ 3; x4 ¼ 2x5 þ 1 3.10. Prove Theorem 3.6. Consider the system (3.4) of linear equations in echelon form with r equations and n unknowns. (i) If r ¼ n, then the system has a unique solution. (ii) If r n, then we can arbitrarily assign values to the n r free variable and solve uniquely for the r pivot variables, obtaining a solution of the system. (i) Suppose r ¼ n. Then we have a square system AX ¼ B where the matrix A of coefficients is (upper) triangular with nonzero diagonal elements. Thus, A is invertible. By Theorem 3.10, the system has a unique solution. (ii) Assigning values to the n r free variables yields a triangular system in the pivot variables, which, by (i), has a unique solution. CHAPTER 3 Systems of Linear Equations 91
  • 99. Gaussian Elimination 3.11. Solve each of the following systems: x þ 2y 4z ¼ 4 2x þ 5y 9z ¼ 10 3x 2y þ 3z ¼ 11 (a) x þ 2y 3z ¼ 1 3x þ y 2z ¼ 7 5x þ 3y 4z ¼ 2 (b) x þ 2y 3z ¼ 1 2x þ 5y 8z ¼ 4 3x þ 8y 13z ¼ 7 (c) Reduce each system to triangular or echelon form using Gaussian elimination: (a) Apply ‘‘Replace L2 by 2L1 þ L2’’ and ‘‘Replace L3 by 3L1 þ L3’’ to eliminate x from the second and third equations, and then apply ‘‘Replace L3 by 8L2 þ L3’’ to eliminate y from the third equation. These operations yield x þ 2y 4z ¼ 4 y z ¼ 2 8y þ 15z ¼ 23 and then x þ 2y 4z ¼ 4 y z ¼ 2 7z ¼ 7 The system is in triangular form. Solve by back-substitution to obtain the unique solution u ¼ ð2; 1; 1Þ. (b) Eliminate x from the second and third equations by the operations ‘‘Replace L2 by 3L1 þ L2’’ and ‘‘Replace L3 by 5L1 þ L3.’’ This gives the equivalent system x þ 2y 3z ¼ 1 7y 11z ¼ 10 7y þ 11z ¼ 7 The operation ‘‘Replace L3 by L2 þ L3’’ yields the following degenerate equation with a nonzero constant: 0x þ 0y þ 0z ¼ 3 This equation and hence the system have no solution. (c) Eliminate x from the second and third equations by the operations ‘‘Replace L2 by 2L1 þ L2’’ and ‘‘Replace L3 by 3L1 þ L3.’’ This yields the new system x þ 2y 3z ¼ 1 y 2z ¼ 2 2y 4z ¼ 4 or x þ 2y 3z ¼ 1 y 2z ¼ 2 (The third equation is deleted, because it is a multiple of the second equation.) The system is in echelon form with pivot variables x and y and free variable z. To find the parametric form of the general solution, set z ¼ a and solve for x and y by back- substitution. Substitute z ¼ a in the second equation to get y ¼ 2 þ 2a. Then substitute z ¼ a and y ¼ 2 þ 2a in the first equation to get x þ 2ð2 þ 2aÞ 3a ¼ 1 or x þ 4 þ a ¼ 1 or x ¼ 3 a Thus, the general solution is x ¼ 3 a; y ¼ 2 þ 2a; z ¼ a or u ¼ ð3 a; 2 þ 2a; aÞ where a is a parameter. 3.12. Solve each of the following systems: x1 3x2 þ 2x3 x4 þ 2x5 ¼ 2 3x1 9x2 þ 7x3 x4 þ 3x5 ¼ 7 2x1 6x2 þ 7x3 þ 4x4 5x5 ¼ 7 (a) x1 þ 2x2 3x3 þ 4x4 ¼ 2 2x1 þ 5x2 2x3 þ x4 ¼ 1 5x1 þ 12x2 7x3 þ 6x4 ¼ 3 (b) Reduce each system to echelon form using Gaussian elimination: 92 CHAPTER 3 Systems of Linear Equations
  • 100. (a) Apply ‘‘Replace L2 by 3L1 þ L2’’ and ‘‘Replace L3 by 2L1 þ L3’’ to eliminate x from the second and third equations. This yields x1 3x2 þ 2x3 x4 þ 2x5 ¼ 2 x3 þ 2x4 3x5 ¼ 1 3x3 þ 6x4 9x5 ¼ 3 or x1 3x2 þ 2x3 x4 þ 2x5 ¼ 2 x3 þ 2x4 3x5 ¼ 1 (We delete L3, because it is a multiple of L2.) The system is in echelon form with pivot variables x1 and x3 and free variables x2; x4; x5. To find the parametric form of the general solution, set x2 ¼ a, x4 ¼ b, x5 ¼ c, where a; b; c are parameters. Back-substitution yields x3 ¼ 1 2b þ 3c and x1 ¼ 3a þ 5b 8c. The general solution is x1 ¼ 3a þ 5b 8c; x2 ¼ a; x3 ¼ 1 2b þ 3c; x4 ¼ b; x5 ¼ c or, equivalently, u ¼ ð3a þ 5b 8c; a; 1 2b þ 3c; b; cÞ. (b) Eliminate x1 from the second and third equations by the operations ‘‘Replace L2 by 2L1 þ L2’’ and ‘‘Replace L3 by 5L1 þ L3.’’ This yields the system x1 þ 2x2 3x3 þ 4x4 ¼ 2 x2 þ 4x3 7x4 ¼ 3 2x2 þ 8x3 14x4 ¼ 7 The operation ‘‘Replace L3 by 2L2 þ L3’’ yields the degenerate equation 0 ¼ 1. Thus, the system has no solution (even though the system has more unknowns than equations). 3.13. Solve using the condensed format: 2y þ 3z ¼ 3 x þ y þ z ¼ 4 4x þ 8y 3z ¼ 35 The condensed format follows: Number Equation Operation ð2Þ ð 1 =Þ 2y þ 3z ¼ 3 L1 $ L2 ð1Þ ð 2 =Þ x þ y þ z ¼ 4 L1 $ L2 ð3Þ 4x þ 8y 3z ¼ 35 ð30 Þ 4y 7z ¼ 19 Replace L3 by 4L1 þ L3 ð300 Þ 13z ¼ 13 Replace L3 by 2L2 þ L3 Here (1), (2), and (300 ) form a triangular system. (We emphasize that the interchange of L1 and L2 is accomplished by simply renumbering L1 and L2 as above.) Using back-substitution with the triangular system yields z ¼ 1 from L3, y ¼ 3 from L2, and x ¼ 2 from L1. Thus, the unique solution of the system is x ¼ 2, y ¼ 3, z ¼ 1 or the triple u ¼ ð2; 3; 1Þ. 3.14. Consider the system x þ 2y þ z ¼ 3 ay þ 5z ¼ 10 2x þ 7y þ az ¼ b (a) Find those values of a for which the system has a unique solution. (b) Find those pairs of values ða; bÞ for which the system has more than one solution. Reduce the system to echelon form. That is, eliminate x from the third equation by the operation ‘‘Replace L3 by 2L1 þ L3’’ and then eliminate y from the third equation by the operation CHAPTER 3 Systems of Linear Equations 93
  • 101. ‘‘Replace L3 by 3L2 þ aL3.’’ This yields x þ 2y þ z ¼ 3 ay þ 5z ¼ 10 3y þ ða 2Þz ¼ b 6 and then x þ 2y þ z ¼ 3 ay þ 5z ¼ 10 ða2 2a 15Þz ¼ ab 6a 30 Examine the last equation ða2 2a 15Þz ¼ ab 6a 30. (a) The system has a unique solution if and only if the coefficient of z is not zero; that is, if a2 2a 15 ¼ ða 5Þða þ 3Þ 6¼ 0 or a 6¼ 5 and a 6¼ 3: (b) The system has more than one solution if both sides are zero. The left-hand side is zero when a ¼ 5 or a ¼ 3. When a ¼ 5, the right-hand side is zero when 5b 60 ¼ 0, or b ¼ 12. When a ¼ 3, the right- hand side is zero when 3b 12 ¼ 0, or b ¼ 4. Thus, ð5; 12Þ and ð3; 4Þ are the pairs for which the system has more than one solution. Echelon Matrices, Row Equivalence, Row Canonical Form 3.15. Row reduce each of the following matrices to echelon form: (a) A ¼ 1 2 3 0 2 4 2 2 3 6 4 3 2 4 3 5; (b) B ¼ 4 1 6 1 2 5 6 3 4 2 4 3 5 (a) Use a11 ¼ 1 as a pivot to obtain 0’s below a11; that is, apply the row operations ‘‘Replace R2 by 2R1 þ R2’’ and ‘‘Replace R3 by 3R1 þ R3:’’ Then use a23 ¼ 4 as a pivot to obtain a 0 below a23; that is, apply the row operation ‘‘Replace R3 by 5R2 þ 4R3.’’ These operations yield A 1 2 3 0 0 0 4 2 0 0 5 3 2 4 3 5 1 2 3 0 0 0 4 2 0 0 0 2 2 4 3 5 The matrix is now in echelon form. (b) Hand calculations are usually simpler if the pivot element equals 1. Therefore, first interchange R1 and R2. Next apply the operations ‘‘Replace R2 by 4R1 þ R2’’ and ‘‘Replace R3 by 6R1 þ R3’’; and then apply the operation ‘‘Replace R3 by R2 þ R3.’’ These operations yield B 1 2 5 4 1 6 6 3 4 2 4 3 5 1 2 5 0 9 26 0 9 26 2 4 3 5 1 2 5 0 9 26 0 0 0 2 4 3 5 The matrix is now in echelon form. 3.16. Describe the pivoting row-reduction algorithm. Also describe the advantages, if any, of using this pivoting algorithm. The row-reduction algorithm becomes a pivoting algorithm if the entry in column j of greatest absolute value is chosen as the pivot a1j1 and if one uses the row operation ðaij1 =a1j1 ÞR1 þ Ri ! Ri The main advantage of the pivoting algorithm is that the above row operation involves division by the (current) pivot a1j1 , and, on the computer, roundoff errors may be substantially reduced when one divides by a number as large in absolute value as possible. 3.17. Let A ¼ 2 2 2 1 3 6 0 1 1 7 10 2 2 4 3 5. Reduce A to echelon form using the pivoting algorithm. 94 CHAPTER 3 Systems of Linear Equations
  • 102. First interchange R1 and R2 so that 3 can be used as the pivot, and then apply the operations ‘‘Replace R2 by 2 3 R1 þ R2’’ and ‘‘Replace R3 by 1 3 R1 þ R3.’’ These operations yield A 3 6 0 1 2 2 2 1 1 7 10 2 2 4 3 5 3 6 0 1 0 2 2 1 3 0 5 10 5 3 2 6 4 3 7 5 Now interchange R2 and R3 so that 5 can be used as the pivot, and then apply the operation ‘‘Replace R3 by 2 5 R2 þ R3.’’ We obtain A 3 6 0 1 0 5 10 5 3 0 2 2 1 3 2 4 3 5 3 6 0 1 0 5 10 5 3 0 0 6 1 2 4 3 5 The matrix has been brought to echelon form using partial pivoting. 3.18. Reduce each of the following matrices to row canonical form: (a) A ¼ 2 2 1 6 4 4 4 1 10 13 8 8 1 26 23 2 4 3 5; (b) B ¼ 5 9 6 0 2 3 0 0 7 2 4 3 5 (a) First reduce A to echelon form by applying the operations ‘‘Replace R2 by 2R1 þ R2’’ and ‘‘Replace R3 by 4R1 þ R3,’’ and then applying the operation ‘‘Replace R3 by R2 þ R3.’’ These operations yield A 2 2 1 6 4 0 0 3 2 5 0 0 3 2 7 2 4 3 5 2 2 1 6 4 0 0 3 2 5 0 0 0 4 2 2 4 3 5 Now use back-substitution on the echelon matrix to obtain the row canonical form of A. Specifically, first multiply R3 by 1 4 to obtain the pivot a34 ¼ 1, and then apply the operations ‘‘Replace R2 by 2R3 þ R2’’ and ‘‘Replace R1 by 6R3 þ R1.’’ These operations yield A 2 2 1 6 4 0 0 3 2 5 0 0 0 1 1 2 2 4 3 5 2 2 1 0 1 0 0 3 0 6 0 0 0 1 1 2 2 4 3 5 Now multiply R2 by 1 3 , making the pivot a23 ¼ 1, and then apply ‘‘Replace R1 by R2 þ R1,’’ yielding A 2 2 1 0 1 0 0 1 0 2 0 0 0 1 1 2 2 4 3 5 2 2 0 0 3 0 0 1 0 2 0 0 0 1 1 2 2 4 3 5 Finally, multiply R1 by 1 2 , so the pivot a11 ¼ 1. Thus, we obtain the following row canonical form of A: A 1 1 0 0 3 2 0 0 1 0 2 0 0 0 1 1 2 2 4 3 5 (b) Because B is in echelon form, use back-substitution to obtain B 5 9 6 0 2 3 0 0 1 2 6 4 3 7 5 5 9 0 0 2 0 0 0 1 2 6 4 3 7 5 5 9 0 0 1 0 0 0 1 2 6 4 3 7 5 5 0 0 0 1 0 0 0 1 2 6 4 3 7 5 1 0 0 0 1 0 0 0 1 2 6 4 3 7 5 The last matrix, which is the identity matrix I, is the row canonical form of B. (This is expected, because B is invertible, and so its row canonical form must be I.) 3.19. Describe the Gauss–Jordan elimination algorithm, which also row reduces an arbitrary matrix A to its row canonical form. CHAPTER 3 Systems of Linear Equations 95
  • 103. The Gauss–Jordan algorithm is similar in some ways to the Gaussian elimination algorithm, except that here each pivot is used to place 0’s both below and above the pivot, not just below the pivot, before working with the next pivot. Also, one variation of the algorithm first normalizes each row—that is, obtains a unit pivot—before it is used to produce 0’s in the other rows, rather than normalizing the rows at the end of the algorithm. 3.20. Let A ¼ 1 2 3 1 2 1 1 4 1 3 2 5 9 2 8 2 4 3 5. Use Gauss–Jordan to find the row canonical form of A. Use a11 ¼ 1 as a pivot to obtain 0’s below a11 by applying the operations ‘‘Replace R2 by R1 þ R2’’ and ‘‘Replace R3 by 2R1 þ R3.’’ This yields A 1 2 3 1 2 0 3 1 2 1 0 9 3 4 4 2 4 3 5 Multiply R2 by 1 3 to make the pivot a22 ¼ 1, and then produce 0’s below and above a22 by applying the operations ‘‘Replace R3 by 9R2 þ R3’’ and ‘‘Replace R1 by 2R2 þ R1.’’ These operations yield A 1 2 3 1 2 0 1 1 3 2 3 1 3 0 9 3 4 4 2 6 6 4 3 7 7 5 1 0 11 3 1 3 8 3 0 1 1 3 2 3 1 3 0 0 0 2 1 2 6 6 4 3 7 7 5 Finally, multiply R3 by 1 2 to make the pivot a34 ¼ 1, and then produce 0’s above a34 by applying the operations ‘‘Replace R2 by 2 3 R3 þ R2’’ and ‘‘Replace R1 by 1 3 R3 þ R1.’’ These operations yield A 1 0 11 3 1 3 8 3 0 1 1 3 2 3 1 3 0 0 0 1 1 2 2 6 6 4 3 7 7 5 1 0 11 3 0 17 6 0 1 1 3 0 2 3 0 0 0 1 1 2 2 6 6 4 3 7 7 5 which is the row canonical form of A. Systems of Linear Equations in Matrix Form 3.21. Find the augmented matrix M and the coefficient matrix A of the following system: x þ 2y 3z ¼ 4 3y 4z þ 7x ¼ 5 6z þ 8x 9y ¼ 1 First align the unknowns in the system, and then use the aligned system to obtain M and A. We have x þ 2y 3z ¼ 4 7x þ 3y 4z ¼ 5 8x 9y þ 6z ¼ 1 ; then M ¼ 1 2 3 4 7 3 4 5 8 9 6 1 2 4 3 5 and A ¼ 1 2 3 7 3 4 8 9 6 2 4 3 5 3.22. Solve each of the following systems using its augmented matrix M: x þ 2y z ¼ 3 x þ 3y þ z ¼ 5 3x þ 8y þ 4z ¼ 17 (a) x 2y þ 4z ¼ 2 2x 3y þ 5z ¼ 3 3x 4y þ 6z ¼ 7 (b) x þ y þ 3z ¼ 1 2x þ 3y z ¼ 3 5x þ 7y þ z ¼ 7 (c) (a) Reduce the augmented matrix M to echelon form as follows: M ¼ 1 2 1 3 1 3 1 5 3 8 4 17 2 4 3 5 1 2 1 3 0 1 2 2 0 2 7 8 2 4 3 5 1 2 1 3 0 1 2 2 0 0 3 4 2 4 3 5 96 CHAPTER 3 Systems of Linear Equations
  • 104. Now write down the corresponding triangular system x þ 2y z ¼ 3 y þ 2z ¼ 2 3z ¼ 4 and solve by back-substitution to obtain the unique solution x ¼ 17 3 ; y ¼ 2 3 ; z ¼ 4 3 or u ¼ ð17 3 ; 2 3 ; 4 3Þ Alternately, reduce the echelon form of M to row canonical form, obtaining M 1 2 1 3 0 1 2 2 0 0 1 4 3 2 6 6 4 3 7 7 5 1 2 0 13 3 0 1 0 2 3 0 0 1 4 3 2 6 6 4 3 7 7 5 1 0 0 17 3 0 1 0 2 3 0 0 1 4 3 2 6 6 4 3 7 7 5 This also corresponds to the above solution. (b) First reduce the augmented matrix M to echelon form as follows: M ¼ 1 2 4 2 2 3 5 3 3 4 6 7 2 4 3 5 1 2 4 2 0 1 3 1 0 2 6 1 2 4 3 5 1 2 4 2 0 1 3 1 0 0 0 3 2 4 3 5 The third row corresponds to the degenerate equation 0x þ 0y þ 0z ¼ 3, which has no solution. Thus, ‘‘DO NOT CONTINUE.’’ The original system also has no solution. (Note that the echelon form indicates whether or not the system has a solution.) (c) Reduce the augmented matrix M to echelon form and then to row canonical form: M ¼ 1 1 3 1 2 3 1 3 5 7 1 7 2 4 3 5 1 1 3 1 0 1 7 1 0 2 14 2 2 4 3 5 1 0 10 0 0 1 7 1 (The third row of the second matrix is deleted, because it is a multiple of the second row and will result in a zero row.) Write down the system corresponding to the row canonical form of M and then transfer the free variables to the other side to obtain the free-variable form of the solution: x þ 10z ¼ 0 y 7z ¼ 1 and x ¼ 10z y ¼ 1 þ 7z Here z is the only free variable. The parametric solution, using z ¼ a, is as follows: x ¼ 10a; y ¼ 1 þ 7a; z ¼ a or u ¼ ð10a; 1 þ 7a; aÞ 3.23. Solve the following system using its augmented matrix M: x1 þ 2x2 3x3 2x4 þ 4x5 ¼ 1 2x1 þ 5x2 8x3 x4 þ 6x5 ¼ 4 x1 þ 4x2 7x3 þ 5x4 þ 2x5 ¼ 8 Reduce the augmented matrix M to echelon form and then to row canonical form: M ¼ 1 2 3 2 4 1 2 5 8 1 6 4 1 4 7 5 2 8 2 6 4 3 7 5 1 2 3 2 4 1 0 1 2 3 2 2 0 2 4 7 2 7 2 6 4 3 7 5 1 2 3 2 4 1 0 1 2 3 2 2 0 0 0 1 2 3 2 6 4 3 7 5 1 2 3 0 8 7 0 1 2 0 8 7 0 0 0 1 2 3 2 6 4 3 7 5 1 0 1 0 24 21 0 1 2 0 8 7 0 0 0 1 2 3 2 6 4 3 7 5 Write down the system corresponding to the row canonical form of M and then transfer the free variables to the other side to obtain the free-variable form of the solution: x1 þ x3 þ 24x5 ¼ 21 x2 2x3 8x5 ¼ 7 x4 þ 2x5 ¼ 3 and x1 ¼ 21 x3 24x5 x2 ¼ 7 þ 2x3 þ 8x5 x4 ¼ 3 2x5 CHAPTER 3 Systems of Linear Equations 97
  • 105. Here x1; x2; x4 are the pivot variables and x3 and x5 are the free variables. Recall that the parametric form of the solution can be obtained from the free-variable form of the solution by simply setting the free variables equal to parameters, say x3 ¼ a, x5 ¼ b. This process yields x1 ¼ 21 a 24b; x2 ¼ 7 þ 2a þ 8b; x3 ¼ a; x4 ¼ 3 2b; x5 ¼ b or u ¼ ð21 a 24b; 7 þ 2a þ 8b; a; 3 2b; bÞ which is another form of the solution. Linear Combinations, Homogeneous Systems 3.24. Write v as a linear combination of u1; u2; u3, where (a) v ¼ ð3; 10; 7Þ and u1 ¼ ð1; 3; 2Þ; u2 ¼ ð1; 4; 2Þ; u3 ¼ ð2; 8; 1Þ; (b) v ¼ ð2; 7; 10Þ and u1 ¼ ð1; 2; 3Þ, u2 ¼ ð1; 3; 5Þ, u3 ¼ ð1; 5; 9Þ; (c) v ¼ ð1; 5; 4Þ and u1 ¼ ð1; 3; 2Þ, u2 ¼ ð2; 7; 1Þ, u3 ¼ ð1; 6; 7Þ. Find the equivalent system of linear equations by writing v ¼ xu1 þ yu2 þ zu3. Alternatively, use the augmented matrix M of the equivalent system, where M ¼ ½u1; u2; u3; v. (Here u1; u2; u3; v are the columns of M.) (a) The vector equation v ¼ xu1 þ yu2 þ zu3 for the given vectors is as follows: 3 10 7 2 4 3 5 ¼ x 1 3 2 2 4 3 5 þ y 1 4 2 2 4 3 5 þ z 2 8 1 2 4 3 5 ¼ x þ y þ 2z 3x þ 4y þ 8z 2x þ 2y þ z 2 4 3 5 Form the equivalent system of linear equations by setting corresponding entries equal to each other, and then reduce the system to echelon form: x þ y þ 2z ¼ 3 3x þ 4y þ 8z ¼ 10 2x þ 2y þ z ¼ 7 or x þ y þ 2z ¼ 3 y þ 2z ¼ 1 4y þ 5z ¼ 13 or x þ y þ 2z ¼ 3 y þ 2z ¼ 1 3z ¼ 9 The system is in triangular form. Back-substitution yields the unique solution x ¼ 2, y ¼ 7, z ¼ 3. Thus, v ¼ 2u1 þ 7u2 3u3. Alternatively, form the augmented matrix M ¼ [u1; u2; u3; v] of the equivalent system, and reduce M to echelon form: M ¼ 1 1 2 3 3 4 8 10 2 2 1 7 2 4 3 5 1 1 2 3 0 1 2 1 0 4 5 13 2 4 3 5 1 1 2 3 0 1 2 1 0 0 3 9 2 4 3 5 The last matrix corresponds to a triangular system that has a unique solution. Back-substitution yields the solution x ¼ 2, y ¼ 7, z ¼ 3. Thus, v ¼ 2u1 þ 7u2 3u3. (b) Form the augmented matrix M ¼ ½u1; u2; u3; v of the equivalent system, and reduce M to the echelon form: M ¼ 1 1 1 2 2 3 5 7 3 5 9 10 2 4 3 5 1 1 1 2 0 1 3 3 0 2 6 4 2 4 3 5 1 1 1 2 0 1 3 3 0 0 0 2 2 4 3 5 The third row corresponds to the degenerate equation 0x þ 0y þ 0z ¼ 2, which has no solution. Thus, the system also has no solution, and v cannot be written as a linear combination of u1; u2; u3. (c) Form the augmented matrix M ¼ ½u1; u2; u3; v of the equivalent system, and reduce M to echelon form: M ¼ 1 2 1 1 3 7 6 5 2 1 7 4 2 4 3 5 1 2 1 1 0 1 3 2 0 3 9 6 2 4 3 5 1 2 1 1 0 1 3 2 0 0 0 0 2 4 3 5 98 CHAPTER 3 Systems of Linear Equations
  • 106. The last matrix corresponds to the following system with free variable z: x þ 2y þ z ¼ 1 y þ 3z ¼ 2 Thus, v can be written as a linear combination of u1; u2; u3 in many ways. For example, let the free variable z ¼ 1, and, by back-substitution, we get y ¼ 2 and x ¼ 2. Thus, v ¼ 2u1 2u2 þ u3. 3.25. Let u1 ¼ ð1; 2; 4Þ, u2 ¼ ð2; 3; 1Þ, u3 ¼ ð2; 1; 1Þ in R3 . Show that u1; u2; u3 are orthogonal, and write v as a linear combination of u1; u2; u3, where (a) v ¼ ð7; 16; 6Þ, (b) v ¼ ð3; 5; 2Þ. Take the dot product of pairs of vectors to get u1 u2 ¼ 2 6 þ 4 ¼ 0; u1 u3 ¼ 2 þ 2 4 ¼ 0; u2 u3 ¼ 4 3 1 ¼ 0 Thus, the three vectors in R3 are orthogonal, and hence Fourier coefficients can be used. That is, v ¼ xu1 þ yu2 þ zu3, where x ¼ v u1 u1 u1 ; y ¼ v u2 u2 u2 ; z ¼ v u3 u3 u3 (a) We have x ¼ 7 þ 32 þ 24 1 þ 4 þ 16 ¼ 63 21 ¼ 3; y ¼ 14 48 þ 6 4 þ 9 þ 1 ¼ 28 14 ¼ 2; z ¼ 14 þ 16 6 4 þ 1 þ 1 ¼ 24 6 ¼ 4 Thus, v ¼ 3u1 2u2 þ 4u3. (b) We have x ¼ 3 þ 10 þ 8 1 þ 4 þ 16 ¼ 21 21 ¼ 1; y ¼ 6 15 þ 2 4 þ 9 þ 1 ¼ 7 14 ¼ 1 2 ; z ¼ 6 þ 5 2 4 þ 1 þ 1 ¼ 9 6 ¼ 3 2 Thus, v ¼ u1 1 2 u2 þ 3 2 u3. 3.26. Find the dimension and a basis for the general solution W of each of the following homogeneous systems: 2x1 þ 4x2 5x3 þ 3x4 ¼ 0 3x1 þ 6x2 7x3 þ 4x4 ¼ 0 5x1 þ 10x2 11x3 þ 6x4 ¼ 0 (a) x 2y 3z ¼ 0 2x þ y þ 3z ¼ 0 3x 4y 2z ¼ 0 (b) (a) Reduce the system to echelon form using the operations ‘‘Replace L2 by 3L1 þ 2L2,’’ ‘‘Replace L3 by 5L1 þ 2L3,’’ and then ‘‘Replace L3 by 2L2 þ L3.’’ These operations yield 2x1 þ 4x2 5x3 þ 3x4 ¼ 0 x3 x4 ¼ 0 3x3 3x4 ¼ 0 and 2x1 þ 4x2 5x3 þ 3x4 ¼ 0 x3 x4 ¼ 0 The system in echelon form has two free variables, x2 and x4, so dim W ¼ 2. A basis ½u1; u2 for W may be obtained as follows: (1) Set x2 ¼ 1, x4 ¼ 0. Back-substitution yields x3 ¼ 0, and then x1 ¼ 2. Thus, u1 ¼ ð2; 1; 0; 0Þ. (2) Set x2 ¼ 0, x4 ¼ 1. Back-substitution yields x3 ¼ 1, and then x1 ¼ 1. Thus, u2 ¼ ð1; 0; 1; 1Þ. (b) Reduce the system to echelon form, obtaining x 2y 3z ¼ 0 5y þ 9z ¼ 0 2y þ 7z ¼ 0 and x 2y 3z ¼ 0 5y þ 9z ¼ 0 17z ¼ 0 There are no free variables (the system is in triangular form). Hence, dim W ¼ 0, and W has no basis. Specifically, W consists only of the zero solution; that is, W ¼ f0g. 3.27. Find the dimension and a basis for the general solution W of the following homogeneous system using matrix notation: x1 þ 2x2 þ 3x3 2x4 þ 4x5 ¼ 0 2x1 þ 4x2 þ 8x3 þ x4 þ 9x5 ¼ 0 3x1 þ 6x2 þ 13x3 þ 4x4 þ 14x5 ¼ 0 Show how the basis gives the parametric form of the general solution of the system. When a system is homogeneous, we represent the system by its coefficient matrix A rather than by its CHAPTER 3 Systems of Linear Equations 99
  • 107. augmented matrix M, because the last column of the augmented matrix M is a zero column, and it will remain a zero column during any row-reduction process. Reduce the coefficient matrix A to echelon form, obtaining A ¼ 1 2 3 2 4 2 4 8 1 9 3 6 13 4 14 2 4 3 5 1 2 3 2 4 0 0 2 5 1 0 0 4 10 2 2 4 3 5 1 2 3 2 4 0 0 2 5 1 (The third row of the second matrix is deleted, because it is a multiple of the second row and will result in a zero row.) We can now proceed in one of two ways. (a) Write down the corresponding homogeneous system in echelon form: x1 þ 2x2 þ 3x3 2x4 þ 4x5 ¼ 0 2x3 þ 5x4 þ x5 ¼ 0 The system in echelon form has three free variables, x2; x4; x5, so dim W ¼ 3. A basis ½u1; u2; u3 for W may be obtained as follows: (1) Set x2 ¼ 1, x4 ¼ 0, x5 ¼ 0. Back-substitution yields x3 ¼ 0, and then x1 ¼ 2. Thus, u1 ¼ ð2; 1; 0; 0; 0Þ. (2) Set x2 ¼ 0, x4 ¼ 1, x5 ¼ 0. Back-substitution yields x3 ¼ 5 2, and then x1 ¼ 19 2 . Thus, u2 ¼ ð19 2 ; 0; 5 2 ; 1; 0Þ. (3) Set x2 ¼ 0, x4 ¼ 0, x5 ¼ 1. Back-substitution yields x3 ¼ 1 2, and then x1 ¼ 5 2. Thus, u3 ¼ ð 5 2, 0, 1 2 ; 0; 1Þ. [One could avoid fractions in the basis by choosing x4 ¼ 2 in (2) and x5 ¼ 2 in (3), which yields multiples of u2 and u3.] The parametric form of the general solution is obtained from the following linear combination of the basis vectors using parameters a; b; c: au1 þ bu2 þ cu3 ¼ ð2a þ 19 2 b 5 2 c; a; 5 2 b 1 2 c; b; cÞ (b) Reduce the echelon form of A to row canonical form: A 1 2 3 2 4 0 0 1 5 2 1 2 # 1 2 3 19 2 5 2 0 0 1 5 2 1 2 # Write down the corresponding free-variable solution: x1 ¼ 2x2 þ 19 2 x4 5 2 x5 x3 ¼ 5 2 x4 1 2 x5 Using these equations for the pivot variables x1 and x3, repeat the above process to obtain a basis ½u1; u2; u3 for W. That is, set x2 ¼ 1, x4 ¼ 0, x5 ¼ 0 to get u1; set x2 ¼ 0, x4 ¼ 1, x5 ¼ 0 to get u2; and set x2 ¼ 0, x4 ¼ 0, x5 ¼ 1 to get u3. 3.28. Prove Theorem 3.15. Let v0 be a particular solution of AX ¼ B, and let W be the general solution of AX ¼ 0. Then U ¼ v0 þ W ¼ fv0 þ w : w 2 Wg is the general solution of AX ¼ B. Let w be a solution of AX ¼ 0. Then Aðv0 þ wÞ ¼ Av0 þ Aw ¼ B þ 0 ¼ B Thus, the sum v0 þ w is a solution of AX ¼ B. On the other hand, suppose v is also a solution of AX ¼ B. Then Aðv v0Þ ¼ Av Av0 ¼ B B ¼ 0 Therefore, v v0 belongs to W. Because v ¼ v0 þ ðv v0Þ, we find that any solution of AX ¼ B can be obtained by adding a solution of AX ¼ 0 to a solution of AX ¼ B. Thus, the theorem is proved. 100 CHAPTER 3 Systems of Linear Equations
  • 108. Elementary Matrices, Applications 3.29. Let e1; e2; e3 denote, respectively, the elementary row operations ‘‘Interchange rows R1 and R2; ’’ ‘‘Replace R3 by 7R3; ’’ ‘‘Replace R2 by 3R1 þ R2’’ Find the corresponding three-square elementary matrices E1; E2; E3. Apply each operation to the 3 3 identity matrix I3 to obtain E1 ¼ 0 1 0 1 0 0 0 0 1 2 4 3 5; E2 ¼ 1 0 0 0 1 0 0 0 7 2 4 3 5; E3 ¼ 1 0 0 3 1 0 0 0 1 2 4 3 5 3.30. Consider the elementary row operations in Problem 3.29. (a) Describe the inverse operations e1 1 , e1 2 , e1 3 . (b) Find the corresponding three-square elementary matrices E0 1, E0 2, E0 3. (c) What is the relationship between the matrices E0 1, E0 2, E0 3 and the matrices E1, E2, E3? (a) The inverses of e1, e2, e3 are, respectively, ‘‘Interchange rows R1 and R2; ’’ ‘‘Replace R3 by 1 7 R3; ’’ ‘‘Replace R2 by 3R1 þ R2:’’ (b) Apply each inverse operation to the 3 3 identity matrix I3 to obtain E0 1 ¼ 0 1 0 1 0 0 0 0 1 2 4 3 5; E0 2 ¼ 1 0 0 0 1 0 0 0 1 7 2 4 3 5; E0 3 ¼ 1 0 0 3 1 0 0 0 1 2 4 3 5 (c) The matrices E0 1, E0 2, E0 3 are, respectively, the inverses of the matrices E1, E2, E3. 3.31. Write each of the following matrices as a product of elementary matrices: (a) A ¼ 1 3 2 4 ; (b) B ¼ 1 2 3 0 1 4 0 0 1 2 4 3 5; (c) C ¼ 1 1 2 2 3 8 3 1 2 2 4 3 5 The following three steps write a matrix M as a product of elementary matrices: Step 1. Row reduce M to the identity matrix I, keeping track of the elementary row operations. Step 2. Write down the inverse row operations. Step 3. Write M as the product of the elementary matrices corresponding to the inverse operations. This gives the desired result. If a zero row appears in Step 1, then M is not row equivalent to the identity matrix I, and M cannot be written as a product of elementary matrices. (a) (1) We have A ¼ 1 3 2 4 1 3 0 2 1 3 0 1 1 0 0 1 ¼ I where the row operations are, respectively, ‘‘Replace R2 by 2R1 þ R2; ’’ ‘‘Replace R2 by 1 2 R2; ’’ ‘‘Replace R1 by 3R2 þ R1’’ (2) Inverse operations: ‘‘Replace R2 by 2R1 þ R2;’’ ‘‘Replace R2 by 2R2;’’ ‘‘Replace R1 by 3R2 þ R1’’ (3) A ¼ 1 0 2 1 1 0 0 2 1 3 0 1 CHAPTER 3 Systems of Linear Equations 101
  • 109. (b) (1) We have B ¼ 1 2 3 0 1 4 0 0 1 2 4 3 5 1 2 0 0 1 0 0 0 1 2 4 3 5 1 0 0 0 1 0 0 0 1 2 4 3 5 ¼ I where the row operations are, respectively, ‘‘Replace R2 by 4R3 þ R2; ’’ ‘‘Replace R1 by 3R3 þ R1; ’’ ‘‘Replace R1 by 2R2 þ R1’’ (2) Inverse operations: ‘‘Replace R2 by 4R3 þ R2; ’’ ‘‘Replace R1 by 3R3 þ R1; ’’ ‘‘Replace R1 by 2R2 þ R1’’ (3) B ¼ 1 0 0 0 1 4 0 0 1 2 4 3 5 1 0 3 0 1 0 0 0 1 2 4 3 5 1 2 0 0 1 0 0 0 1 2 4 3 5 (c) (1) First row reduce C to echelon form. We have C ¼ 1 1 2 2 3 8 3 1 2 2 4 3 5 1 1 2 0 1 4 0 2 8 2 4 3 5 1 1 2 0 1 4 0 0 0 2 4 3 5 In echelon form, C has a zero row. ‘‘STOP.’’ The matrix C cannot be row reduced to the identity matrix I, and C cannot be written as a product of elementary matrices. (We note, in particular, that C has no inverse.) 3.32. Find the inverse of (a) A ¼ 1 2 4 1 1 5 2 7 3 2 4 3 5; (b) B ¼ 1 3 4 1 5 1 3 13 6 2 4 3 5. (a) Form the matrix M ¼ [A; I] and row reduce M to echelon form: M ¼ 1 2 4 1 0 0 1 1 5 0 1 0 2 7 3 0 0 1 2 6 4 3 7 5 1 2 4 1 0 0 0 1 1 1 1 0 0 3 5 2 0 1 2 6 4 3 7 5 1 2 4 1 0 0 0 1 1 1 1 0 0 0 2 5 3 1 2 6 4 3 7 5 In echelon form, the left half of M is in triangular form; hence, A has an inverse. Further reduce M to row canonical form: M 1 2 0 9 6 2 0 1 0 7 2 5 2 1 2 0 0 1 5 2 3 2 1 2 2 6 6 4 3 7 7 5 1 0 0 16 11 3 0 1 0 7 2 5 2 1 2 0 0 1 5 2 3 2 1 2 2 6 6 4 3 7 7 5 The final matrix has the form ½I; A1 ; that is, A1 is the right half of the last matrix. Thus, A1 ¼ 16 11 3 7 2 5 2 1 2 5 2 3 2 1 2 2 6 6 4 3 7 7 5 (b) Form the matrix M ¼ ½B; I and row reduce M to echelon form: M ¼ 1 3 4 1 0 0 1 5 1 0 1 0 3 13 6 0 0 1 2 4 3 5 1 3 4 1 0 0 0 2 3 1 1 0 0 4 6 3 0 1 2 4 3 5 1 3 4 1 0 0 0 2 3 1 1 0 0 0 0 1 2 1 2 4 3 5 In echelon form, M has a zero row in its left half; that is, B is not row reducible to triangular form. Accordingly, B has no inverse. 102 CHAPTER 3 Systems of Linear Equations
  • 110. 3.33. Show that every elementary matrix E is invertible, and its inverse is an elementary matrix. Let E be the elementary matrix corresponding to the elementary operation e; that is, eðIÞ ¼ E. Let e0 be the inverse operation of e and let E0 be the corresponding elementary matrix; that is, e0 ðIÞ ¼ E0 . Then I ¼ e0 ðeðIÞÞ ¼ e0 ðEÞ ¼ E0 E and I ¼ eðe0 ðIÞÞ ¼ eðE0 Þ ¼ EE0 Therefore, E0 is the inverse of E. 3.34. Prove Theorem 3.16: Let e be an elementary row operation and let E be the corresponding m-square elementary matrix; that is, E ¼ eðIÞ. Then eðAÞ ¼ EA, where A is any m n matrix. Let Ri be the row i of A; we denote this by writing A ¼ ½R1; . . . ; Rm. If B is a matrix for which AB is defined then AB ¼ ½R1B; . . . ; RmB. We also let ei ¼ ð0; . . . ; 0; ^ 1; 0; . . . ; 0Þ; ^¼ i Here ^¼ i means 1 is the ith entry. One can show (Problem 2.45) that eiA ¼ Ri. We also note that I ¼ ½e1; e2; . . . ; em is the identity matrix. (i) Let e be the elementary row operation ‘‘Interchange rows Ri and Rj.’’ Then, for ^¼ i and ^ ^ ¼ j, E ¼ eðIÞ ¼ ½e1; . . . ; b ej; . . . ; b b ei; . . . ; em and eðAÞ ¼ ½R1; . . . ; b Rj; . . . ; b b Ri; . . . ; Rm Thus, EA ¼ ½e1A; . . . ; c ejA; . . . ; c c eiA; . . . ; emA ¼ ½R1; . . . ; b Rj; . . . ; b b Ri; . . . ; Rm ¼ eðAÞ (ii) Let e be the elementary row operation ‘‘Replace Ri by kRi ðk 6¼ 0Þ.’’ Then, for^¼ i, E ¼ eðIÞ ¼ ½e1; . . . ; b kei; . . . ; em and eðAÞ ¼ ½R1; . . . ; c kRi; . . . ; Rm Thus, EA ¼ ½e1A; . . . ; d keiA; . . . ; emA ¼ ½R1; . . . ; c kRi; . . . ; Rm ¼ eðAÞ (iii) Let e be the elementary row operation ‘‘Replace Ri by kRj þ Ri.’’ Then, for^¼ i, E ¼ eðIÞ ¼ ½e1; . . . ; d kej þ ei; . . . ; em and eðAÞ ¼ ½R1; . . . ; d kRj þ Ri; . . . ; Rm Using ðkej þ eiÞA ¼ kðejAÞ þ eiA ¼ kRj þ Ri, we have EA ¼ ½e1A; . . . ; ðkej þ eiÞA; . . . ; emA ¼ ½R1; . . . ; d kRj þ Ri; . . . ; Rm ¼ eðAÞ 3.35. Prove Theorem 3.17: Let A be a square matrix. Then the following are equivalent: (a) A is invertible (nonsingular). (b) A is row equivalent to the identity matrix I. (c) A is a product of elementary matrices. Suppose A is invertible and suppose A is row equivalent to matrix B in row canonical form. Then there exist elementary matrices E1; E2; . . . ; Es such that Es . . . E2E1A ¼ B. Because A is invertible and each elementary matrix is invertible, B is also invertible. But if B 6¼ I, then B has a zero row; whence B is not invertible. Thus, B ¼ I, and (a) implies (b). CHAPTER 3 Systems of Linear Equations 103
  • 111. If (b) holds, then there exist elementary matrices E1; E2; . . . ; Es such that Es . . . E2E1A ¼ I. Hence, A ¼ ðEs . . . E2E1Þ1 ¼ E1 1 E1 2 . . . ; E1 s . But the E1 i are also elementary matrices. Thus (b) implies (c). If (c) holds, then A ¼ E1E2 . . . Es. The Ei are invertible matrices; hence, their product A is also invertible. Thus, (c) implies (a). Accordingly, the theorem is proved. 3.36. Prove Theorem 3.18: If AB ¼ I, then BA ¼ I, and hence B ¼ A1 . Suppose A is not invertible. Then A is not row equivalent to the identity matrix I, and so A is row equivalent to a matrix with a zero row. In other words, there exist elementary matrices E1; . . . ; Es such that Es . . . E2E1A has a zero row. Hence, Es . . . E2E1AB ¼ Es . . . E2E1, an invertible matrix, also has a zero row. But invertible matrices cannot have zero rows; hence A is invertible, with inverse A1 . Then also, B ¼ IB ¼ ðA1 AÞB ¼ A1 ðABÞ ¼ A1 I ¼ A1 3.37. Prove Theorem 3.19: B is row equivalent to A (written B AÞ if and only if there exists a nonsingular matrix P such that B ¼ PA. If B A, then B ¼ esð. . . ðe2ðe1ðAÞÞÞ . . .Þ ¼ Es . . . E2E1A ¼ PA where P ¼ Es . . . E2E1 is nonsingular. Conversely, suppose B ¼ PA, where P is nonsingular. By Theorem 3.17, P is a product of elementary matrices, and so B can be obtained from A by a sequence of elementary row operations; that is, B A. Thus, the theorem is proved. 3.38. Prove Theorem 3.21: Every m n matrix A is equivalent to a unique block matrix of the form Ir 0 0 0 , where Ir is the r r identity matrix. The proof is constructive, in the form of an algorithm. Step 1. Row reduce A to row canonical form, with leading nonzero entries a1j1 , a2j2 ; . . . ; arjr . Step 2. Interchange C1 and C1j1 , interchange C2 and C2j2 ; . . . , and interchange Cr and Cjr. This gives a matrix in the form Ir B 0 0 , with leading nonzero entries a11; a22; . . . ; arr. Step 3. Use column operations, with the aii as pivots, to replace each entry in B with a zero; that is, for i ¼ 1; 2; . . . ; r and j ¼ r þ 1, r þ 2; . . . ; n, apply the operation bijCi þ Cj ! Cj. The final matrix has the desired form Ir 0 0 0 . Lu Factorization 3.39. Find the LU factorization of (a) A ¼ 1 3 5 2 4 7 1 2 1 2 4 3 5; (b) B ¼ 1 4 3 2 8 1 5 9 7 2 4 3 5: (a) Reduce A to triangular form by the following operations: ‘‘Replace R2 by 2R1 þ R2; ’’ ‘‘Replace R3 by R1 þ R3; ’’ and then ‘‘Replace R3 by 5 2 R2 þ R3’’ These operations yield the following, where the triangular form is U: A 1 3 5 0 2 3 0 5 6 2 4 3 5 1 3 5 0 2 3 0 0 3 2 2 4 3 5 ¼ U and L ¼ 1 0 0 2 1 0 1 5 2 1 2 4 3 5 The entries 2; 1; 5 2 in L are the negatives of the multipliers 2; 1; 5 2 in the above row operations. (As a check, multiply L and U to verify A ¼ LU.) 104 CHAPTER 3 Systems of Linear Equations
  • 112. (b) Reduce B to triangular form by first applying the operations ‘‘Replace R2 by 2R1 þ R2’’ and ‘‘Replace R3 by 5R1 þ R3.’’ These operations yield B 1 4 3 0 0 7 0 11 8 2 4 3 5: Observe that the second diagonal entry is 0. Thus, B cannot be brought into triangular form without row interchange operations. Accordingly, B is not LU-factorable. (There does exist a PLU factorization of such a matrix B, where P is a permutation matrix, but such a factorization lies beyond the scope of this text.) 3.40. Find the LDU factorization of the matrix A in Problem 3.39. The A ¼ LDU factorization refers to the situation where L is a lower triangular matrix with 1’s on the diagonal (as in the LU factorization of A), D is a diagonal matrix, and U is an upper triangular matrix with 1’s on the diagonal. Thus, simply factor out the diagonal entries in the matrix U in the above LU factorization of A to obtain D and L. That is, L ¼ 1 0 0 2 1 0 1 5 2 1 2 4 3 5; D ¼ 1 0 0 0 2 0 0 0 3 2 2 4 3 5; U ¼ 1 3 5 0 1 3 0 0 1 2 4 3 5 3.41. Find the LU factorization of the matrix A ¼ 1 2 1 2 3 3 3 10 2 2 4 3 5. Reduce A to triangular form by the following operations: ð1Þ ‘‘Replace R2 by 2R1 þ R2; ’’ ð2Þ ‘‘Replace R3 by 3R1 þ R3; ’’ ð3Þ ‘‘Replace R3 by 4R2 þ R3’’ These operations yield the following, where the triangular form is U: A 1 2 1 0 1 1 0 4 5 2 4 3 5 1 2 1 0 1 1 0 0 1 2 4 3 5 ¼ U and L ¼ 1 0 0 2 1 0 3 4 1 2 4 3 5 The entries 2; 3; 4 in L are the negatives of the multipliers 2; 3; 4 in the above row operations. (As a check, multiply L and U to verify A ¼ LU.) 3.42. Let A be the matrix in Problem 3.41. Find X1; X2; X3, where Xi is the solution of AX ¼ Bi for (a) B1 ¼ ð1; 1; 1Þ, (b) B2 ¼ B1 þ X1, (c) B3 ¼ B2 þ X2. (a) Find L1 B1 by applying the row operations (1), (2), and then (3) in Problem 3.41 to B1: B1 ¼ 1 1 1 2 4 3 5 ! ð1Þ and ð2Þ 1 1 4 2 4 3 5 ! ð3Þ 1 1 8 2 4 3 5 Solve UX ¼ B for B ¼ ð1; 1; 8Þ by back-substitution to obtain X1 ¼ ð25; 9; 8Þ. (b) First find B2 ¼ B1 þ X1 ¼ ð1; 1; 1Þ þ ð25; 9; 8Þ ¼ ð24; 10; 9Þ. Then as above B2 ¼ ½24; 10; 9T ! ð1Þ and ð2Þ ½24; 58; 63T ! ð3Þ ½24; 58; 295T Solve UX ¼ B for B ¼ ð24; 58; 295Þ by back-substitution to obtain X2 ¼ ð943; 353; 295Þ. (c) First find B3 ¼ B2 þ X2 ¼ ð24; 10; 9Þ þ ð943; 353; 295Þ ¼ ð919; 343; 286Þ. Then, as above B3 ¼ ½943; 353; 295T ! ð1Þ and ð2Þ ½919; 2181; 2671T ! ð3Þ ½919; 2181; 11 395T Solve UX ¼ B for B ¼ ð919; 2181; 11 395Þ by back-substitution to obtain X3 ¼ ð37 628; 13 576; 11 395Þ. CHAPTER 3 Systems of Linear Equations 105
  • 113. Miscellaneous Problems 3.43. Let L be a linear combination of the m equations in n unknowns in the system (3.2). Say L is the equation ðc1a11 þ þ cmam1Þx1 þ þ ðc1a1n þ þ cmamnÞxn ¼ c1b1 þ þ cmbm ð1Þ Show that any solution of the system (3.2) is also a solution of L. Let u ¼ ðk1; . . . ; knÞ be a solution of (3.2). Then ai1k1 þ ai2k2 þ þ ainkn ¼ bi ði ¼ 1; 2; . . . ; mÞ ð2Þ Substituting u in the left-hand side of (1) and using (2), we get ðc1a11 þ þ cmam1Þk1 þ þ ðc1a1n þ þ cmamnÞkn ¼ c1ða11k1 þ þ a1nknÞ þ þ cmðam1k1 þ þ amnknÞ ¼ c1b1 þ þ cmbm This is the right-hand side of (1); hence, u is a solution of (1). 3.44. Suppose a system m of linear equations is obtained from a system l by applying an elementary operation (page 64). Show that m and l have the same solutions. Each equation L in m is a linear combination of equations in l. Hence, by Problem 3.43, any solution of l will also be a solution of m. On the other hand, each elementary operation has an inverse elementary operation, so l can be obtained from m by an elementary operation. This means that any solution of m is a solution of l. Thus, l and m have the same solutions. 3.45. Prove Theorem 3.4: Suppose a system m of linear equations is obtained from a system l by a sequence of elementary operations. Then m and l have the same solutions. Each step of the sequence does not change the solution set (Problem 3.44). Thus, the original system l and the final system m (and any system in between) have the same solutions. 3.46. A system l of linear equations is said to be consistent if no linear combination of its equations is a degenerate equation L with a nonzero constant. Show that l is consistent if and only if l is reducible to echelon form. Suppose l is reducible to echelon form. Then l has a solution, which must also be a solution of every linear combination of its equations. Thus, L, which has no solution, cannot be a linear combination of the equations in l. Thus, l is consistent. On the other hand, suppose l is not reducible to echelon form. Then, in the reduction process, it must yield a degenerate equation L with a nonzero constant, which is a linear combination of the equations in l. Therefore, l is not consistent; that is, l is inconsistent. 3.47. Suppose u and v are distinct vectors. Show that, for distinct scalars k, the vectors u þ kðu vÞ are distinct. Suppose u þ k1ðu vÞ ¼ u þ k2ðu vÞ: We need only show that k1 ¼ k2. We have k1ðu vÞ ¼ k2ðu vÞ; and so ðk1 k2Þðu vÞ ¼ 0 Because u and v are distinct, u v 6¼ 0. Hence, k1 k2 ¼ 0, and so k1 ¼ k2. 3.48. Suppose AB is defined. Prove (a) Suppose A has a zero row. Then AB has a zero row. (b) Suppose B has a zero column. Then AB has a zero column. 106 CHAPTER 3 Systems of Linear Equations
  • 114. (a) Let Ri be the zero row of A, and C1; . . . ; Cn the columns of B. Then the ith row of AB is ðRiC1; RiC2; . . . ; RiCnÞ ¼ ð0; 0; 0; . . . ; 0Þ (b) BT has a zero row, and so BT AT ¼ ðABÞT has a zero row. Hence, AB has a zero column. SUPPLEMENTARY PROBLEMS Linear Equations, 2 2 Systems 3.49. Determine whether each of the following systems is linear: (a) 3x 4y þ 2yz ¼ 8, (b) ex þ 3y ¼ p, (c) 2x 3y þ kz ¼ 4 3.50. Solve (a) px ¼ 2, (b) 3x þ 2 ¼ 5x þ 7 2x, (c) 6x þ 2 4x ¼ 5 þ 2x 3 3.51. Solve each of the following systems: (a) 2x þ 3y ¼ 1 5x þ 7y ¼ 3 (b) 4x 2y ¼ 5 6x þ 3y ¼ 1 (c) 2x 4 ¼ 3y 5y x ¼ 5 (d) 2x 4y ¼ 10 3x 6y ¼ 15 3.52. Consider each of the following systems in unknowns x and y: (a) x ay ¼ 1 ax 4y ¼ b (b) ax þ 3y ¼ 2 12x þ ay ¼ b (c) x þ ay ¼ 3 2x þ 5y ¼ b For which values of a does each system have a unique solution, and for which pairs of values ða; bÞ does each system have more than one solution? General Systems of Linear Equations 3.53. Solve (a) x þ y þ 2z ¼ 4 2x þ 3y þ 6z ¼ 10 3x þ 6y þ 10z ¼ 17 (b) x 2y þ 3z ¼ 2 2x 3y þ 8z ¼ 7 3x 4y þ 13z ¼ 8 (c) x þ 2y þ 3z ¼ 3 2x þ 3y þ 8z ¼ 4 5x þ 8y þ 19z ¼ 11 3.54. Solve (a) x 2y ¼ 5 2x þ 3y ¼ 3 3x þ 2y ¼ 7 (b) x þ 2y 3z þ 2t ¼ 2 2x þ 5y 8z þ 6t ¼ 5 3x þ 4y 5z þ 2t ¼ 4 (c) x þ 2y þ 4z 5t ¼ 3 3x y þ 5z þ 2t ¼ 4 5x 4y þ 6z þ 9t ¼ 2 3.55. Solve (a) 2x y 4z ¼ 2 4x 2y 6z ¼ 5 6x 3y 8z ¼ 8 (b) x þ 2y z þ 3t ¼ 3 2x þ 4y þ 4z þ 3t ¼ 9 3x þ 6y z þ 8t ¼ 10 3.56. Consider each of the following systems in unknowns x; y; z: (a) x 2y ¼ 1 x y þ az ¼ 2 ay þ 9z ¼ b (b) x þ 2y þ 2z ¼ 1 x þ ay þ 3z ¼ 3 x þ 11y þ az ¼ b (c) x þ y þ az ¼ 1 x þ ay þ z ¼ 4 ax þ y þ z ¼ b For which values of a does the system have a unique solution, and for which pairs of values ða; bÞ does the system have more than one solution? The value of b does not have any effect on whether the system has a unique solution. Why? CHAPTER 3 Systems of Linear Equations 107
  • 115. Linear Combinations, Homogeneous Systems 3.57. Write v as a linear combination of u1; u2; u3, where (a) v ¼ ð4; 9; 2Þ, u1 ¼ ð1; 2; 1Þ, u2 ¼ ð1; 4; 2Þ, u3 ¼ ð1; 3; 2Þ; (b) v ¼ ð1; 3; 2Þ, u1 ¼ ð1; 2; 1Þ, u2 ¼ ð2; 6; 5Þ, u3 ¼ ð1; 7; 8Þ; (c) v ¼ ð1; 4; 6Þ, u1 ¼ ð1; 1; 2Þ, u2 ¼ ð2; 3; 5Þ, u3 ¼ ð3; 5; 8Þ. 3.58. Let u1 ¼ ð1; 1; 2Þ, u2 ¼ ð1; 3; 2Þ, u3 ¼ ð4; 2; 1Þ in R3 . Show that u1; u2; u3 are orthogonal, and write v as a linear combination of u1; u2; u3, where (a) v ¼ ð5; 5; 9Þ, (b) v ¼ ð1; 3; 3Þ, (c) v ¼ ð1; 1; 1Þ. (Hint: Use Fourier coefficients.) 3.59. Find the dimension and a basis of the general solution W of each of the following homogeneous systems: (a) x y þ 2z ¼ 0 2x þ y þ z ¼ 0 5x þ y þ 4z ¼ 0 (b) x þ 2y 3z ¼ 0 2x þ 5y þ 2z ¼ 0 3x y 4z ¼ 0 (c) x þ 2y þ 3z þ t ¼ 0 2x þ 4y þ 7z þ 4t ¼ 0 3x þ 6y þ 10z þ 5t ¼ 0 3.60. Find the dimension and a basis of the general solution W of each of the following systems: (a) x1 þ 3x2 þ 2x3 x4 x5 ¼ 0 2x1 þ 6x2 þ 5x3 þ x4 x5 ¼ 0 5x1 þ 15x2 þ 12x3 þ x4 3x5 ¼ 0 (b) 2x1 4x2 þ 3x3 x4 þ 2x5 ¼ 0 3x1 6x2 þ 5x3 2x4 þ 4x5 ¼ 0 5x1 10x2 þ 7x3 3x4 þ 18x5 ¼ 0 Echelon Matrices, Row Canonical Form 3.61. Reduce each of the following matrices to echelon form and then to row canonical form: (a) 1 1 2 2 4 9 1 5 12 2 4 3 5; (b) 1 2 1 2 1 2 4 1 2 5 3 6 3 7 7 2 4 3 5; (c) 2 4 2 2 5 1 3 6 2 2 0 4 4 8 2 6 5 7 2 4 3 5 3.62. Reduce each of the following matrices to echelon form and then to row canonical form: (a) 1 2 1 2 1 2 2 4 3 5 5 7 3 6 4 9 10 11 1 2 4 3 6 9 2 6 6 4 3 7 7 5; (b) 0 1 2 3 0 3 8 12 0 0 4 6 0 2 7 10 2 6 6 4 3 7 7 5; (c) 1 3 1 3 2 8 5 10 1 7 7 11 3 11 7 15 2 6 6 4 3 7 7 5 3.63. Using only 0’s and 1’s, list all possible 2 2 matrices in row canonical form. 3.64. Using only 0’s and 1’s, find the number n of possible 3 3 matrices in row canonical form. Elementary Matrices, Applications 3.65. Let e1; e2; e3 denote, respectively, the following elementary row operations: ‘‘Interchange R2 and R3; ’’ ‘‘Replace R2 by 3R2; ’’ ‘‘Replace R1 by 2R3 þ R1’’ (a) Find the corresponding elementary matrices E1; E2; E3. (b) Find the inverse operations e1 1 , e1 2 , e1 3 ; their corresponding elementary matrices E0 1, E0 2, E0 3; and the relationship between them and E1; E2; E3. (c) Describe the corresponding elementary column operations f1; f2; f3. (d) Find elementary matrices F1; F2; F3 corresponding to f1; f2; f3, and the relationship between them and E1; E2; E3. 108 CHAPTER 3 Systems of Linear Equations
  • 116. 3.66. Express each of the following matrices as a product of elementary matrices: A ¼ 1 2 3 4 ; B ¼ 3 6 2 4 ; C ¼ 2 6 3 7 ; D ¼ 1 2 0 0 1 3 3 8 7 2 4 3 5 3.67. Find the inverse of each of the following matrices (if it exists): A ¼ 1 2 1 2 3 1 3 4 4 2 4 3 5; B ¼ 1 2 3 2 6 1 3 10 1 2 4 3 5; C ¼ 1 3 2 2 8 3 1 7 1 2 4 3 5; D ¼ 2 1 1 5 2 3 0 2 1 2 4 3 5 3.68. Find the inverse of each of the following n n matrices: (a) A has 1’s on the diagonal and superdiagonal (entries directly above the diagonal) and 0’s elsewhere. (b) B has 1’s on and above the diagonal, and 0’s below the diagonal. Lu Factorization 3.69. Find the LU factorization of each of the following matrices: (a) 1 1 1 3 4 2 2 3 2 2 4 3 5, (b) 1 3 1 2 5 1 3 4 2 2 4 3 5, (c) 2 3 6 4 7 9 3 5 4 2 4 3 5, (d) 1 2 3 2 4 7 3 7 10 2 4 3 5 3.70. Let A be the matrix in Problem 3.69(a). Find X1; X2; X3; X4, where (a) X1 is the solution of AX ¼ B1, where B1 ¼ ð1; 1; 1ÞT . (b) For k 1, Xk is the solution of AX ¼ Bk, where Bk ¼ Bk1 þ Xk1. 3.71. Let B be the matrix in Problem 3.69(b). Find the LDU factorization of B. Miscellaneous Problems 3.72. Consider the following systems in unknowns x and y: ðaÞ ax þ by ¼ 1 cx þ dy ¼ 0 ðbÞ ax þ by ¼ 0 cx þ dy ¼ 1 Suppose D ¼ ad bc 6¼ 0. Show that each system has the unique solution: (a) x ¼ d=D, y ¼ c=D, (b) x ¼ b=D, y ¼ a=D. 3.73. Find the inverse of the row operation ‘‘Replace Ri by kRj þ k0 Ri ðk0 6¼ 0Þ.’’ 3.74. Prove that deleting the last column of an echelon form (respectively, the row canonical form) of an augmented matrix M ¼ ½A; B yields an echelon form (respectively, the row canonical form) of A. 3.75. Let e be an elementary row operation and E its elementary matrix, and let f be the corresponding elementary column operation and F its elementary matrix. Prove (a) f ðAÞ ¼ ðeðAT ÞÞT , (b) F ¼ ET , (c) f ðAÞ ¼ AF. 3.76. Matrix A is equivalent to matrix B, written A B, if there exist nonsingular matrices P and Q such that B ¼ PAQ. Prove that is an equivalence relation; that is, (a) A A, (b) If A B, then B A, (c) If A B and B C, then A C. CHAPTER 3 Systems of Linear Equations 109
  • 117. ANSWERS TO SUPPLEMENTARY PROBLEMS Notation: A ¼ ½R1; R2; . . . denotes the matrix A with rows R1; R2; . . . . The elements in each row are separated by commas (which may be omitted with single digits), the rows are separated by semicolons, and 0 denotes a zero row. For example, A ¼ ½1; 2; 3; 4; 5; 6; 7; 8; 0 ¼ 1 2 3 4 5 6 7 8 0 0 0 0 2 4 3 5 3.49. (a) no, (b) yes, (c) linear in x; y; z, not linear in x; y; z; k 3.50. (a) x ¼ 2=p, (b) no solution, (c) every scalar k is a solution 3.51. (a) ð2; 1Þ, (b) no solution, (c) ð5; 2Þ, (d) ð5 2a; aÞ 3.52. (a) a 6¼ 2; ð2; 2Þ; ð2; 2Þ, (b) a 6¼ 6; ð6; 4Þ; ð6; 4Þ, (c) a 6¼ 5 2 ; ð5 2 ; 6Þ 3.53. (a) ð2; 1; 1 2Þ, (b) no solution, (c) u ¼ ð7a 1; 2a þ 2; aÞ. 3.54. (a) ð3; 1Þ, (b) u ¼ ða þ 2b; 1 þ 2a 2b; a; bÞ, (c) no solution 3.55. (a) u ¼ ð1 2 a þ 2; a; 1 2Þ, (b) u ¼ ð1 2 ð7 5b 4aÞ; a; 1 2 ð1 þ bÞ; bÞ 3.56. (a) a 6¼ 3; ð3; 3Þ; ð3; 3Þ, (b) a 6¼ 5 and a 6¼ 1; ð5; 7Þ; ð1; 5Þ, (c) a 6¼ 1 and a 6¼ 2; ð2; 5Þ 3.57. (a) 2; 1; 3, (b) 6; 3; 1, (c) not possible 3.58. (a) 3; 2; 1, (b) 2 3 ; 1; 1 3, (c) 2 3 ; 1 7 ; 1 21 3.59. (a) dim W ¼ 1; u1 ¼ ð1; 1; 1Þ, (b) dim W ¼ 0, no basis, (c) dim W ¼ 2; u1 ¼ ð2; 1; 0; 0Þ; u2 ¼ ð5; 0; 2; 1Þ 3.60. (a) dim W ¼ 3; u1 ¼ ð3; 1; 0; 0; 0Þ, u2 ¼ ð7; 0; 3; 1; 0Þ, u3 ¼ ð3; 0; 1; 0; 1Þ, (b) dim W ¼ 2, u1 ¼ ð2; 1; 0; 0; 0Þ, u2 ¼ ð5; 0; 5; 3; 1Þ 3.61. (a) ½1; 0; 1 2 ; 0; 1; 5 2 ; 0, (b) ½1; 2; 0; 0; 2; 0; 0; 1; 0; 5; 0; 0; 0; 1; 2, (c) ½1; 2; 0; 4; 5; 3; 0; 0; 1; 5; 15 2 ; 5 2 ; 0 3.62. (a) ½1; 2; 0; 0; 4; 2; 0; 0; 1; 0; 1; 2; 0; 0; 0; 1; 2; 1; 0, (b) ½0; 1; 0; 0; 0; 0; 1; 0; 0; 0; 0; 1; 0, (c) ½1; 0; 0; 4; 0; 1; 0; 1; 0; 0; 1; 2; 0 3.63. 5: ½1; 0; 0; 1, ½1; 1; 0; 0, ½1; 0; 0; 0, ½0; 1; 0; 0; 0 3.64. 16 3.65. (a) ½1; 0; 0; 0; 0; 1; 0; 1; 0, ½1; 0; 0; 0; 3; 0; 0; 0; 1, ½1; 0; 2; 0; 1; 0; 0; 0; 1, (b) R2 $ R3; 1 3 R2 ! R2; 2R3 þ R1 ! R1; each E0 i ¼ E1 i , (c) C2 $ C3; 3C2 ! C2; 2C3 þ C1 ! C1, (d) each Fi ¼ ET i . 3.66. A ¼ ½1; 0; 3; 1½1; 0; 0; 2½1; 2; 0; 1, B is not invertible, C ¼ ½1; 0; 3 2 ; 1½1; 0; 0; 2½1; 6; 0; 1½2; 0; 0; 1, D ¼ ½100; 010; 301½100; 010; 021½100; 013; 001½120; 010; 001 3.67. A1 ¼ ½8; 12; 5; 5; 7; 3; 1; 2; 1, B has no inverse, C1 ¼ ½29 2 ; 17 2 ; 7 2 ; 5 2 ; 3 2 ; 1 2 ; 3; 2; 1; D1 ¼ ½8; 3; 1; 5; 2; 1; 10; 4; 1 110 CHAPTER 3 Systems of Linear Equations
  • 118. 3.68. A1 ¼ ½1; 1; 1; 1; . . . ; 0; 1; 1; 1; 1; . . . ; 0; 0; 1; 1; 1; 1; 1; . . . ; . . . ; . . . ; 0; . . . 0; 1 B1 has 1’s on diagonal, 1’s on superdiagonal, and 0’s elsewhere. 3.69. (a) ½100; 310; 211½1; 1; 1; 0; 1; 1; 0; 0; 1, (b) ½100; 210; 351½1; 3; 1; 0; 1; 3; 0; 0; 10, (c) ½100; 210; 3 2 ; 1 2 ; 1½2; 3; 6; 0; 1; 3; 0; 0; 7 2, (d) There is no LU decomposition. 3.70. X1 ¼ ½1; 1; 1T ; B2 ¼ ½2; 2; 0T , X2 ¼ ½6; 4; 0T , B3 ¼ ½8; 6; 0T , X3 ¼ ½22; 16; 2T , B4 ¼ ½30; 22; 2T , X4 ¼ ½86; 62; 6T 3.71. B ¼ ½100; 210; 351 diagð1; 1; 10Þ ½1; 3; 1; 0; 1; 3; 0; 0; 1 3.73. Replace Ri by kRj þ ð1=k0 ÞRi. 3.75. (c) f ðAÞ ¼ ðeðAT ÞÞT ¼ ðEAT ÞT ¼ ðAT ÞT ET ¼ AF 3.76. (a) A ¼ IAI: (b) If A ¼ PBQ, then B ¼ P1 AQ1 . (c) If A ¼ PBQ and B ¼ P0 CQ0 , then A ¼ ðPP0 ÞCðQ 0 QÞ. CHAPTER 3 Systems of Linear Equations 111
  • 119. Vector Spaces 4.1 Introduction This chapter introduces the underlying structure of linear algebra, that of a finite-dimensional vector space. The definition of a vector space V, whose elements are called vectors, involves an arbitrary field K, whose elements are called scalars. The following notation will be used (unless otherwise stated or implied): V the given vector space u; v; w vectors in V K the given number field a; b; c; or k scalars in K Almost nothing essential is lost if the reader assumes that K is the real field R or the complex field C. The reader might suspect that the real line R has ‘‘dimension’’ one, the cartesian plane R2 has ‘‘dimension’’ two, and the space R3 has ‘‘dimension’’ three. This chapter formalizes the notion of ‘‘dimension,’’ and this definition will agree with the reader’s intuition. Throughout this text, we will use the following set notation: a 2 A Element a belongs to set A a; b 2 A Elements a and b belong to A 8x 2 A For every x in A 9x 2 A There exists an x in A A B A is a subset of B A B Intersection of A and B A [ B Union of A and B ; Empty set 4.2 Vector Spaces The following defines the notion of a vector space V where K is the field of scalars. DEFINITION: Let V be a nonempty set with two operations: (i) Vector Addition: This assigns to any u; v 2 V a sum u þ v in V. (ii) Scalar Multiplication: This assigns to any u 2 V, k 2 K a product ku 2 V. Then V is called a vector space (over the field K) if the following axioms hold for any vectors u; v; w 2 V: 112 CHAPTER 4
  • 120. [A1] ðu þ vÞ þ w ¼ u þ ðv þ wÞ [A2] There is a vector in V, denoted by 0 and called the zero vector, such that, for any u 2 V; u þ 0 ¼ 0 þ u ¼ u [A3] For each u 2 V; there is a vector in V, denoted by u, and called the negative of u, such that u þ ðuÞ ¼ ðuÞ þ u ¼ 0. [A4] u þ v ¼ v þ u. [M1] kðu þ vÞ ¼ ku þ kv, for any scalar k 2 K: [M2] ða þ bÞu ¼ au þ bu; for any scalars a; b 2 K. [M3] ðabÞu ¼ aðbuÞ; for any scalars a; b 2 K. [M4] 1u ¼ u, for the unit scalar 1 2 K. The above axioms naturally split into two sets (as indicated by the labeling of the axioms). The first four are concerned only with the additive structure of V and can be summarized by saying V is a commutative group under addition. This means (a) Any sum v1 þ v2 þ þ vm of vectors requires no parentheses and does not depend on the order of the summands. (b) The zero vector 0 is unique, and the negative u of a vector u is unique. (c) (Cancellation Law) If u þ w ¼ v þ w, then u ¼ v. Also, subtraction in V is defined by u v ¼ u þ ðvÞ, where v is the unique negative of v. On the other hand, the remaining four axioms are concerned with the ‘‘action’’ of the field K of scalars on the vector space V. Using these additional axioms, we prove (Problem 4.2) the following simple properties of a vector space. THEOREM 4.1: Let V be a vector space over a field K. (i) For any scalar k 2 K and 0 2 V; k0 ¼ 0. (ii) For 0 2 K and any vector u 2 V; 0u ¼ 0. (iii) If ku ¼ 0, where k 2 K and u 2 V, then k ¼ 0 or u ¼ 0. (iv) For any k 2 K and any u 2 V; ðkÞu ¼ kðuÞ ¼ ku. 4.3 Examples of Vector Spaces This section lists important examples of vector spaces that will be used throughout the text. Space Kn Let K be an arbitrary field. The notation Kn is frequently used to denote the set of all n-tuples of elements in K. Here Kn is a vector space over K using the following operations: (i) Vector Addition: ða1; a2; . . . ; anÞ þ ðb1; b2; . . . ; bnÞ ¼ ða1 þ b1; a2 þ b2; . . . ; an þ bnÞ (ii) Scalar Multiplication: kða1; a2; . . . ; anÞ ¼ ðka1; ka2; . . . ; kanÞ The zero vector in Kn is the n-tuple of zeros, 0 ¼ ð0; 0; . . . ; 0Þ and the negative of a vector is defined by ða1; a2; . . . ; anÞ ¼ ða1; a2; . . . ; anÞ Observe that these are the same as the operations defined for Rn in Chapter 1. The proof that Kn is a vector space is identical to the proof of Theorem 1.1, which we now regard as stating that Rn with the operations defined there is a vector space over R. CHAPTER 4 Vector Spaces 113
  • 121. Polynomial Space PðtÞ Let PðtÞ denote the set of all polynomials of the form pðtÞ ¼ a0 þ a1t þ a2t2 þ þ asts ðs ¼ 1; 2; . . .Þ where the coefficients ai belong to a field K. Then PðtÞ is a vector space over K using the following operations: (i) Vector Addition: Here pðtÞ þ qðtÞ in PðtÞ is the usual operation of addition of polynomials. (ii) Scalar Multiplication: Here kpðtÞ in PðtÞ is the usual operation of the product of a scalar k and a polynomial pðtÞ. The zero polynomial 0 is the zero vector in PðtÞ. Polynomial Space PnðtÞ Let PnðtÞ denote the set of all polynomials pðtÞ over a field K, where the degree of pðtÞ is less than or equal to n; that is, pðtÞ ¼ a0 þ a1t þ a2t2 þ þ asts where s n. Then PnðtÞ is a vector space over K with respect to the usual operations of addition of polynomials and of multiplication of a polynomial by a constant (just like the vector space PðtÞ above). We include the zero polynomial 0 as an element of PnðtÞ, even though its degree is undefined. Matrix Space Mm;n The notation Mm;n, or simply M; will be used to denote the set of all m n matrices with entries in a field K. Then Mm;n is a vector space over K with respect to the usual operations of matrix addition and scalar multiplication of matrices, as indicated by Theorem 2.1. Function Space FðXÞ Let X be a nonempty set and let K be an arbitrary field. Let FðXÞ denote the set of all functions of X into K. [Note that FðXÞ is nonempty, because X is nonempty.] Then FðXÞ is a vector space over K with respect to the following operations: (i) Vector Addition: The sum of two functions f and g in FðXÞ is the function f þ g in FðXÞ defined by ð f þ gÞðxÞ ¼ f ðxÞ þ gðxÞ 8x 2 X (ii) Scalar Multiplication: The product of a scalar k 2 K and a function f in FðXÞ is the function kf in FðXÞ defined by ðkf ÞðxÞ ¼ kf ðxÞ 8x 2 X The zero vector in FðXÞ is the zero function 0, which maps every x 2 X into the zero element 0 2 K; 0ðxÞ ¼ 0 8x 2 X Also, for any function f in FðXÞ, negative of f is the function f in FðXÞ defined by ðf ÞðxÞ ¼ f ðxÞ 8x 2 X Fields and Subfields Suppose a field E is an extension of a field K; that is, suppose E is a field that contains K as a subfield. Then E may be viewed as a vector space over K using the following operations: (i) Vector Addition: Here u þ v in E is the usual addition in E. (ii) Scalar Multiplication: Here ku in E, where k 2 K and u 2 E, is the usual product of k and u as elements of E. That is, the eight axioms of a vector space are satisfied by E and its subfield K with respect to the above two operations. 114 CHAPTER 4 Vector Spaces
  • 122. 4.4 Linear Combinations, Spanning Sets Let V be a vector space over a field K. A vector v in V is a linear combination of vectors u1; u2; . . . ; um in V if there exist scalars a1; a2; . . . ; am in K such that v ¼ a1u1 þ a2u2 þ þ amum Alternatively, v is a linear combination of u1; u2; . . . ; um if there is a solution to the vector equation v ¼ x1u1 þ x2u2 þ þ xmum where x1; x2; . . . ; xm are unknown scalars. EXAMPLE 4.1 (Linear Combinations in Rn ) Suppose we want to express v ¼ ð3; 7; 4Þ in R3 as a linear combination of the vectors u1 ¼ ð1; 2; 3Þ; u2 ¼ ð2; 3; 7Þ; u3 ¼ ð3; 5; 6Þ We seek scalars x, y, z such that v ¼ xu1 þ yu2 þ zu3; that is, 3 3 4 2 4 3 5 ¼ x 1 2 3 2 4 3 5 þ y 2 3 7 2 4 3 5 þ z 3 5 6 2 4 3 5 or x þ 2y þ 3z ¼ 3 2x þ 3y þ 5z ¼ 7 3x þ 7y þ 6z ¼ 4 (For notational convenience, we have written the vectors in R3 as columns, because it is then easier to find the equivalent system of linear equations.) Reducing the system to echelon form yields x þ 2y þ 3z ¼ 3 y z ¼ 1 y 3z ¼ 13 and then x þ 2y þ 3z ¼ 3 y z ¼ 1 4z ¼ 12 Back-substitution yields the solution x ¼ 2, y ¼ 4, z ¼ 3. Thus, v ¼ 2u1 4u2 þ 3u3. Remark: Generally speaking, the question of expressing a given vector v in Kn as a linear combination of vectors u1; u2; . . . ; um in Kn is equivalent to solving a system AX ¼ B of linear equations, where v is the column B of constants, and the u’s are the columns of the coefficient matrix A. Such a system may have a unique solution (as above), many solutions, or no solution. The last case—no solution—means that v cannot be written as a linear combination of the u’s. EXAMPLE 4.2 (Linear combinations in PðtÞ) Suppose we want to express the polynomial v ¼ 3t2 þ 5t 5 as a linear combination of the polynomials p1 ¼ t2 þ 2t þ 1; p2 ¼ 2t2 þ 5t þ 4; p3 ¼ t2 þ 3t þ 6 We seek scalars x, y, z such that v ¼ xp1 þ yp2 þ zp3; that is, 3t2 þ 5t 5 ¼ xðt2 þ 2t þ 1Þ þ yð2t2 þ 5t þ 4Þ þ zðt2 þ 3t þ 6Þ ð*Þ There are two ways to proceed from here. (1) Expand the right-hand side of (*) obtaining: 3t2 þ 5t 5 ¼ xt2 þ 2xt þ x þ 2yt2 þ 5yt þ 4y þ zt2 þ 3zt þ 6z ¼ ðx þ 2y þ zÞt2 þ ð2x þ 5y þ 3zÞt þ ðx þ 4y þ 6zÞ Set coefficients of the same powers of t equal to each other, and reduce the system to echelon form: x þ 2y þ z ¼ 3 2x þ 5y þ 3z ¼ 5 x þ 4y þ 6z ¼ 5 or x þ 2y þ z ¼ 3 y þ z ¼ 1 2y þ 5z ¼ 8 or x þ 2y þ z ¼ 3 y þ z ¼ 1 3z ¼ 6 CHAPTER 4 Vector Spaces 115
  • 123. The system is in triangular form and has a solution. Back-substitution yields the solution x ¼ 3, y ¼ 1, z ¼ 2. Thus, v ¼ 3p1 þ p2 2p3 (2) The equation (*) is actually an identity in the variable t; that is, the equation holds for any value of t. We can obtain three equations in the unknowns x, y, z by setting t equal to any three values. For example, Set t ¼ 0 in ð1Þ to obtain: x þ 4y þ 6z ¼ 5 Set t ¼ 1 in ð1Þ to obtain: 4x þ 11y þ 10z ¼ 3 Set t ¼ 1 in ð1Þ to obtain: y þ 4z ¼ 7 Reducing this system to echelon form and solving by back-substitution again yields the solution x ¼ 3, y ¼ 1, z ¼ 2. Thus (again), v ¼ 3p1 þ p2 2p3. Spanning Sets Let V be a vector space over K. Vectors u1; u2; . . . ; um in V are said to span V or to form a spanning set of V if every v in V is a linear combination of the vectors u1; u2; . . . ; um—that is, if there exist scalars a1; a2; . . . ; am in K such that v ¼ a1u1 þ a2u2 þ þ amum The following remarks follow directly from the definition. Remark 1: Suppose u1; u2; . . . ; um span V. Then, for any vector w, the set w; u1; u2; . . . ; um also spans V. Remark 2: Suppose u1; u2; . . . ; um span V and suppose uk is a linear combination of some of the other u’s. Then the u’s without uk also span V. Remark 3: Suppose u1; u2; . . . ; um span V and suppose one of the u’s is the zero vector. Then the u’s without the zero vector also span V. EXAMPLE 4.3 Consider the vector space V ¼ R3 . (a) We claim that the following vectors form a spanning set of R3 : e1 ¼ ð1; 0; 0Þ; e2 ¼ ð0; 1; 0Þ; e3 ¼ ð0; 0; 1Þ Specifically, if v ¼ ða; b; cÞ is any vector in R3 , then v ¼ ae1 þ be2 þ ce3 For example, v ¼ ð5; 6; 2Þ ¼ 5e1 6e2 þ 2e3. (b) We claim that the following vectors also form a spanning set of R3 : w1 ¼ ð1; 1; 1Þ; w2 ¼ ð1; 1; 0Þ; w3 ¼ ð1; 0; 0Þ Specifically, if v ¼ ða; b; cÞ is any vector in R3 , then (Problem 4.62) v ¼ ða; b; cÞ ¼ cw1 þ ðb cÞw2 þ ða bÞw3 For example, v ¼ ð5; 6; 2Þ ¼ 2w1 8w2 þ 11w3. (c) One can show (Problem 3.24) that v ¼ ð2; 7; 8Þ cannot be written as a linear combination of the vectors u1 ¼ ð1; 2; 3Þ; u2 ¼ ð1; 3; 5Þ; u3 ¼ ð1; 5; 9Þ Accordingly, u1, u2, u3 do not span R3 . 116 CHAPTER 4 Vector Spaces
  • 124. EXAMPLE 4.4 Consider the vector space V ¼ PnðtÞ consisting of all polynomials of degree n. (a) Clearly every polynomial in PnðtÞ can be expressed as a linear combination of the n þ 1 polynomials 1; t; t2 ; t3 ; . . . ; tn Thus, these powers of t (where 1 ¼ t0 ) form a spanning set for PnðtÞ. (b) One can also show that, for any scalar c, the following n þ 1 powers of t c, 1; t c; ðt cÞ2 ; ðt cÞ3 ; . . . ; ðt cÞn (where ðt cÞ0 ¼ 1), also form a spanning set for PnðtÞ. EXAMPLE 4.5 Consider the vector space M ¼ M2;2 consisting of all 2 2 matrices, and consider the following four matrices in M: E11 ¼ 1 0 0 0 ; E12 ¼ 0 1 0 0 ; E21 ¼ 0 0 1 0 ; E22 ¼ 0 0 0 1 Then clearly any matrix A in M can be written as a linear combination of the four matrices. For example, A ¼ 5 6 7 8 ¼ 5E11 6E12 þ 7E21 þ 8E22 Accordingly, the four matrices E11, E12, E21, E22 span M. 4.5 Subspaces This section introduces the important notion of a subspace. DEFINITION: Let V be a vector space over a field K and let W be a subset of V. Then W is a subspace of V if W is itself a vector space over K with respect to the operations of vector addition and scalar multiplication on V. The way in which one shows that any set W is a vector space is to show that W satisfies the eight axioms of a vector space. However, if W is a subset of a vector space V, then some of the axioms automatically hold in W, because they already hold in V. Simple criteria for identifying subspaces follow. THEOREM 4.2: Suppose W is a subset of a vector space V. Then W is a subspace of V if the following two conditions hold: (a) The zero vector 0 belongs to W. (b) For every u; v 2 W; k 2 K: (i) The sum u þ v 2 W. (ii) The multiple ku 2 W. Property (i) in (b) states that W is closed under vector addition, and property (ii) in (b) states that W is closed under scalar multiplication. Both properties may be combined into the following equivalent single statement: (b0 ) For every u; v 2 W; a; b 2 K, the linear combination au þ bv 2 W. Now let V be any vector space. Then V automatically contains two subspaces: the set {0} consisting of the zero vector alone and the whole space V itself. These are sometimes called the trivial subspaces of V. Examples of nontrivial subspaces follow. EXAMPLE 4.6 Consider the vector space V ¼ R3 . (a) Let U consist of all vectors in R3 whose entries are equal; that is, U ¼ fða; b; cÞ : a ¼ b ¼ cg For example, (1, 1, 1), (73, 73, 73), (7, 7, 7), (72, 72, 72) are vectors in U. Geometrically, U is the line through the origin O and the point (1, 1, 1) as shown in Fig. 4-1(a). Clearly 0 ¼ ð0; 0; 0Þ belongs to U, because CHAPTER 4 Vector Spaces 117
  • 125. all entries in 0 are equal. Further, suppose u and v are arbitrary vectors in U, say, u ¼ ða; a; aÞ and v ¼ ðb; b; bÞ. Then, for any scalar k 2 R, the following are also vectors in U: u þ v ¼ ða þ b; a þ b; a þ bÞ and ku ¼ ðka; ka; kaÞ Thus, U is a subspace of R3 . (b) Let W be any plane in R3 passing through the origin, as pictured in Fig. 4-1(b). Then 0 ¼ ð0; 0; 0Þ belongs to W, because we assumed W passes through, the origin O. Further, suppose u and v are vectors in W. Then u and v may be viewed as arrows in the plane W emanating from the origin O, as in Fig. 4-1(b). The sum u þ v and any multiple ku of u also lie in the plane W. Thus, W is a subspace of R3 . EXAMPLE 4.7 (a) Let V ¼ Mn;n, the vector space of n n matrices. Let W1 be the subset of all (upper) triangular matrices and let W2 be the subset of all symmetric matrices. Then W1 is a subspace of V, because W1 contains the zero matrix 0 and W1 is closed under matrix addition and scalar multiplication; that is, the sum and scalar multiple of such triangular matrices are also triangular. Similarly, W2 is a subspace of V. (b) Let V ¼ PðtÞ, the vector space PðtÞ of polynomials. Then the space PnðtÞ of polynomials of degree at most n may be viewed as a subspace of PðtÞ. Let QðtÞ be the collection of polynomials with only even powers of t. For example, the following are polynomials in QðtÞ: p1 ¼ 3 þ 4t2 5t6 and p2 ¼ 6 7t4 þ 9t6 þ 3t12 (We assume that any constant k ¼ kt0 is an even power of t.) Then QðtÞ is a subspace of PðtÞ. (c) Let V be the vector space of real-valued functions. Then the collection W1 of continuous functions and the collection W2 of differentiable functions are subspaces of V. Intersection of Subspaces Let U and W be subspaces of a vector space V. We show that the intersection U W is also a subspace of V. Clearly, 0 2 U and 0 2 W, because U and W are subspaces; whence 0 2 U W. Now suppose u and v belong to the intersection U W. Then u; v 2 U and u; v 2 W. Further, because U and W are subspaces, for any scalars a; b 2 K, au þ bv 2 U and au þ bv 2 W Thus, au þ bv 2 U W. Therefore, U W is a subspace of V. The above result generalizes as follows. THEOREM 4.3: The intersection of any number of subspaces of a vector space V is a subspace of V. Figure 4-1 118 CHAPTER 4 Vector Spaces
  • 126. Solution Space of a Homogeneous System Consider a system AX ¼ B of linear equations in n unknowns. Then every solution u may be viewed as a vector in Kn . Thus, the solution set of such a system is a subset of Kn . Now suppose the system is homogeneous; that is, suppose the system has the form AX ¼ 0. Let W be its solution set. Because A0 ¼ 0, the zero vector 0 2 W. Moreover, suppose u and v belong to W. Then u and v are solutions of AX ¼ 0, or, in other words, Au ¼ 0 and Av ¼ 0. Therefore, for any scalars a and b, we have Aðau þ bvÞ ¼ aAu þ bAv ¼ a0 þ b0 ¼ 0 þ 0 ¼ 0 Thus, au þ bv belongs to W, because it is a solution of AX ¼ 0. Accordingly, W is a subspace of Kn . We state the above result formally. THEOREM 4.4: The solution set W of a homogeneous system AX ¼ 0 in n unknowns is a subspace of Kn . We emphasize that the solution set of a nonhomogeneous system AX ¼ B is not a subspace of Kn . In fact, the zero vector 0 does not belong to its solution set. 4.6 Linear Spans, Row Space of a Matrix Suppose u1; u2; . . . ; um are any vectors in a vector space V. Recall (Section 4.4) that any vector of the form a1u1 þ a2u2 þ þ amum, where the ai are scalars, is called a linear combination of u1; u2; . . . ; um. The collection of all such linear combinations, denoted by spanðu1; u2; . . . ; umÞ or spanðuiÞ is called the linear span of u1; u2; . . . ; um. Clearly the zero vector 0 belongs to spanðuiÞ, because 0 ¼ 0u1 þ 0u2 þ þ 0um Furthermore, suppose v and v0 belong to spanðuiÞ, say, v ¼ a1u1 þ a2u2 þ þ amum and v0 ¼ b1u1 þ b2u2 þ þ bmum Then, v þ v0 ¼ ða1 þ b1Þu1 þ ða2 þ b2Þu2 þ þ ðam þ bmÞum and, for any scalar k 2 K, kv ¼ ka1u1 þ ka2u2 þ þ kamum Thus, v þ v0 and kv also belong to spanðuiÞ. Accordingly, spanðuiÞ is a subspace of V. More generally, for any subset S of V, spanðSÞ consists of all linear combinations of vectors in S or, when S ¼ f, span(S) ¼ f0g. Thus, in particular, S is a spanning set (Section 4.4) of spanðSÞ. The following theorem, which was partially proved above, holds. THEOREM 4.5: Let S be a subset of a vector space V. (i) Then spanðSÞ is a subspace of V that contains S. (ii) If W is a subspace of V containing S, then spanðSÞ W. Condition (ii) in theorem 4.5 may be interpreted as saying that spanðSÞ is the ‘‘smallest’’ subspace of V containing S. EXAMPLE 4.8 Consider the vector space V ¼ R3 . (a) Let u be any nonzero vector in R3 . Then spanðuÞ consists of all scalar multiples of u. Geometrically, spanðuÞ is the line through the origin O and the endpoint of u, as shown in Fig. 4-2(a). CHAPTER 4 Vector Spaces 119
  • 127. (b) Let u and v be vectors in R3 that are not multiples of each other. Then spanðu; vÞ is the plane through the origin O and the endpoints of u and v as shown in Fig. 4-2(b). (c) Consider the vectors e1 ¼ ð1; 0; 0Þ, e2 ¼ ð0; 1; 0Þ, e3 ¼ ð0; 0; 1Þ in R3 . Recall [Example 4.1(a)] that every vector in R3 is a linear combination of e1, e2, e3. That is, e1, e2, e3 form a spanning set of R3 . Accordingly, spanðe1; e2; e3Þ ¼ R3 . Row Space of a Matrix Let A ¼ ½aij be an arbitrary m n matrix over a field K. The rows of A, R1 ¼ ða11; a12; . . . ; a1nÞ; R2 ¼ ða21; a22; . . . ; a2nÞ; . . . ; Rm ¼ ðam1; am2; . . . ; amnÞ may be viewed as vectors in Kn ; hence, they span a subspace of Kn called the row space of A and denoted by rowsp(A). That is, rowspðAÞ ¼ spanðR1; R2; . . . ; RmÞ Analagously, the columns of A may be viewed as vectors in Km called the column space of A and denoted by colsp(A). Observe that colspðAÞ ¼ rowspðAT Þ. Recall that matrices A and B are row equivalent, written A B, if B can be obtained from A by a sequence of elementary row operations. Now suppose M is the matrix obtained by applying one of the following elementary row operations on a matrix A: ð1Þ Interchange Ri and Rj; ð2Þ Replace Ri by kRi; ð3Þ Replace Rj by kRi þ Rj Then each row of M is a row of A or a linear combination of rows of A. Hence, the row space of M is contained in the row space of A. On the other hand, we can apply the inverse elementary row operation on M to obtain A; hence, the row space of A is contained in the row space of M. Accordingly, A and M have the same row space. This will be true each time we apply an elementary row operation. Thus, we have proved the following theorem. THEOREM 4.6: Row equivalent matrices have the same row space. We are now able to prove (Problems 4.45–4.47) basic results on row equivalence (which first appeared as Theorems 3.7 and 3.8 in Chapter 3). THEOREM 4.7: Suppose A ¼ ½aij and B ¼ ½bij are row equivalent echelon matrices with respective pivot entries a1j1 ; a2j2 ; . . . ; arjr and b1k1 ; b2k2 ; . . . ; bsks Then A and B have the same number of nonzero rows—that is, r ¼ s—and their pivot entries are in the same positions—that is, j1 ¼ k1; j2 ¼ k2; . . . ; jr ¼ kr. THEOREM 4.8: Suppose A and B are row canonical matrices. Then A and B have the same row space if and only if they have the same nonzero rows. 0 (a) u Figure 4-2 0 (b) u 120 CHAPTER 4 Vector Spaces
  • 128. COROLLARY 4.9: Every matrix A is row equivalent to a unique matrix in row canonical form. We apply the above results in the next example. EXAMPLE 4.9 Consider the following two sets of vectors in R4 : u1 ¼ ð1; 2; 1; 3Þ; u2 ¼ ð2; 4; 1; 2Þ; u3 ¼ ð3; 6; 3; 7Þ w1 ¼ ð1; 2; 4; 11Þ; w2 ¼ ð2; 4; 5; 14Þ Let U ¼ spanðuiÞ and W ¼ spanðwiÞ. There are two ways to show that U ¼ W. (a) Show that each ui is a linear combination of w1 and w2, and show that each wi is a linear combination of u1, u2, u3. Observe that we have to show that six systems of linear equations are consistent. (b) Form the matrix A whose rows are u1, u2, u3 and row reduce A to row canonical form, and form the matrix B whose rows are w1 and w2 and row reduce B to row canonical form: A ¼ 1 2 1 3 2 4 1 2 3 6 3 7 2 6 4 3 7 5 1 2 1 3 0 0 3 8 0 0 6 16 2 6 4 3 7 5 1 2 0 1 3 0 0 1 8 3 0 0 0 0 2 6 4 3 7 5 B ¼ 1 2 4 11 2 4 5 14 1 2 4 11 0 0 3 8 1 2 0 1 3 0 0 1 8 3 # Because the nonzero rows of the matrices in row canonical form are identical, the row spaces of A and B are equal. Therefore, U ¼ W. Clearly, the method in (b) is more efficient than the method in (a). 4.7 Linear Dependence and Independence Let V be a vector space over a field K. The following defines the notion of linear dependence and independence of vectors over K. (One usually suppresses mentioning K when the field is understood.) This concept plays an essential role in the theory of linear algebra and in mathematics in general. DEFINITION: We say that the vectors v1; v2; . . . ; vm in V are linearly dependent if there exist scalars a1; a2; . . . ; am in K, not all of them 0, such that a1v1 þ a2v2 þ þ amvm ¼ 0 Otherwise, we say that the vectors are linearly independent. The above definition may be restated as follows. Consider the vector equation x1v1 þ x2v2 þ þ xmvm ¼ 0 ð*Þ where the x’s are unknown scalars. This equation always has the zero solution x1 ¼ 0; x2 ¼ 0; . . . ; xm ¼ 0. Suppose this is the only solution; that is, suppose we can show: x1v1 þ x2v2 þ þ xmvm ¼ 0 implies x1 ¼ 0; x2 ¼ 0; . . . ; xm ¼ 0 Then the vectors v1; v2; . . . ; vm are linearly independent, On the other hand, suppose the equation (*) has a nonzero solution; then the vectors are linearly dependent. A set S ¼ fv1; v2; . . . ; vmg of vectors in V is linearly dependent or independent according to whether the vectors v1; v2; . . . ; vm are linearly dependent or independent. An infinite set S of vectors is linearly dependent or independent according to whether there do or do not exist vectors v1; v2; . . . ; vk in S that are linearly dependent. Warning: The set S ¼ fv1; v2; . . . ; vmg above represents a list or, in other words, a finite sequence of vectors where the vectors are ordered and repetition is permitted. CHAPTER 4 Vector Spaces 121
  • 129. The following remarks follow directly from the above definition. Remark 1: Suppose 0 is one of the vectors v1; v2; . . . ; vm, say v1 ¼ 0. Then the vectors must be linearly dependent, because we have the following linear combination where the coefficient of v1 6¼ 0: 1v1 þ 0v2 þ þ 0vm ¼ 1 0 þ 0 þ þ 0 ¼ 0 Remark 2: Suppose v is a nonzero vector. Then v, by itself, is linearly independent, because kv ¼ 0; v 6¼ 0 implies k ¼ 0 Remark 3: Suppose two of the vectors v1; v2; . . . ; vm are equal or one is a scalar multiple of the other, say v1 ¼ kv2. Then the vectors must be linearly dependent, because we have the following linear combination where the coefficient of v1 6¼ 0: v1 kv2 þ 0v3 þ þ 0vm ¼ 0 Remark 4: Two vectors v1 and v2 are linearly dependent if and only if one of them is a multiple of the other. Remark 5: If the set fv1; . . . ; vmg is linearly independent, then any rearrangement of the vectors fvi1 ; vi2 ; . . . ; vim g is also linearly independent. Remark 6: If a set S of vectors is linearly independent, then any subset of S is linearly independent. Alternatively, if S contains a linearly dependent subset, then S is linearly dependent. EXAMPLE 4.10 (a) Let u ¼ ð1; 1; 0Þ, v ¼ ð1; 3; 2Þ, w ¼ ð4; 9; 5Þ. Then u, v, w are linearly dependent, because 3u þ 5v 2w ¼ 3ð1; 1; 0Þ þ 5ð1; 3; 2Þ 2ð4; 9; 5Þ ¼ ð0; 0; 0Þ ¼ 0 (b) We show that the vectors u ¼ ð1; 2; 3Þ, v ¼ ð2; 5; 7Þ, w ¼ ð1; 3; 5Þ are linearly independent. We form the vector equation xu þ yv þ zw ¼ 0, where x, y, z are unknown scalars. This yields x 1 2 3 2 4 3 5 þ y 2 5 7 2 4 3 5 þ z 1 3 5 2 4 3 5 ¼ 0 0 0 2 4 3 5 or x þ 2y þ z ¼ 0 2x þ 5y þ 3z ¼ 0 3x þ 7y þ 5z ¼ 0 or x þ 2y þ z ¼ 0 y þ z ¼ 0 2z ¼ 0 Back-substitution yields x ¼ 0, y ¼ 0, z ¼ 0. We have shown that xu þ yv þ zw ¼ 0 implies x ¼ 0; y ¼ 0; z ¼ 0 Accordingly, u, v, w are linearly independent. (c) Let V be the vector space of functions from R into R. We show that the functions f ðtÞ ¼ sin t, gðtÞ ¼ et , hðtÞ ¼ t2 are linearly independent. We form the vector (function) equation xf þ yg þ zh ¼ 0, where x, y, z are unknown scalars. This function equation means that, for every value of t, x sin t þ yet þ zt2 ¼ 0 Thus, in this equation, we choose appropriate values of t to easily get x ¼ 0, y ¼ 0, z ¼ 0. For example, ðiÞ Substitute t ¼ 0 ðiiÞ Substitute t ¼ p ðiiiÞ Substitute t ¼ p=2 to obtain xð0Þ þ yð1Þ þ zð0Þ ¼ 0 to obtain xð0Þ þ 0ðep Þ þ zðp2 Þ ¼ 0 to obtain xð1Þ þ 0ðep=2 Þ þ 0ðp2 =4Þ ¼ 0 or or or y ¼ 0 z ¼ 0 x ¼ 0 We have shown xf þ yg þ zf ¼ 0 implies x ¼ 0; y ¼ 0; z ¼ 0 Accordingly, u, v, w are linearly independent. 122 CHAPTER 4 Vector Spaces
  • 130. Linear Dependence in R3 Linear dependence in the vector space V ¼ R3 can be described geometrically as follows: (a) Any two vectors u and v in R3 are linearly dependent if and only if they lie on the same line through the origin O, as shown in Fig. 4-3(a). (b) Any three vectors u, v, w in R3 are linearly dependent if and only if they lie on the same plane through the origin O, as shown in Fig. 4-3(b). Later, we will be able to show that any four or more vectors in R3 are automatically linearly dependent. Linear Dependence and Linear Combinations The notions of linear dependence and linear combinations are closely related. Specifically, for more than one vector, we show that the vectors v1; v2; . . . ; vm are linearly dependent if and only if one of them is a linear combination of the others. Suppose, say, vi is a linear combination of the others, vi ¼ a1v1 þ þ ai1vi1 þ aiþ1viþ1 þ þ amvm Then by adding vi to both sides, we obtain a1v1 þ þ ai1vi1 vi þ aiþ1viþ1 þ þ amvm ¼ 0 where the coefficient of vi is not 0. Hence, the vectors are linearly dependent. Conversely, suppose the vectors are linearly dependent, say, b1v1 þ þ bjvj þ þ bmvm ¼ 0; where bj 6¼ 0 Then we can solve for vj obtaining vj ¼ b1 j b1v1 b1 j bj1vj1 b1 j bjþ1vjþ1 b1 j bmvm and so vj is a linear combination of the other vectors. We now state a slightly stronger statement than the one above. This result has many important consequences. LEMMA 4.10: Suppose two or more nonzero vectors v1; v2; . . . ; vm are linearly dependent. Then one of the vectors is a linear combination of the preceding vectors; that is, there exists k 1 such that vk ¼ c1v1 þ c2v2 þ þ ck1vk1 Figure 4-3 CHAPTER 4 Vector Spaces 123
  • 131. Linear Dependence and Echelon Matrices Consider the following echelon matrix A, whose pivots have been circled: A ¼ 0 2 3 4 5 6 7 0 0 4 3 2 3 4 0 0 0 0 7 8 9 0 0 0 0 0 6 7 0 0 0 0 0 0 0 2 6 6 6 6 4 3 7 7 7 7 5 Observe that the rows R2, R3, R4 have 0’s in the second column below the nonzero pivot in R1, and hence any linear combination of R2, R3, R4 must have 0 as its second entry. Thus, R1 cannot be a linear combination of the rows below it. Similarly, the rows R3 and R4 have 0’s in the third column below the nonzero pivot in R2, and hence R2 cannot be a linear combination of the rows below it. Finally, R3 cannot be a multiple of R4, because R4 has a 0 in the fifth column below the nonzero pivot in R3. Viewing the nonzero rows from the bottom up, R4, R3, R2, R1, no row is a linear combination of the preceding rows. Thus, the rows are linearly independent by Lemma 4.10. The argument used with the above echelon matrix A can be used for the nonzero rows of any echelon matrix. Thus, we have the following very useful result. THEOREM 4.11: The nonzero rows of a matrix in echelon form are linearly independent. 4.8 Basis and Dimension First we state two equivalent ways to define a basis of a vector space V. (The equivalence is proved in Problem 4.28.) DEFINITION A: A set S ¼ fu1; u2; . . . ; ung of vectors is a basis of V if it has the following two properties: (1) S is linearly independent. (2) S spans V. DEFINITION B: A set S ¼ fu1; u2; . . . ; ung of vectors is a basis of V if every v 2 V can be written uniquely as a linear combination of the basis vectors. The following is a fundamental result in linear algebra. THEOREM 4.12: Let V be a vector space such that one basis has m elements and another basis has n elements. Then m ¼ n. A vector space V is said to be of finite dimension n or n-dimensional, written dim V ¼ n if V has a basis with n elements. Theorem 4.12 tells us that all bases of V have the same number of elements, so this definition is well defined. The vector space {0} is defined to have dimension 0. Suppose a vector space V does not have a finite basis. Then V is said to be of infinite dimension or to be infinite-dimensional. The above fundamental Theorem 4.12 is a consequence of the following ‘‘replacement lemma’’ (proved in Problem 4.35). LEMMA 4.13: Suppose fv1; v2; . . . ; vng spans V, and suppose fw1; w2; . . . ; wmg is linearly indepen- dent. Then m n, and V is spanned by a set of the form fw1; w2; . . . ; wm; vi1 ; vi2 ; . . . ; vinm g Thus, in particular, n þ 1 or more vectors in V are linearly dependent. Observe in the above lemma that we have replaced m of the vectors in the spanning set of V by the m independent vectors and still retained a spanning set. 124 CHAPTER 4 Vector Spaces
  • 132. Examples of Bases This subsection presents important examples of bases of some of the main vector spaces appearing in this text. (a) Vector space Kn : Consider the following n vectors in Kn : e1 ¼ ð1; 0; 0; 0; . . . ; 0; 0Þ; e2 ¼ ð0; 1; 0; 0; . . . ; 0; 0Þ; . . . ; en ¼ ð0; 0; 0; 0; . . . ; 0; 1Þ These vectors are linearly independent. (For example, they form a matrix in echelon form.) Furthermore, any vector u ¼ ða1; a2; . . . ; anÞ in Kn can be written as a linear combination of the above vectors. Specifically, v ¼ a1e1 þ a2e2 þ þ anen Accordingly, the vectors form a basis of Kn called the usual or standard basis of Kn . Thus (as one might expect), Kn has dimension n. In particular, any other basis of Kn has n elements. (b) Vector space M ¼ Mr;s of all r s matrices: The following six matrices form a basis of the vector space M2;3 of all 2 3 matrices over K: 1 0 0 0 0 0 ; 0 1 0 0 0 0 ; 0 0 1 0 0 0 ; 0 0 0 1 0 0 ; 0 0 0 0 1 0 ; 0 0 0 0 0 1 More generally, in the vector space M ¼ Mr;s of all r s matrices, let Eij be the matrix with ij-entry 1 and 0’s elsewhere. Then all such matrices form a basis of Mr;s called the usual or standard basis of Mr;s. Accordingly, dim Mr;s ¼ rs. (c) Vector space PnðtÞ of all polynomials of degree n: The set S ¼ f1; t; t2 ; t3 ; . . . ; tn g of n þ 1 polynomials is a basis of PnðtÞ. Specifically, any polynomial f ðtÞ of degree n can be expessed as a linear combination of these powers of t, and one can show that these polynomials are linearly independent. Therefore, dim PnðtÞ ¼ n þ 1. (d) Vector space PðtÞ of all polynomials: Consider any finite set S ¼ ff1ðtÞ; f2ðtÞ; . . . ; fmðtÞg of polynomials in PðtÞ, and let m denote the largest of the degrees of the polynomials. Then any polynomial gðtÞ of degree exceeding m cannot be expressed as a linear combination of the elements of S. Thus, S cannot be a basis of PðtÞ. This means that the dimension of PðtÞ is infinite. We note that the infinite set S0 ¼ f1; t; t2 ; t3 ; . . .g, consisting of all the powers of t, spans PðtÞ and is linearly independent. Accordingly, S0 is an infinite basis of PðtÞ. Theorems on Bases The following three theorems (proved in Problems 4.37, 4.38, and 4.39) will be used frequently. THEOREM 4.14: Let V be a vector space of finite dimension n. Then: (i) Any n þ 1 or more vectors in V are linearly dependent. (ii) Any linearly independent set S ¼ fu1; u2; . . . ; ung with n elements is a basis of V. (iii) Any spanning set T ¼ fv1; v2; . . . ; vng of V with n elements is a basis of V. THEOREM 4.15: Suppose S spans a vector space V. Then: (i) Any maximum number of linearly independent vectors in S form a basis of V. (ii) Suppose one deletes from S every vector that is a linear combination of preceding vectors in S. Then the remaining vectors form a basis of V. CHAPTER 4 Vector Spaces 125
  • 133. THEOREM 4.16: Let V be a vector space of finite dimension and let S ¼ fu1; u2; . . . ; urg be a set of linearly independent vectors in V. Then S is part of a basis of V; that is, S may be extended to a basis of V. EXAMPLE 4.11 (a) The following four vectors in R4 form a matrix in echelon form: ð1; 1; 1; 1Þ; ð0; 1; 1; 1Þ; ð0; 0; 1; 1Þ; ð0; 0; 0; 1Þ Thus, the vectors are linearly independent, and, because dim R4 ¼ 4, the four vectors form a basis of R4 . (b) The following n þ 1 polynomials in PnðtÞ are of increasing degree: 1; t 1; ðt 1Þ2 ; . . . ; ðt 1Þn Therefore, no polynomial is a linear combination of preceding polynomials; hence, the polynomials are linear independent. Furthermore, they form a basis of PnðtÞ, because dim PnðtÞ ¼ n þ 1. (c) Consider any four vectors in R3 , say ð257; 132; 58Þ; ð43; 0; 17Þ; ð521; 317; 94Þ; ð328; 512; 731Þ By Theorem 4.14(i), the four vectors must be linearly dependent, because they come from the three-dimensional vector space R3 . Dimension and Subspaces The following theorem (proved in Problem 4.40) gives the basic relationship between the dimension of a vector space and the dimension of a subspace. THEOREM 4.17: Let W be a subspace of an n-dimensional vector space V. Then dim W n. In particular, if dim W ¼ n, then W ¼ V. EXAMPLE 4.12 Let W be a subspace of the real space R3 . Note that dim R3 ¼ 3. Theorem 4.17 tells us that the dimension of W can only be 0, 1, 2, or 3. The following cases apply: (a) If dim W ¼ 0, then W ¼ f0g, a point. (b) If dim W ¼ 1, then W is a line through the origin 0. (c) If dim W ¼ 2, then W is a plane through the origin 0. (d) If dim W ¼ 3, then W is the entire space R3 . 4.9 Application to Matrices, Rank of a Matrix Let A be any m n matrix over a field K. Recall that the rows of A may be viewed as vectors in Kn and that the row space of A, written rowsp(A), is the subspace of Kn spanned by the rows of A. The following definition applies. DEFINITION: The rank of a matrix A, written rank(A), is equal to the maximum number of linearly independent rows of A or, equivalently, the dimension of the row space of A. Recall, on the other hand, that the columns of an m n matrix A may be viewed as vectors in Km and that the column space of A, written colsp(A), is the subspace of Km spanned by the columns of A. Although m may not be equal to n—that is, the rows and columns of A may belong to different vector spaces—we have the following fundamental result. THEOREM 4.18: The maximum number of linearly independent rows of any matrix A is equal to the maximum number of linearly independent columns of A. Thus, the dimension of the row space of A is equal to the dimension of the column space of A. Accordingly, one could restate the above definition of the rank of A using columns instead of rows. 126 CHAPTER 4 Vector Spaces
  • 134. Basis-Finding Problems This subsection shows how an echelon form of any matrix A gives us the solution to certain problems about A itself. Specifically, let A and B be the following matrices, where the echelon matrix B (whose pivots are circled) is an echelon form of A: A ¼ 1 2 1 3 1 2 2 5 5 6 4 5 3 7 6 11 6 9 1 5 10 8 9 9 2 6 8 11 9 12 2 6 6 6 6 4 3 7 7 7 7 5 and B ¼ 1 2 1 3 1 2 0 1 3 1 2 1 0 0 0 1 1 2 0 0 0 0 0 0 0 0 0 0 0 0 2 6 6 6 6 4 3 7 7 7 7 5 We solve the following four problems about the matrix A, where C1; C2; . . . ; C6 denote its columns: (a) Find a basis of the row space of A. (b) Find each column Ck of A that is a linear combination of preceding columns of A. (c) Find a basis of the column space of A. (d) Find the rank of A. (a) We are given that A and B are row equivalent, so they have the same row space. Moreover, B is in echelon form, so its nonzero rows are linearly independent and hence form a basis of the row space of B. Thus, they also form a basis of the row space of A. That is, basis of rowspðAÞ: ð1; 2; 1; 3; 1; 2Þ; ð0; 1; 3; 1; 2; 1Þ; ð0; 0; 0; 1; 1; 2Þ (b) Let Mk ¼ ½C1; C2; . . . ; Ck, the submatrix of A consisting of the first k columns of A. Then Mk1 and Mk are, respectively, the coefficient matrix and augmented matrix of the vector equation x1C1 þ x2C2 þ þ xk1Ck1 ¼ Ck Theorem 3.9 tells us that the system has a solution, or, equivalently, Ck is a linear combination of the preceding columns of A if and only if rankðMkÞ ¼ rankðMk1Þ, where rankðMkÞ means the number of pivots in an echelon form of Mk. Now the first k column of the echelon matrix B is also an echelon form of Mk. Accordingly, rankðM2Þ ¼ rankðM3Þ ¼ 2 and rankðM4Þ ¼ rankðM5Þ ¼ rankðM6Þ ¼ 3 Thus, C3, C5, C6 are each a linear combination of the preceding columns of A. (c) The fact that the remaining columns C1, C2, C4 are not linear combinations of their respective preceding columns also tells us that they are linearly independent. Thus, they form a basis of the column space of A. That is, basis of colspðAÞ: ½1; 2; 3; 1; 2T ; ½2; 5; 7; 5; 6T ; ½3; 6; 11; 8; 11T Observe that C1, C2, C4 may also be characterized as those columns of A that contain the pivots in any echelon form of A. (d) Here we see that three possible definitions of the rank of A yield the same value. (i) There are three pivots in B, which is an echelon form of A. (ii) The three pivots in B correspond to the nonzero rows of B, which form a basis of the row space of A. (iii) The three pivots in B correspond to the columns of A, which form a basis of the column space of A. Thus, rankðAÞ ¼ 3. CHAPTER 4 Vector Spaces 127
  • 135. Application to Finding a Basis for W ¼ spanðu1; u2; . . . ; urÞ Frequently, we are given a list S ¼ fu1; u2; . . . ; urg of vectors in Kn and we want to find a basis for the subspace W of Kn spanned by the given vectors—that is, a basis of W ¼ spanðSÞ ¼ spanðu1; u2; . . . ; urÞ The following two algorithms, which are essentially described in the above subsection, find such a basis (and hence the dimension) of W. Algorithm 4.1 (Row space algorithm) Step 1. Form the matrix M whose rows are the given vectors. Step 2. Row reduce M to echelon form. Step 3. Output the nonzero rows of the echelon matrix. Sometimes we want to find a basis that only comes from the original given vectors. The next algorithm accomplishes this task. Algorithm 4.2 (Casting-out algorithm) Step 1. Form the matrix M whose columns are the given vectors. Step 2. Row reduce M to echelon form. Step 3. For each column Ck in the echelon matrix without a pivot, delete (cast out) the vector uk from the list S of given vectors. Step 4. Output the remaining vectors in S (which correspond to columns with pivots). We emphasize that in the first algorithm we form a matrix whose rows are the given vectors, whereas in the second algorithm we form a matrix whose columns are the given vectors. EXAMPLE 4.13 Let W be the subspace of R5 spanned by the following vectors: u1 ¼ ð1; 2; 1; 3; 2Þ; u2 ¼ ð1; 3; 3; 5; 3Þ; u3 ¼ ð3; 8; 7; 13; 8Þ u4 ¼ ð1; 4; 6; 9; 7Þ; u5 ¼ ð5; 13; 13; 25; 19Þ Find a basis of W consisting of the original given vectors, and find dim W. Form the matrix M whose columns are the given vectors, and reduce M to echelon form: M ¼ 1 1 3 1 5 2 3 8 4 13 1 3 7 6 13 3 5 13 9 25 2 3 8 7 19 2 6 6 6 6 4 3 7 7 7 7 5 1 1 3 1 5 0 1 2 2 3 0 0 0 1 2 0 0 0 0 0 0 0 0 0 0 2 6 6 6 6 4 3 7 7 7 7 5 The pivots in the echelon matrix appear in columns C1, C2, C4. Accordingly, we ‘‘cast out’’ the vectors u3 and u5 from the original five vectors. The remaining vectors u1, u2, u4, which correspond to the columns in the echelon matrix with pivots, form a basis of W. Thus, in particular, dim W ¼ 3. Remark: The justification of the casting-out algorithm is essentially described above, but we repeat it again here for emphasis. The fact that column C3 in the echelon matrix in Example 4.13 does not have a pivot means that the vector equation xu1 þ yu2 ¼ u3 has a solution, and hence u3 is a linear combination of u1 and u2. Similarly, the fact that C5 does not have a pivot means that u5 is a linear combination of the preceding vectors. We have deleted each vector in the original spanning set that is a linear combination of preceding vectors. Thus, the remaining vectors are linearly independent and form a basis of W. 128 CHAPTER 4 Vector Spaces
  • 136. Application to Homogeneous Systems of Linear Equations Consider again a homogeneous system AX ¼ 0 of linear equations over K with n unknowns. By Theorem 4.4, the solution set W of such a system is a subspace of Kn , and hence W has a dimension. The following theorem, whose proof is postponed until Chapter 5, holds. THEOREM 4.19: The dimension of the solution space W of a homogeneous system AX ¼ 0 is n r, where n is the number of unknowns and r is the rank of the coefficient matrix A. In the case where the system AX ¼ 0 is in echelon form, it has precisely n r free variables, say xi1 ; xi2 ; . . . ; xinr . Let vj be the solution obtained by setting xij ¼ 1 (or any nonzero constant) and the remaining free variables equal to 0. We show (Problem 4.50) that the solutions v1; v2; . . . ; vnr are linearly independent; hence, they form a basis of the solution space W. We have already used the above process to find a basis of the solution space W of a homogeneous system AX ¼ 0 in Section 3.11. Problem 4.48 gives three other examples. 4.10 Sums and Direct Sums Let U and W be subsets of a vector space V. The sum of U and W, written U þ W, consists of all sums u þ w where u 2 U and w 2 W. That is, U þ W ¼ fv : v ¼ u þ w; where u 2 U and w 2 Wg Now suppose U and W are subspaces of V. Then one can easily show (Problem 4.53) that U þ W is a subspace of V. Recall that U W is also a subspace of V. The following theorem (proved in Problem 4.58) relates the dimensions of these subspaces. THEOREM 4.20: Suppose U and W are finite-dimensional subspaces of a vector space V. Then U þ W has finite dimension and dimðU þ WÞ ¼ dim U þ dim W dimðU WÞ EXAMPLE 4.14 Let V ¼ M2;2, the vector space of 2 2 matrices. Let U consist of those matrices whose second row is zero, and let W consist of those matrices whose second column is zero. Then U ¼ a b 0 0 ; W ¼ a 0 c 0 and U þ W ¼ a b c 0 ; U W ¼ a 0 0 0 That is, U þ W consists of those matrices whose lower right entry is 0, and U W consists of those matrices whose second row and second column are zero. Note that dim U ¼ 2, dim W ¼ 2, dimðU WÞ ¼ 1. Also, dimðU þ WÞ ¼ 3, which is expected from Theorem 4.20. That is, dimðU þ WÞ ¼ dim U þ dim V dimðU WÞ ¼ 2 þ 2 1 ¼ 3 Direct Sums The vector space V is said to be the direct sum of its subspaces U and W, denoted by V ¼ U W if every v 2 V can be written in one and only one way as v ¼ u þ w where u 2 U and w 2 W. The following theorem (proved in Problem 4.59) characterizes such a decomposition. THEOREM 4.21: The vector space V is the direct sum of its subspaces U and W if and only if: (i) V ¼ U þ W, (ii) U W ¼ f0g. CHAPTER 4 Vector Spaces 129
  • 137. EXAMPLE 4.15 Consider the vector space V ¼ R3 : (a) Let U be the xy-plane and let W be the yz-plane; that is, U ¼ fða; b; 0Þ : a; b 2 Rg and W ¼ fð0; b; cÞ : b; c 2 Rg Then R3 ¼ U þ W, because every vector in R3 is the sum of a vector in U and a vector in W. However, R3 is not the direct sum of U and W, because such sums are not unique. For example, ð3; 5; 7Þ ¼ ð3; 1; 0Þ þ ð0; 4; 7Þ and also ð3; 5; 7Þ ¼ ð3; 4; 0Þ þ ð0; 9; 7Þ (b) Let U be the xy-plane and let W be the z-axis; that is, U ¼ fða; b; 0Þ : a; b 2 Rg and W ¼ fð0; 0; cÞ : c 2 Rg Now any vector ða; b; cÞ 2 R3 can be written as the sum of a vector in U and a vector in V in one and only one way: ða; b; cÞ ¼ ða; b; 0Þ þ ð0; 0; cÞ Accordingly, R3 is the direct sum of U and W; that is, R3 ¼ U W. General Direct Sums The notion of a direct sum is extended to more than one factor in the obvious way. That is, V is the direct sum of subspaces W1; W2; . . . ; Wr, written V ¼ W1 W2 Wr if every vector v 2 V can be written in one and only one way as v ¼ w1 þ w2 þ þ wr where w1 2 W1; w2 2 W2; . . . ; wr 2 Wr. The following theorems hold. THEOREM 4.22: Suppose V ¼ W1 W2 Wr. Also, for each k, suppose Sk is a linearly independent subset of Wk. Then (a) The union S ¼ S k Sk is linearly independent in V. (b) If each Sk is a basis of Wk, then S k Sk is a basis of V. (c) dim V ¼ dim W1 þ dim W2 þ þ dim Wr. THEOREM 4.23: Suppose V ¼ W1 þ W2 þ þ Wr and dim V ¼ P k dim Wk. Then V ¼ W1 W2 Wr: 4.11 Coordinates Let V be an n-dimensional vector space over K with basis S ¼ fu1; u2; . . . ; ung. Then any vector v 2 V can be expressed uniquely as a linear combination of the basis vectors in S, say v ¼ a1u1 þ a2u2 þ þ anun These n scalars a1; a2; . . . ; an are called the coordinates of v relative to the basis S, and they form a vector [a1; a2; . . . ; an] in Kn called the coordinate vector of v relative to S. We denote this vector by ½vS, or simply ½v; when S is understood. Thus, ½vS ¼ ½a1; a2; . . . ; an For notational convenience, brackets ½. . ., rather than parentheses ð. . .Þ, are used to denote the coordinate vector. 130 CHAPTER 4 Vector Spaces
  • 138. Remark: The above n scalars a1; a2; . . . ; an also form the coordinate column vector ½a1; a2; . . . ; anT of v relative to S. The choice of the column vector rather than the row vector to represent v depends on the context in which it is used. The use of such column vectors will become clear later in Chapter 6. EXAMPLE 4.16 Consider the vector space P2ðtÞ of polynomials of degree 2. The polynomials p1 ¼ t þ 1; p2 ¼ t 1; p3 ¼ ðt 1Þ2 ¼ t2 2t þ 1 form a basis S of P2ðtÞ. The coordinate vector [v] of v ¼ 2t2 5t þ 9 relative to S is obtained as follows. Set v ¼ xp1 þ yp2 þ zp3 using unknown scalars x, y, z, and simplify: 2t2 5t þ 9 ¼ xðt þ 1Þ þ yðt 1Þ þ zðt2 2t þ 1Þ ¼ xt þ x þ yt y þ zt2 2zt þ z ¼ zt2 þ ðx þ y 2zÞt þ ðx y þ zÞ Then set the coefficients of the same powers of t equal to each other to obtain the system z ¼ 2; x þ y 2z ¼ 5; x y þ z ¼ 9 The solution of the system is x ¼ 3, y ¼ 4, z ¼ 2. Thus, v ¼ 3p1 4p2 þ 2p3; and hence; ½v ¼ ½3; 4; 2 EXAMPLE 4.17 Consider real space R3 . The following vectors form a basis S of R3 : u1 ¼ ð1; 1; 0Þ; u2 ¼ ð1; 1; 0Þ; u3 ¼ ð0; 1; 1Þ The coordinates of v ¼ ð5; 3; 4Þ relative to the basis S are obtained as follows. Set v ¼ xv1 þ yv2 þ zv3; that is, set v as a linear combination of the basis vectors using unknown scalars x, y, z. This yields 5 3 4 2 4 3 5 ¼ x 1 1 0 2 4 3 5 þ y 1 1 0 2 4 3 5 þ z 0 1 1 2 4 3 5 The equivalent system of linear equations is as follows: x þ y ¼ 5; x þ y þ z ¼ 3; z ¼ 4 The solution of the system is x ¼ 3, y ¼ 2, z ¼ 4. Thus, v ¼ 3u1 þ 2u2 þ 4u3; and so ½vs ¼ ½3; 2; 4 Remark 1: There is a geometrical interpretation of the coordinates of a vector v relative to a basis S for the real space Rn , which we illustrate using the basis S of R3 in Example 4.17. First consider the space R3 with the usual x, y, z axes. Then the basis vectors determine a new coordinate system of R3 , say with x0 , y0 , z0 axes, as shown in Fig. 4-4. That is, (1) The x0 -axis is in the direction of u1 with unit length ku1k. (2) The y0 -axis is in the direction of u2 with unit length ku2k. (3) The z0 -axis is in the direction of u3 with unit length ku3k. Then each vector v ¼ ða; b; cÞ or, equivalently, the point Pða; b; cÞ in R3 will have new coordinates with respect to the new x0 , y0 , z0 axes. These new coordinates are precisely ½vS, the coordinates of v with respect to the basis S. Thus, as shown in Example 4.17, the coordinates of the point Pð5; 3; 4Þ with the new axes form the vector [3, 2, 4]. Remark 2: Consider the usual basis E ¼ fe1; e2; . . . ; eng of Kn defined by e1 ¼ ð1; 0; 0; . . . ; 0; 0Þ; e2 ¼ ð0; 1; 0; . . . ; 0; 0Þ; . . . ; en ¼ ð0; 0; 0; . . . ; 0; 1Þ CHAPTER 4 Vector Spaces 131
  • 139. Let v ¼ ða1; a2; . . . ; anÞ be any vector in Kn . Then one can easily show that v ¼ a1e1 þ a2e2 þ þ anen; and so ½vE ¼ ½a1; a2; . . . ; an That is, the coordinate vector ½vE of any vector v relative to the usual basis E of Kn is identical to the original vector v. Isomorphism of V and Kn Let V be a vector space of dimension n over K, and suppose S ¼ fu1; u2; . . . ; ung is a basis of V. Then each vector v 2 V corresponds to a unique n-tuple ½vS in Kn . On the other hand, each n-tuple [c1; c2; . . . ; cn] in Kn corresponds to a unique vector c1u1 þ c2u2 þ þ cnun in V. Thus, the basis S induces a one-to-one correspondence between V and Kn . Furthermore, suppose v ¼ a1u1 þ a2u2 þ þ anun and w ¼ b1u1 þ b2u2 þ þ bnun Then v þ w ¼ ða1 þ b1Þu1 þ ða2 þ b2Þu2 þ þ ðan þ bnÞun kv ¼ ðka1Þu1 þ ðka2Þu2 þ þ ðkanÞun where k is a scalar. Accordingly, ½v þ wS ¼ ½a1 þ b1; . . . ; an þ bn ¼ ½a1; . . . ; an þ ½b1; . . . ; bn ¼ ½vS þ ½wS ½kvS ¼ ½ka1; ka2; . . . ; kan ¼ k½a1; a2; . . . ; an ¼ k½vS Thus, the above one-to-one correspondence between V and Kn preserves the vector space operations of vector addition and scalar multiplication. We then say that V and Kn are isomorphic, written V ffi Kn We state this result formally. Figure 4-4 132 CHAPTER 4 Vector Spaces
  • 140. THEOREM 4.24: Let V be an n-dimensional vector space over a field K. Then V and Kn are isomorphic. The next example gives a practical application of the above result. EXAMPLE 4.18 Suppose we want to determine whether or not the following matrices in V ¼ M2;3 are linearly dependent: A ¼ 1 2 3 4 0 1 ; B ¼ 1 3 4 6 5 4 ; C ¼ 3 8 11 16 10 9 The coordinate vectors of the matrices in the usual basis of M2;3 are as follows: ½A ¼ ½1; 2; 3; 4; 0; 1; ½B ¼ ½1; 3; 4; 6; 5; 4; ½C ¼ ½3; 8; 11; 16; 10; 9 Form the matrix M whose rows are the above coordinate vectors and reduce M to an echelon form: M ¼ 1 2 3 4 0 1 1 3 4 6 5 4 3 8 11 16 10 9 2 4 3 5 1 2 3 4 0 1 0 1 1 2 5 3 0 2 2 4 10 6 2 4 3 5 1 2 3 4 0 1 0 1 1 2 5 3 0 0 0 0 0 0 2 4 3 5 Because the echelon matrix has only two nonzero rows, the coordinate vectors [A], [B], [C] span a subspace of dimension 2 and so are linearly dependent. Accordingly, the original matrices A, B, C are linearly dependent. SOLVED PROBLEMS Vector Spaces, Linear Combinations 4.1. Suppose u and v belong to a vector space V. Simplify each of the following expressions: (a) E1 ¼ 3ð2u 4vÞ þ 5u þ 7v, (c) E3 ¼ 2uv þ 3ð2u þ 4vÞ (b) E2 ¼ 3u 6ð3u 5vÞ þ 7u, (d) E4 ¼ 5u 3 v þ 5u Multiply out and collect terms: (a) E1 ¼ 6u 12v þ 5u þ 7v ¼ 11u 5v (b) E2 ¼ 3u 18u þ 30v þ 7u ¼ 8u þ 30v (c) E3 is not defined because the product uv of vectors is not defined. (d) E4 is not defined because division by a vector is not defined. 4.2. Prove Theorem 4.1: Let V be a vector space over a field K. (i) k0 ¼ 0. (ii) 0u ¼ 0. (iii) If ku ¼ 0, then k ¼ 0 or u ¼ 0. (iv) ðkÞu ¼ kðuÞ ¼ ku. (i) By Axiom [A2] with u ¼ 0, we have 0 þ 0 ¼ 0. Hence, by Axiom [M1], we have k0 ¼ kð0 þ 0Þ ¼ k0 þ k0 Adding k0 to both sides gives the desired result. (ii) For scalars, 0 þ 0 ¼ 0. Hence, by Axiom [M2], we have 0u ¼ ð0 þ 0Þu ¼ 0u þ 0u Adding 0u to both sides gives the desired result. (iii) Suppose ku ¼ 0 and k 6¼ 0. Then there exists a scalar k1 such that k1 k ¼ 1. Thus, u ¼ 1u ¼ ðk1 kÞu ¼ k1 ðkuÞ ¼ k1 0 ¼ 0 (iv) Using u þ ðuÞ ¼ 0 and k þ ðkÞ ¼ 0 yields 0 ¼ k0 ¼ k½u þ ðuÞ ¼ ku þ kðuÞ and 0 ¼ 0u ¼ ½k þ ðkÞu ¼ ku þ ðkÞu Adding ku to both sides of the first equation gives ku ¼ kðuÞ; and adding ku to both sides of the second equation gives ku ¼ ðkÞu. Thus, ðkÞu ¼ kðuÞ ¼ ku. CHAPTER 4 Vector Spaces 133
  • 141. 4.3. Show that (a) kðu vÞ ¼ ku kv, (b) u þ u ¼ 2u. (a) Using the definition of subtraction, that u v ¼ u þ ðvÞ, and Theorem 4.1(iv), that kðvÞ ¼ kv, we have kðu vÞ ¼ k½u þ ðvÞ ¼ ku þ kðvÞ ¼ ku þ ðkvÞ ¼ ku kv (b) Using Axiom [M4] and then Axiom [M2], we have u þ u ¼ 1u þ 1u ¼ ð1 þ 1Þu ¼ 2u 4.4. Express v ¼ ð1; 2; 5Þ in R3 as a linear combination of the vectors u1 ¼ ð1; 1; 1Þ; u2 ¼ ð1; 2; 3Þ; u3 ¼ ð2; 1; 1Þ We seek scalars x, y, z, as yet unknown, such that v ¼ xu1 þ yu2 þ zu3. Thus, we require 1 2 5 2 4 3 5 ¼ x 1 1 1 2 4 3 5 þ y 1 2 3 2 4 3 5 þ z 2 1 1 2 4 3 5 or x þ y þ 2z ¼ 1 x þ 2y z ¼ 2 x þ 3y þ z ¼ 5 (For notational convenience, we write the vectors in R3 as columns, because it is then easier to find the equivalent system of linear equations.) Reducing the system to echelon form yields the triangular system x þ y þ 2z ¼ 1; y 3z ¼ 3; 5z ¼ 10 The system is consistent and has a solution. Solving by back-substitution yields the solution x ¼ 6, y ¼ 3, z ¼ 2. Thus, v ¼ 6u1 þ 3u2 þ 2u3. Alternatively, write down the augmented matrix M of the equivalent system of linear equations, where u1, u2, u3 are the first three columns of M and v is the last column, and then reduce M to echelon form: M ¼ 1 1 2 1 1 2 1 2 1 3 1 5 2 4 3 5 1 1 2 1 0 1 3 3 0 2 1 4 2 4 3 5 1 1 2 1 0 1 3 3 0 0 5 10 2 4 3 5 The last matrix corresponds to a triangular system, which has a solution. Solving the triangular system by back-substitution yields the solution x ¼ 6, y ¼ 3, z ¼ 2. Thus, v ¼ 6u1 þ 3u2 þ 2u3. 4.5. Express v ¼ ð2; 5; 3Þ in R3 as a linear combination of the vectors u1 ¼ ð1; 3; 2Þ; u2 ¼ ð2; 4; 1Þ; u3 ¼ ð1; 5; 7Þ We seek scalars x, y, z, as yet unknown, such that v ¼ xu1 þ yu2 þ zu3. Thus, we require 2 5 3 2 4 3 5 ¼ x 1 3 2 2 4 3 5 þ y 2 4 1 2 4 3 5 þ z 1 5 7 2 4 3 5 or x þ 2y þ z ¼ 2 3x 4y 5z ¼ 5 2x y þ 7z ¼ 3 Reducing the system to echelon form yields the system x þ 2y þ z ¼ 2; 2y 2z ¼ 1; 0 ¼ 3 The system is inconsistent and so has no solution. Thus, v cannot be written as a linear combination of u1, u2, u3. 4.6. Express the polynomial v ¼ t2 þ 4t 3 in PðtÞ as a linear combination of the polynomials p1 ¼ t2 2t þ 5; p2 ¼ 2t2 3t; p3 ¼ t þ 1 Set v as a linear combination of p1, p2, p3 using unknowns x, y, z to obtain t2 þ 4t 3 ¼ xðt2 2t þ 5Þ þ yð2t2 3tÞ þ zðt þ 1Þ ð*Þ We can proceed in two ways. 134 CHAPTER 4 Vector Spaces
  • 142. Method 1. Expand the right side of (*) and express it in terms of powers of t as follows: t2 þ 4t 3 ¼ xt2 2xt þ 5x þ 2yt2 3yt þ zt þ z ¼ ðx þ 2yÞt2 þ ð2x 3y þ zÞt þ ð5x þ 3zÞ Set coefficients of the same powers of t equal to each other, and reduce the system to echelon form. This yields x þ 2y ¼ 1 2x 3y þ z ¼ 4 5x þ 3z ¼ 3 or x þ 2y ¼ 1 y þ z ¼ 6 10y þ 3z ¼ 8 or x þ 2y ¼ 1 y þ z ¼ 6 13z ¼ 52 The system is consistent and has a solution. Solving by back-substitution yields the solution x ¼ 3, y ¼ 2, z ¼ 4. Thus, v ¼ 3p1 þ 2p2 þ 4p2. Method 2. The equation (*) is an identity in t; that is, the equation holds for any value of t. Thus, we can set t equal to any numbers to obtain equations in the unknowns. (a) Set t ¼ 0 in (*) to obtain the equation 3 ¼ 5x þ z. (b) Set t ¼ 1 in (*) to obtain the equation 2 ¼ 4x y þ 2z. (c) Set t ¼ 1 in (*) to obtain the equation 6 ¼ 8x þ 5y. Solve the system of the three equations to again obtain the solution x ¼ 3, y ¼ 2, z ¼ 4. Thus, v ¼ 3p1 þ 2p2 þ 4p3. 4.7. Express M as a linear combination of the matrices A, B, C, where M ¼ 4 7 7 9 ; and A ¼ 1 1 1 1 ; B ¼ 1 2 3 4 ; C ¼ 1 1 4 5 Set M as a linear combination of A, B, C using unknown scalars x, y, z; that is, set M ¼ xA þ yB þ zC. This yields 4 7 7 9 ¼ x 1 1 1 1 þ y 1 2 3 4 þ z 1 1 4 5 ¼ x þ y þ z x þ 2y þ z x þ 3y þ 4z x þ 4y þ 5z Form the equivalent system of equations by setting corresponding entries equal to each other: x þ y þ z ¼ 4; x þ 2y þ z ¼ 7; x þ 3y þ 4z ¼ 7; x þ 4y þ 5z ¼ 9 Reducing the system to echelon form yields x þ y þ z ¼ 4; y ¼ 3; 3z ¼ 3; 4z ¼ 4 The last equation drops out. Solving the system by back-substitution yields z ¼ 1, y ¼ 3, x ¼ 2. Thus, M ¼ 2A þ 3B C. Subspaces 4.8. Prove Theorem 4.2: W is a subspace of V if the following two conditions hold: (a) 0 2 W. (b) If u; v 2 W, then u þ v, ku 2 W. By (a), W is nonempty, and, by (b), the operations of vector addition and scalar multiplication are well defined for W. Axioms [A1], [A4], [M1], [M2], [M3], [M4] hold in W because the vectors in W belong to V. Thus, we need only show that [A2] and [A3] also hold in W. Now [A2] holds because the zero vector in V belongs to W by (a). Finally, if v 2 W, then ð1Þv ¼ v 2 W, and v þ ðvÞ ¼ 0. Thus [A3] holds. 4.9. Let V ¼ R3 . Show that W is not a subspace of V, where (a) W ¼ fða; b; cÞ : a 0g, (b) W ¼ fða; b; cÞ : a2 þ b2 þ c2 1g. In each case, show that Theorem 4.2 does not hold. CHAPTER 4 Vector Spaces 135
  • 143. (a) W consists of those vectors whose first entry is nonnegative. Thus, v ¼ ð1; 2; 3Þ belongs to W. Let k ¼ 3. Then kv ¼ ð3; 6; 9Þ does not belong to W, because 3 is negative. Thus, W is not a subspace of V. (b) W consists of vectors whose length does not exceed 1. Hence, u ¼ ð1; 0; 0Þ and v ¼ ð0; 1; 0Þ belong to W, but u þ v ¼ ð1; 1; 0Þ does not belong to W, because 12 þ 12 þ 02 ¼ 2 1. Thus, W is not a subspace of V. 4.10. Let V ¼ PðtÞ, the vector space of real polynomials. Determine whether or not W is a subspace of V, where (a) W consists of all polynomials with integral coefficients. (b) W consists of all polynomials with degree 6 and the zero polynomial. (c) W consists of all polynomials with only even powers of t. (a) No, because scalar multiples of polynomials in W do not always belong to W. For example, f ðtÞ ¼ 3 þ 6t þ 7t2 2 W but 1 2 f ðtÞ ¼ 3 2 þ 3t þ 7 2 t2 62 W (b and c) Yes. In each case, W contains the zero polynomial, and sums and scalar multiples of polynomials in W belong to W. 4.11. Let V be the vector space of functions f : R ! R. Show that W is a subspace of V, where (a) W ¼ f f ðxÞ : f ð1Þ ¼ 0g, all functions whose value at 1 is 0. (b) W ¼ f f ðxÞ : f ð3Þ ¼ f ð1Þg, all functions assigning the same value to 3 and 1. (c) W ¼ f f ðtÞ : f ðxÞ ¼ f ðxÞg, all odd functions. Let ^ 0 denote the zero function, so ^ 0ðxÞ ¼ 0 for every value of x. (a) ^ 0 2 W, because ^ 0ð1Þ ¼ 0. Suppose f ; g 2 W. Then f ð1Þ ¼ 0 and gð1Þ ¼ 0. Also, for scalars a and b, we have ðaf þ bgÞð1Þ ¼ af ð1Þ þ bgð1Þ ¼ a0 þ b0 ¼ 0 Thus, af þ bg 2 W, and hence W is a subspace. (b) ^ 0 2 W, because ^ 0ð3Þ ¼ 0 ¼ ^ 0ð1Þ. Suppose f; g 2 W. Then f ð3Þ ¼ f ð1Þ and gð3Þ ¼ gð1Þ. Thus, for any scalars a and b, we have ðaf þ bgÞð3Þ ¼ af ð3Þ þ bgð3Þ ¼ af ð1Þ þ bgð1Þ ¼ ðaf þ bgÞð1Þ Thus, af þ bg 2 W, and hence W is a subspace. (c) ^ 0 2 W, because ^ 0ðxÞ ¼ 0 ¼ 0 ¼ ^ 0ðxÞ. Suppose f; g 2 W. Then f ðxÞ ¼ f ðxÞ and gðxÞ ¼ gðxÞ. Also, for scalars a and b, ðaf þ bgÞðxÞ ¼ af ðxÞ þ bgðxÞ ¼ af ðxÞ bgðxÞ ¼ ðaf þ bgÞðxÞ Thus, ab þ gf 2 W, and hence W is a subspace of V. 4.12. Prove Theorem 4.3: The intersection of any number of subspaces of V is a subspace of V. Let fWi : i 2 Ig be a collection of subspaces of V and let W ¼ ðWi : i 2 IÞ. Because each Wi is a subspace of V, we have 0 2 Wi, for every i 2 I. Hence, 0 2 W. Suppose u; v 2 W. Then u; v 2 Wi, for every i 2 I. Because each Wi is a subspace, au þ bv 2 Wi , for every i 2 I. Hence, au þ bv 2 W. Thus, W is a subspace of V. Linear Spans 4.13. Show that the vectors u1 ¼ ð1; 1; 1Þ, u2 ¼ ð1; 2; 3Þ, u3 ¼ ð1; 5; 8Þ span R3 . We need to show that an arbitrary vector v ¼ ða; b; cÞ in R3 is a linear combination of u1, u2, u3. Set v ¼ xu1 þ yu2 þ zu3; that is, set ða; b; cÞ ¼ xð1; 1; 1Þ þ yð1; 2; 3Þ þ zð1; 5; 8Þ ¼ ðx þ y þ z; x þ 2y þ 5z; x þ 3y þ 8zÞ 136 CHAPTER 4 Vector Spaces
  • 144. Form the equivalent system and reduce it to echelon form: x þ y þ z ¼ a x þ 2y þ 5z ¼ b x þ 3y þ 8z ¼ c or x þ y þ z ¼ a y þ 4z ¼ b a 2y þ 7c ¼ c a or x þ y þ z ¼ a y þ 4z ¼ b a z ¼ c 2b þ a The above system is in echelon form and is consistent; in fact, x ¼ a þ 5b 3c; y ¼ 3a 7b þ 4c; z ¼ a þ 2b c is a solution. Thus, u1, u2, u3 span R3 . 4.14. Find conditions on a, b, c so that v ¼ ða; b; cÞ in R3 belongs to W ¼ spanðu1; u2; u3Þ; where u1 ¼ ð1; 2; 0Þ; u2 ¼ ð1; 1; 2Þ; u3 ¼ ð3; 0; 4Þ Set v as a linear combination of u1, u2, u3 using unknowns x, y, z; that is, set v ¼ xu1 þ yu2 þ zu3: This yields ða; b; cÞ ¼ xð1; 2; 0Þ þ yð1; 1; 2Þ þ zð3; 0; 4Þ ¼ ðx y þ 3z; 2x þ y; 2y 4zÞ Form the equivalent system of linear equations and reduce it to echelon form: x y þ 3z ¼ a 2x þ y ¼ b 2y 4z ¼ c or x y þ 3z ¼ a 3y 6z ¼ b 2a 2y 4z ¼ c or x y þ 3z ¼ a 3y 6z ¼ b 2a 0 ¼ 4a 2b þ 3c The vector v ¼ ða; b; cÞ belongs to W if and only if the system is consistent, and it is consistent if and only if 4a 2b þ 3c ¼ 0. Note, in particular, that u1, u2, u3 do not span the whole space R3 . 4.15. Show that the vector space V ¼ PðtÞ of real polynomials cannot be spanned by a finite number of polynomials. Any finite set S of polynomials contains a polynomial of maximum degree, say m. Then the linear span span(S) of S cannot contain a polynomial of degree greater than m. Thus, spanðSÞ 6¼ V, for any finite set S. 4.16. Prove Theorem 4.5: Let S be a subset of V. (i) Then span(S) is a subspace of V containing S. (ii) If W is a subspace of V containing S, then spanðSÞ W. (i) Suppose S is empty. By definition, spanðSÞ ¼ f0g. Hence spanðSÞ ¼ f0g is a subspace of V and S spanðSÞ. Suppose S is not empty and v 2 S. Then v ¼ 1v 2 spanðSÞ; hence, S spanðSÞ. Also 0 ¼ 0v 2 spanðSÞ. Now suppose u; w 2 spanðSÞ, say u ¼ a1u1 þ þ arur ¼ P i aiui and w ¼ b1w1 þ þ bsws ¼ P j bjwj where ui, wj 2 S and ai; bj 2 K. Then u þ v ¼ P i aiui þ P j bjwj and ku ¼ k P i aiui ¼ P i kaiui belong to span(S) because each is a linear combination of vectors in S. Thus, span(S) is a subspace of V. (ii) Suppose u1; u2; . . . ; ur 2 S. Then all the ui belong to W. Thus, all multiples a1u1; a2u2; . . . ; arur 2 W, and so the sum a1u1 þ a2u2 þ þ arur 2 W. That is, W contains all linear combinations of elements in S, or, in other words, spanðSÞ W, as claimed. Linear Dependence 4.17. Determine whether or not u and v are linearly dependent, where (a) u ¼ ð1; 2Þ, v ¼ ð3; 5Þ, (c) u ¼ ð1; 2; 3Þ, v ¼ ð4; 5; 6Þ (b) u ¼ ð1; 3Þ, v ¼ ð2; 6Þ, (d) u ¼ ð2; 4; 8Þ, v ¼ ð3; 6; 12Þ Two vectors u and v are linearly dependent if and only if one is a multiple of the other. (a) No. (b) Yes; for v ¼ 2u. (c) No. (d) Yes, for v ¼ 3 2 u. CHAPTER 4 Vector Spaces 137
  • 145. 4.18. Determine whether or not u and v are linearly dependent, where (a) u ¼ 2t2 þ 4t 3, v ¼ 4t2 þ 8t 6, (b) u ¼ 2t2 3t þ 4, v ¼ 4t2 3t þ 2, (c) u ¼ 1 3 4 5 0 1 ; v ¼ 4 12 16 20 0 4 , (d) u ¼ 1 1 1 2 2 2 ; v ¼ 2 2 2 3 3 3 Two vectors u and v are linearly dependent if and only if one is a multiple of the other. (a) Yes; for v ¼ 2u. (b) No. (c) Yes, for v ¼ 4u. (d) No. 4.19. Determine whether or not the vectors u ¼ ð1; 1; 2Þ, v ¼ ð2; 3; 1Þ, w ¼ ð4; 5; 5Þ in R3 are linearly dependent. Method 1. Set a linear combination of u, v, w equal to the zero vector using unknowns x, y, z to obtain the equivalent homogeneous system of linear equations and then reduce the system to echelon form. This yields x 1 1 1 2 4 3 5 þ y 2 3 1 2 4 3 5 þ z 4 5 5 2 4 3 5 ¼ 0 0 0 2 4 3 5 or x þ 2y þ 4z ¼ 0 x þ 3y þ 5z ¼ 0 2x þ y þ 5z ¼ 0 or x þ 2y þ 4z ¼ 0 y þ z ¼ 0 The echelon system has only two nonzero equations in three unknowns; hence, it has a free variable and a nonzero solution. Thus, u, v, w are linearly dependent. Method 2. Form the matrix A whose columns are u, v, w and reduce to echelon form: A ¼ 1 2 4 1 3 5 2 1 5 2 4 3 5 1 2 4 0 1 1 0 3 3 2 4 3 5 1 2 4 0 1 1 0 0 0 2 4 3 5 The third column does not have a pivot; hence, the third vector w is a linear combination of the first two vectors u and v. Thus, the vectors are linearly dependent. (Observe that the matrix A is also the coefficient matrix in Method 1. In other words, this method is essentially the same as the first method.) Method 3. Form the matrix B whose rows are u, v, w, and reduce to echelon form: B ¼ 1 1 2 2 3 1 4 5 5 2 4 3 5 0 1 2 0 1 3 0 1 3 2 4 3 5 1 1 2 0 1 3 0 0 0 2 4 3 5 Because the echelon matrix has only two nonzero rows, the three vectors are linearly dependent. (The three given vectors span a space of dimension 2.) 4.20. Determine whether or not each of the following lists of vectors in R3 is linearly dependent: (a) u1 ¼ ð1; 2; 5Þ, u2 ¼ ð1; 3; 1Þ, u3 ¼ ð2; 5; 7Þ, u4 ¼ ð3; 1; 4Þ, (b) u ¼ ð1; 2; 5Þ, v ¼ ð2; 5; 1Þ, w ¼ ð1; 5; 2Þ, (c) u ¼ ð1; 2; 3Þ, v ¼ ð0; 0; 0Þ, w ¼ ð1; 5; 6Þ. (a) Yes, because any four vectors in R3 are linearly dependent. (b) Use Method 2 above; that is, form the matrix A whose columns are the given vectors, and reduce the matrix to echelon form: A ¼ 1 2 1 2 5 5 5 1 2 2 4 3 5 1 2 1 0 1 3 0 9 3 2 4 3 5 1 2 1 0 1 3 0 0 24 2 4 3 5 Every column has a pivot entry; hence, no vector is a linear combination of the previous vectors. Thus, the vectors are linearly independent. (c) Because 0 ¼ ð0; 0; 0Þ is one of the vectors, the vectors are linearly dependent. 138 CHAPTER 4 Vector Spaces
  • 146. 4.21. Show that the functions f ðtÞ ¼ sin t, gðtÞ cos t, hðtÞ ¼ t from R into R are linearly independent. Set a linear combination of the functions equal to the zero function 0 using unknown scalars x, y, z; that is, set xf þ yg þ zh ¼ 0. Then show x ¼ 0, y ¼ 0, z ¼ 0. We emphasize that xf þ yg þ zh ¼ 0 means that, for every value of t, we have xf ðtÞ þ ygðtÞ þ zhðtÞ ¼ 0. Thus, in the equation x sin t þ y cos t þ zt ¼ 0: ðiÞ Set t ¼ 0 ðiiÞ Set t ¼ p=2 ðiiiÞ Set t ¼ p to obtain to obtain to obtain xð0Þ þ yð1Þ þ zð0Þ ¼ 0 xð1Þ þ yð0Þ þ zp=2 ¼ 0 xð0Þ þ yð1Þ þ zðpÞ ¼ 0 or or or y ¼ 0: x þ pz=2 ¼ 0: y þ pz ¼ 0: The three equations have only the zero solution; that is, x ¼ 0, y ¼ 0, z ¼ 0. Thus, f , g, h are linearly independent. 4.22. Suppose the vectors u, v, w are linearly independent. Show that the vectors u þ v, u v, u 2v þ w are also linearly independent. Suppose xðu þ vÞ þ yðu vÞ þ zðu 2v þ wÞ ¼ 0. Then xu þ xv þ yu yv þ zu 2zv þ zw ¼ 0 or ðx þ y þ zÞu þ ðx y 2zÞv þ zw ¼ 0 Because u, v, w are linearly independent, the coefficients in the above equation are each 0; hence, x þ y þ z ¼ 0; x y 2z ¼ 0; z ¼ 0 The only solution to the above homogeneous system is x ¼ 0, y ¼ 0, z ¼ 0. Thus, u þ v, u v, u 2v þ w are linearly independent. 4.23. Show that the vectors u ¼ ð1 þ i; 2iÞ and w ¼ ð1; 1 þ iÞ in C2 are linearly dependent over the complex field C but linearly independent over the real field R. Recall that two vectors are linearly dependent (over a field K) if and only if one of them is a multiple of the other (by an element in K). Because ð1 þ iÞw ¼ ð1 þ iÞð1; 1 þ iÞ ¼ ð1 þ i; 2iÞ ¼ u u and w are linearly dependent over C. On the other hand, u and w are linearly independent over R, as no real multiple of w can equal u. Specifically, when k is real, the first component of kw ¼ ðk; k þ kiÞ must be real, and it can never equal the first component 1 þ i of u, which is complex. Basis and Dimension 4.24. Determine whether or not each of the following form a basis of R3 : (a) (1, 1, 1), (1, 0, 1); (c) (1, 1, 1), (1, 2, 3), ð2; 1; 1Þ; (b) (1, 2, 3), (1, 3, 5), (1, 0, 1), (2, 3, 0); (d) (1, 1, 2), (1, 2, 5), (5, 3, 4). (a and b) No, because a basis of R3 must contain exactly three elements because dim R3 ¼ 3. (c) The three vectors form a basis if and only if they are linearly independent. Thus, form the matrix whose rows are the given vectors, and row reduce the matrix to echelon form: 1 1 1 1 2 3 2 1 1 2 4 3 5 1 1 1 0 1 2 0 3 1 2 4 3 5 1 1 1 0 1 2 0 0 5 2 4 3 5 The echelon matrix has no zero rows; hence, the three vectors are linearly independent, and so they do form a basis of R3 . CHAPTER 4 Vector Spaces 139
  • 147. (d) Form the matrix whose rows are the given vectors, and row reduce the matrix to echelon form: 1 1 2 1 2 5 5 3 4 2 4 3 5 1 1 2 0 1 3 0 2 6 2 4 3 5 1 1 2 0 1 3 0 0 0 2 4 3 5 The echelon matrix has a zero row; hence, the three vectors are linearly dependent, and so they do not form a basis of R3 . 4.25. Determine whether (1, 1, 1, 1), (1, 2, 3, 2), (2, 5, 6, 4), (2, 6, 8, 5) form a basis of R4 . If not, find the dimension of the subspace they span. Form the matrix whose rows are the given vectors, and row reduce to echelon form: B ¼ 1 1 1 1 1 2 3 2 2 5 6 4 2 6 8 5 2 6 6 4 3 7 7 5 1 1 1 1 0 1 2 1 0 3 4 2 0 4 6 3 2 6 6 4 3 7 7 5 1 1 1 1 0 1 2 1 0 0 2 1 0 0 2 1 2 6 6 4 3 7 7 5 1 1 1 1 0 1 2 1 0 0 2 1 0 0 0 0 2 6 6 4 3 7 7 5 The echelon matrix has a zero row. Hence, the four vectors are linearly dependent and do not form a basis of R4 . Because the echelon matrix has three nonzero rows, the four vectors span a subspace of dimension 3. 4.26. Extend fu1 ¼ ð1; 1; 1; 1Þ; u2 ¼ ð2; 2; 3; 4Þg to a basis of R4 . First form the matrix with rows u1 and u2, and reduce to echelon form: 1 1 1 1 2 2 3 4 1 1 1 1 0 0 1 2 Then w1 ¼ ð1; 1; 1; 1Þ and w2 ¼ ð0; 0; 1; 2Þ span the same set of vectors as spanned by u1 and u2. Let u3 ¼ ð0; 1; 0; 0Þ and u4 ¼ ð0; 0; 0; 1Þ. Then w1, u3, w2, u4 form a matrix in echelon form. Thus, they are linearly independent, and they form a basis of R4 . Hence, u1, u2, u3, u4 also form a basis of R4 . 4.27. Consider the complex field C, which contains the real field R, which contains the rational field Q. (Thus, C is a vector space over R, and R is a vector space over Q.) (a) Show that f1; ig is a basis of C over R; hence, C is a vector space of dimension 2 over R. (b) Show that R is a vector space of infinite dimension over Q. (a) For any v 2 C, we have v ¼ a þ bi ¼ að1Þ þ bðiÞ, where a; b 2 R. Hence, f1; ig spans C over R. Furthermore, if xð1Þ þ yðiÞ ¼ 0 or x þ yi ¼ 0, where x, y 2 R, then x ¼ 0 and y ¼ 0. Hence, f1; ig is linearly independent over R. Thus, f1; ig is a basis for C over R. (b) It can be shown that p is a transcendental number; that is, p is not a root of any polynomial over Q. Thus, for any n, the n þ 1 real numbers 1; p; p2 ; . . . ; pn are linearly independent over Q. R cannot be of dimension n over Q. Accordingly, R is of infinite dimension over Q. 4.28. Suppose S ¼ fu1; u2; . . . ; ung is a subset of V. Show that the following Definitions A and B of a basis of V are equivalent: (A) S is linearly independent and spans V. (B) Every v 2 V is a unique linear combination of vectors in S. Suppose (A) holds. Because S spans V, the vector v is a linear combination of the ui, say u ¼ a1u1 þ a2u2 þ þ anun and u ¼ b1u1 þ b2u2 þ þ bnun Subtracting, we get 0 ¼ v v ¼ ða1 b1Þu1 þ ða2 b2Þu2 þ þ ðan bnÞun 140 CHAPTER 4 Vector Spaces
  • 148. But the ui are linearly independent. Hence, the coefficients in the above relation are each 0: a1 b1 ¼ 0; a2 b2 ¼ 0; . . . ; an bn ¼ 0 Therefore, a1 ¼ b1; a2 ¼ b2; . . . ; an ¼ bn. Hence, the representation of v as a linear combination of the ui is unique. Thus, (A) implies (B). Suppose (B) holds. Then S spans V. Suppose 0 ¼ c1u1 þ c2u2 þ þ cnun However, we do have 0 ¼ 0u1 þ 0u2 þ þ 0un By hypothesis, the representation of 0 as a linear combination of the ui is unique. Hence, each ci ¼ 0 and the ui are linearly independent. Thus, (B) implies (A). Dimension and Subspaces 4.29. Find a basis and dimension of the subspace W of R3 where (a) W ¼ fða; b; cÞ : a þ b þ c ¼ 0g, (b) W ¼ fða; b; cÞ : ða ¼ b ¼ cÞg (a) Note that W 6¼ R3 , because, for example, ð1; 2; 3Þ 62 W. Thus, dim W 3. Note that u1 ¼ ð1; 0; 1Þ and u2 ¼ ð0; 1; 1Þ are two independent vectors in W. Thus, dim W ¼ 2, and so u1 and u2 form a basis of W. (b) The vector u ¼ ð1; 1; 1Þ 2 W. Any vector w 2 W has the form w ¼ ðk; k; kÞ. Hence, w ¼ ku. Thus, u spans W and dim W ¼ 1. 4.30. Let W be the subspace of R4 spanned by the vectors u1 ¼ ð1; 2; 5; 3Þ; u2 ¼ ð2; 3; 1; 4Þ; u3 ¼ ð3; 8; 3; 5Þ (a) Find a basis and dimension of W. (b) Extend the basis of W to a basis of R4 . (a) Apply Algorithm 4.1, the row space algorithm. Form the matrix whose rows are the given vectors, and reduce it to echelon form: A ¼ 1 2 5 3 2 3 1 4 3 8 3 5 2 4 3 5 1 2 5 3 0 7 9 2 0 14 18 4 2 4 3 5 1 2 5 3 0 7 9 2 0 0 0 0 2 4 3 5 The nonzero rows ð1; 2; 5; 3Þ and ð0; 7; 9; 2Þ of the echelon matrix form a basis of the row space of A and hence of W. Thus, in particular, dim W ¼ 2. (b) We seek four linearly independent vectors, which include the above two vectors. The four vectors ð1; 2; 5; 3Þ, ð0; 7; 9; 2Þ, (0, 0, 1, 0), and (0, 0, 0, 1) are linearly independent (because they form an echelon matrix), and so they form a basis of R4 , which is an extension of the basis of W. 4.31. Let W be the subspace of R5 spanned by u1 ¼ ð1; 2; 1; 3; 4Þ, u2 ¼ ð2; 4; 2; 6; 8Þ, u3 ¼ ð1; 3; 2; 2; 6Þ, u4 ¼ ð1; 4; 5; 1; 8Þ, u5 ¼ ð2; 7; 3; 3; 9Þ. Find a subset of the vectors that form a basis of W. Here we use Algorithm 4.2, the casting-out algorithm. Form the matrix M whose columns (not rows) are the given vectors, and reduce it to echelon form: M ¼ 1 2 1 1 2 2 4 3 4 7 1 2 2 5 3 3 6 2 1 3 4 8 6 8 9 2 6 6 6 6 4 3 7 7 7 7 5 1 2 1 1 2 0 0 1 2 3 0 0 3 6 5 0 0 1 2 3 0 0 2 4 1 2 6 6 6 6 4 3 7 7 7 7 5 1 2 1 1 2 0 0 1 2 3 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 2 6 6 6 6 4 3 7 7 7 7 5 The pivot positions are in columns C1, C3, C5. Hence, the corresponding vectors u1, u3, u5 form a basis of W, and dim W ¼ 3. CHAPTER 4 Vector Spaces 141
  • 149. 4.32. Let V be the vector space of 2 2 matrices over K. Let W be the subspace of symmetric matrices. Show that dim W ¼ 3, by finding a basis of W. Recall that a matrix A ¼ ½aij is symmetric if AT ¼ A, or, equivalently, each aij ¼ aji. Thus, A ¼ a b b d denotes an arbitrary 2 2 symmetric matrix. Setting (i) a ¼ 1, b ¼ 0, d ¼ 0; (ii) a ¼ 0, b ¼ 1, d ¼ 0; (iii) a ¼ 0, b ¼ 0, d ¼ 1, we obtain the respective matrices: E1 ¼ 1 0 0 0 ; E2 ¼ 0 1 1 0 ; E3 ¼ 0 0 0 1 We claim that S ¼ fE1; E2; E3g is a basis of W; that is, (a) S spans W and (b) S is linearly independent. (a) The above matrix A ¼ a b b d ¼ aE1 þ bE2 þ dE3. Thus, S spans W. (b) Suppose xE1 þ yE2 þ zE3 ¼ 0, where x, y, z are unknown scalars. That is, suppose x 1 0 0 0 þ y 0 1 1 0 þ z 0 0 0 1 ¼ 0 0 0 0 or x y y z ¼ 0 0 0 0 Setting corresponding entries equal to each other yields x ¼ 0, y ¼ 0, z ¼ 0. Thus, S is linearly independent. Therefore, S is a basis of W, as claimed. Theorems on Linear Dependence, Basis, and Dimension 4.33. Prove Lemma 4.10: Suppose two or more nonzero vectors v1; v2; . . . ; vm are linearly dependent. Then one of them is a linear combination of the preceding vectors. Because the vi are linearly dependent, there exist scalars a1; . . . ; am, not all 0, such that a1v1 þ þ amvm ¼ 0. Let k be the largest integer such that ak 6¼ 0. Then a1v1 þ þ akvk þ 0vkþ1 þ þ 0vm ¼ 0 or a1v1 þ þ akvk ¼ 0 Suppose k ¼ 1; then a1v1 ¼ 0, a1 6¼ 0, and so v1 ¼ 0. But the vi are nonzero vectors. Hence, k 1 and vk ¼ a1 k a1v1 a1 k ak1vk1 That is, vk is a linear combination of the preceding vectors. 4.34. Suppose S ¼ fv1; v2; . . . ; vmg spans a vector space V. (a) If w 2 V, then fw; v1; . . . ; vmg is linearly dependent and spans V. (b) If vi is a linear combination of v1; . . . ; vi1, then S without vi spans V. (a) The vector w is a linear combination of the vi, because fvig spans V. Accordingly, fw; v1; . . . ; vmg is linearly dependent. Clearly, w with the vi span V, as the vi by themselves span V; that is, fw; v1; . . . ; vmg spans V. (b) Suppose vi ¼ k1v1 þ þ ki1vi1. Let u 2 V. Because fvig spans V, u is a linear combination of the vj’s, say u ¼ a1v1 þ þ amvm: Substituting for vi, we obtain u ¼ a1v1 þ þ ai1vi1 þ aiðk1v1 þ þ ki1vi1Þ þ aiþ1viþ1 þ þ amvm ¼ ða1 þ aik1Þv1 þ þ ðai1 þ aiki1Þvi1 þ aiþ1viþ1 þ þ amvm Thus, fv1; . . . ; vi1; viþ1; . . . ; vmg spans V. In other words, we can delete vi from the spanning set and still retain a spanning set. 4.35. Prove Lemma 4.13: Suppose fv1; v2; . . . ; vng spans V, and suppose fw1; w2; . . . ; wmg is linearly independent. Then m n, and V is spanned by a set of the form fw1; w2; . . . ; wm; vi1 ; vi2 ; . . . ; vinm g Thus, any n þ 1 or more vectors in V are linearly dependent. 142 CHAPTER 4 Vector Spaces
  • 150. It suffices to prove the lemma in the case that the vi are all not 0. (Prove!) Because fvig spans V, we have by Problem 4.34 that fw1; v1; . . . ; vng ð1Þ is linearly dependent and also spans V. By Lemma 4.10, one of the vectors in (1) is a linear combination of the preceding vectors. This vector cannot be w1, so it must be one of the v’s, say vj: Thus by Problem 4.34, we can delete vj from the spanning set (1) and obtain the spanning set fw1; v1; . . . ; vj1; vjþ1; . . . ; vng ð2Þ Now we repeat the argument with the vector w2. That is, because (2) spans V, the set fw1; w2; v1; . . . ; vj1; vjþ1; . . . ; vng ð3Þ is linearly dependent and also spans V. Again by Lemma 4.10, one of the vectors in (3) is a linear combination of the preceding vectors. We emphasize that this vector cannot be w1 or w2, because fw1; . . . ; wmg is independent; hence, it must be one of the v’s, say vk. Thus, by Problem 4.34, we can delete vk from the spanning set (3) and obtain the spanning set fw1; w2; v1; . . . ; vj1; vjþ1; . . . ; vk1; vkþ1; . . . ; vng We repeat the argument with w3, and so forth. At each step, we are able to add one of the w’s and delete one of the v’s in the spanning set. If m n, then we finally obtain a spanning set of the required form: fw1; . . . ; wm; vi1 ; . . . ; vinm g Finally, we show that m n is not possible. Otherwise, after n of the above steps, we obtain the spanning set fw1; . . . ; wng. This implies that wnþ1 is a linear combination of w1; . . . ; wn, which contradicts the hypothesis that fwig is linearly independent. 4.36. Prove Theorem 4.12: Every basis of a vector space V has the same number of elements. Suppose fu1; u2; . . . ; ung is a basis of V, and suppose fv1; v2; . . .g is another basis of V. Because fuig spans V, the basis fv1; v2; . . .g must contain n or less vectors, or else it is linearly dependent by Problem 4.35—Lemma 4.13. On the other hand, if the basis fv1; v2; . . .g contains less than n elements, then fu1; u2; . . . ; ung is linearly dependent by Problem 4.35. Thus, the basis fv1; v2; . . .g contains exactly n vectors, and so the theorem is true. 4.37. Prove Theorem 4.14: Let V be a vector space of finite dimension n. Then (i) Any n þ 1 or more vectors must be linearly dependent. (ii) Any linearly independent set S ¼ fu1; u2; . . . ung with n elements is a basis of V. (iii) Any spanning set T ¼ fv1; v2; . . . ; vng of V with n elements is a basis of V. Suppose B ¼ fw1; w2; . . . ; wng is a basis of V. (i) Because B spans V, any n þ 1 or more vectors are linearly dependent by Lemma 4.13. (ii) By Lemma 4.13, elements from B can be adjoined to S to form a spanning set of V with n elements. Because S already has n elements, S itself is a spanning set of V. Thus, S is a basis of V. (iii) Suppose T is linearly dependent. Then some vi is a linear combination of the preceding vectors. By Problem 4.34, V is spanned by the vectors in T without vi and there are n 1 of them. By Lemma 4.13, the independent set B cannot have more than n 1 elements. This contradicts the fact that B has n elements. Thus, T is linearly independent, and hence T is a basis of V. 4.38. Prove Theorem 4.15: Suppose S spans a vector space V. Then (i) Any maximum number of linearly independent vectors in S form a basis of V. (ii) Suppose one deletes from S every vector that is a linear combination of preceding vectors in S. Then the remaining vectors form a basis of V. (i) Suppose fv1; . . . ; vmg is a maximum linearly independent subset of S, and suppose w 2 S. Accord- ingly, fv1; . . . ; vm; wg is linearly dependent. No vk can be a linear combination of preceding vectors. CHAPTER 4 Vector Spaces 143
  • 151. Hence, w is a linear combination of the vi. Thus, w 2 spanðviÞ, and hence S spanðviÞ. This leads to V ¼ spanðSÞ spanðviÞ V Thus, fvig spans V, and, as it is linearly independent, it is a basis of V. (ii) The remaining vectors form a maximum linearly independent subset of S; hence, by (i), it is a basis of V. 4.39. Prove Theorem 4.16: Let V be a vector space of finite dimension and let S ¼ fu1; u2; . . . ; urg be a set of linearly independent vectors in V. Then S is part of a basis of V; that is, S may be extended to a basis of V. Suppose B ¼ fw1; w2; . . . ; wng is a basis of V. Then B spans V, and hence V is spanned by S [ B ¼ fu1; u2; . . . ; ur; w1; w2; . . . ; wng By Theorem 4.15, we can delete from S [ B each vector that is a linear combination of preceding vectors to obtain a basis B0 for V. Because S is linearly independent, no uk is a linear combination of preceding vectors. Thus, B0 contains every vector in S, and S is part of the basis B0 for V. 4.40. Prove Theorem 4.17: Let W be a subspace of an n-dimensional vector space V. Then dim W n. In particular, if dim W ¼ n, then W ¼ V. Because V is of dimension n, any n þ 1 or more vectors are linearly dependent. Furthermore, because a basis of W consists of linearly independent vectors, it cannot contain more than n elements. Accordingly, dim W n. In particular, if fw1; . . . ; wng is a basis of W, then, because it is an independent set with n elements, it is also a basis of V. Thus, W ¼ V when dim W ¼ n. Rank of a Matrix, Row and Column Spaces 4.41. Find the rank and basis of the row space of each of the following matrices: (a) A ¼ 1 2 0 1 2 6 3 3 3 10 6 5 2 4 3 5, (b) B ¼ 1 3 1 2 3 1 4 3 1 4 2 3 4 7 3 3 8 1 7 8 2 6 6 4 3 7 7 5. (a) Row reduce A to echelon form: A 1 2 0 1 0 2 3 1 0 4 6 2 2 4 3 5 1 2 0 1 0 2 3 1 0 0 0 0 2 4 3 5 The two nonzero rows ð1; 2; 0; 1Þ and ð0; 2; 3; 1Þ of the echelon form of A form a basis for rowsp(A). In particular, rankðAÞ ¼ 2. (b) Row reduce B to echelon form: B 1 3 1 2 3 0 1 2 1 1 0 3 6 3 3 0 1 2 1 1 2 6 6 4 3 7 7 5 1 3 1 2 3 0 1 2 1 1 0 0 0 0 0 0 0 0 0 0 2 6 6 4 3 7 7 5 The two nonzero rows ð1; 3; 1; 2; 3Þ and ð0; 1; 2; 1; 1Þ of the echelon form of B form a basis for rowsp(B). In particular, rankðBÞ ¼ 2. 4.42. Show that U ¼ W, where U and W are the following subspaces of R3 : U ¼ spanðu1; u2; u3Þ ¼ spanð1; 1; 1Þ; ð2; 3; 1Þ; ð3; 1; 5Þg W ¼ spanðw1; w2; w3Þ ¼ spanð1; 1; 3Þ; ð3; 2; 8Þ; ð2; 1; 3Þg 144 CHAPTER 4 Vector Spaces
  • 152. Form the matrix A whose rows are the ui, and row reduce A to row canonical form: A ¼ 1 1 1 2 3 1 3 1 5 2 4 3 5 1 1 1 0 1 1 0 2 2 2 4 3 5 1 0 2 0 1 1 0 0 0 2 4 3 5 Next form the matrix B whose rows are the wj, and row reduce B to row canonical form: B ¼ 1 1 3 3 2 8 2 1 3 2 4 3 5 1 1 3 0 1 1 0 3 3 2 4 3 5 1 0 2 0 1 1 0 0 0 2 4 3 5 Because A and B have the same row canonical form, the row spaces of A and B are equal, and so U ¼ W. 4.43. Let A ¼ 1 2 1 2 3 1 2 4 3 7 7 4 1 2 2 5 5 6 3 6 6 15 14 15 2 6 6 4 3 7 7 5. (a) Find rankðMkÞ, for k ¼ 1; 2; . . . ; 6, where Mk is the submatrix of A consisting of the first k columns C1; C2; . . . ; Ck of A. (b) Which columns Ckþ1 are linear combinations of preceding columns C1; . . . ; Ck? (c) Find columns of A that form a basis for the column space of A. (d) Express column C4 as a linear combination of the columns in part (c). (a) Row reduce A to echelon form: A 1 2 1 2 3 1 0 0 1 3 1 2 0 0 1 3 2 5 0 0 3 9 5 12 2 6 6 4 3 7 7 5 1 2 1 2 3 1 0 0 1 3 1 2 0 0 0 0 1 3 0 0 0 0 0 0 2 6 6 4 3 7 7 5 Observe that this simultaneously reduces all the matrices Mk to echelon form; for example, the first four columns of the echelon form of A are an echelon form of M4. We know that rankðMkÞ is equal to the number of pivots or, equivalently, the number of nonzero rows in an echelon form of Mk. Thus, rankðM1Þ ¼ rankðM2Þ ¼ 1; rankðM3Þ ¼ rankðM4Þ ¼ 2 rankðM5Þ ¼ rankðM6Þ ¼ 3 (b) The vector equation x1C1 þ x2C2 þ þ xkCk ¼ Ckþ1 yields the system with coefficient matrix Mk and augmented Mkþ1. Thus, Ckþ1 is a linear combination of C1; . . . ; Ck if and only if rankðMkÞ ¼ rankðMkþ1Þ or, equivalently, if Ckþ1 does not contain a pivot. Thus, each of C2, C4, C6 is a linear combination of preceding columns. (c) In the echelon form of A, the pivots are in the first, third, and fifth columns. Thus, columns C1, C3, C5 of A form a basis for the columns space of A. Alternatively, deleting columns C2, C4, C6 from the spanning set of columns (they are linear combinations of other columns), we obtain, again, C1, C3, C5. (d) The echelon matrix tells us that C4 is a linear combination of columns C1 and C3. The augmented matrix M of the vector equation C4 ¼ xC1 þ yC2 consists of the columns C1, C3, C4 of A which, when reduced to echelon form, yields the matrix (omitting zero rows) 1 1 2 0 1 3 or x þ y ¼ 2 y ¼ 3 or x ¼ 1; y ¼ 3 Thus, C4 ¼ C1 þ 3C3 ¼ C1 þ 3C3 þ 0C5. 4.44. Suppose u ¼ ða1; a2; . . . ; anÞ is a linear combination of the rows R1; R2; . . . ; Rm of a matrix B ¼ ½bij, say u ¼ k1R1 þ k2R2 þ þ kmRm: Prove that ai ¼ k1b1i þ k2b2i þ þ kmbmi; i ¼ 1; 2; . . . ; n where b1i; b2i; . . . ; bmi are the entries in the ith column of B. CHAPTER 4 Vector Spaces 145
  • 153. We are given that u ¼ k1R1 þ k2R2 þ þ kmRm. Hence, ða1; a2; . . . ; anÞ ¼ k1ðb11; . . . ; b1nÞ þ þ kmðbm1; . . . ; bmnÞ ¼ ðk1b11 þ þ kmbm1; . . . ; k1b1n þ þ kmbmnÞ Setting corresponding components equal to each other, we obtain the desired result. 4.45. Prove Theorem 4.7: Suppose A ¼ ½aij and B ¼ ½bij are row equivalent echelon matrices with respective pivot entries a1j1 ; a2j2 ; . . . ; arjr and b1k1 ; b2k2 ; . . . ; bsks (pictured in Fig. 4-5). Then A and B have the same number of nonzero rows—that is, r ¼ s—and their pivot entries are in the same positions; that is, j1 ¼ k1; j2 ¼ k2; . . . ; jr ¼ kr. Clearly A ¼ 0 if and only if B ¼ 0, and so we need only prove the theorem when r 1 and s 1. We first show that j1 ¼ k1. Suppose j1 k1. Then the j1th column of B is zero. Because the first row R* of A is in the row space of B, we have R* ¼ c1R1 þ c1R2 þ þ cmRm, where the Ri are the rows of B. Because the j1th column of B is zero, we have a1j1 ¼ c10 þ c20 þ þ cm0 ¼ 0 But this contradicts the fact that the pivot entry a1j1 6¼ 0. Hence, j1 k1 and, similarly, k1 j1. Thus j1 ¼ k1. Now let A0 be the submatrix of A obtained by deleting the first row of A, and let B0 be the submatrix of B obtained by deleting the first row of B. We prove that A0 and B0 have the same row space. The theorem will then follow by induction, because A0 and B0 are also echelon matrices. Let R ¼ ða1; a2; ... ; anÞ be any row of A0 and let R1; ... ; Rm be the rows of B. Because R is in the row space of B, there exist scalars d1; ... ; dm such that R ¼ d1R1 þ d2R2 þ þ dmRm. Because A is in echelon form and R is not the first row of A, the j1th entry of R is zero: ai ¼ 0 for i ¼ j1 ¼ k1. Furthermore, because B is in echelon form, all the entries in the k1th column of B are 0 except the first: b1k1 6¼ 0, but b2k1 ¼ 0; ... ; bmk1 ¼ 0. Thus, 0 ¼ ak1 ¼ d1b1k1 þ d20 þ þ dm0 ¼ d1b1k1 Now b1k1 6¼ 0 and so d1 ¼ 0. Thus, R is a linear combination of R2; . . . ; Rm and so is in the row space of B0 . Because R was any row of A0 , the row space of A0 is contained in the row space of B0 . Similarly, the row space of B0 is contained in the row space of A0 . Thus, A0 and B0 have the same row space, and so the theorem is proved. 4.46. Prove Theorem 4.8: Suppose A and B are row canonical matrices. Then A and B have the same row space if and only if they have the same nonzero rows. Obviously, if A and B have the same nonzero rows, then they have the same row space. Thus we only have to prove the converse. Suppose A and B have the same row space, and suppose R 6¼ 0 is the ith row of A. Then there exist scalars c1; . . . ; cs such that R ¼ c1R1 þ c2R2 þ þ csRs ð1Þ where the Ri are the nonzero rows of B. The theorem is proved if we show that R ¼ Ri; that is, that ci ¼ 1 but ck ¼ 0 for k 6¼ i. A ¼ a1j1 a2j2 :::::::::::::::::::::::::::::::::::::: arjr 2 6 6 4 3 7 7 5; b ¼ b1k1 b2k2 :::::::::::::::::::::::::::::::::::::: bsks 2 6 6 4 3 7 7 5 Figure 4-5 146 CHAPTER 4 Vector Spaces
  • 154. Let aij, be the pivot entry in R—that is, the first nonzero entry of R. By (1) and Problem 4.44, aiji ¼ c1b1ji þ c2b2ji þ þ csbsji ð2Þ But, by Problem 4.45, biji is a pivot entry of B, and, as B is row reduced, it is the only nonzero entry in the jth column of B. Thus, from (2), we obtain aiji ¼ cibiji . However, aiji ¼ 1 and biji ¼ 1, because A and B are row reduced; hence, ci ¼ 1. Now suppose k 6¼ i, and bkjk is the pivot entry in Rk. By (1) and Problem 4.44, aijk ¼ c1b1jk þ c2b2jk þ þ csbsjk ð3Þ Because B is row reduced, bkjk is the only nonzero entry in the jth column of B. Hence, by (3), aijk ¼ ckbkjk . Furthermore, by Problem 4.45, akjk is a pivot entry of A, and because A is row reduced, aijk ¼ 0. Thus, ckbkjk ¼ 0, and as bkjk ¼ 1, ck ¼ 0. Accordingly R ¼ Ri; and the theorem is proved. 4.47. Prove Corollary 4.9: Every matrix A is row equivalent to a unique matrix in row canonical form. Suppose A is row equivalent to matrices A1 and A2, where A1 and A2 are in row canonical form. Then rowspðAÞ ¼ rowspðA1Þ and rowspðAÞ ¼ rowspðA2Þ. Hence, rowspðA1Þ ¼ rowspðA2Þ. Because A1 and A2 are in row canonical form, A1 ¼ A2 by Theorem 4.8. Thus, the corollary is proved. 4.48. Suppose RB and AB are defined, where R is a row vector and A and B are matrices. Prove (a) RB is a linear combination of the rows of B. (b) The row space of AB is contained in the row space of B. (c) The column space of AB is contained in the column space of A. (d) If C is a column vector and AC is defined, then AC is a linear combination of the columns of A: (e) rankðABÞ rankðBÞ and rankðABÞ rankðAÞ. (a) Suppose R ¼ ða1; a2; . . . ; amÞ and B ¼ ½bij. Let B1; . . . ; Bm denote the rows of B and B1 ; . . . ; Bn its columns. Then RB ¼ ðRB1 ; RB2 ; . . . ; RBn Þ ¼ ða1b11 þ a2b21 þ þ ambm1; . . . ; a1b1n þ a2b2n þ þ ambmnÞ ¼ a1ðb11; b12; . . . ; b1nÞ þ a2ðb21; b22; . . . ; b2nÞ þ þ amðbm1; bm2; . . . ; bmnÞ ¼ a1B1 þ a2B2 þ þ amBm Thus, RB is a linear combination of the rows of B, as claimed. (b) The rows of AB are RiB, where Ri is the ith row of A. Thus, by part (a), each row of AB is in the row space of B. Thus, rowspðABÞ rowspðBÞ, as claimed. (c) Using part (b), we have colspðABÞ ¼ rowspðABÞT ¼ rowspðBT AT Þ rowspðAT Þ ¼ colspðAÞ: (d) Follows from ðcÞ where C replaces B: (e) The row space of AB is contained in the row space of B; hence, rankðABÞ rankðBÞ. Furthermore, the column space of AB is contained in the column space of A; hence, rankðABÞ rankðAÞ. 4.49. Let A be an n-square matrix. Show that A is invertible if and only if rankðAÞ ¼ n. Note that the rows of the n-square identity matrix In are linearly independent, because In is in echelon form; hence, rankðInÞ ¼ n. Now if A is invertible, then A is row equivalent to In; hence, rankðAÞ ¼ n. But if A is not invertible, then A is row equivalent to a matrix with a zero row; hence, rankðAÞ n; that is, A is invertible if and only if rankðAÞ ¼ n. CHAPTER 4 Vector Spaces 147
  • 155. Applications to Linear Equations 4.50. Find the dimension and a basis of the solution space W of each homogeneous system: x þ 2y þ 2z s þ 3t ¼ 0 x þ 2y þ 3z þ s þ t ¼ 0 3x þ 6y þ 8z þ s þ 5t ¼ 0 (a) x þ 2y þ z 2t ¼ 0 2x þ 4y þ 4z 3t ¼ 0 3x þ 6y þ 7z 4t ¼ 0 (b) x þ y þ 2z ¼ 0 2x þ 3y þ 3z ¼ 0 x þ 3y þ 5z ¼ 0 (c) (a) Reduce the system to echelon form: x þ 2y þ 2z s þ 3t ¼ 0 z þ 2s 2t ¼ 0 2z þ 4s 4t ¼ 0 or x þ 2y þ 2z s þ 3t ¼ 0 z þ 2s 2t ¼ 0 The system in echelon form has two (nonzero) equations in five unknowns. Hence, the system has 5 2 ¼ 3 free variables, which are y, s, t. Thus, dim W ¼ 3. We obtain a basis for W: ð1Þ Set y ¼ 1; s ¼ 0; t ¼ 0 to obtain the solution v1 ¼ ð2; 1; 0; 0; 0Þ: ð2Þ Set y ¼ 0; s ¼ 1; t ¼ 0 to obtain the solution v2 ¼ ð5; 0; 2; 1; 0Þ: ð3Þ Set y ¼ 0; s ¼ 0; t ¼ 1 to obtain the solution v3 ¼ ð7; 0; 2; 0; 1Þ: The set fv1; v2; v3g is a basis of the solution space W. (b) (Here we use the matrix format of our homogeneous system.) Reduce the coefficient matrix A to echelon form: A ¼ 1 2 1 2 2 4 4 3 3 6 7 4 2 4 3 5 1 2 1 2 0 0 2 1 0 0 4 2 2 4 3 5 1 2 1 2 0 0 2 1 0 0 0 0 2 4 3 5 This corresponds to the system x þ 2y þ 2z 2t ¼ 0 2z þ t ¼ 0 The free variables are y and t, and dim W ¼ 2. (i) Set y ¼ 1, z ¼ 0 to obtain the solution u1 ¼ ð2; 1; 0; 0Þ. (ii) Set y ¼ 0, z ¼ 2 to obtain the solution u2 ¼ ð6; 0; 1; 2Þ. Then fu1; u2g is a basis of W. (c) Reduce the coefficient matrix A to echelon form: A ¼ 1 1 2 2 3 3 1 3 5 2 4 3 5 1 1 2 0 1 1 0 2 3 2 4 3 5 1 1 2 0 1 1 0 0 5 2 4 3 5 This corresponds to a triangular system with no free variables. Thus, 0 is the only solution; that is, W ¼ f0g. Hence, dim W ¼ 0. 4.51. Find a homogeneous system whose solution set W is spanned by fu1; u2; u3g ¼ fð1; 2; 0; 3Þ; ð1; 1; 1; 4Þ; ð1; 0; 2; 5Þg Let v ¼ ðx; y; z; tÞ. Then v 2 W if and only if v is a linear combination of the vectors u1, u2, u3 that span W. Thus, form the matrix M whose first columns are u1, u2, u3 and whose last column is v, and then row reduce M to echelon form. This yields M ¼ 1 1 1 x 2 1 0 y 0 1 2 z 3 4 5 t 2 6 6 4 3 7 7 5 1 1 1 x 0 1 2 2x þ y 0 1 2 z 0 1 2 3x þ t 2 6 6 4 3 7 7 5 1 1 1 x 0 1 2 2x þ y 0 0 0 2x þ y þ z 0 0 0 5x y þ t 2 6 6 4 3 7 7 5 148 CHAPTER 4 Vector Spaces
  • 156. Then v is a linear combination of u1, u2, u3 if rankðMÞ ¼ rankðAÞ, where A is the submatrix without column v. Thus, set the last two entries in the fourth column on the right equal to zero to obtain the required homogeneous system: 2x þ y þ z ¼ 0 5x þ y t ¼ 0 4.52. Let xi1 ; xi2 ; . . . ; xik be the free variables of a homogeneous system of linear equations with n unknowns. Let vj be the solution for which xij ¼ 1, and all other free variables equal 0. Show that the solutions v1; v2; . . . ; vk are linearly independent. Let A be the matrix whose rows are the vi. We interchange column 1 and column i1, then column 2 and column i2; . . . ; then column k and column ik, and we obtain the k n matrix B ¼ ½I; C ¼ 1 0 0 . . . 0 0 c1;kþ1 . . . c1n 0 1 0 . . . 0 0 c2;kþ1 . . . c2n ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 0 0 0 . . . 0 1 ck;kþ1 . . . ckn 2 6 6 4 3 7 7 5 The above matrix B is in echelon form, and so its rows are independent; hence, rankðBÞ ¼ k. Because A and B are column equivalent, they have the same rank—rankðAÞ ¼ k. But A has k rows; hence, these rows (i.e., the vi) are linearly independent, as claimed. Sums, Direct Sums, Intersections 4.53. Let U and W be subspaces of a vector space V. Show that (a) U þ V is a subspace of V. (b) U and W are contained in U þ W. (c) U þ W is the smallest subspace containing U and W; that is, U þ W ¼ spanðU; WÞ. (d) W þ W ¼ W. (a) Because U and W are subspaces, 0 2 U and 0 2 W. Hence, 0 ¼ 0 þ 0 belongs to U þ W. Now suppose v; v0 2 U þ W. Then v ¼ u þ w and v0 ¼ u0 þ v0 , where u; u0 2 U and w; w0 2 W. Then av þ bv0 ¼ ðau þ bu0 Þ þ ðaw þ bw0 Þ 2 U þ W Thus, U þ W is a subspace of V. (b) Let u 2 U. Because W is a subspace, 0 2 W. Hence, u ¼ u þ 0 belongs to U þ W. Thus, U U þ W. Similarly, W U þ W. (c) Because U þ W is a subspace of V containing U and W, it must also contain the linear span of U and W. That is, spanðU; WÞ U þ W. On the other hand, if v 2 U þ W, then v ¼ u þ w ¼ 1u þ 1w, where u 2 U and w 2 W. Thus, v is a linear combination of elements in U [ W, and so v 2 spanðU; WÞ. Hence, U þ W spanðU; WÞ. The two inclusion relations give the desired result. (d) Because W is a subspace of V, we have that W is closed under vector addition; hence, W þ W W. By part (a), W W þ W. Hence, W þ W ¼ W. 4.54. Consider the following subspaces of R5 : U ¼ spanðu1; u2; u3Þ ¼ spanfð1; 3; 2; 2; 3Þ; ð1; 4; 3; 4; 2Þ; ð2; 3; 1; 2; 9Þg W ¼ spanðw1; w2; w3Þ ¼ spanfð1; 3; 0; 2; 1Þ; ð1; 5; 6; 6; 3Þ; ð2; 5; 3; 2; 1Þg Find a basis and the dimension of (a) U þ W, (b) U W. CHAPTER 4 Vector Spaces 149
  • 157. (a) U þ W is the space spanned by all six vectors. Hence, form the matrix whose rows are the given six vectors, and then row reduce to echelon form: 1 3 2 2 3 1 4 3 4 2 2 3 1 2 9 1 3 0 2 1 1 5 6 6 3 2 5 3 2 1 2 6 6 6 6 6 6 4 3 7 7 7 7 7 7 5 1 3 2 2 3 0 1 1 2 1 0 3 3 6 3 0 0 2 0 2 0 2 4 4 0 0 1 7 2 5 2 6 6 6 6 6 6 4 3 7 7 7 7 7 7 5 1 3 2 2 3 0 1 1 2 1 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 6 6 6 6 6 6 4 3 7 7 7 7 7 7 5 The following three nonzero rows of the echelon matrix form a basis of U W: ð1; 3; 2; 2; 2; 3Þ; ð0; 1; 1; 2; 1Þ; ð0; 0; 1; 0; 1Þ Thus, dimðU þ WÞ ¼ 3. (b) Let v ¼ ðx; y; z; s; tÞ denote an arbitrary element in R5 . First find, say as in Problem 4.49, homogeneous systems whose solution sets are U and W, respectively. Let M be the matrix whose columns are the ui and v, and reduce M to echelon form: M ¼ 1 1 2 x 3 4 3 y 2 3 1 z 2 4 2 s 3 2 9 t 2 6 6 6 6 4 3 7 7 7 7 5 1 1 2 x 0 1 3 3x þ y 0 0 0 x þ y þ z 0 0 0 4x 2y þ s 0 0 0 6x þ y þ t 2 6 6 6 6 4 3 7 7 7 7 5 Set the last three entries in the last column equal to zero to obtain the following homogeneous system whose solution set is U: x þ y þ z ¼ 0; 4x 2y þ s ¼ 0; 6x þ y þ t ¼ 0 Now let M0 be the matrix whose columns are the wi and v, and reduce M0 to echelon form: M0 ¼ 1 1 2 x 3 5 5 y 0 6 3 z 2 6 2 s 1 3 1 t 2 6 6 6 6 4 3 7 7 7 7 5 1 1 2 x 0 2 1 3x þ y 0 0 0 9x þ 3y þ z 0 0 0 4x 2y þ s 0 0 0 2x y þ t 2 6 6 6 6 4 3 7 7 7 7 5 Again set the last three entries in the last column equal to zero to obtain the following homogeneous system whose solution set is W: 9 þ 3 þ z ¼ 0; 4x 2y þ s ¼ 0; 2x y þ t ¼ 0 Combine both of the above systems to obtain a homogeneous system, whose solution space is U W, and reduce the system to echelon form, yielding x þ y þ z ¼ 0 2y þ 4z þ s ¼ 0 8z þ 5s þ 2t ¼ 0 s 2t ¼ 0 There is one free variable, which is t; hence, dimðU WÞ ¼ 1. Setting t ¼ 2, we obtain the solution u ¼ ð1; 4; 3; 4; 2Þ, which forms our required basis of U W. 4.55. Suppose U and W are distinct four-dimensional subspaces of a vector space V, where dim V ¼ 6. Find the possible dimensions of U W. Because U and W are distinct, U þ W properly contains U and W; consequently, dimðU þ WÞ 4. But dimðU þ WÞ cannot be greater than 6, as dim V ¼ 6. Hence, we have two possibilities: (a) dimðU þ WÞ ¼ 5 or (b) dimðU þ WÞ ¼ 6. By Theorem 4.20, dimðU WÞ ¼ dim U þ dim W dimðU þ WÞ ¼ 8 dimðU þ WÞ Thus (a) dimðU WÞ ¼ 3 or (b) dimðU WÞ ¼ 2. 150 CHAPTER 4 Vector Spaces
  • 158. 4.56. Let U and W be the following subspaces of R3 : U ¼ fða; b; cÞ : a ¼ b ¼ cg and W ¼ fð0; b; cÞg (Note that W is the yz-plane.) Show that R3 ¼ U W. First we show that U W ¼ f0g. Suppose v ¼ ða; b; cÞ 2 U W. Then a ¼ b ¼ c and a ¼ 0. Hence, a ¼ 0, b ¼ 0, c ¼ 0. Thus, v ¼ 0 ¼ ð0; 0; 0Þ. Next we show that R3 ¼ U þ W. For, if v ¼ ða; b; cÞ 2 R3 , then v ¼ ða; a; aÞ þ ð0; b a; c aÞ where ða; a; aÞ 2 U and ð0; b a; c aÞ 2 W Both conditions U W ¼ f0g and U þ W ¼ R3 imply that R3 ¼ U W. 4.57. Suppose that U and W are subspaces of a vector space V and that S ¼ fuig spans U and S0 ¼ fwjg spans W. Show that S [ S0 spans U þ W. (Accordingly, by induction, if Si spans Wi, for i ¼ 1; 2; . . . ; n, then S1 [ . . . [ Sn spans W1 þ þ Wn.) Let v 2 U þ W. Then v ¼ u þ w, where u 2 U and w 2 W. Because S spans U, u is a linear combination of ui, and as S0 spans W, w is a linear combination of wj; say u ¼ a1ui1 þ a2ui2 þ þ aruir and v ¼ b1wj1 þ b2wj2 þ þ bswjs where ai; bj 2 K. Then v ¼ u þ w ¼ a1ui1 þ a2ui2 þ þ aruir þ b1wj1 þ b2wj2 þ þ bswjs Accordingly, S [ S0 ¼ fui; wjg spans U þ W. 4.58. Prove Theorem 4.20: Suppose U and V are finite-dimensional subspaces of a vector space V. Then U þ W has finite dimension and dimðU þ WÞ ¼ dim U þ dim W dimðU WÞ Observe that U W is a subspace of both U and W. Suppose dim U ¼ m, dim W ¼ n, dimðU WÞ ¼ r. Suppose fv1; . . . ; vrg is a basis of U W. By Theorem 4.16, we can extend fvig to a basis of U and to a basis of W; say fv1; . . . ; vr; u1; . . . ; umrg and fv1; . . . ; vr; w1; . . . ; wnrg are bases of U and W, respectively. Let B ¼ fv1; . . . ; vr; u1; . . . ; umr; w1; . . . ; wnrg Note that B has exactly m þ n r elements. Thus, the theorem is proved if we can show that B is a basis of U þ W. Because fvi; ujg spans U and fvi; wkg spans W, the union B ¼ fvi; uj; wkg spans U þ W. Thus, it suffices to show that B is independent. Suppose a1v1 þ þ arvr þ b1u1 þ þ bmrumr þ c1w1 þ þ cnrwnr ¼ 0 ð1Þ where ai, bj, ck are scalars. Let v ¼ a1v1 þ þ arvr þ b1u1 þ þ bmrumr ð2Þ By (1), we also have v ¼ c1w1 cnrwnr ð3Þ Because fvi; ujg U, v 2 U by (2); and as fwkg W, v 2 W by (3). Accordingly, v 2 U W. Now fvig is a basis of U W, and so there exist scalars d1; . . . ; dr for which v ¼ d1v1 þ þ drvr. Thus, by (3), we have d1v1 þ þ drvr þ c1w1 þ þ cnrwnr ¼ 0 But fvi; wkg is a basis of W, and so is independent. Hence, the above equation forces c1 ¼ 0; . . . ; cnr ¼ 0. Substituting this into (1), we obtain a1v1 þ þ arvr þ b1u1 þ þ bmrumr ¼ 0 But fvi; ujg is a basis of U, and so is independent. Hence, the above equation forces a1 ¼ 0; . . . ; ar ¼ 0; b1 ¼ 0; . . . ; bmr ¼ 0. Because (1) implies that the ai, bj, ck are all 0, B ¼ fvi; uj; wkg is independent, and the theorem is proved. CHAPTER 4 Vector Spaces 151
  • 159. 4.59. Prove Theorem 4.21: V ¼ U W if and only if (i) V ¼ U þ W, (ii) U W ¼ f0g. Suppose V ¼ U W. Then any v 2 V can be uniquely written in the form v ¼ u þ w, where u 2 U and w 2 W. Thus, in particular, V ¼ U þ W. Now suppose v 2 U W. Then ð1Þ v ¼ v þ 0; where v 2 U; 0 2 W; ð2Þ v ¼ 0 þ v; where 0 2 U; v 2 W: Thus, v ¼ 0 þ 0 ¼ 0 and U W ¼ f0g. On the other hand, suppose V ¼ U þ W and U W ¼ f0g. Let v 2 V. Because V ¼ U þ W, there exist u 2 U and w 2 W such that v ¼ u þ w. We need to show that such a sum is unique. Suppose also that v ¼ u0 þ w0 , where u0 2 U and w0 2 W. Then u þ w ¼ u0 þ w0 ; and so u u0 ¼ w0 w But u u0 2 U and w0 w 2 W; hence, by U W ¼ f0g, u u0 ¼ 0; w0 w ¼ 0; and so u ¼ u0 ; w ¼ w0 Thus, such a sum for v 2 V is unique, and V ¼ U W. 4.60. Prove Theorem 4.22 (for two factors): Suppose V ¼ U W. Also, suppose S ¼ fu1; . . . ; umg and S0 ¼ fw1; . . . ; wng are linearly independent subsets of U and W, respectively. Then (a) The union S [ S0 is linearly independent in V. (b) If S and S0 are bases of U and W, respectively, then S [ S0 is a basis of V. (c) dim V ¼ dim U þ dim W. (a) Suppose a1u1 þ þ amum þ b1w1 þ þ bnwn ¼ 0, where ai, bj are scalars. Then ða1u1 þ þ amumÞ þ ðb1w1 þ þ bnwnÞ ¼ 0 ¼ 0 þ 0 where 0; a1u1 þ þ amum 2 U and 0; b1w1 þ þ bnwn 2 W. Because such a sum for 0 is unique, this leads to a1u1 þ þ amum ¼ 0 and b1w1 þ þ bnwn ¼ 0 Because S1 is linearly independent, each ai ¼ 0, and because S2 is linearly independent, each bj ¼ 0. Thus, S ¼ S1 [ S2 is linearly independent. (b) By part (a), S ¼ S1 [ S2 is linearly independent, and, by Problem 4.55, S ¼ S1 [ S2 spans V ¼ U þ W. Thus, S ¼ S1 [ S2 is a basis of V. (c) This follows directly from part (b). Coordinates 4.61. Relative to the basis S ¼ fu1; u2g ¼ fð1; 1Þ; ð2; 3Þg of R2 , find the coordinate vector of v, where (a) v ¼ ð4; 3Þ, (b) v ¼ ða; bÞ. In each case, set v ¼ xu1 þ yu2 ¼ xð1; 1Þ þ yð2; 3Þ ¼ ðx þ 2y; x þ 3yÞ and then solve for x and y. (a) We have ð4; 3Þ ¼ ðx þ 2y; x þ 3yÞ or x þ 2y ¼ 4 x þ 3y ¼ 3 The solution is x ¼ 18, y ¼ 7. Hence, ½v ¼ ½18; 7. (b) We have ða; bÞ ¼ ðx þ 2y; x þ 3yÞ or x þ 2y ¼ a x þ 3y ¼ b The solution is x ¼ 3a 2b, y ¼ a þ b. Hence, ½v ¼ ½3a 2b; a þ b. 152 CHAPTER 4 Vector Spaces
  • 160. 4.62. Find the coordinate vector of v ¼ ða; b; cÞ in R3 relative to (a) the usual basis E ¼ fð1; 0; 0Þ; ð0; 1; 0Þ; ð0; 0; 1Þg, (b) the basis S ¼ fu1; u2; u3g ¼ fð1; 1; 1Þ; ð1; 1; 0Þ; ð1; 0; 0Þg. (a) Relative to the usual basis E, the coordinates of ½vE are the same as v. That is, ½vE ¼ ½a; b; c. (b) Set v as a linear combination of u1, u2, u3 using unknown scalars x, y, z. This yields a b c 2 4 3 5 ¼ x 1 1 1 2 4 3 5 þ y 1 1 0 2 4 3 5 þ z 1 0 0 2 4 3 5 or x þ y þ z ¼ a x þ y ¼ b x ¼ c Solving the system yields x ¼ c, y ¼ b c, z ¼ a b. Thus, ½vS ¼ ½c; b c; a b. 4.63. Consider the vector space P3ðtÞ of polynomials of degree 3. (a) Show that S ¼ fðt 1Þ3 ; ðt 1Þ2 ; t 1; 1g is a basis of P3ðtÞ. (b) Find the coordinate vector ½v of v ¼ 3t3 4t2 þ 2t 5 relative to S. (a) The degree of ðt 1Þk is k; writing the polynomials of S in reverse order, we see that no polynomial is a linear combination of preceding polynomials. Thus, the polynomials are linearly independent, and, because dim P3ðtÞ ¼ 4, they form a basis of P3ðtÞ. (b) Set v as a linear combination of the basis vectors using unknown scalars x, y, z, s. We have v ¼ 3t3 þ 4t2 þ 2t 5 ¼ xðt 1Þ3 þ yðt 1Þ2 þ zðt 1Þ þ sð1Þ ¼ xðt3 3t2 þ 3t 1Þ þ yðt2 2t þ 1Þ þ zðt 1Þ þ sð1Þ ¼ xt3 3xt2 þ 3xt x þ yt2 2yt þ y þ zt z þ s ¼ xt3 þ ð3x þ yÞt2 þ ð3x 2y þ zÞt þ ðx þ y z þ sÞ Then set coefficients of the same powers of t equal to each other to obtain x ¼ 3; 3x þ y ¼ 4; 3x 2y þ z ¼ 2; x þ y z þ s ¼ 5 Solving the system yields x ¼ 3, y ¼ 13, z ¼ 19, s ¼ 4. Thus, ½v ¼ ½3; 13; 19; 4. 4.64. Find the coordinate vector of A ¼ 2 3 4 7 in the real vector space M ¼ M2;2 relative to (a) the basis S ¼ 1 1 1 1 ; 1 1 1 0 ; 1 1 0 0 ; 1 0 0 0 , (b) the usual basis E ¼ 1 0 0 0 ; 0 1 0 0 ; 0 0 1 0 ; 0 0 0 1 (a) Set A as a linear combination of the basis vectors using unknown scalars x, y, z, t as follows: A ¼ 2 3 4 7 ¼ x 1 1 1 1 þ y 1 1 1 0 þ z 1 1 0 0 þ t 1 0 0 0 ¼ x þ z þ t x y z x þ y x Set corresponding entries equal to each other to obtain the system x þ z þ t ¼ 2; x y z ¼ 3; x þ y ¼ 4; x ¼ 7 Solving the system yields x ¼ 7, y ¼ 11, z ¼ 21, t ¼ 30. Thus, ½AS ¼ ½7; 11; 21; 30. (Note that the coordinate vector of A is a vector in R4 , because dim M ¼ 4.) (b) Expressing A as a linear combination of the basis matrices yields 2 3 4 7 ¼ x 1 0 0 0 þ y 0 1 0 0 þ z 0 0 1 0 þ t 0 0 0 1 ¼ x y z t Thus, x ¼ 2, y ¼ 3, z ¼ 4, t ¼ 7. Hence, ½A ¼ ½2; 3; 4; 7, whose components are the elements of A written row by row. CHAPTER 4 Vector Spaces 153
  • 161. Remark: This result is true in general; that is, if A is any m n matrix in M ¼ Mm;n, then the coordinates of A relative to the usual basis of M are the elements of A written row by row. 4.65. In the space M ¼ M2;3, determine whether or not the following matrices are linearly dependent: A ¼ 1 2 3 4 0 5 ; B ¼ 2 4 7 10 1 13 ; C ¼ 1 2 5 8 2 11 If the matrices are linearly dependent, find the dimension and a basis of the subspace W of M spanned by the matrices. The coordinate vectors of the above matrices relative to the usual basis of M are as follows: ½A ¼ ½1; 2; 3; 4; 0; 5; ½B ¼ ½2; 4; 7; 10; 1; 13; ½C ¼ ½1; 2; 5; 8; 2; 11 Form the matrix M whose rows are the above coordinate vectors, and reduce M to echelon form: M ¼ 1 2 3 4 0 5 2 4 7 10 1 13 1 2 5 8 2 11 2 4 3 5 1 2 3 4 0 5 0 0 1 2 1 3 0 0 0 0 0 0 2 4 3 5 Because the echelon matrix has only two nonzero rows, the coordinate vectors ½A, ½B, ½C span a space of dimension two, and so they are linearly dependent. Thus, A, B, C are linearly dependent. Furthermore, dim W ¼ 2, and the matrices w1 ¼ 1 2 3 4 0 5 and w2 ¼ 0 0 1 2 1 3 corresponding to the nonzero rows of the echelon matrix form a basis of W. Miscellaneous Problems 4.66. Consider a finite sequence of vectors S ¼ fv1; v2; . . . ; vng. Let T be the sequence of vectors obtained from S by one of the following ‘‘elementary operations’’: (i) interchange two vectors, (ii) multiply a vector by a nonzero scalar, (iii) add a multiple of one vector to another. Show that S and T span the same space W. Also show that T is independent if and only if S is independent. Observe that, for each operation, the vectors in T are linear combinations of vectors in S. On the other hand, each operation has an inverse of the same type (Prove!); hence, the vectors in S are linear combinations of vectors in T. Thus S and T span the same space W. Also, T is independent if and only if dim W ¼ n, and this is true if and only if S is also independent. 4.67. Let A ¼ ½aij and B ¼ ½bij be row equivalent m n matrices over a field K, and let v1; . . . ; vn be any vectors in a vector space V over K. Let u1 ¼ a11v1 þ a12v2 þ þ a1nvn u2 ¼ a21v1 þ a22v2 þ þ a2nvn um ¼ am1v1 þ am2v2 þ þ amnvn w1 ¼ b11v1 þ b12v2 þ þ b1nvn w2 ¼ b21v1 þ b22v2 þ þ b2nvn ::::::::::::::::::::::::::::::::::::::::::::::::::::: ::::::::::::::::::::::::::::::::::::::::::::::::::::::: wm ¼ bm1v1 þ bm2v2 þ þ bmnvn Show that fuig and fwig span the same space. Applying an ‘‘elementary operation’’ of Problem 4.66 to fuig is equivalent to applying an elementary row operation to the matrix A. Because A and B are row equivalent, B can be obtained from A by a sequence of elementary row operations; hence, fwig can be obtained from fuig by the corresponding sequence of operations. Accordingly, fuig and fwig span the same space. 4.68. Let v1; . . . ; vn belong to a vector space V over K, and let P ¼ ½aij be an n-square matrix over K. Let w1 ¼ a11v1 þ a12v2 þ þ a1nvn; . . . ; wn ¼ an1v1 þ an2v2 þ þ annvn (a) Suppose P is invertible. Show that fwig and fvig span the same space; hence, fwig is independent if and only if fvig is independent. (b) Suppose P is not invertible. Show that fwig is dependent. (c) Suppose fwig is independent. Show that P is invertible. 154 CHAPTER 4 Vector Spaces
  • 162. (a) Because P is invertible, it is row equivalent to the identity matrix I. Hence, by Problem 4.67, fwig and fvig span the same space. Thus, one is independent if and only if the other is. (b) Because P is not invertible, it is row equivalent to a matrix with a zero row. This means that fwig spans a space that has a spanning set of less than n elements. Thus, fwig is dependent. (c) This is the contrapositive of the statement of (b), and so it follows from (b). 4.69. Suppose that A1; A2; . . . are linearly independent sets of vectors, and that A1 A2 . . .. Show that the union A ¼ A1 [ A2 [ . . . is also linearly independent. Suppose A is linearly dependent. Then there exist vectors v1; . . . ; vn 2 A and scalars a1; . . . ; an 2 K, not all of them 0, such that a1v1 þ a2v2 þ þ anvn ¼ 0 ð1Þ Because A ¼ [ Ai and the vi 2 A, there exist sets Ai1 ; . . . ; Ain such that v1 2 Ai1 ; v2 2 Ai2 ; . . . ; vn 2 Ain Let k be the maximum index of the sets Aij : k ¼ maxði1; . . . ; inÞ. It follows then, as A1 A2 . . . ; that each Aij is contained in Ak. Hence, v1; v2; . . . ; vn 2 Ak, and so, by (1), Ak is linearly dependent, which contradicts our hypothesis. Thus, A is linearly independent. 4.70. Let K be a subfield of a field L, and let L be a subfield of a field E. (Thus, K L E, and K is a subfield of E.) Suppose E is of dimension n over L, and L is of dimension m over K. Show that E is of dimension mn over K. Suppose fv1; . . . ; vng is a basis of E over L and fa1; . . . ; amg is a basis of L over K. We claim that faivj : i ¼ 1; . . . ; m; j ¼ 1; . . . ; ng is a basis of E over K. Note that faivjg contains mn elements. Let w be any arbitrary element in E. Because fv1; . . . ; vng spans E over L, w is a linear combination of the vi with coefficients in L: w ¼ b1v1 þ b2v2 þ þ bnvn; bi 2 L ð1Þ Because fa1; . . . ; amg spans L over K, each bi 2 L is a linear combination of the aj with coefficients in K: b1 ¼ k11a1 þ k12a2 þ þ k1mam b2 ¼ k21a1 þ k22a2 þ þ k2mam :::::::::::::::::::::::::::::::::::::::::::::::::: bn ¼ kn1a1 þ kn2a2 þ þ kmnam where kij 2 K. Substituting in (1), we obtain w ¼ ðk11a1 þ þ k1mamÞv1 þ ðk21a1 þ þ k2mamÞv2 þ þ ðkn1a1 þ þ knmamÞvn ¼ k11a1v1 þ þ k1mamv1 þ k21a1v2 þ þ k2mamv2 þ þ kn1a1vn þ þ knmamvn ¼ P i;j kjiðaivjÞ where kji 2 K. Thus, w is a linear combination of the aivj with coefficients in K; hence, faivjg spans E over K. The proof is complete if we show that faivjg is linearly independent over K. Suppose, for scalars xji 2 K; we have P i;j xjiðaivjÞ ¼ 0; that is, ðx11a1v1 þ x12a2v1 þ þ x1mamv1Þ þ þ ðxn1a1vn þ xn2a2vn þ þ xnmamvmÞ ¼ 0 or ðx11a1 þ x12a2 þ þ x1mamÞv1 þ þ ðxn1a1 þ xn2a2 þ þ xnmamÞvn ¼ 0 Because fv1; . . . ; vng is linearly independent over L and the above coefficients of the vi belong to L, each coefficient must be 0: x11a1 þ x12a2 þ þ x1mam ¼ 0; . . . ; xn1a1 þ xn2a2 þ þ xnmam ¼ 0 CHAPTER 4 Vector Spaces 155
  • 163. But fa1; . . . ; amg is linearly independent over K; hence, because the xji 2 K, x11 ¼ 0; x12 ¼ 0; . . . ; x1m ¼ 0; . . . ; xn1 ¼ 0; xn2 ¼ 0; . . . ; xnm ¼ 0 Accordingly, faivjg is linearly independent over K, and the theorem is proved. SUPPLEMENTARY PROBLEMS Vector Spaces 4.71. Suppose u and v belong to a vector space V. Simplify each of the following expressions: (a) E1 ¼ 4ð5u 6vÞ þ 2ð3u þ vÞ, (c) E3 ¼ 6ð3u þ 2vÞ þ 5u 7v, (b) E2 ¼ 5ð2u 3vÞ þ 4ð7v þ 8Þ, (d) E4 ¼ 3ð5u þ 2=vÞ: 4.72. Let V be the set of ordered pairs (a; b) of real numbers with addition in V and scalar multiplication on V defined by ða; bÞ þ ðc; dÞ ¼ ða þ c; b þ dÞ and kða; bÞ ¼ ðka; 0Þ Show that V satisfies all the axioms of a vector space except [M4]—that is, except 1u ¼ u. Hence, [M4] is not a consequence of the other axioms. 4.73. Show that Axiom [A4] of a vector space V (that u þ v ¼ v þ u) can be derived from the other axioms for V. 4.74. Let V be the set of ordered pairs (a; b) of real numbers. Show that V is not a vector space over R with addition and scalar multiplication defined by (i) ða; bÞ þ ðc; dÞ ¼ ða þ d; b þ cÞ and kða; bÞ ¼ ðka; kbÞ, (ii) ða; bÞ þ ðc; dÞ ¼ ða þ c; b þ dÞ and kða; bÞ ¼ ða; bÞ, (iii) ða; bÞ þ ðc; dÞ ¼ ð0; 0Þ and kða; bÞ ¼ ðka; kbÞ, (iv) ða; bÞ þ ðc; dÞ ¼ ðac; bdÞ and kða; bÞ ¼ ðka; kbÞ. 4.75. Let V be the set of infinite sequences (a1; a2; . . .) in a field K. Show that V is a vector space over K with addition and scalar multiplication defined by ða1; a2; . . .Þ þ ðb1; b2; . . .Þ ¼ ða1 þ b1; a2 þ b2; . . .Þ and kða1; a2; . . .Þ ¼ ðka1; ka2; . . .Þ 4.76. Let U and W be vector spaces over a field K. Let V be the set of ordered pairs (u; w) where u 2 U and w 2 W. Show that V is a vector space over K with addition in V and scalar multiplication on V defined by ðu; wÞ þ ðu0 ; w0 Þ ¼ ðu þ u0 ; w þ w0 Þ and kðu; wÞ ¼ ðku; kwÞ (This space V is called the external direct product of U and W.) Subspaces 4.77. Determine whether or not W is a subspace of R3 where W consists of all vectors (a; b; c) in R3 such that (a) a ¼ 3b, (b) a b c, (c) ab ¼ 0, (d) a þ b þ c ¼ 0, (e) b ¼ a2 , ( f ) a ¼ 2b ¼ 3c. 4.78. Let V be the vector space of n-square matrices over a field K. Show that W is a subspace of V if W consists of all matrices A ¼ ½aij that are (a) symmetric (AT ¼ A or aij ¼ aji), (b) (upper) triangular, (c) diagonal, (d) scalar. 4.79. Let AX ¼ B be a nonhomogeneous system of linear equations in n unknowns; that is, B 6¼ 0. Show that the solution set is not a subspace of Kn . 4.80. Suppose U and W are subspaces of V for which U [ W is a subspace. Show that U W or W U. 4.81. Let V be the vector space of all functions from the real field R into R. Show that W is a subspace of V where W consists of all: (a) bounded functions, (b) even functions. [Recall that f : R ! R is bounded if 9M 2 R such that 8x 2 R, we have j f ðxÞj M; and f ðxÞ is even if f ðxÞ ¼ f ðxÞ; 8x 2 R.] 156 CHAPTER 4 Vector Spaces
  • 164. 4.82. Let V be the vector space (Problem 4.75) of infinite sequences (a1; a2; . . .) in a field K. Show that W is a subspace of V if W consists of all sequences with (a) 0 as the first element, (b) only a finite number of nonzero elements. Linear Combinations, Linear Spans 4.83. Consider the vectors u ¼ ð1; 2; 3Þ and v ¼ ð2; 3; 1Þ in R3 . (a) Write w ¼ ð1; 3; 8Þ as a linear combination of u and v. (b) Write w ¼ ð2; 4; 5Þ as a linear combination of u and v. (c) Find k so that w ¼ ð1; k; 4Þ is a linear combination of u and v. (d) Find conditions on a, b, c so that w ¼ ða; b; cÞ is a linear combination of u and v. 4.84. Write the polynomial f ðtÞ ¼ at2 þ bt þ c as a linear combination of the polynomials p1 ¼ ðt 1Þ2 , p2 ¼ t 1, p3 ¼ 1. [Thus, p1, p2, p3 span the space P2ðtÞ of polynomials of degree 2.] 4.85. Find one vector in R3 that spans the intersection of U and W where U is the xy-plane—that is, U ¼ fða; b; 0Þg—and W is the space spanned by the vectors (1, 1, 1) and (1, 2, 3). 4.86. Prove that span(S) is the intersection of all subspaces of V containing S. 4.87. Show that spanðSÞ ¼ spanðS [ f0gÞ. That is, by joining or deleting the zero vector from a set, we do not change the space spanned by the set. 4.88. Show that (a) If S T, then spanðSÞ spanðTÞ. (b) span½spanðSÞ ¼ spanðSÞ. Linear Dependence and Linear Independence 4.89. Determine whether the following vectors in R4 are linearly dependent or independent: (a) ð1; 2; 3; 1Þ, ð3; 7; 1; 2Þ, ð1; 3; 7; 4Þ; (b) ð1; 3; 1; 2Þ, ð2; 5; 1; 3Þ, ð1; 3; 7; 2Þ. 4.90. Determine whether the following polynomials u, v, w in PðtÞ are linearly dependent or independent: (a) u ¼ t3 4t2 þ 3t þ 3, v ¼ t3 þ 2t2 þ 4t 1, w ¼ 2t3 t2 3t þ 5; (b) u ¼ t3 5t2 2t þ 3, v ¼ t3 4t2 3t þ 4, w ¼ 2t3 17t2 7t þ 9. 4.91. Show that the following functions f , g, h are linearly independent: (a) f ðtÞ ¼ et , gðtÞ ¼ sin t, hðtÞ ¼ t2 ; (b) f ðtÞ ¼ et , gðtÞ ¼ e2t , hðtÞ ¼ t. 4.92. Show that u ¼ ða; bÞ and v ¼ ðc; dÞ in K2 are linearly dependent if and only if ad bc ¼ 0. 4.93. Suppose u, v, w are linearly independent vectors. Prove that S is linearly independent where (a) S ¼ fu þ v 2w; u v w; u þ wg; (b) S ¼ fu þ v 3w; u þ 3v w; v þ wg. 4.94. Suppose fu1; . . . ; ur; w1; . . . ; wsg is a linearly independent subset of V. Show that spanðuiÞ spanðwjÞ ¼ f0g 4.95. Suppose v1; v2; . . . ; vn are linearly independent. Prove that S is linearly independent where (a) S ¼ fa1v1; a2v2; . . . ; anvng and each ai 6¼ 0. (b) S ¼ fv1; . . . ; vk1; w; vkþ1; . . . ; vng and w ¼ P i bivi and bk 6¼ 0. 4.96. Suppose ða11; . . . ; a1nÞ; ða21; . . . ; a2nÞ; . . . ; ðam1; . . . ; amnÞ are linearly independent vectors in Kn , and suppose v1; v2; . . . ; vn are linearly independent vectors in a vector space V over K. Show that the following CHAPTER 4 Vector Spaces 157
  • 165. vectors are also linearly independent: w1 ¼ a11v1 þ þ a1nvn; w2 ¼ a21v1 þ þ a2nvn; . . . ; wm ¼ am1v1 þ þ amnvn Basis and Dimension 4.97. Find a subset of u1, u2, u3, u4 that gives a basis for W ¼ spanðuiÞ of R5 , where (a) u1 ¼ ð1; 1; 1; 2; 3Þ, u2 ¼ ð1; 2; 1; 2; 1Þ, u3 ¼ ð3; 5; 1; 2; 5Þ, u4 ¼ ð1; 2; 1; 1; 4Þ (b) u1 ¼ ð1; 2; 1; 3; 1Þ, u2 ¼ ð2; 4; 2; 6; 2Þ, u3 ¼ ð1; 3; 1; 2; 1Þ, u4 ¼ ð3; 7; 3; 8; 1Þ (c) u1 ¼ ð1; 0; 1; 0; 1Þ, u2 ¼ ð1; 1; 2; 1; 0Þ, u3 ¼ ð2; 1; 3; 1; 1Þ, u4 ¼ ð1; 2; 1; 1; 1Þ (d) u1 ¼ ð1; 0; 1; 1; 1Þ, u2 ¼ ð2; 1; 2; 0; 1Þ, u3 ¼ ð1; 1; 2; 3; 4Þ, u4 ¼ ð4; 2; 5; 4; 6Þ 4.98. Consider the subspaces U ¼ fða; b; c; dÞ : b 2c þ d ¼ 0g and W ¼ fða; b; c; dÞ : a ¼ d; b ¼ 2cg of R4 . Find a basis and the dimension of (a) U, (b) W, (c) U W. 4.99. Find a basis and the dimension of the solution space W of each of the following homogeneous systems: ðaÞ x þ 2y 2z þ 2s t ¼ 0 x þ 2y z þ 3s 2t ¼ 0 2x þ 4y 7z þ s þ t ¼ 0 ðbÞ x þ 2y z þ 3s 4t ¼ 0 2x þ 4y 2z s þ 5t ¼ 0 2x þ 4y 2z þ 4s 2t ¼ 0 4.100. Find a homogeneous system whose solution space is spanned by the following sets of three vectors: (a) ð1; 2; 0; 3; 1Þ, ð2; 3; 2; 5; 3Þ, ð1; 2; 1; 2; 2Þ; (b) (1, 1, 2, 1, 1), (1, 2, 1, 4, 3), (3, 5, 4, 9, 7). 4.101. Determine whether each of the following is a basis of the vector space PnðtÞ: (a) f1; 1 þ t; 1 þ t þ t2 ; 1 þ t þ t2 þ t3 ; . . . ; 1 þ t þ t2 þ þ tn1 þ tn g; (b) f1 þ t; t þ t2 ; t2 þ t3 ; . . . ; tn2 þ tn1 ; tn1 þ tn g: 4.102. Find a basis and the dimension of the subspace W of PðtÞ spanned by (a) u ¼ t3 þ 2t2 2t þ 1, v ¼ t3 þ 3t2 3t þ 4, w ¼ 2t3 þ t2 7t 7, (b) u ¼ t3 þ t2 3t þ 2, v ¼ 2t3 þ t2 þ t 4, w ¼ 4t3 þ 3t2 5t þ 2. 4.103. Find a basis and the dimension of the subspace W of V ¼ M2;2 spanned by A ¼ 1 5 4 2 ; B ¼ 1 1 1 5 ; C ¼ 2 4 5 7 ; D ¼ 1 7 5 1 Rank of a Matrix, Row and Column Spaces 4.104. Find the rank of each of the following matrices: (a) 1 3 2 5 4 1 4 1 3 5 1 4 2 4 3 2 7 3 6 13 2 6 6 4 3 7 7 5, (b) 1 2 3 2 1 3 2 0 3 8 7 2 2 1 9 10 2 6 6 4 3 7 7 5, (c) 1 1 2 4 5 5 5 8 1 1 2 2 2 6 6 4 3 7 7 5 4.105. For k ¼ 1; 2; . . . ; 5, find the number nk of linearly independent subsets consisting of k columns for each of the following matrices: (a) A ¼ 1 1 0 2 3 1 2 0 2 5 1 3 0 2 7 2 4 3 5, (b) B ¼ 1 2 1 0 2 1 2 3 0 4 1 1 5 0 6 2 4 3 5 158 CHAPTER 4 Vector Spaces
  • 166. 4.106. Let (a) A ¼ 1 2 1 3 1 6 2 4 3 8 3 15 1 2 2 5 3 11 4 8 6 16 7 32 2 6 6 4 3 7 7 5, (b) B ¼ 1 2 2 1 2 1 2 4 5 4 5 5 1 2 3 4 4 6 3 6 7 7 9 10 2 6 6 4 3 7 7 5 For each matrix (where C1; . . . ; C6 denote its columns): (i) Find its row canonical form M. (ii) Find the columns that are linear combinations of preceding columns. (iii) Find columns (excluding C6) that form a basis for the column space. (iv) Express C6 as a linear combination of the basis vectors obtained in (iii). 4.107. Determine which of the following matrices have the same row space: A ¼ 1 2 1 3 4 5 ; B ¼ 1 1 2 2 3 1 ; C ¼ 1 1 3 2 1 10 3 5 1 2 4 3 5 4.108. Determine which of the following subspaces of R3 are identical: U1 ¼ span½ð1; 1; 1Þ; ð2; 3; 1Þ; ð3; 1; 5Þ; U2 ¼ span½ð1; 1; 3Þ; ð3; 2; 8Þ; ð2; 1; 3Þ U3 ¼ span½ð1; 1; 1Þ; ð1; 1; 3Þ; ð3; 1; 7Þ 4.109. Determine which of the following subspaces of R4 are identical: U1 ¼ span½ð1; 2; 1; 4Þ; ð2; 4; 1; 5Þ; ð3; 6; 2; 9Þ; U2 ¼ span½ð1; 2; 1; 2Þ; ð2; 4; 1; 3Þ; U3 ¼ span½ð1; 2; 3; 10Þ; ð2; 4; 3; 11Þ 4.110. Find a basis for (i) the row space and (ii) the column space of each matrix M: (a) M ¼ 0 0 3 1 4 1 3 1 2 1 3 9 4 5 2 4 12 8 8 7 2 6 6 4 3 7 7 5, (b) M ¼ 1 2 1 0 1 1 2 2 1 3 3 6 5 2 7 2 4 1 1 0 2 6 6 4 3 7 7 5. 4.111. Show that if any row is deleted from a matrix in echelon (respectively, row canonical) form, then the resulting matrix is still in echelon (respectively, row canonical) form. 4.112. Let A and B be arbitrary m n matrices. Show that rankðA þ BÞ rankðAÞ þ rankðBÞ. 4.113. Let r ¼ rankðA þ BÞ. Find 2 2 matrices A and B such that (a) r rankðAÞ, rank(B); (b) r ¼ rankðAÞ ¼ rankðBÞ; (c) r rankðAÞ, rank(B). Sums, Direct Sums, Intersections 4.114. Suppose U and W are two-dimensional subspaces of K3 . Show that U W 6¼ f0g. 4.115. Suppose U and W are subspaces of V such that dim U ¼ 4, dim W ¼ 5, and dim V ¼ 7. Find the possible dimensions of U W. 4.116. Let U and W be subspaces of R3 for which dim U ¼ 1, dim W ¼ 2, and U 6 W. Show that R3 ¼ U W. 4.117. Consider the following subspaces of R5 : U ¼ span½ð1; 1; 1; 2; 0Þ; ð1; 2; 2; 0; 3Þ; ð1; 1; 2; 2; 1Þ W ¼ span½ð1; 2; 3; 0; 2Þ; ð1; 1; 3; 2; 4Þ; ð1; 1; 2; 2; 5Þ CHAPTER 4 Vector Spaces 159
  • 167. (a) Find two homogeneous systems whose solution spaces are U and W, respectively. (b) Find a basis and the dimension of U W. 4.118. Let U1, U2, U3 be the following subspaces of R3 : U1 ¼ fða; b; cÞ : a ¼ cg; U2 ¼ fða; b; cÞ : a þ b þ c ¼ 0g; U3 ¼ fð0; 0; cÞg Show that (a) R3 ¼ U1 þ U2, (b) R3 ¼ U2 þ U3, (c) R3 ¼ U1 þ U3. When is the sum direct? 4.119. Suppose U, W1, W2 are subspaces of a vector space V. Show that ðU W1Þ þ ðU W2Þ U ðW1 þ W2Þ Find subspaces of R2 for which equality does not hold. 4.120. Suppose W1; W2; . . . ; Wr are subspaces of a vector space V. Show that (a) spanðW1; W2; . . . ; WrÞ ¼ W1 þ W2 þ þ Wr. (b) If Si spans Wi for i ¼ 1; . . . ; r, then S1 [ S2 [ [ Sr spans W1 þ W2 þ þ Wr. 4.121. Suppose V ¼ U W. Show that dim V ¼ dim U þ dim W. 4.122. Let S and T be arbitrary nonempty subsets (not necessarily subspaces) of a vector space V and let k be a scalar. The sum S þ T and the scalar product kS are defined by S þ T ¼ ðu þ v : u 2 S; v 2 Tg; kS ¼ fku : u 2 Sg [We also write w þ S for fwg þ S.] Let S ¼ fð1; 2Þ; ð2; 3Þg; T ¼ fð1; 4Þ; ð1; 5Þ; ð2; 5Þg; w ¼ ð1; 1Þ; k ¼ 3 Find: (a) S þ T, (b) w þ S, (c) kS, (d) kT, (e) kS þ kT, (f) kðS þ TÞ. 4.123. Show that the above operations of S þ T and kS satisfy (a) Commutative law: S þ T ¼ T þ S. (b) Associative law: ðS1 þ S2Þ þ S3 ¼ S1 þ ðS2 þ S3Þ. (c) Distributive law: kðS þ TÞ ¼ kS þ kT. (d) S þ f0g ¼ f0g þ S ¼ S and S þ V ¼ V þ S ¼ V. 4.124. Let V be the vector space of n-square matrices. Let U be the subspace of upper triangular matrices, and let W be the subspace of lower triangular matrices. Find (a) U W, (b) U þ W. 4.125. Let V be the external direct sum of vector spaces U and W over a field K. (See Problem 4.76.) Let ^ U ¼ fðu; 0Þ : u 2 Ug and ^ W ¼ fð0; wÞ : w 2 Wg Show that (a) ^ U and ^ W are subspaces of V, (b) V ¼ ^ U ^ W. 4.126. Suppose V ¼ U þ W. Let ^ V be the external direct sum of U and W. Show that V is isomorphic to ^ V under the correspondence v ¼ u þ w $ ðu; wÞ. 4.127. Use induction to prove (a) Theorem 4.22, (b) Theorem 4.23. Coordinates 4.128. The vectors u1 ¼ ð1; 2Þ and u2 ¼ ð4; 7Þ form a basis S of R2 . Find the coordinate vector ½v of v relative to S where (a) v ¼ ð5; 3Þ, (b) v ¼ ða; bÞ. 4.129. The vectors u1 ¼ ð1; 2; 0Þ, u2 ¼ ð1; 3; 2Þ, u3 ¼ ð0; 1; 3Þ form a basis S of R3 . Find the coordinate vector ½v of v relative to S where (a) v ¼ ð2; 7; 4Þ, (b) v ¼ ða; b; cÞ. 160 CHAPTER 4 Vector Spaces
  • 168. 4.130. S ¼ ft3 þ t2 ; t2 þ t; t þ 1; 1g is a basis of P3ðtÞ. Find the coordinate vector ½v of v relative to S where (a) v ¼ 2t3 þ t2 4t þ 2, (b) v ¼ at3 þ bt2 þ ct þ d. 4.131. Let V ¼ M2;2. Find the coordinate vector [A] of A relative to S where S ¼ 1 1 1 1 ; 1 1 1 0 ; 1 1 0 0 ; 1 0 0 0 and ðaÞ A ¼ 3 5 6 7 ; ðbÞ A ¼ a b c d 4.132. Find the dimension and a basis of the subspace W of P3ðtÞ spanned by u ¼ t3 þ 2t2 3t þ 4; v ¼ 2t3 þ 5t2 4t þ 7; w ¼ t3 þ 4t2 þ t þ 2 4.133. Find the dimension and a basis of the subspace W of M ¼ M2;3 spanned by A ¼ 1 2 1 3 1 2 ; B ¼ 2 4 3 7 5 6 ; C ¼ 1 2 3 5 7 6 Miscellaneous Problems 4.134. Answer true or false. If false, prove it with a counterexample. (a) If u1, u2, u3 span V, then dim V ¼ 3. (b) If A is a 4 8 matrix, then any six columns are linearly dependent. (c) If u1, u2, u3 are linearly independent, then u1, u2, u3, w are linearly dependent. (d) If u1, u2, u3, u4 are linearly independent, then dim V 4. (e) If u1, u2, u3 span V, then w, u1, u2, u3 span V. (f) If u1, u2, u3, u4 are linearly independent, then u1, u2, u3 are linearly independent. 4.135. Answer true or false. If false, prove it with a counterexample. (a) If any column is deleted from a matrix in echelon form, then the resulting matrix is still in echelon form. (b) If any column is deleted from a matrix in row canonical form, then the resulting matrix is still in row canonical form. (c) If any column without a pivot is deleted from a matrix in row canonical form, then the resulting matrix is in row canonical form. 4.136. Determine the dimension of the vector space W of the following n-square matrices: (a) symmetric matrices, (b) antisymmetric matrices, (d) diagonal matrices, (c) scalar matrices. 4.137. Let t1; t2; . . . ; tn be symbols, and let K be any field. Let V be the following set of expressions where ai 2 K: a1t1 þ a2t2 þ þ antn Define addition in V and scalar multiplication on V by ða1t1 þ þ antnÞ þ ðb1t1 þ þ bntnÞ ¼ ða1 þ b1Þt1 þ þ ðanbnmÞtn kða1t1 þ a2t2 þ þ antnÞ ¼ ka1t1 þ ka2t2 þ þ kantn Show that V is a vector space over K with the above operations. Also, show that ft1; . . . ; tng is a basis of V, where tj ¼ 0t1 þ þ 0tj1 þ 1tj þ 0tjþ1 þ þ 0tn CHAPTER 4 Vector Spaces 161
  • 169. ANSWERS TO SUPPLEMENTARY PROBLEMS [Some answers, such as bases, need not be unique.] 4.71. (a) E1 ¼ 26u 22v; (b) The sum 7v þ 8 is not defined, so E2 is not defined; (c) E3 ¼ 23u þ 5v; (d) Division by v is not defined, so E4 is not defined. 4.77. (a) Yes; (b) No; e.g., ð1; 2; 3Þ 2 W but 2ð1; 2; 3Þ 62 W; (c) No; e.g., ð1; 0; 0Þ; ð0; 1; 0Þ 2 W, but not their sum; (d) Yes; (e) No; e.g., ð1; 1; 1Þ 2 W, but 2ð1; 1; 1Þ 62 W; (f) Yes 4.79. The zero vector 0 is not a solution. 4.83. (a) w ¼ 3u1 u2, (b) Impossible, (c) k ¼ 11 5 , (d) 7a 5b þ c ¼ 0 4.84. Using f ¼ xp1 þ yp2 þ zp3, we get x ¼ a, y ¼ 2a þ b, z ¼ a þ b þ c 4.85. v ¼ ð2; 1; 0Þ 4.89. (a) Dependent, (b) Independent 4.90. (a) Independent, (b) Dependent 4.97. (a) u1, u2, u4; (b) u1, u2, u3; (c) u1, u2, u4; (d) u1, u2, u3 4.98. (a) dim U ¼ 3, (b) dim W ¼ 2, (c) dimðU WÞ ¼ 1 4.99. (a) Basis: fð2; 1; 0; 0; 0Þ; ð4; 0; 1; 1; 0Þ; ð3; 0; 1; 0; 1Þg; dim W ¼ 3; (b) Basis: fð2; 1; 0; 0; 0Þ; ð1; 0; 1; 0; 0Þg; dim W ¼ 2 4.100. (a) 5x þ y z s ¼ 0; x þ y z t ¼ 0; (b) 3x y z ¼ 0; 2x 3y þ s ¼ 0; x 2y þ t ¼ 0 4.101. (a) Yes, (b) No, because dim PnðtÞ ¼ n þ 1, but the set contains only n elements. 4.102. (a) dim W ¼ 2, (b) dim W ¼ 3 4.103. dim W ¼ 2 4.104. (a) 3, (b) 2, (c) 3 4.105. (a) n1 ¼ 4; n2 ¼ 5; n3 ¼ n4 ¼ n5 ¼ 0; (b) n1 ¼ 4; n2 ¼ 6; n3 ¼ 3; n4 ¼ n5 ¼ 0 4.106. (a) (i) M ¼ ½1; 2; 0; 1; 0; 3; 0; 0; 1; 2; 0; 1; 0; 0; 0; 0; 1; 2; 0; (ii) C2, C4, C6; (iii) C1, C3, C5; (iv) C6 ¼ 3C1 þ C3 þ 2C5. (b) (i) M ¼ ½1; 2; 0; 0; 3; 1; 0; 0; 1; 0; 1; 1; 0; 0; 0; 1; 1; 2; 0; (ii) C2, C5, C6; (iii) C1, C3, C4; (iv) C6 ¼ C1 C3 þ 2C4 4.107. A and C are row equivalent to 1 0 7 0 1 4 , but not B 4.108. U1 and U2 are row equivalent to 1 0 2 0 1 1 , but not U3 4.109. U1 and U3 are row equivalent to 1 2 0 1 0 0 1 3 ; but not U2 4.110. (a) (i) ð1; 3; 1; 2; 1Þ, ð0; 0; 1; 1; 1Þ, ð0; 0; 0; 4; 7Þ; (ii) C1, C3, C4; (b) (i) ð1; 2; 1; 0; 1Þ, ð0; 0; 1; 1; 2Þ; (ii) C1, C3 162 CHAPTER 4 Vector Spaces
  • 170. 4.113. (a) A ¼ 1 1 0 0 ; B ¼ 1 1 0 0 ; (b) A ¼ 1 0 0 0 ; B ¼ 0 2 0 0 ; (c) A ¼ 1 0 0 0 ; B ¼ 0 0 0 1 4.115. dimðU WÞ ¼ 2, 3, or 4 4.117. (a) (i) 3x þ 4y z t ¼ 0 4x þ 2y þ s ¼ 0 (ii) 4x þ 2y s ¼ 0 9x þ 2y þ z þ t ¼ 0 ; (b) Basis: fð1; 2; 5; 0; 0Þ; ð0; 0; 1; 0; 1Þg; dimðU WÞ ¼ 2 4.118. The sum is direct in (b) and (c). 4.119. In R2 , let U, V, W be, respectively, the line y ¼ x, the x-axis, the y-axis. 4.122. (a) fð2; 6Þ; ð2; 7Þ; ð3; 7Þ; ð3; 8Þ; ð4; 8Þg; (b) fð2; 3Þ; ð3; 4Þg; (c) fð3; 6Þ; ð6; 9Þg; (d) fð3; 12Þ; ð3; 15Þ; ð6; 15Þg; (e and f) fð6; 18Þ; ð6; 21Þ; ð9; 21Þ; ð9; 24Þ; ð12; 24Þg 4.124. (a) Diagonal matrices, (b) V 4.128. (a) [41; 11], (b) [7a 4b; 2a þ b] 4.129. (a) [11; 13; 10], (b) [c 3b þ 7a; c þ 3b 6a; c 2b þ 4a] 4.130. (a) [2; 1; 2; 2], (b) [a; b c; c b þ a; d c þ b a] 4.131. (a) [7; 1; 13; 10], (b) [d; c d; b þ c 2d; a b 2c þ 2d] 4.132. dim W ¼ 2; basis: ft3 þ 2t2 3t þ 4; t2 þ 2t 1g 4.133. dim W ¼ 2; basis: f½1; 2; 1; 3; 1; 2; ½0; 0; 1; 1; 3; 2g 4.134. (a) False; (1, 1), (1, 2), (2, 1) span R2 ; (b) True; (c) False; (1, 0, 0, 0), (0, 1, 0, 0), (0, 0, 1, 0), w ¼ ð0; 0; 0; 1Þ; (d) True; (e) True; (f) True 4.135. (a) True; (b) False; e.g. delete C2 from 1 0 3 0 1 2 ; (c) True 4.136. (a) 1 2 nðn þ 1Þ, (b) 1 2 nðn 1Þ, (c) n, (d) 1 CHAPTER 4 Vector Spaces 163
  • 171. Linear Mappings 5.1 Introduction The main subject matter of linear algebra is the study of linear mappings and their representation by means of matrices. This chapter introduces us to these linear maps and Chapter 6 shows how they can be represented by matrices. First, however, we begin with a study of mappings in general. 5.2 Mappings, Functions Let A and B be arbitrary nonempty sets. Suppose to each element in a 2 A there is assigned a unique element of B; called the image of a. The collection f of such assignments is called a mapping (or map) from A into B, and it is denoted by f : A ! B The set A is called the domain of the mapping, and B is called the target set. We write f ðaÞ, read ‘‘f of a;’’ for the unique element of B that f assigns to a 2 A. One may also view a mapping f : A ! B as a computer that, for each input value a 2 A, produces a unique output f ðaÞ 2 B. Remark: The term function is used synonymously with the word mapping, although some texts reserve the word ‘‘function’’ for a real-valued or complex-valued mapping. Consider a mapping f : A ! B. If A0 is any subset of A, then f ðA0 Þ denotes the set of images of elements of A0 ; and if B0 is any subset of B, then f 1 ðB0 Þ denotes the set of elements of A; each of whose image lies in B. That is, f ðA0 Þ ¼ f f ðaÞ : a 2 A0 g and f 1 ðB0 Þ ¼ fa 2 A : f ðaÞ 2 B0 g We call f ðA0 ) the image of A0 and f 1 ðB0 Þ the inverse image or preimage of B0 . In particular, the set of all images (i.e., f ðAÞ) is called the image or range of f. To each mapping f : A ! B there corresponds the subset of A B given by fða; f ðaÞÞ : a 2 Ag. We call this set the graph of f . Two mappings f : A ! B and g : A ! B are defined to be equal, written f ¼ g, if f ðaÞ ¼ gðaÞ for every a 2 A—that is, if they have the same graph. Thus, we do not distinguish between a function and its graph. The negation of f ¼ g is written f 6¼ g and is the statement: There exists an a 2 A for which f ðaÞ 6¼ gðaÞ: Sometimes the ‘‘barred’’ arrow 7! is used to denote the image of an arbitrary element x 2 A under a mapping f : A ! B by writing x 7! f ðxÞ This is illustrated in the following example. 164 CHAPTER 5
  • 172. EXAMPLE 5.1 (a) Let f : R ! R be the function that assigns to each real number x its square x2 . We can denote this function by writing f ðxÞ ¼ x2 or x 7! x2 Here the image of 3 is 9, so we may write f ð3Þ ¼ 9. However, f 1 ð9Þ ¼ f3; 3g. Also, f ðRÞ ¼ ½0; 1Þ ¼ fx : x 0g is the image of f. (b) Let A ¼ fa; b; c; dg and B ¼ fx; y; z; tg. Then the following defines a mapping f : A ! B: f ðaÞ ¼ y; f ðbÞ ¼ x; f ðcÞ ¼ z; f ðdÞ ¼ y or f ¼ fða; yÞ; ðb; xÞ; ðc; zÞ; ðd; yÞg The first defines the mapping explicitly, and the second defines the mapping by its graph. Here, f ðfa; b; dgÞ ¼ f f ðaÞ; f ðbÞ; f ðdÞg ¼ fy; x; yg ¼ fx; yg Furthermore, f ðAÞ ¼ fx; y; zg is the image of f. EXAMPLE 5.2 Let V be the vector space of polynomials over R, and let pðtÞ ¼ 3t2 5t þ 2. (a) The derivative defines a mapping D : V ! V where, for any polynomials f ðtÞ, we have Dð f Þ ¼ df =dt. Thus, DðpÞ ¼ Dð3t2 5t þ 2Þ ¼ 6t 5 (b) The integral, say from 0 to 1, defines a mapping J : V ! R. That is, for any polynomial f ðtÞ, Jð f Þ ¼ ð1 0 f ðtÞ dt; and so JðpÞ ¼ ð1 0 ð3t2 5t þ 2Þ ¼ 1 2 Observe that the mapping in (b) is from the vector space V into the scalar field R, whereas the mapping in (a) is from the vector space V into itself. Matrix Mappings Let A be any m n matrix over K. Then A determines a mapping FA : Kn ! Km by FAðuÞ ¼ Au where the vectors in Kn and Km are written as columns. For example, suppose A ¼ 1 4 5 2 3 6 and u ¼ 1 3 5 2 4 3 5 then FAðuÞ ¼ Au ¼ 1 4 5 2 3 6 1 3 5 2 4 3 5 ¼ 36 41 Remark: For notational convenience, we will frequently denote the mapping FA by the letter A, the same symbol as used for the matrix. Composition of Mappings Consider two mappings f : A ! B and g : B ! C, illustrated below: A ! f B ! g C The composition of f and g, denoted by g f , is the mapping g f : A ! C defined by ðg f ÞðaÞ gð f ðaÞÞ CHAPTER 5 Linear Mappings 165
  • 173. That is, first we apply f to a 2 A, and then we apply g to f ðaÞ 2 B to get gð f ðaÞÞ 2 C. Viewing f and g as ‘‘computers,’’ the composition means we first input a 2 A to get the output f ðaÞ 2 B using f , and then we input f ðaÞ to get the output gð f ðaÞÞ 2 C using g. Our first theorem tells us that the composition of mappings satisfies the associative law. THEOREM 5.1: Let f : A ! B, g : B ! C, h : C ! D. Then h ðg f Þ ¼ ðh gÞ f We prove this theorem here. Let a 2 A. Then ðh ðg f ÞÞðaÞ ¼ hððg f ÞðaÞÞ ¼ hðgð f ðaÞÞÞ ððh gÞ f ÞðaÞ ¼ ðh gÞð f ðaÞÞ ¼ hðgð f ðaÞÞÞ Thus, ðh ðg f ÞÞðaÞ ¼ ððh gÞ f ÞðaÞ for every a 2 A, and so h ðg f Þ ¼ ðh gÞ f. One-to-One and Onto Mappings We formally introduce some special types of mappings. DEFINITION: A mapping f : A ! B is said to be one-to-one (or 1-1 or injective) if different elements of A have distinct images; that is, If f ðaÞ ¼ f ða0 Þ; then a ¼ a0 : DEFINITION: A mapping f : A ! B is said to be onto (or f maps A onto B or surjective) if every b 2 B is the image of at least one a 2 A. DEFINITION: A mapping f : A ! B is said to be a one-to-one correspondence between A and B (or bijective) if f is both one-to-one and onto. EXAMPLE 5.3 Let f : R ! R, g : R ! R, h : R ! R be defined by f ðxÞ ¼ 2x ; gðxÞ ¼ x3 x; hðxÞ ¼ x2 The graphs of these functions are shown in Fig. 5-1. The function f is one-to-one. Geometrically, this means that each horizontal line does not contain more than one point of f. The function g is onto. Geometrically, this means that each horizontal line contains at least one point of g. The function h is neither one-to-one nor onto. For example, both 2 and 2 have the same image 4, and 16 has no preimage. Identity and Inverse Mappings Let A be any nonempty set. The mapping f : A ! A defined by f ðaÞ ¼ a—that is, the function that assigns to each element in A itself—is called identity mapping. It is usually denoted by 1A or 1 or I. Thus, for any a 2 A, we have 1AðaÞ ¼ a. Figure 5-1 166 CHAPTER 5 Linear Mappings
  • 174. Now let f : A ! B. We call g : B ! A the inverse of f, written f 1 , if f g ¼ 1B and g f ¼ 1A We emphasize that f has an inverse if and only if f is a one-to-one correspondence between A and B; that is, f is one-to-one and onto (Problem 5.7). Also, if b 2 B, then f 1 ðbÞ ¼ a, where a is the unique element of A for which f ðaÞ ¼ b 5.3 Linear Mappings (Linear Transformations) We begin with a definition. DEFINITION: Let V and U be vector spaces over the same field K. A mapping F : V ! U is called a linear mapping or linear transformation if it satisfies the following two conditions: (1) For any vectors v; w 2 V, Fðv þ wÞ ¼ FðvÞ þ FðwÞ. (2) For any scalar k and vector v 2 V, FðkvÞ ¼ kFðvÞ. Namely, F : V ! U is linear if it ‘‘preserves’’ the two basic operations of a vector space, that of vector addition and that of scalar multiplication. Substituting k ¼ 0 into condition (2), we obtain Fð0Þ ¼ 0. Thus, every linear mapping takes the zero vector into the zero vector. Now for any scalars a; b 2 K and any vector v; w 2 V, we obtain Fðav þ bwÞ ¼ FðavÞ þ FðbwÞ ¼ aFðvÞ þ bFðwÞ More generally, for any scalars ai 2 K and any vectors vi 2 V, we obtain the following basic property of linear mappings: Fða1v1 þ a2v2 þ þ amvmÞ ¼ a1Fðv1Þ þ a2Fðv2Þ þ þ amFðvmÞ Remark 1: A linear mapping F : V ! U is completely characterized by the condition Fðav þ bwÞ ¼ aFðvÞ þ bFðwÞ ð*Þ and so this condition is sometimes used as its defintion. Remark 2: The term linear transformation rather than linear mapping is frequently used for linear mappings of the form F : Rn ! Rm . EXAMPLE 5.4 (a) Let F : R3 ! R3 be the ‘‘projection’’ mapping into the xy-plane; that is, F is the mapping defined by Fðx; y; zÞ ¼ ðx; y; 0Þ. We show that F is linear. Let v ¼ ða; b; cÞ and w ¼ ða0 ; b0 ; c0 Þ. Then Fðv þ wÞ ¼ Fða þ a0 ; b þ b0 ; c þ c0 Þ ¼ ða þ a0 ; b þ b0 ; 0Þ ¼ ða; b; 0Þ þ ða0 ; b0 ; 0Þ ¼ FðvÞ þ FðwÞ and, for any scalar k, FðkvÞ ¼ Fðka; kb; kcÞ ¼ ðka; kb; 0Þ ¼ kða; b; 0Þ ¼ kFðvÞ Thus, F is linear. (b) Let G : R2 ! R2 be the ‘‘translation’’ mapping defined by Gðx; yÞ ¼ ðx þ 1; y þ 2Þ. [That is, G adds the vector (1, 2) to any vector v ¼ ðx; yÞ in R2 .] Note that Gð0Þ ¼ Gð0; 0Þ ¼ ð1; 2Þ 6¼ 0 Thus, the zero vector is not mapped into the zero vector. Hence, G is not linear. CHAPTER 5 Linear Mappings 167
  • 175. EXAMPLE 5.5 (Derivative and Integral Mappings) Consider the vector space V ¼ PðtÞ of polynomials over the real field R. Let uðtÞ and vðtÞ be any polynomials in V and let k be any scalar. (a) Let D : V ! V be the derivative mapping. One proves in calculus that dðu þ vÞ dt ¼ du dt þ dv dt and dðkuÞ dt ¼ k du dt That is, Dðu þ vÞ ¼ DðuÞ þ DðvÞ and DðkuÞ ¼ kDðuÞ. Thus, the derivative mapping is linear. (b) Let J : V ! R be an integral mapping, say Jð f ðtÞÞ ¼ ð1 0 f ðtÞ dt One also proves in calculus that, ð1 0 ½uðtÞ þ vðtÞdt ¼ ð1 0 uðtÞ dt þ ð1 0 vðtÞ dt and ð1 0 kuðtÞ dt ¼ k ð1 0 uðtÞ dt That is, Jðu þ vÞ ¼ JðuÞ þ JðvÞ and JðkuÞ ¼ kJðuÞ. Thus, the integral mapping is linear. EXAMPLE 5.6 (Zero and Identity Mappings) (a) Let F : V ! U be the mapping that assigns the zero vector 0 2 U to every vector v 2 V. Then, for any vectors v; w 2 V and any scalar k 2 K, we have Fðv þ wÞ ¼ 0 ¼ 0 þ 0 ¼ FðvÞ þ FðwÞ and FðkvÞ ¼ 0 ¼ k0 ¼ kFðvÞ Thus, F is linear. We call F the zero mapping, and we usually denote it by 0. (b) Consider the identity mapping I : V ! V, which maps each v 2 V into itself. Then, for any vectors v; w 2 V and any scalars a; b 2 K, we have Iðav þ bwÞ ¼ av þ bw ¼ aIðvÞ þ bIðwÞ Thus, I is linear. Our next theorem (proved in Problem 5.13) gives us an abundance of examples of linear mappings. In particular, it tells us that a linear mapping is completely determined by its values on the elements of a basis. THEOREM 5.2: Let V and U be vector spaces over a field K. Let fv1; v2; . . . ; vng be a basis of V and let u1; u2; . . . ; un be any vectors in U. Then there exists a unique linear mapping F : V ! U such that Fðv1Þ ¼ u1; Fðv2Þ ¼ u2; . . . ; FðvnÞ ¼ un. We emphasize that the vectors u1; u2; . . . ; un in Theorem 5.2 are completely arbitrary; they may be linearly dependent or they may even be equal to each other. Matrices as Linear Mappings Let A be any real m n matrix. Recall that A determines a mapping FA : Kn ! Km by FAðuÞ ¼ Au (where the vectors in Kn and Km are written as columns). We show FA is linear. By matrix multiplication, FAðv þ wÞ ¼ Aðv þ wÞ ¼ Av þ Aw ¼ FAðvÞ þ FAðwÞ FAðkvÞ ¼ AðkvÞ ¼ kðAvÞ ¼ kFAðvÞ In other words, using A to represent the mapping, we have Aðv þ wÞ ¼ Av þ Aw and AðkvÞ ¼ kðAvÞ Thus, the matrix mapping A is linear. 168 CHAPTER 5 Linear Mappings
  • 176. Vector Space Isomorphism The notion of two vector spaces being isomorphic was defined in Chapter 4 when we investigated the coordinates of a vector relative to a basis. We now redefine this concept. DEFINITION: Two vector spaces V and U over K are isomorphic, written V ffi U, if there exists a bijective (one-to-one and onto) linear mapping F : V ! U. The mapping F is then called an isomorphism between V and U. Consider any vector space V of dimension n and let S be any basis of V. Then the mapping v 7! ½vS which maps each vector v 2 V into its coordinate vector ½vS, is an isomorphism between V and Kn . 5.4 Kernel and Image of a Linear Mapping We begin by defining two concepts. DEFINITION: Let F : V ! U be a linear mapping. The kernel of F, written Ker F, is the set of elements in V that map into the zero vector 0 in U; that is, Ker F ¼ fv 2 V : FðvÞ ¼ 0g The image (or range) of F, written Im F, is the set of image points in U; that is, Im F ¼ fu 2 U : there exists v 2 V for which FðvÞ ¼ ug The following theorem is easily proved (Problem 5.22). THEOREM 5.3: Let F : V ! U be a linear mapping. Then the kernel of F is a subspace of V and the image of F is a subspace of U. Now suppose that v1; v2; . . . ; vm span a vector space V and that F : V ! U is linear. We show that Fðv1Þ; Fðv2Þ; . . . ; FðvmÞ span Im F. Let u 2 Im F. Then there exists v 2 V such that FðvÞ ¼ u. Because the vi’s span V and v 2 V, there exist scalars a1; a2; . . . ; am for which v ¼ a1v1 þ a2v2 þ þ amvm Therefore, u ¼ FðvÞ ¼ Fða1v1 þ a2v2 þ þ amvmÞ ¼ a1Fðv1Þ þ a2Fðv2Þ þ þ amFðvmÞ Thus, the vectors Fðv1Þ; Fðv2Þ; . . . ; FðvmÞ span Im F. We formally state the above result. PROPOSITION 5.4: Suppose v1; v2; . . . ; vm span a vector space V, and suppose F : V ! U is linear. Then Fðv1Þ; Fðv2Þ; . . . ; FðvmÞ span Im F. EXAMPLE 5.7 (a) Let F : R3 ! R3 be the projection of a vector v into the xy-plane [as pictured in Fig. 5-2(a)]; that is, Fðx; y; zÞ ¼ ðx; y; 0Þ Clearly the image of F is the entire xy-plane—that is, points of the form (x; y; 0). Moreover, the kernel of F is the z-axis—that is, points of the form (0; 0; c). That is, Im F ¼ fða; b; cÞ : c ¼ 0g ¼ xy-plane and Ker F ¼ fða; b; cÞ : a ¼ 0; b ¼ 0g ¼ z-axis (b) Let G : R3 ! R3 be the linear mapping that rotates a vector v about the z-axis through an angle y [as pictured in Fig. 5-2(b)]; that is, Gðx; y; zÞ ¼ ðx cos y y sin y; x sin y þ y cos y; zÞ CHAPTER 5 Linear Mappings 169
  • 177. Observe that the distance of a vector v from the origin O does not change under the rotation, and so only the zero vector 0 is mapped into the zero vector 0. Thus, Ker G ¼ f0g. On the other hand, every vector u in R3 is the image of a vector v in R3 that can be obtained by rotating u back by an angle of y. Thus, Im G ¼ R3 , the entire space. EXAMPLE 5.8 Consider the vector space V ¼ PðtÞ of polynomials over the real field R, and let H : V ! V be the third-derivative operator; that is, H½ f ðtÞ ¼ d3 f =dt3 . [Sometimes the notation D3 is used for H, where D is the derivative operator.] We claim that Ker H ¼ fpolynomials of degree 2g ¼ P2ðtÞ and Im H ¼ V The first comes from the fact that Hðat2 þ bt þ cÞ ¼ 0 but Hðtn Þ 6¼ 0 for n 3. The second comes from that fact that every polynomial gðtÞ in V is the third derivative of some polynomial f ðtÞ (which can be obtained by taking the antiderivative of gðtÞ three times). Kernel and Image of Matrix Mappings Consider, say, a 3 4 matrix A and the usual basis fe1; e2; e3; e4g of K4 (written as columns): A ¼ a1 a2 a3 a4 b1 b2 b3 b4 c1 c2 c3 c4 2 4 3 5; e1 ¼ 1 0 0 0 2 6 6 4 3 7 7 5; e2 ¼ 1 0 0 0 2 6 6 4 3 7 7 5; e3 ¼ 1 0 0 0 2 6 6 4 3 7 7 5; e4 ¼ 1 0 0 0 2 6 6 4 3 7 7 5 Recall that A may be viewed as a linear mapping A : K4 ! K3 , where the vectors in K4 and K3 are viewed as column vectors. Now the usual basis vectors span K4 , so their images Ae1, Ae2, Ae3, Ae4 span the image of A. But the vectors Ae1, Ae2, Ae3, Ae4 are precisely the columns of A: Ae1 ¼ ½a1; b1; c1T ; Ae2 ¼ ½a2; b2; c2T ; Ae3 ¼ ½a3; b3; c3T ; Ae4 ¼ ½a4; b4; c4T Thus, the image of A is precisely the column space of A. On the other hand, the kernel of A consists of all vectors v for which Av ¼ 0. This means that the kernel of A is the solution space of the homogeneous system AX ¼ 0, called the null space of A. We state the above results formally. PROPOSITION 5.5: Let A be any m n matrix over a field K viewed as a linear map A : Kn ! Km . Then Ker A ¼ nullspðAÞ and Im A ¼ colspðAÞ Here colsp(A) denotes the column space of A, and nullsp(A) denotes the null space of A. Figure 5-2 170 CHAPTER 5 Linear Mappings
  • 178. Rank and Nullity of a Linear Mapping Let F : V ! U be a linear mapping. The rank of F is defined to be the dimension of its image, and the nullity of F is defined to be the dimension of its kernel; namely, rankðFÞ ¼ dimðIm FÞ and nullityðFÞ ¼ dimðKer FÞ The following important theorem (proved in Problem 5.23) holds. THEOREM 5.6 Let V be of finite dimension, and let F : V ! U be linear. Then dim V ¼ dimðKer FÞ þ dimðIm FÞ ¼ nullityðFÞ þ rankðFÞ Recall that the rank of a matrix A was also defined to be the dimension of its column space and row space. If we now view A as a linear mapping, then both definitions correspond, because the image of A is precisely its column space. EXAMPLE 5.9 Let F : R4 ! R3 be the linear mapping defined by Fðx; y; z; tÞ ¼ ðx y þ z þ t; 2x 2y þ 3z þ 4t; 3x 3y þ 4z þ 5tÞ (a) Find a basis and the dimension of the image of F. First find the image of the usual basis vectors of R4 , Fð1; 0; 0; 0Þ ¼ ð1; 2; 3Þ; Fð0; 0; 1; 0Þ ¼ ð1; 3; 4Þ Fð0; 1; 0; 0Þ ¼ ð1; 2; 3Þ; Fð0; 0; 0; 1Þ ¼ ð1; 4; 5Þ By Proposition 5.4, the image vectors span Im F. Hence, form the matrix M whose rows are these image vectors and row reduce to echelon form: M ¼ 1 2 3 1 2 3 1 3 4 1 4 5 2 6 6 4 3 7 7 5 1 2 3 0 0 0 0 1 1 0 2 2 2 6 6 4 3 7 7 5 1 2 3 0 1 1 0 0 0 0 0 0 2 6 6 4 3 7 7 5 Thus, (1, 2, 3) and (0, 1, 1) form a basis of Im F. Hence, dimðIm FÞ ¼ 2 and rankðFÞ ¼ 2. (b) Find a basis and the dimension of the kernel of the map F. Set FðvÞ ¼ 0, where v ¼ ðx; y; z; tÞ, Fðx; y; z; tÞ ¼ ðx y þ z þ t; 2x 2y þ 3z þ 4t; 3x 3y þ 4z þ 5tÞ ¼ ð0; 0; 0Þ Set corresponding components equal to each other to form the following homogeneous system whose solution space is Ker F: x y þ z þ t ¼ 0 2x 2y þ 3z þ 4t ¼ 0 3x 3y þ 4z þ 5t ¼ 0 or x y þ z þ t ¼ 0 z þ 2t ¼ 0 z þ 2t ¼ 0 or x y þ z þ t ¼ 0 z þ 2t ¼ 0 The free variables are y and t. Hence, dimðKer FÞ ¼ 2 or nullityðFÞ ¼ 2. (i) Set y ¼ 1, t ¼ 0 to obtain the solution (1; 1; 0; 0Þ, (ii) Set y ¼ 0, t ¼ 1 to obtain the solution (1; 0; 2; 1Þ. Thus, (1; 1; 0; 0) and (1; 0; 2; 1) form a basis for Ker F. As expected from Theorem 5.6, dimðIm FÞ þ dimðKer FÞ ¼ 4 ¼ dim R4 . Application to Systems of Linear Equations Let AX ¼ B denote the matrix form of a system of m linear equations in n unknowns. Now the matrix A may be viewed as a linear mapping A : Kn ! Km CHAPTER 5 Linear Mappings 171
  • 179. Thus, the solution of the equation AX ¼ B may be viewed as the preimage of the vector B 2 Km under the linear mapping A. Furthermore, the solution of the associated homogeneous system AX ¼ 0 may be viewed as the kernel of the linear mapping A. Applying Theorem 5.6 to this homogeneous system yields dimðKer AÞ ¼ dim Kn dimðIm AÞ ¼ n rank A But n is exactly the number of unknowns in the homogeneous system AX ¼ 0. Thus, we have proved the following theorem of Chapter 4. THEOREM 4.19: The dimension of the solution space W of a homogenous system AX ¼ 0 of linear equations is s ¼ n r, where n is the number of unknowns and r is the rank of the coefficient matrix A. Observe that r is also the number of pivot variables in an echelon form of AX ¼ 0, so s ¼ n r is also the number of free variables. Furthermore, the s solution vectors of AX ¼ 0 described in Theorem 3.14 are linearly independent (Problem 4.52). Accordingly, because dim W ¼ s, they form a basis for the solution space W. Thus, we have also proved Theorem 3.14. 5.5 Singular and Nonsingular Linear Mappings, Isomorphisms Let F : V ! U be a linear mapping. Recall that Fð0Þ ¼ 0. F is said to be singular if the image of some nonzero vector v is 0—that is, if there exists v 6¼ 0 such that FðvÞ ¼ 0. Thus, F : V ! U is nonsingular if the zero vector 0 is the only vector whose image under F is 0 or, in other words, if Ker F ¼ f0g. EXAMPLE 5.10 Consider the projection map F : R3 ! R3 and the rotation map G : R3 ! R3 appearing in Fig. 5-2. (See Example 5.7.) Because the kernel of F is the z-axis, F is singular. On the other hand, the kernel of G consists only of the zero vector 0. Thus, G is nonsingular. Nonsingular linear mappings may also be characterized as those mappings that carry independent sets into independent sets. Specifically, we prove (Problem 5.28) the following theorem. THEOREM 5.7: Let F : V ! U be a nonsingular linear mapping. Then the image of any linearly independent set is linearly independent. Isomorphisms Suppose a linear mapping F : V ! U is one-to-one. Then only 0 2 V can map into 0 2 U, and so F is nonsingular. The converse is also true. For suppose F is nonsingular and FðvÞ ¼ FðwÞ, then Fðv wÞ ¼ FðvÞ FðwÞ ¼ 0, and hence, v w ¼ 0 or v ¼ w. Thus, FðvÞ ¼ FðwÞ implies v ¼ w— that is, F is one-to-one. We have proved the following proposition. PROPOSITION 5.8: A linear mapping F : V ! U is one-to-one if and only if F is nonsingular. Recall that a mapping F : V ! U is called an isomorphism if F is linear and if F is bijective (i.e., if F is one-to-one and onto). Also, recall that a vector space V is said to be isomorphic to a vector space U, written V ffi U, if there is an isomorphism F : V ! U. The following theorem (proved in Problem 5.29) applies. THEOREM 5.9: Suppose V has finite dimension and dim V ¼ dim U. Suppose F : V ! U is linear. Then F is an isomorphism if and only if F is nonsingular. 172 CHAPTER 5 Linear Mappings
  • 180. 5.6 Operations with Linear Mappings We are able to combine linear mappings in various ways to obtain new linear mappings. These operations are very important and will be used throughout the text. Let F : V ! U and G : V ! U be linear mappings over a field K. The sum F þ G and the scalar product kF, where k 2 K, are defined to be the following mappings from V into U: ðF þ GÞðvÞ FðvÞ þ GðvÞ and ðkFÞðvÞ kFðvÞ We now show that if F and G are linear, then F þ G and kF are also linear. Specifically, for any vectors v; w 2 V and any scalars a; b 2 K, ðF þ GÞðav þ bwÞ ¼ Fðav þ bwÞ þ Gðav þ bwÞ ¼ aFðvÞ þ bFðwÞ þ aGðvÞ þ bGðwÞ ¼ a½FðvÞ þ GðvÞ þ b½FðwÞ þ GðwÞ ¼ aðF þ GÞðvÞ þ bðF þ GÞðwÞ and ðkFÞðav þ bwÞ ¼ kFðav þ bwÞ ¼ k½aFðvÞ þ bFðwÞ ¼ akFðvÞ þ bkFðwÞ ¼ aðkFÞðvÞ þ bðkFÞðwÞ Thus, F þ G and kF are linear. The following theorem holds. THEOREM 5.10: Let V and U be vector spaces over a field K. Then the collection of all linear mappings from V into U with the above operations of addition and scalar multi- plication forms a vector space over K. The vector space of linear mappings in Theorem 5.10 is usually denoted by HomðV; UÞ Here Hom comes from the word ‘‘homomorphism.’’ We emphasize that the proof of Theorem 5.10 reduces to showing that HomðV; UÞ does satisfy the eight axioms of a vector space. The zero element of HomðV; UÞ is the zero mapping from V into U, denoted by 0 and defined by 0ðvÞ ¼ 0 for every vector v 2 V. Suppose V and U are of finite dimension. Then we have the following theorem. THEOREM 5.11: Suppose dim V ¼ m and dim U ¼ n. Then dim½HomðV; UÞ ¼ mn. Composition of Linear Mappings Now suppose V, U, and W are vector spaces over the same field K, and suppose F : V ! U and G : U ! W are linear mappings. We picture these mappings as follows: V ! F U ! G W Recall that the composition function G F is the mapping from V into W defined by ðG FÞðvÞ ¼ GðFðvÞÞ. We show that G F is linear whenever F and G are linear. Specifically, for any vectors v; w 2 V and any scalars a; b 2 K, we have ðG FÞðav þ bwÞ ¼ GðFðav þ bwÞÞ ¼ GðaFðvÞ þ bFðwÞÞ ¼ aGðFðvÞÞ þ bGðFðwÞÞ ¼ aðG FÞðvÞ þ bðG FÞðwÞ Thus, G F is linear. The composition of linear mappings and the operations of addition and scalar multiplication are related as follows. CHAPTER 5 Linear Mappings 173
  • 181. THEOREM 5.12: Let V, U, W be vector spaces over K. Suppose the following mappings are linear: F : V ! U; F0 : V ! U and G : U ! W; G0 : U ! W Then, for any scalar k 2 K: (i) G ðF þ F0 Þ ¼ G F þ G F0 . (ii) ðG þ G0 Þ F ¼ G F þ G0 F. (iii) kðG FÞ ¼ ðkGÞ F ¼ G ðkFÞ. 5.7 Algebra AðVÞ of Linear Operators Let V be a vector space over a field K. This section considers the special case of linear mappings from the vector space V into itself—that is, linear mappings of the form F : V ! V. They are also called linear operators or linear transformations on V. We will write AðVÞ, instead of HomðV; VÞ, for the space of all such mappings. Now AðVÞ is a vector space over K (Theorem 5.8), and, if dim V ¼ n, then dim AðVÞ ¼ n2 . Moreover, for any mappings F; G 2 AðVÞ, the composition G F exists and also belongs to AðVÞ. Thus, we have a ‘‘multiplication’’ defined in AðVÞ. [We sometimes write FG instead of G F in the space AðVÞ.] Remark: An algebra A over a field K is a vector space over K in which an operation of multiplication is defined satisfying, for every F; G; H 2 A and every k 2 K: (i) FðG þ HÞ ¼ FG þ FH, (ii) ðG þ HÞF ¼ GF þ HF, (iii) kðGFÞ ¼ ðkGÞF ¼ GðkFÞ. The algebra is said to be associative if, in addition, ðFGÞH ¼ FðGHÞ. The above definition of an algebra and previous theorems give us the following result. THEOREM 5.13: Let V be a vector space over K. Then AðVÞ is an associative algebra over K with respect to composition of mappings. If dim V ¼ n, then dim AðVÞ ¼ n2 . This is why AðVÞ is called the algebra of linear operators on V. Polynomials and Linear Operators Observe that the identity mapping I : V ! V belongs to AðVÞ. Also, for any linear operator F in AðVÞ, we have FI ¼ IF ¼ F. We can also form ‘‘powers’’ of F. Namely, we define F0 ¼ I; F2 ¼ F F; F3 ¼ F2 F ¼ F F F; F4 ¼ F3 F; . . . Furthermore, for any polynomial pðtÞ over K, say, pðtÞ ¼ a0 þ a1t þ a2t2 þ þ ast2 we can form the linear operator pðFÞ defined by pðFÞ ¼ a0I þ a1F þ a2F2 þ þ asFs (For any scalar k, the operator kI is sometimes denoted simply by k.) In particular, we say F is a zero of the polynomial pðtÞ if pðFÞ ¼ 0. EXAMPLE 5.11 Let F : K3 ! K3 be defined by Fðx; y; zÞ ¼ ð0; x; yÞ. For any ða; b; cÞ 2 K3 , ðF þ IÞða; b; cÞ ¼ ð0; a; bÞ þ ða; b; cÞ ¼ ða; a þ b; b þ cÞ F3 ða; b; cÞ ¼ F2 ð0; a; bÞ ¼ Fð0; 0; aÞ ¼ ð0; 0; 0Þ Thus, F3 ¼ 0, the zero mapping in AðVÞ. This means F is a zero of the polynomial pðtÞ ¼ t3 . 174 CHAPTER 5 Linear Mappings
  • 182. Square Matrices as Linear Operators Let M ¼ Mn;n be the vector space of all square n n matrices over K. Then any matrix A in M defines a linear mapping FA : Kn ! Kn by FAðuÞ ¼ Au (where the vectors in Kn are written as columns). Because the mapping is from Kn into itself, the square matrix A is a linear operator, not simply a linear mapping. Suppose A and B are matrices in M. Then the matrix product AB is defined. Furthermore, for any (column) vector u in Kn , FABðuÞ ¼ ðABÞu ¼ AðBuÞ ¼ AðFBðUÞÞ ¼ FAðFBðuÞÞ ¼ ðFA FBÞðuÞ In other words, the matrix product AB corresponds to the composition of A and B as linear mappings. Similarly, the matrix sum A þ B corresponds to the sum of A and B as linear mappings, and the scalar product kA corresponds to the scalar product of A as a linear mapping. Invertible Operators in AðVÞ Let F : V ! V be a linear operator. F is said to be invertible if it has an inverse—that is, if there exists F1 in AðVÞ such that FF1 ¼ F1 F ¼ I. On the other hand, F is invertible as a mapping if F is both one-to-one and onto. In such a case, F1 is also linear and F1 is the inverse of F as a linear operator (proved in Problem 5.15). Suppose F is invertible. Then only 0 2 V can map into itself, and so F is nonsingular. The converse is not true, as seen by the following example. EXAMPLE 5.12 Let V ¼ PðtÞ, the vector space of polynomials over K. Let F be the mapping on V that increases by 1 the exponent of t in each term of a polynomial; that is, Fða0 þ a1t þ a2t2 þ þ asts Þ ¼ a0t þ a1t2 þ a2t3 þ þ astsþ1 Then F is a linear mapping and F is nonsingular. However, F is not onto, and so F is not invertible. The vector space V ¼ PðtÞ in the above example has infinite dimension. The situation changes significantly when V has finite dimension. Namely, the following theorem applies. THEOREM 5.14: Let F be a linear operator on a finite-dimensional vector space V. Then the following four conditions are equivalent. (i) F is nonsingular: Ker F ¼ f0g. (iii) F is an onto mapping. (ii) F is one-to-one. (iv) F is invertible. The proof of the above theorem mainly follows from Theorem 5.6, which tells us that dim V ¼ dimðKer FÞ þ dimðIm FÞ By Proposition 5.8, (i) and (ii) are equivalent. Note that (iv) is equivalent to (ii) and (iii). Thus, to prove the theorem, we need only show that (i) and (iii) are equivalent. This we do below. (a) Suppose (i) holds. Then dimðKer FÞ ¼ 0, and so the above equation tells us that dim V ¼ dimðIm FÞ. This means V ¼ Im F or, in other words, F is an onto mapping. Thus, (i) implies (iii). (b) Suppose (iii) holds. Then V ¼ Im F, and so dim V ¼ dimðIm FÞ. Therefore, the above equation tells us that dimðKer FÞ ¼ 0, and so F is nonsingular. Therefore, (iii) implies (i). Accordingly, all four conditions are equivalent. Remark: Suppose A is a square n n matrix over K. Then A may be viewed as a linear operator on Kn . Because Kn has finite dimension, Theorem 5.14 holds for the square matrix A. This is why the terms ‘‘nonsingular’’ and ‘‘invertible’’ are used interchangeably when applied to square matrices. EXAMPLE 5.13 Let F be the linear operator on R2 defined by Fðx; yÞ ¼ ð2x þ y; 3x þ 2yÞ. (a) To show that F is invertible, we need only show that F is nonsingular. Set Fðx; yÞ ¼ ð0; 0Þ to obtain the homogeneous system 2x þ y ¼ 0 and 3x þ 2y ¼ 0 CHAPTER 5 Linear Mappings 175
  • 183. Solve for x and y to get x ¼ 0, y ¼ 0. Hence, F is nonsingular and so invertible. (b) To find a formula for F1 , we set Fðx; yÞ ¼ ðs; tÞ and so F1 ðs; tÞ ¼ ðx; yÞ. We have ð2x þ y; 3x þ 2yÞ ¼ ðs; tÞ or 2x þ y ¼ s 3x þ 2y ¼ t Solve for x and y in terms of s and t to obtain x ¼ 2s t, y ¼ 3s þ 2t. Thus, F1 ðs; tÞ ¼ ð2s t; 3s þ 2tÞ or F1 ðx; yÞ ¼ ð2x y; 3x þ 2yÞ where we rewrite the formula for F1 using x and y instead of s and t. SOLVED PROBLEMS Mappings 5.1. State whether each diagram in Fig. 5-3 defines a mapping from A ¼ fa; b; cg into B ¼ fx; y; zg. (a) No. There is nothing assigned to the element b 2 A. (b) No. Two elements, x and z, are assigned to c 2 A. (c) Yes. 5.2. Let f : A ! B and g : B ! C be defined by Fig. 5-4. (a) Find the composition mapping ðg f Þ : A ! C. (b) Find the images of the mappings f , g, g f . (a) Use the definition of the composition mapping to compute ðg f Þ ðaÞ ¼ gð f ðaÞÞ ¼ gðyÞ ¼ t; ðg f Þ ðbÞ ¼ gð f ðbÞÞ ¼ gðxÞ ¼ s ðg f Þ ðcÞ ¼ gð f ðcÞÞ ¼ gðyÞ ¼ t Observe that we arrive at the same answer if we ‘‘follow the arrows’’ in Fig. 5-4: a ! y ! t; b ! x ! s; c ! y ! t (b) By Fig. 5-4, the image values under the mapping f are x and y, and the image values under g are r, s, t. Figure 5-3 Figure 5-4 176 CHAPTER 5 Linear Mappings
  • 184. Hence, Im f ¼ fx; yg and Im g ¼ fr; s; tg Also, by part (a), the image values under the composition mapping g f are t and s; accordingly, Im g f ¼ fs; tg. Note that the images of g and g f are different. 5.3. Consider the mapping F : R3 ! R2 defined by Fðx; y; zÞ ¼ ðyz; x2 Þ. Find (a) Fð2; 3; 4Þ; (b) Fð5; 2; 7Þ; (c) F1 ð0; 0Þ, that is, all v 2 R3 such that FðvÞ ¼ 0. (a) Substitute in the formula for F to get Fð2; 3; 4Þ ¼ ð3 4; 22 Þ ¼ ð12; 4Þ. (b) Fð5; 2; 7Þ ¼ ð2 7; 52 Þ ¼ ð14; 25Þ. (c) Set FðvÞ ¼ 0, where v ¼ ðx; y; zÞ, and then solve for x, y, z: Fðx; y; zÞ ¼ ðyz; x2 Þ ¼ ð0; 0Þ or yz ¼ 0; x2 ¼ 0 Thus, x ¼ 0 and either y ¼ 0 or z ¼ 0. In other words, x ¼ 0, y ¼ 0 or x ¼ 0; z ¼ 0—that is, the z-axis and the y-axis. 5.4. Consider the mapping F : R2 ! R2 defined by Fðx; yÞ ¼ ð3y; 2xÞ. Let S be the unit circle in R2 , that is, the solution set of x2 þ y2 ¼ 1. (a) Describe FðSÞ. (b) Find F1 ðSÞ. (a) Let (a; b) be an element of FðSÞ. Then there exists ðx; yÞ 2 S such that Fðx; yÞ ¼ ða; bÞ. Hence, ð3y; 2xÞ ¼ ða; bÞ or 3y ¼ a; 2x ¼ b or y ¼ a 3 ; x ¼ b 2 Because ðx; yÞ 2 S—that is, x2 þ y2 ¼ 1—we have b 2 2 þ a 3 2 ¼ 1 or a2 9 þ b2 4 ¼ 1 Thus, FðSÞ is an ellipse. (b) Let Fðx; yÞ ¼ ða; bÞ, where ða; bÞ 2 S. Then ð3y; 2xÞ ¼ ða; bÞ or 3y ¼ a, 2x ¼ b. Because ða; bÞ 2 S, we have a2 þ b2 ¼ 1. Thus, ð3yÞ2 þ ð2xÞ2 ¼ 1. Accordingly, F1 ðSÞ is the ellipse 4x2 þ 9y2 ¼ 1. 5.5. Let the mappings f : A ! B, g : B ! C, h : C ! D be defined by Fig. 5-5. Determine whether or not each function is (a) one-to-one; (b) onto; (c) invertible (i.e., has an inverse). (a) The mapping f : A ! B is one-to-one, as each element of A has a different image. The mapping g : B ! C is not one-to one, because x and z both have the same image 4. The mapping h : C ! D is one-to-one. (b) The mapping f : A ! B is not onto, because z 2 B is not the image of any element of A. The mapping g : B ! C is onto, as each element of C is the image of some element of B. The mapping h : C ! D is also onto. (c) A mapping has an inverse if and only if it is one-to-one and onto. Hence, only h has an inverse. z y x w B g f C h 5 6 4 1 a D b c 2 3 A Figure 5-5 CHAPTER 5 Linear Mappings 177
  • 185. 5.6. Suppose f : A ! B and g : B ! C. Hence, ðg f Þ : A ! C exists. Prove (a) If f and g are one-to-one, then g f is one-to-one. (b) If f and g are onto mappings, then g f is an onto mapping. (c) If g f is one-to-one, then f is one-to-one. (d) If g f is an onto mapping, then g is an onto mapping. (a) Suppose ðg f ÞðxÞ ¼ ðg f ÞðyÞ. Then gð f ðxÞÞ ¼ gð f ðyÞÞ. Because g is one-to-one, f ðxÞ ¼ f ðyÞ. Because f is one-to-one, x ¼ y. We have proven that ðg f ÞðxÞ ¼ ðg f ÞðyÞ implies x ¼ y; hence g f is one-to-one. (b) Suppose c 2 C. Because g is onto, there exists b 2 B for which gðbÞ ¼ c. Because f is onto, there exists a 2 A for which f ðaÞ ¼ b. Thus, ðg f ÞðaÞ ¼ gð f ðaÞÞ ¼ gðbÞ ¼ c. Hence, g f is onto. (c) Suppose f is not one-to-one. Then there exist distinct elements x; y 2 A for which f ðxÞ ¼ f ðyÞ. Thus, ðg f ÞðxÞ ¼ gð f ðxÞÞ ¼ gð f ðyÞÞ ¼ ðg f ÞðyÞ. Hence, g f is not one-to-one. Therefore, if g f is one-to- one, then f must be one-to-one. (d) If a 2 A, then ðg f ÞðaÞ ¼ gð f ðaÞÞ 2 gðBÞ. Hence, ðg f ÞðAÞ gðBÞ. Suppose g is not onto. Then gðBÞ is properly contained in C and so ðg f ÞðAÞ is properly contained in C; thus, g f is not onto. Accordingly, if g f is onto, then g must be onto. 5.7. Prove that f : A ! B has an inverse if and only if f is one-to-one and onto. Suppose f has an inverse—that is, there exists a function f 1 : B ! A for which f 1 f ¼ 1A and f f 1 ¼ 1B. Because 1A is one-to-one, f is one-to-one by Problem 5.6(c), and because 1B is onto, f is onto by Problem 5.6(d); that is, f is both one-to-one and onto. Now suppose f is both one-to-one and onto. Then each b 2 B is the image of a unique element in A, say b*. Thus, if f ðaÞ ¼ b, then a ¼ b*; hence, f ðb*Þ ¼ b. Now let g denote the mapping from B to A defined by b 7! b*. We have (i) ðg f ÞðaÞ ¼ gð f ðaÞÞ ¼ gðbÞ ¼ b* ¼ a for every a 2 A; hence, g f ¼ 1A. (ii) ð f gÞðbÞ ¼ f ðgðbÞÞ ¼ f ðb*Þ ¼ b for every b 2 B; hence, f g ¼ 1B. Accordingly, f has an inverse. Its inverse is the mapping g. 5.8. Let f : R ! R be defined by f ðxÞ ¼ 2x 3. Now f is one-to-one and onto; hence, f has an inverse mapping f 1 . Find a formula for f 1 . Let y be the image of x under the mapping f ; that is, y ¼ f ðxÞ ¼ 2x 3. Hence, x will be the image of y under the inverse mapping f 1 . Thus, solve for x in terms of y in the above equation to obtain x ¼ 1 2 ðy þ 3Þ. Then the formula defining the inverse function is f 1 ðyÞ ¼ 1 2 ðy þ 3Þ, or, using x instead of y, f 1 ðxÞ ¼ 1 2 ðx þ 3Þ. Linear Mappings 5.9. Suppose the mapping F : R2 ! R2 is defined by Fðx; yÞ ¼ ðx þ y; xÞ. Show that F is linear. We need to show that Fðv þ wÞ ¼ FðvÞ þ FðwÞ and FðkvÞ ¼ kFðvÞ, where u and v are any elements of R2 and k is any scalar. Let v ¼ ða; bÞ and w ¼ ða0 ; b0 Þ. Then v þ w ¼ ða þ a0 ; b þ b0 Þ and kv ¼ ðka; kbÞ We have FðvÞ ¼ ða þ b; aÞ and FðwÞ ¼ ða0 þ b0 ; a0 Þ. Thus, Fðv þ wÞ ¼ Fða þ a0 ; b þ b0 Þ ¼ ða þ a0 þ b þ b0 ; a þ a0 Þ ¼ ða þ b; aÞ þ ða0 þ b0 ; a0 Þ ¼ FðvÞ þ FðwÞ and FðkvÞ ¼ Fðka; kbÞ ¼ ðka þ kb; kaÞ ¼ kða þ b; aÞ ¼ kFðvÞ Because v, w, k were arbitrary, F is linear. 178 CHAPTER 5 Linear Mappings
  • 186. 5.10. Suppose F : R3 ! R2 is defined by Fðx; y; zÞ ¼ ðx þ y þ z; 2x 3y þ 4zÞ. Show that F is linear. We argue via matrices. Writing vectors as columns, the mapping F may be written in the form FðvÞ ¼ Av, where v ¼ ½x; y; zT and A ¼ 1 1 1 2 3 4 Then, using properties of matrices, we have Fðv þ wÞ ¼ Aðv þ wÞ ¼ Av þ Aw ¼ FðvÞ þ FðwÞ FðkvÞ ¼ AðkvÞ ¼ kðAvÞ ¼ kFðvÞ and Thus, F is linear. 5.11. Show that the following mappings are not linear: (a) F : R2 ! R2 defined by Fðx; yÞ ¼ ðxy; xÞ (b) F : R2 ! R3 defined by Fðx; yÞ ¼ ðx þ 3; 2y; x þ yÞ (c) F : R3 ! R2 defined by Fðx; y; zÞ ¼ ðjxj; y þ zÞ (a) Let v ¼ ð1; 2Þ and w ¼ ð3; 4Þ; then v þ w ¼ ð4; 6Þ. Also, FðvÞ ¼ ð1ð2Þ; 1Þ ¼ ð2; 1Þ and FðwÞ ¼ ð3ð4Þ; 3Þ ¼ ð12; 3Þ Hence, Fðv þ wÞ ¼ ð4ð6Þ; 4Þ ¼ ð24; 6Þ 6¼ FðvÞ þ FðwÞ (b) Because Fð0; 0Þ ¼ ð3; 0; 0Þ 6¼ ð0; 0; 0Þ, F cannot be linear. (c) Let v ¼ ð1; 2; 3Þ and k ¼ 3. Then kv ¼ ð3; 6; 9Þ. We have FðvÞ ¼ ð1; 5Þ and kFðvÞ ¼ 3ð1; 5Þ ¼ ð3; 15Þ: Thus, FðkvÞ ¼ Fð3; 6; 9Þ ¼ ð3; 15Þ 6¼ kFðvÞ Accordingly, F is not linear. 5.12. Let V be the vector space of n-square real matrices. Let M be an arbitrary but fixed matrix in V. Let F : V ! V be defined by FðAÞ ¼ AM þ MA, where A is any matrix in V. Show that F is linear. For any matrices A and B in V and any scalar k, we have FðA þ BÞ ¼ ðA þ BÞM þ MðA þ BÞ ¼ AM þ BM þ MA þ MB ¼ ðAM þ MAÞ ¼ ðBM þ MBÞ ¼ FðAÞ þ FðBÞ and FðkAÞ ¼ ðkAÞM þ MðkAÞ ¼ kðAMÞ þ kðMAÞ ¼ kðAM þ MAÞ ¼ kFðAÞ Thus, F is linear. 5.13. Prove Theorem 5.2: Let V and U be vector spaces over a field K. Let fv1; v2; . . . ; vng be a basis of V and let u1; u2; . . . ; un be any vectors in U. Then there exists a unique linear mapping F : V ! U such that Fðv1Þ ¼ u1; Fðv2Þ ¼ u2; . . . ; FðvnÞ ¼ un. There are three steps to the proof of the theorem: (1) Define the mapping F : V ! U such that FðviÞ ¼ ui; i ¼ 1; . . . ; n. (2) Show that F is linear. (3) Show that F is unique. Step 1. Let v 2 V. Because fv1; . . . ; vng is a basis of V, there exist unique scalars a1; . . . ; an 2 K for which v ¼ a1v1 þ a2v2 þ þ anvn. We define F : V ! U by FðvÞ ¼ a1u1 þ a2u2 þ þ anun CHAPTER 5 Linear Mappings 179
  • 187. (Because the ai are unique, the mapping F is well defined.) Now, for i ¼ 1; . . . ; n, vi ¼ 0v1 þ þ 1vi þ þ 0vn Hence, FðviÞ ¼ 0u1 þ þ 1ui þ þ 0un ¼ ui Thus, the first step of the proof is complete. Step 2. Suppose v ¼ a1v1 þ a2v2 þ þ anvn and w ¼ b1v1 þ b2v2 þ þ bnvn. Then v þ w ¼ ða1 þ b1Þv1 þ ða2 þ b2Þv2 þ þ ðan þ bnÞvn and, for any k 2 K, kv ¼ ka1v1 þ ka2v2 þ þ kanvn. By definition of the mapping F, FðvÞ ¼ a1u1 þ a2u2 þ þ anvn and FðwÞ ¼ b1u1 þ b2u2 þ þ bnun Hence, Fðv þ wÞ ¼ ða1 þ b1Þu1 þ ða2 þ b2Þu2 þ þ ðan þ bnÞun ¼ ða1u1 þ a2u2 þ þ anunÞ þ ðb1u1 þ b2u2 þ þ bnunÞ ¼ FðvÞ þ FðwÞ and FðkvÞ ¼ kða1u1 þ a2u2 þ þ anunÞ ¼ kFðvÞ Thus, F is linear. Step 3. Suppose G : V ! U is linear and Gðv1Þ ¼ ui; i ¼ 1; . . . ; n. Let v ¼ a1v1 þ a2v2 þ þ anvn Then GðvÞ ¼ Gða1v1 þ a2v2 þ þ anvnÞ ¼ a1Gðv1Þ þ a2Gðv2Þ þ þ anGðvnÞ ¼ a1u1 þ a2u2 þ þ anun ¼ FðvÞ Because GðvÞ ¼ FðvÞ for every v 2 V; G ¼ F. Thus, F is unique and the theorem is proved. 5.14. Let F : R2 ! R2 be the linear mapping for which Fð1; 2Þ ¼ ð2; 3Þ and Fð0; 1Þ ¼ ð1; 4Þ. [Note that fð1; 2Þ; ð0; 1Þg is a basis of R2 , so such a linear map F exists and is unique by Theorem 5.2.] Find a formula for F; that is, find Fða; bÞ. Write ða; bÞ as a linear combination of (1, 2) and (0, 1) using unknowns x and y, ða; bÞ ¼ xð1; 2Þ þ yð0; 1Þ ¼ ðx; 2x þ yÞ; so a ¼ x; b ¼ 2x þ y Solve for x and y in terms of a and b to get x ¼ a, y ¼ 2a þ b. Then Fða; bÞ ¼ xFð1; 2Þ þ yFð0; 1Þ ¼ að2; 3Þ þ ð2a þ bÞð1; 4Þ ¼ ðb; 5a þ 4bÞ 5.15. Suppose a linear mapping F : V ! U is one-to-one and onto. Show that the inverse mapping F1 : U ! V is also linear. Suppose u; u0 2 U. Because F is one-to-one and onto, there exist unique vectors v; v0 2 V for which FðvÞ ¼ u and Fðv0 Þ ¼ u0 . Because F is linear, we also have Fðv þ v0 Þ ¼ FðvÞ þ Fðv0 Þ ¼ u þ u0 and FðkvÞ ¼ kFðvÞ ¼ ku By definition of the inverse mapping, F1 ðuÞ ¼ v; F1 ðu0 Þ ¼ v0 ; F1 ðu þ u0 Þ ¼ v þ v0 ; F1 ðkuÞ ¼ kv: Then F1 ðu þ u0 Þ ¼ v þ v0 ¼ F1 ðuÞ þ F1 ðu0 Þ and F1 ðkuÞ ¼ kv ¼ kF1 ðuÞ Thus, F1 is linear. 180 CHAPTER 5 Linear Mappings
  • 188. Kernel and Image of Linear Mappings 5.16. Let F : R4 ! R3 be the linear mapping defined by Fðx; y; z; tÞ ¼ ðx y þ z þ t; x þ 2z t; x þ y þ 3z 3tÞ Find a basis and the dimension of (a) the image of F; (b) the kernel of F. (a) Find the images of the usual basis of R4 : Fð1; 0; 0; 0Þ ¼ ð1; 1; 1Þ; Fð0; 0; 1; 0Þ ¼ ð1; 2; 3Þ Fð0; 1; 0; 0Þ ¼ ð1; 0; 1Þ; Fð0; 0; 0; 1Þ ¼ ð1; 1; 3Þ By Proposition 5.4, the image vectors span Im F. Hence, form the matrix whose rows are these image vectors, and row reduce to echelon form: 1 1 1 1 0 1 1 2 3 1 1 3 2 6 6 6 4 3 7 7 7 5 1 1 1 0 1 2 0 1 2 0 2 4 2 6 6 6 4 3 7 7 7 5 1 1 1 0 1 2 0 0 0 0 0 0 2 6 6 6 4 3 7 7 7 5 Thus, (1, 1, 1) and (0, 1, 2) form a basis for Im F; hence, dimðIm FÞ ¼ 2. (b) Set FðvÞ ¼ 0, where v ¼ ðx; y; z; tÞ; that is, set Fðx; y; z; tÞ ¼ ðx y þ z þ t; x þ 2z t; x þ y þ 3z 3tÞ ¼ ð0; 0; 0Þ Set corresponding entries equal to each other to form the following homogeneous system whose solution space is Ker F: x y þ z þ t ¼ 0 x þ 2z t ¼ 0 x þ y þ 3z 3t ¼ 0 or x y þ z þ t ¼ 0 y þ z 2t ¼ 0 2y þ 2z 4t ¼ 0 or x y þ z þ t ¼ 0 y þ z 2t ¼ 0 The free variables are z and t. Hence, dimðKer FÞ ¼ 2. (i) Set z ¼ 1, t ¼ 0 to obtain the solution (2; 1; 1; 0). (ii) Set z ¼ 0, t ¼ 1 to obtain the solution (1, 2, 0, 1). Thus, (2; 1; 1; 0) and (1, 2, 0, 1) form a basis of Ker F. [As expected, dimðIm FÞ þ dimðKer FÞ ¼ 2 þ 2 ¼ 4 ¼ dim R4 , the domain of F.] 5.17. Let G : R3 ! R3 be the linear mapping defined by Gðx; y; zÞ ¼ ðx þ 2y z; y þ z; x þ y 2zÞ Find a basis and the dimension of (a) the image of G, (b) the kernel of G. (a) Find the images of the usual basis of R3 : Gð1; 0; 0Þ ¼ ð1; 0; 1Þ; Gð0; 1; 0Þ ¼ ð2; 1; 1Þ; Gð0; 0; 1Þ ¼ ð1; 1; 2Þ By Proposition 5.4, the image vectors span Im G. Hence, form the matrix M whose rows are these image vectors, and row reduce to echelon form: M ¼ 1 0 1 2 1 1 1 1 2 2 4 3 5 1 0 1 0 1 1 0 1 1 2 4 3 5 1 0 1 0 1 1 0 0 0 2 4 3 5 Thus, (1, 0, 1) and (0; 1; 1) form a basis for Im G; hence, dimðIm GÞ ¼ 2. (b) Set GðvÞ ¼ 0, where v ¼ ðx; y; zÞ; that is, Gðx; y; zÞ ¼ ðx þ 2y z; y þ z; x þ y 2zÞ ¼ ð0; 0; 0Þ CHAPTER 5 Linear Mappings 181
  • 189. Set corresponding entries equal to each other to form the following homogeneous system whose solution space is Ker G: x þ 2y z ¼ 0 y þ z ¼ 0 x þ y 2z ¼ 0 or x þ 2y z ¼ 0 y þ z ¼ 0 y z ¼ 0 or x þ 2y z ¼ 0 y þ z ¼ 0 The only free variable is z; hence, dimðKer GÞ ¼ 1. Set z ¼ 1; then y ¼ 1 and x ¼ 3. Thus, (3; 1; 1) forms a basis of Ker G. [As expected, dimðIm GÞ þ dimðKer GÞ ¼ 2 þ 1 ¼ 3 ¼ dim R3 , the domain of G.] 5.18. Consider the matrix mapping A : R4 ! R3 , where A ¼ 1 2 3 1 1 3 5 2 3 8 13 3 2 4 3 5. Find a basis and the dimension of (a) the image of A, (b) the kernel of A. (a) The column space of A is equal to Im A. Now reduce AT to echelon form: AT ¼ 1 1 3 2 3 8 3 5 13 1 2 3 2 6 6 4 3 7 7 5 1 1 3 0 1 2 0 2 4 0 3 6 2 6 6 4 3 7 7 5 1 1 3 0 1 2 0 0 0 0 0 0 2 6 6 4 3 7 7 5 Thus, fð1; 1; 3Þ; ð0; 1; 2Þg is a basis of Im A, and dimðIm AÞ ¼ 2. (b) Here Ker A is the solution space of the homogeneous system AX ¼ 0, where X ¼ fx; y; z; tÞT . Thus, reduce the matrix A of coefficients to echelon form: 1 2 3 1 0 1 2 3 0 2 4 6 2 4 3 5 1 2 3 1 0 1 2 3 0 0 0 0 2 4 3 5 or x þ 2y þ 3z þ t ¼ 0 y þ 2z 3t ¼ 0 The free variables are z and t. Thus, dimðKer AÞ ¼ 2. (i) Set z ¼ 1, t ¼ 0 to get the solution (1; 2; 1; 0). (ii) Set z ¼ 0, t ¼ 1 to get the solution (7; 3; 0; 1). Thus, (1; 2; 1; 0) and (7; 3; 0; 1) form a basis for Ker A. 5.19. Find a linear map F : R3 ! R4 whose image is spanned by (1; 2; 0; 4) and (2; 0; 1; 3). Form a 4 3 matrix whose columns consist only of the given vectors, say A ¼ 1 2 2 2 0 0 0 1 1 4 3 3 2 6 6 4 3 7 7 5 Recall that A determines a linear map A : R3 ! R4 whose image is spanned by the columns of A. Thus, A satisfies the required condition. 5.20. Suppose f : V ! U is linear with kernel W, and that f ðvÞ ¼ u. Show that the ‘‘coset’’ v þ W ¼ fv þ w : w 2 Wg is the preimage of u; that is, f 1 ðuÞ ¼ v þ W. We must prove that (i) f 1 ðuÞ v þ W and (ii) v þ W f 1 ðuÞ. We first prove (i). Suppose v0 2 f 1 ðuÞ. Then f ðv0 Þ ¼ u, and so f ðv0 vÞ ¼ f ðv0 Þ f ðvÞ ¼ u u ¼ 0 that is, v0 v 2 W. Thus, v0 ¼ v þ ðv0 vÞ 2 v þ W, and hence f 1 ðuÞ v þ W. 182 CHAPTER 5 Linear Mappings
  • 190. Now we prove (ii). Suppose v0 2 v þ W. Then v0 ¼ v þ w, where w 2 W. Because W is the kernel of f ; we have f ðwÞ ¼ 0. Accordingly, f ðv0 Þ ¼ f ðv þ wÞ þ f ðvÞ þ f ðwÞ ¼ f ðvÞ þ 0 ¼ f ðvÞ ¼ u Thus, v0 2 f 1 ðuÞ, and so v þ W f 1 ðuÞ. Both inclusions imply f 1 ðuÞ ¼ v þ W. 5.21. Suppose F : V ! U and G : U ! W are linear. Prove (a) rankðG FÞ rankðGÞ, (b) rankðG FÞ rankðFÞ. (a) Because FðVÞ U, we also have GðFðVÞÞ GðUÞ, and so dim½GðFðVÞÞ dim½GðUÞ. Then rankðG FÞ ¼ dim½ðG FÞðVÞ ¼ dim½GðFðVÞÞ dim½GðUÞ ¼ rankðGÞ. (b) We have dim½GðFðVÞÞ dim½FðVÞ. Hence, rankðG FÞ ¼ dim½ðG FÞðVÞ ¼ dim½GðFðVÞÞ dim½FðVÞ ¼ rankðFÞ 5.22. Prove Theorem 5.3: Let F : V ! U be linear. Then, (a) Im F is a subspace of U, (b) Ker F is a subspace of V. (a) Because Fð0Þ ¼ 0; we have 0 2 Im F. Now suppose u; u0 2 Im F and a; b 2 K. Because u and u0 belong to the image of F, there exist vectors v; v0 2 V such that FðvÞ ¼ u and Fðv0 Þ ¼ u0 . Then Fðav þ bv0 Þ ¼ aFðvÞ þ bFðv0 Þ ¼ au þ bu0 2 Im F Thus, the image of F is a subspace of U. (b) Because Fð0Þ ¼ 0; we have 0 2 Ker F. Now suppose v; w 2 Ker F and a; b 2 K. Because v and w belong to the kernel of F, FðvÞ ¼ 0 and FðwÞ ¼ 0. Thus, Fðav þ bwÞ ¼ aFðvÞ þ bFðwÞ ¼ a0 þ b0 ¼ 0 þ 0 ¼ 0; and so av þ bw 2 Ker F Thus, the kernel of F is a subspace of V. 5.23. Prove Theorem 5.6: Suppose V has finite dimension and F : V ! U is linear. Then dim V ¼ dimðKer FÞ þ dimðIm FÞ ¼ nullityðFÞ þ rankðFÞ Suppose dimðKer FÞ ¼ r and fw1; . . . ; wrg is a basis of Ker F, and suppose dimðIm FÞ ¼ s and fu1; . . . ; usg is a basis of Im F. (By Proposition 5.4, Im F has finite dimension.) Because every uj 2 Im F, there exist vectors v1; . . . ; vs in V such that Fðv1Þ ¼ u1; . . . ; FðvsÞ ¼ us. We claim that the set B ¼ fw1; . . . ; wr; v1; . . . ; vsg is a basis of V; that is, (i) B spans V, and (ii) B is linearly independent. Once we prove (i) and (ii), then dim V ¼ r þ s ¼ dimðKer FÞ þ dimðIm FÞ. (i) B spans V. Let v 2 V. Then FðvÞ 2 Im F. Because the uj span Im F, there exist scalars a1; . . . ; as such that FðvÞ ¼ a1u1 þ þ asus. Set ^ v ¼ a1v1 þ þ asvs v. Then Fð^ vÞ ¼ Fða1v1 þ þ asvs vÞ ¼ a1Fðv1Þ þ þ asFðvsÞ FðvÞ ¼ a1u1 þ þ asus FðvÞ ¼ 0 Thus, ^ v 2 Ker F. Because the wi span Ker F, there exist scalars b1; . . . ; br, such that ^ v ¼ b1w1 þ þ brwr ¼ a1v1 þ þ asvs v Accordingly, v ¼ a1v1 þ þ asvs b1w1 brwr Thus, B spans V. CHAPTER 5 Linear Mappings 183
  • 191. (ii) B is linearly independent. Suppose x1w1 þ þ xrwr þ y1v1 þ þ ysvs ¼ 0 ð1Þ where xi; yj 2 K. Then 0 ¼ Fð0Þ ¼ Fðx1w1 þ þ xrwr þ y1v1 þ þ ysvsÞ ¼ x1Fðw1Þ þ þ xrFðwrÞ þ y1Fðv1Þ þ þ ysFðvsÞ ð2Þ But FðwiÞ ¼ 0, since wi 2 Ker F, and FðvjÞ ¼ uj. Substituting into (2), we will obtain y1u1 þ þ ysus ¼ 0. Since the uj are linearly independent, each yj ¼ 0. Substitution into (1) gives x1w1 þ þ xrwr ¼ 0. Since the wi are linearly independent, each xi ¼ 0. Thus B is linearly independent. Singular and Nonsingular Linear Maps, Isomorphisms 5.24. Determine whether or not each of the following linear maps is nonsingular. If not, find a nonzero vector v whose image is 0. (a) F : R2 ! R2 defined by Fðx; yÞ ¼ ðx y; x 2yÞ. (b) G : R2 ! R2 defined by Gðx; yÞ ¼ ð2x 4y; 3x 6yÞ. (a) Find Ker F by setting FðvÞ ¼ 0, where v ¼ ðx; yÞ, ðx y; x 2yÞ ¼ ð0; 0Þ or x y ¼ 0 x 2y ¼ 0 or x y ¼ 0 y ¼ 0 The only solution is x ¼ 0, y ¼ 0. Hence, F is nonsingular. (b) Set Gðx; yÞ ¼ ð0; 0Þ to find Ker G: ð2x 4y; 3x 6yÞ ¼ ð0; 0Þ or 2x 4y ¼ 0 3x 6y ¼ 0 or x 2y ¼ 0 The system has nonzero solutions, because y is a free variable. Hence, G is singular. Let y ¼ 1 to obtain the solution v ¼ ð2; 1Þ, which is a nonzero vector, such that GðvÞ ¼ 0. 5.25. The linear map F : R2 ! R2 defined by Fðx; yÞ ¼ ðx y; x 2yÞ is nonsingular by the previous Problem 5.24. Find a formula for F1 . Set Fðx; yÞ ¼ ða; bÞ, so that F1 ða; bÞ ¼ ðx; yÞ. We have ðx y; x 2yÞ ¼ ða; bÞ or x y ¼ a x 2y ¼ b or x y ¼ a y ¼ a b Solve for x and y in terms of a and b to get x ¼ 2a b, y ¼ a b. Thus, F1 ða; bÞ ¼ ð2a b; a bÞ or F1 ðx; yÞ ¼ ð2x y; x yÞ (The second equation is obtained by replacing a and b by x and y, respectively.) 5.26. Let G : R2 ! R3 be defined by Gðx; yÞ ¼ ðx þ y; x 2y; 3x þ yÞ. (a) Show that G is nonsingular. (b) Find a formula for G1 . (a) Set Gðx; yÞ ¼ ð0; 0; 0Þ to find Ker G. We have ðx þ y; x 2y; 3x þ yÞ ¼ ð0; 0; 0Þ or x þ y ¼ 0; x 2y ¼ 0; 3x þ y ¼ 0 The only solution is x ¼ 0, y ¼ 0; hence, G is nonsingular. (b) Although G is nonsingular, it is not invertible, because R2 and R3 have different dimensions. (Thus, Theorem 5.9 does not apply.) Accordingly, G1 does not exist. 184 CHAPTER 5 Linear Mappings
  • 192. 5.27. Suppose that F : V ! U is linear and that V is of finite dimension. Show that V and the image of F have the same dimension if and only if F is nonsingular. Determine all nonsingular linear mappings T : R4 ! R3 . By Theorem 5.6, dim V ¼ dimðIm FÞ þ dimðKer FÞ. Hence, V and Im F have the same dimension if and only if dimðKer FÞ ¼ 0 or Ker F ¼ f0g (i.e., if and only if F is nonsingular). Because dim R3 is less than dim R4 , we have that dimðIm TÞ is less than the dimension of the domain R4 of T. Accordingly no linear mapping T : R4 ! R3 can be nonsingular. 5.28. Prove Theorem 5.7: Let F : V ! U be a nonsingular linear mapping. Then the image of any linearly independent set is linearly independent. Suppose v1; v2; . . . ; vn are linearly independent vectors in V. We claim that Fðv1Þ; Fðv2Þ; . . . ; FðvnÞ are also linearly independent. Suppose a1Fðv1Þ þ a2Fðv2Þ þ þ anFðvnÞ ¼ 0, where ai 2 K. Because F is linear, Fða1v1 þ a2v2 þ þ anvnÞ ¼ 0. Hence, a1v1 þ a2v2 þ þ anvn 2 Ker F But F is nonsingular—that is, Ker F ¼ f0g. Hence, a1v1 þ a2v2 þ þ anvn ¼ 0. Because the vi are linearly independent, all the ai are 0. Accordingly, the FðviÞ are linearly independent. Thus, the theorem is proved. 5.29. Prove Theorem 5.9: Suppose V has finite dimension and dim V ¼ dim U. Suppose F : V ! U is linear. Then F is an isomorphism if and only if F is nonsingular. If F is an isomorphism, then only 0 maps to 0; hence, F is nonsingular. Conversely, suppose F is nonsingular. Then dimðKer FÞ ¼ 0. By Theorem 5.6, dim V ¼ dimðKer FÞ þ dimðIm FÞ. Thus, dim U ¼ dim V ¼ dimðIm FÞ Because U has finite dimension, Im F ¼ U. This means F maps V onto U. Thus, F is one-to-one and onto; that is, F is an isomorphism. Operations with Linear Maps 5.30. Define F : R3 ! R2 and G : R3 ! R2 by Fðx; y; zÞ ¼ ð2x; y þ zÞ and Gðx; y; zÞ ¼ ðx z; yÞ. Find formulas defining the maps: (a) F þ G, (b) 3F, (c) 2F 5G. (a) ðF þ GÞðx; y; zÞ ¼ Fðx; y; zÞ þ Gðx; y; zÞ ¼ ð2x; y þ zÞ þ ðx z; yÞ ¼ ð3x z; 2y þ zÞ (b) ð3FÞðx; y; zÞ ¼ 3Fðx; y; zÞ ¼ 3ð2x; y þ zÞ ¼ ð6x; 3y þ 3zÞ (c) ð2F 5GÞðx; y; zÞ ¼ 2Fðx; y; zÞ 5Gðx; y; zÞ ¼ 2ð2x; y þ zÞ 5ðx z; yÞ ¼ ð4x; 2y þ 2zÞ þ ð5x þ 5z; 5yÞ ¼ ðx þ 5z; 3y þ 2zÞ 5.31. Let F : R3 ! R2 and G : R2 ! R2 be defined by Fðx; y; zÞ ¼ ð2x; y þ zÞ and Gðx; yÞ ¼ ðy; xÞ. Derive formulas defining the mappings: (a) G F, (b) F G. (a) ðG FÞðx; y; zÞ ¼ GðFðx; y; zÞÞ ¼ Gð2x; y þ zÞ ¼ ðy þ z; 2xÞ (b) The mapping F G is not defined, because the image of G is not contained in the domain of F. 5.32. Prove: (a) The zero mapping 0, defined by 0ðvÞ ¼ 0 2 U for every v 2 V, is the zero element of HomðV; UÞ. (b) The negative of F 2 HomðV; UÞ is the mapping ð1ÞF, that is, F ¼ ð1ÞF. Let F 2 HomðV; UÞ. Then, for every v 2 V: ðF þ 0ÞðvÞ ¼ FðvÞ þ 0ðvÞ ¼ FðvÞ þ 0 ¼ FðvÞ ðaÞ Because ðF þ 0ÞðvÞ ¼ FðvÞ for every v 2 V, we have F þ 0 ¼ F. Similarly, 0 þ F ¼ F: ðF þ ð1ÞFÞðvÞ ¼ FðvÞ þ ð1ÞFðvÞ ¼ FðvÞ FðvÞ ¼ 0 ¼ 0ðvÞ ðbÞ Thus, F þ ð1ÞF ¼ 0: Similarly ð1ÞF þ F ¼ 0: Hence, F ¼ ð1ÞF: CHAPTER 5 Linear Mappings 185
  • 193. 5.33. Suppose F1; F2; . . . ; Fn are linear maps from V into U. Show that, for any scalars a1; a2; . . . ; an, and for any v 2 V, ða1F1 þ a2F2 þ þ anFnÞðvÞ ¼ a1F1ðvÞ þ a2F2ðvÞ þ þ anFnðvÞ The mapping a1F1 is defined by ða1F1ÞðvÞ ¼ a1FðvÞ. Hence, the theorem holds for n ¼ 1. Accordingly, by induction, ða1F1 þ a2F2 þ þ anFnÞðvÞ ¼ ða1F1ÞðvÞ þ ða2F2 þ þ anFnÞðvÞ ¼ a1F1ðvÞ þ a2F2ðvÞ þ þ anFnðvÞ 5.34. Consider linear mappings F : R3 ! R2 , G : R3 ! R2 , H : R3 ! R2 defined by Fðx; y; zÞ ¼ ðx þ y þ z; x þ yÞ; Gðx; y; zÞ ¼ ð2x þ z; x þ yÞ; Hðx; y; zÞ ¼ ð2y; xÞ Show that F, G, H are linearly independent [as elements of HomðR3 ; R2 Þ]. Suppose, for scalars a; b; c 2 K, aF þ bG þ cH ¼ 0 ð1Þ (Here 0 is the zero mapping.) For e1 ¼ ð1; 0; 0Þ 2 R3 , we have 0ðe1Þ ¼ ð0; 0Þ and ðaF þ bG þ cHÞðe1Þ ¼ aFð1; 0; 0Þ þ bGð1; 0; 0Þ þ cHð1; 0; 0Þ ¼ að1; 1Þ þ bð2; 1Þ þ cð0; 1Þ ¼ ða þ 2b; a þ b þ cÞ Thus by (1), ða þ 2b; a þ b þ cÞ ¼ ð0; 0Þ and so a þ 2b ¼ 0 and a þ b þ c ¼ 0 ð2Þ Similarly for e2 ¼ ð0; 1; 0Þ 2 R3 , we have 0ðe2Þ ¼ ð0; 0Þ and ðaF þ bG þ cHÞðe2Þ ¼ aFð0; 1; 0Þ þ bGð0; 1; 0Þ þ cHð0; 1; 0Þ ¼ að1; 1Þ þ bð0; 1Þ þ cð2; 0Þ ¼ ða þ 2c; a þ bÞ Thus, a þ 2c ¼ 0 and a þ b ¼ 0 ð3Þ Using (2) and (3), we obtain a ¼ 0; b ¼ 0; c ¼ 0 ð4Þ Because (1) implies (4), the mappings F, G, H are linearly independent. 5.35. Let k be a nonzero scalar. Show that a linear map T is singular if and only if kT is singular. Hence, T is singular if and only if T is singular. Suppose T is singular. Then TðvÞ ¼ 0 for some vector v 6¼ 0. Hence, ðkTÞðvÞ ¼ kTðvÞ ¼ k0 ¼ 0 and so kT is singular. Now suppose kT is singular. Then ðkTÞðwÞ ¼ 0 for some vector w 6¼ 0. Hence, TðkwÞ ¼ kTðwÞ ¼ ðkTÞðwÞ ¼ 0 But k 6¼ 0 and w 6¼ 0 implies kw 6¼ 0. Thus, T is also singular. 5.36. Find the dimension d of: (a) HomðR3 ; R4 Þ, (b) HomðR5 ; R3 Þ, (c) HomðP3ðtÞ; R2 Þ, (d) HomðM2;3; R4 Þ. Use dim½HomðV; UÞ ¼ mn, where dim V ¼ m and dim U ¼ n. (a) d ¼ 3ð4Þ ¼ 12. (c) Because dim P3ðtÞ ¼ 4, d ¼ 4ð2Þ ¼ 8. (b) d ¼ 5ð3Þ ¼ 15. (d) Because dim M2;3 ¼ 6, d ¼ 6ð4Þ ¼ 24. 186 CHAPTER 5 Linear Mappings
  • 194. 5.37. Prove Theorem 5.11. Suppose dim V ¼ m and dim U ¼ n. Then dim½HomðV; UÞ ¼ mn. Suppose fv1; . . . ; vmg is a basis of V and fu1; . . . ; ung is a basis of U. By Theorem 5.2, a linear mapping in HomðV; UÞ is uniquely determined by arbitrarily assigning elements of U to the basis elements vi of V. We define Fij 2 HomðV; UÞ; i ¼ 1; . . . ; m; j ¼ 1; . . . ; n to be the linear mapping for which FijðviÞ ¼ uj, and FijðvkÞ ¼ 0 for k 6¼ i. That is, Fij maps vi into uj and the other v’s into 0. Observe that fFijg contains exactly mn elements; hence, the theorem is proved if we show that it is a basis of HomðV; UÞ. Proof that fFijg generates HomðV; UÞ. Consider an arbitrary function F 2 HomðV; UÞ. Suppose Fðv1Þ ¼ w1; Fðv2Þ ¼ w2; . . . ; FðvmÞ ¼ wm. Because wk 2 U, it is a linear combination of the u’s; say, wk ¼ ak1u1 þ ak2u2 þ þ aknun; k ¼ 1; . . . ; m; aij 2 K ð1Þ Consider the linear mapping G ¼ Pm i¼1 Pn j¼1 aijFij. Because G is a linear combination of the Fij, the proof that fFijg generates HomðV; UÞ is complete if we show that F ¼ G. We now compute GðvkÞ; k ¼ 1; . . . ; m. Because FijðvkÞ ¼ 0 for k 6¼ i and FkiðvkÞ ¼ ui; GðvkÞ ¼ P m i¼1 P n j¼1 aijFijðvkÞ ¼ P n j¼1 akjFkjðvkÞ ¼ P n j¼1 akjuj ¼ ak1u1 þ ak2u2 þ þ aknun Thus, by (1), GðvkÞ ¼ wk for each k. But FðvkÞ ¼ wk for each k. Accordingly, by Theorem 5.2, F ¼ G; hence, fFijg generates HomðV; UÞ. Proof that fFijg is linearly independent. Suppose, for scalars cij 2 K, P m i¼1 P n j¼1 cijFij ¼ 0 For vk; k ¼ 1; . . . ; m, 0 ¼ 0ðvkÞ ¼ P m i¼1 P n j¼1 cijFijðvkÞ ¼ P n j¼1 ckjFkjðvkÞ ¼ P n j¼1 ckjuj ¼ ck1u1 þ ck2u2 þ þ cknun But the ui are linearly independent; hence, for k ¼ 1; . . . ; m, we have ck1 ¼ 0; ck2 ¼ 0; . . . ; ckn ¼ 0. In other words, all the cij ¼ 0, and so fFijg is linearly independent. 5.38. Prove Theorem 5.12: (i) G ðF þ F0 Þ ¼ G F þ G F0 . (ii) ðG þ G0 Þ F ¼ G F þ G0 F. (iii) kðG FÞ ¼ ðkGÞ F ¼ G ðkFÞ. (i) For every v 2 V, ðG ðF þ F0 ÞÞðvÞ ¼ GððF þ F0 ÞðvÞÞ ¼ GðFðvÞ þ F0 ðvÞÞ ¼ GðFðvÞÞ þ GðF0 ðvÞÞ ¼ ðG FÞðvÞ þ ðG F0 ÞðvÞ ¼ ðG F þ G F0 ÞðvÞ Thus, G ðF þ F0 Þ ¼ G F þ G F0 . (ii) For every v 2 V, ððG þ G0 Þ FÞðvÞ ¼ ðG þ G0 ÞðFðvÞÞ ¼ GðFðvÞÞ þ G0 ðFðvÞÞ ¼ ðG FÞðvÞ þ ðG0 FÞðvÞ ¼ ðG F þ G0 FÞðvÞ Thus, ðG þ G0 Þ F ¼ G F þ G0 F. CHAPTER 5 Linear Mappings 187
  • 195. (iii) For every v 2 V, ðkðG FÞÞðvÞ ¼ kðG FÞðvÞ ¼ kðGðFðvÞÞÞ ¼ ðkGÞðFðvÞÞ ¼ ðkG FÞðvÞ and ðkðG FÞÞðvÞ ¼ kðG FÞðvÞ ¼ kðGðFðvÞÞÞ ¼ GðkFðvÞÞ ¼ GððkFÞðvÞÞ ¼ ðG kFÞðvÞ Accordingly, kðG FÞ ¼ ðkGÞ F ¼ G ðkFÞ. (We emphasize that two mappings are shown to be equal by showing that each of them assigns the same image to each point in the domain.) Algebra of Linear Maps 5.39. Let F and G be the linear operators on R2 defined by Fðx; yÞ ¼ ðy; xÞ and Gðx; yÞ ¼ ð0; xÞ. Find formulas defining the following operators: (a) F þ G, (b) 2F 3G, (c) FG, (d) GF, (e) F2 , (f) G2 . (a) ðF þ GÞðx; yÞ ¼ Fðx; yÞ þ Gðx; yÞ ¼ ðy; xÞ þ ð0; xÞ ¼ ðy; 2xÞ. (b) ð2F 3GÞðx; yÞ ¼ 2Fðx; yÞ 3Gðx; yÞ ¼ 2ðy; xÞ 3ð0; xÞ ¼ ð2y; xÞ. (c) ðFGÞðx; yÞ ¼ FðGðx; yÞÞ ¼ Fð0; xÞ ¼ ðx; 0Þ. (d) ðGFÞðx; yÞ ¼ GðFðx; yÞÞ ¼ Gðy; xÞ ¼ ð0; yÞ. (e) F2 ðx; yÞ ¼ FðFðx; yÞÞ ¼ Fðy; xÞ ¼ ðx; yÞ. (Note that F2 ¼ I, the identity mapping.) (f) G2 ðx; yÞ ¼ GðGðx; yÞÞ ¼ Gð0; xÞ ¼ ð0; 0Þ. (Note that G2 ¼ 0, the zero mapping.) 5.40. Consider the linear operator T on R3 defined by Tðx; y; zÞ ¼ ð2x; 4x y; 2x þ 3y zÞ. (a) Show that T is invertible. Find formulas for (b) T1 , (c) T2 , (d) T2 . (a) Let W ¼ Ker T. We need only show that T is nonsingular (i.e., that W ¼ f0g). Set Tðx; y; zÞ ¼ ð0; 0; 0Þ, which yields Tðx; y; zÞ ¼ ð2x; 4x y; 2x þ 3y zÞ ¼ ð0; 0; 0Þ Thus, W is the solution space of the homogeneous system 2x ¼ 0; 4x y ¼ 0; 2x þ 3y z ¼ 0 which has only the trivial solution (0, 0, 0). Thus, W ¼ f0g. Hence, T is nonsingular, and so T is invertible. (b) Set Tðx; y; zÞ ¼ ðr; s; tÞ [and so T1 ðr; s; tÞ ¼ ðx; y; zÞ]. We have ð2x; 4x y; 2x þ 3y zÞ ¼ ðr; s; tÞ or 2x ¼ r; 4x y ¼ s; 2x þ 3y z ¼ t Solve for x, y, z in terms of r, s, t to get x ¼ 1 2 r, y ¼ 2r s, z ¼ 7r 3s t. Thus, T1 ðr; s; tÞ ¼ ð1 2 r; 2r s; 7r 3s tÞ or T1 ðx; y; zÞ ¼ ð1 2 x; 2x y; 7x 3y zÞ (c) Apply T twice to get T2 ðx; y; zÞ ¼ Tð2x; 4x y; 2x þ 3y zÞ ¼ ½4x; 4ð2xÞ ð4x yÞ; 2ð2xÞ þ 3ð4x yÞ ð2x þ 3y zÞ ¼ ð4x; 4x þ y; 14x 6y þ zÞ (d) Apply T1 twice to get T2 ðx; y; zÞ ¼ T2 ð1 2 x; 2x y; 7x 3y zÞ ¼ ½1 4 x; 2ð1 2 xÞ ð2x yÞ; 7ð1 2 xÞ 3ð2x yÞ ð7x 3y zÞ ¼ ð1 4 x; x þ y; 19 2 x þ 6y þ zÞ 188 CHAPTER 5 Linear Mappings
  • 196. 5.41. Let V be of finite dimension and let T be a linear operator on V for which TR ¼ I, for some operator R on V. (We call R a right inverse of T.) (a) Show that T is invertible. (b) Show that R ¼ T1 . (c) Give an example showing that the above need not hold if V is of infinite dimension. (a) Let dim V ¼ n. By Theorem 5.14, T is invertible if and only if T is onto; hence, T is invertible if and only if rankðTÞ ¼ n. We have n ¼ rankðIÞ ¼ rankðTRÞ rankðTÞ n. Hence, rankðTÞ ¼ n and T is invertible. (b) TT1 ¼ T1 T ¼ I. Then R ¼ IR ¼ ðT1 TÞR ¼ T1 ðTRÞ ¼ T1 I ¼ T1 . (c) Let V be the space of polynomials in t over K; say, pðtÞ ¼ a0 þ a1t þ a2t2 þ þ asts . Let T and R be the operators on V defined by Tð pðtÞÞ ¼ 0 þ a1 þ a2t þ þ asts1 and Rð pðtÞÞ ¼ a0t þ a1t2 þ þ astsþ1 We have ðTRÞð pðtÞÞ ¼ TðRð pðtÞÞÞ ¼ Tða0t þ a1t2 þ þ astsþ1 Þ ¼ a0 þ a1t þ þ asts ¼ pðtÞ and so TR ¼ I, the identity mapping. On the other hand, if k 2 K and k 6¼ 0, then ðRTÞðkÞ ¼ RðTðkÞÞ ¼ Rð0Þ ¼ 0 6¼ k Accordingly, RT 6¼ I. 5.42. Let F and G be linear operators on R2 defined by Fðx; yÞ ¼ ð0; xÞ and Gðx; yÞ ¼ ðx; 0Þ. Show that (a) GF ¼ 0, the zero mapping, but FG 6¼ 0. (b) G2 ¼ G. (a) ðGFÞðx; yÞ ¼ GðFðx; yÞÞ ¼ Gð0; xÞ ¼ ð0; 0Þ. Because GF assigns 0 ¼ ð0; 0Þ to every vector (x; y) in R2 , it is the zero mapping; that is, GF ¼ 0. On the other hand, ðFGÞðx; yÞ ¼ FðGðx; yÞÞ ¼ Fðx; 0Þ ¼ ð0; xÞ. For example, ðFGÞð2; 3Þ ¼ ð0; 2Þ. Thus, FG 6¼ 0, as it does not assign 0 ¼ ð0; 0Þ to every vector in R2 . (b) For any vector (x; y) in R2 , we have G2 ðx; yÞ ¼ GðGðx; yÞÞ ¼ Gðx; 0Þ ¼ ðx; 0Þ ¼ Gðx; yÞ. Hence, G2 ¼ G. 5.43. Find the dimension of (a) AðR4 Þ, (b) AðP2ðtÞÞ, (c) AðM2;3). Use dim½AðVÞ ¼ n2 where dim V ¼ n. Hence, (a) dim½AðR4 Þ ¼ 42 ¼ 16, (b) dim½AðP2ðtÞÞ ¼ 32 ¼ 9, (c) dim½AðM2;3Þ ¼ 62 ¼ 36. 5.44. Let E be a linear operator on V for which E2 ¼ E. (Such an operator is called a projection.) Let U be the image of E, and let W be the kernel. Prove (a) If u 2 U, then EðuÞ ¼ u (i.e., E is the identity mapping on U ). (b) If E 6¼ I, then E is singular—that is, EðvÞ ¼ 0 for some v 6¼ 0. (c) V ¼ U W. (a) If u 2 U, the image of E, then EðvÞ ¼ u for some v 2 V. Hence, using E2 ¼ E, we have u ¼ EðvÞ ¼ E2 ðvÞ ¼ EðEðvÞÞ ¼ EðuÞ (b) If E 6¼ I, then for some v 2 V, EðvÞ ¼ u, where v 6¼ u. By (i), EðuÞ ¼ u. Thus, Eðv uÞ ¼ EðvÞ EðuÞ ¼ u u ¼ 0; where v u 6¼ 0 (c) We first show that V ¼ U þ W. Let v 2 V. Set u ¼ EðvÞ and w ¼ v EðvÞ. Then v ¼ EðvÞ þ v EðvÞ ¼ u þ w By deflnition, u ¼ EðvÞ 2 U, the image of E. We now show that w 2 W, the kernel of E, EðwÞ ¼ Eðv EðvÞÞ ¼ EðvÞ E2 ðvÞ ¼ EðvÞ EðvÞ ¼ 0 and thus w 2 W. Hence, V ¼ U þ W. We next show that U W ¼ f0g. Let v 2 U W. Because v 2 U, EðvÞ ¼ v by part (a). Because v 2 W, EðvÞ ¼ 0. Thus, v ¼ EðvÞ ¼ 0 and so U W ¼ f0g. The above two properties imply that V ¼ U W. CHAPTER 5 Linear Mappings 189
  • 197. SUPPLEMENTARY PROBLEMS Mappings 5.45. Determine the number of different mappings from ðaÞ f1; 2g into f1; 2; 3g; ðbÞ f1; 2; ... ; rg into f1; 2; ... ; sg: 5.46. Let f : R ! R and g : R ! R be defined by f ðxÞ ¼ x2 þ 3x þ 1 and gðxÞ ¼ 2x 3. Find formulas defining the composition mappings: (a) f g; (b) g f ; (c) g g; (d) f f. 5.47. For each mappings f : R ! R find a formula for its inverse: (a) f ðxÞ ¼ 3x 7, (b) f ðxÞ ¼ x3 þ 2. 5.48. For any mapping f : A ! B, show that 1B f ¼ f ¼ f 1A. Linear Mappings 5.49. Show that the following mappings are linear: (a) F : R3 ! R2 defined by Fðx; y; zÞ ¼ ðx þ 2y 3z; 4x 5y þ 6zÞ. (b) F : R2 ! R2 defined by Fðx; yÞ ¼ ðax þ by; cx þ dyÞ, where a, b, c, d belong to R. 5.50. Show that the following mappings are not linear: (a) F : R2 ! R2 defined by Fðx; yÞ ¼ ðx2 ; y2 Þ. (b) F : R3 ! R2 defined by Fðx; y; zÞ ¼ ðx þ 1; y þ zÞ. (c) F : R2 ! R2 defined by Fðx; yÞ ¼ ðxy; yÞ. (d) F : R3 ! R2 defined by Fðx; y; zÞ ¼ ðjxj; y þ zÞ. 5.51. Find Fða; bÞ, where the linear map F : R2 ! R2 is defined by Fð1; 2Þ ¼ ð3; 1Þ and Fð0; 1Þ ¼ ð2; 1Þ. 5.52. Find a 2 2 matrix A that maps (a) ð1; 3ÞT and ð1; 4ÞT into ð2; 5ÞT and ð3; 1ÞT , respectively. (b) ð2; 4ÞT and ð1; 2ÞT into ð1; 1ÞT and ð1; 3ÞT , respectively. 5.53. Find a 2 2 singular matrix B that maps ð1; 1ÞT into ð1; 3ÞT . 5.54. Let V be the vector space of real n-square matrices, and let M be a fixed nonzero matrix in V. Show that the first two of the following mappings T : V ! V are linear, but the third is not: (a) TðAÞ ¼ MA, (b) TðAÞ ¼ AM þ MA, (c) TðAÞ ¼ M þ A. 5.55. Give an example of a nonlinear map F : R2 ! R2 such that F1 ð0Þ ¼ f0g but F is not one-to-one. 5.56. Let F : R2 ! R2 be defined by Fðx; yÞ ¼ ð3x þ 5y; 2x þ 3yÞ, and let S be the unit circle in R2 . (S consists of all points satisfying x2 þ y2 ¼ 1.) Find (a) the image FðSÞ, (b) the preimage F1 ðSÞ. 5.57. Consider the linear map G : R3 ! R3 defined by Gðx; y; zÞ ¼ ðx þ y þ z; y 2z; y 3zÞ and the unit sphere S2 in R3 , which consists of the points satisfying x2 þ y2 þ z2 ¼ 1. Find (a) GðS2Þ, (b) G1 ðS2Þ. 5.58. Let H be the plane x þ 2y 3z ¼ 4 in R3 and let G be the linear map in Problem 5.57. Find (a) GðHÞ, (b) G1 ðHÞ. 5.59. Let W be a subspace of V. The inclusion map, denoted by i : W ,! V, is defined by iðwÞ ¼ w for every w 2 W. Show that the inclusion map is linear. 5.60. Suppose F : V ! U is linear. Show that FðvÞ ¼ FðvÞ. Kernel and Image of Linear Mappings 5.61. For each linear map F find a basis and the dimension of the kernel and the image of F: (a) F : R3 ! R3 defined by Fðx; y; zÞ ¼ ðx þ 2y 3z; 2x þ 5y 4z; x þ 4y þ zÞ, (b) F : R4 ! R3 defined by Fðx; y; z; tÞ ¼ ðx þ 2y þ 3z þ 2t; 2x þ 4y þ 7z þ 5t; x þ 2y þ 6z þ 5tÞ. 190 CHAPTER 5 Linear Mappings
  • 198. 5.62. For each linear map G, find a basis and the dimension of the kernel and the image of G: (a) G : R3 ! R2 defined by Gðx; y; zÞ ¼ ðx þ y þ z; 2x þ 2y þ 2zÞ, (b) G : R3 ! R2 defined by Gðx; y; zÞ ¼ ðx þ y; y þ zÞ, (c) G : R5 ! R3 defined by Gðx; y; z; s; tÞ ¼ ðx þ 2y þ 2z þ s þ t; x þ 2y þ 3z þ 2s t; 3x þ 6y þ 8z þ 5s tÞ: 5.63. Each of the following matrices determines a linear map from R4 into R3 : (a) A ¼ 1 2 0 1 2 1 2 1 1 3 2 2 2 4 3 5, (b) B ¼ 1 0 2 1 2 3 1 1 2 0 5 3 2 4 3 5. Find a basis as well as the dimension of the kernel and the image of each linear map. 5.64. Find a linear mapping F : R3 ! R3 whose image is spanned by (1, 2, 3) and (4, 5, 6). 5.65. Find a linear mapping G : R4 ! R3 whose kernel is spanned by (1, 2, 3, 4) and (0, 1, 1, 1). 5.66. Let V ¼ P10ðtÞ, the vector space of polynomials of degree 10. Consider the linear map D4 : V ! V, where D4 denotes the fourth derivative d4 ð f Þ=dt4 . Find a basis and the dimension of (a) the image of D4 ; (b) the kernel of D4 . 5.67. Suppose F : V ! U is linear. Show that (a) the image of any subspace of V is a subspace of U; (b) the preimage of any subspace of U is a subspace of V. 5.68. Show that if F : V ! U is onto, then dim U dim V. Determine all linear maps F : R3 ! R4 that are onto. 5.69. Consider the zero mapping 0 : V ! U defined by 0ðvÞ ¼ 0; 8 v 2 V. Find the kernel and the image of 0. Operations with linear Mappings 5.70. Let F : R3 ! R2 and G : R3 ! R2 be defined by Fðx; y; zÞ ¼ ðy; x þ zÞ and Gðx; y; zÞ ¼ ð2z; x yÞ. Find formulas defining the mappings F þ G and 3F 2G. 5.71. Let H : R2 ! R2 be defined by Hðx; yÞ ¼ ðy; 2xÞ. Using the maps F and G in Problem 5.70, find formulas defining the mappings: (a) H F and H G, (b) F H and G H, (c) H ðF þ GÞ and H F þ H G. 5.72. Show that the following mappings F, G, H are linearly independent: (a) F; G; H 2 HomðR2 ; R2 Þ defined by Fðx; yÞ ¼ ðx; 2yÞ, Gðx; yÞ ¼ ðy; x þ yÞ, Hðx; yÞ ¼ ð0; xÞ, (b) F; G; H 2 HomðR3 ; RÞ defined by Fðx; y; zÞ ¼ x þ y þ z, Gðx; y; zÞ ¼ y þ z, Hðx; y; zÞ ¼ x z. 5.73. For F; G 2 HomðV; UÞ, show that rankðF þ GÞ rankðFÞ þ rankðGÞ. (Here V has finite dimension.) 5.74. Let F : V ! U and G : U ! V be linear. Show that if F and G are nonsingular, then G F is nonsingular. Give an example where G F is nonsingular but G is not. [Hint: Let dim V dim U: 5.75. Find the dimension d of (a) HomðR2 ; R8 Þ, (b) HomðP4ðtÞ; R3 Þ, (c) HomðM2;4; P2ðtÞÞ. 5.76. Determine whether or not each of the following linear maps is nonsingular. If not, find a nonzero vector v whose image is 0; otherwise find a formula for the inverse map: (a) F : R3 ! R3 defined by Fðx; y; zÞ ¼ ðx þ y þ z; 2x þ 3y þ 5z; x þ 3y þ 7zÞ, (b) G : R3 ! P2ðtÞ defined by Gðx; y; zÞ ¼ ðx þ yÞt2 þ ðx þ 2y þ 2zÞt þ y þ z, (c) H : R2 ! P2ðtÞ defined by Hðx; yÞ ¼ ðx þ 2yÞt2 þ ðx yÞt þ x þ y. 5.77. When can dim ½HomðV; UÞ ¼ dim V? CHAPTER 5 Linear Mappings 191
  • 199. Algebra of Linear Operators 5.78. Let F and G be the linear operators on R2 defined by Fðx; yÞ ¼ ðx þ y; 0Þ and Gðx; yÞ ¼ ðy; xÞ. Find formulas defining the linear operators: (a) F þ G, (b) 5F 3G, (c) FG, (d) GF, (e) F2 , ( f ) G2 . 5.79. Show that each linear operator T on R2 is nonsingular and find a formula for T1 , where (a) Tðx; yÞ ¼ ðx þ 2y; 2x þ 3yÞ, (b) Tðx; yÞ ¼ ð2x 3y; 3x 4yÞ. 5.80. Show that each of the following linear operators T on R3 is nonsingular and find a formula for T1 , where (a) Tðx; y; zÞ ¼ ðx 3y 2z; y 4z; zÞ; (b) Tðx; y; zÞ ¼ ðx þ z; x y; yÞ. 5.81. Find the dimension of AðVÞ, where (a) V ¼ R7 , (b) V ¼ P5ðtÞ, (c) V ¼ M3;4. 5.82. Which of the following integers can be the dimension of an algebra AðVÞ of linear maps: 5, 9, 12, 25, 28, 36, 45, 64, 88, 100? 5.83. Let T be the linear operator on R2 defined by Tðx; yÞ ¼ ðx þ 2y; 3x þ 4yÞ. Find a formula for f ðTÞ, where (a) f ðtÞ ¼ t2 þ 2t 3, (b) f ðtÞ ¼ t2 5t 2. Miscellaneous Problems 5.84. Suppose F : V ! U is linear and k is a nonzero scalar. Prove that the maps F and kF have the same kernel and the same image. 5.85. Suppose F and G are linear operators on V and that F is nonsingular. Assume that V has finite dimension. Show that rankðFGÞ ¼ rankðGFÞ ¼ rankðGÞ. 5.86. Suppose V has finite dimension. Suppose T is a linear operator on V such that rankðT2 Þ ¼ rankðTÞ. Show that Ker T Im T ¼ f0g. 5.87. Suppose V ¼ U W. Let E1 and E2 be the linear operators on V defined by E1ðvÞ ¼ u, E2ðvÞ ¼ w, where v ¼ u þ w, u 2 U, w 2 W. Show that (a) E2 1 ¼ E1 and E2 2 ¼ E2 (i.e., that E1 and E2 are projections); (b) E1 þ E2 ¼ I, the identity mapping; (c) E1E2 ¼ 0 and E2E1 ¼ 0. 5.88. Let E1 and E2 be linear operators on V satisfying parts (a), (b), (c) of Problem 5.88. Prove V ¼ Im E1 Im E2 5.89. Let v and w be elements of a real vector space V. The line segment L from v to v þ w is defined to be the set of vectors v þ tw for 0 t 1. (See Fig. 5.6.) (a) Show that the line segment L between vectors v and u consists of the points: (i) ð1 tÞv þ tu for 0 t 1, (ii) t1v þ t2u for t1 þ t2 ¼ 1, t1 0, t2 0. (b) Let F : V ! U be linear. Show that the image FðLÞ of a line segment L in V is a line segment in U. Figure 5-6 192 CHAPTER 5 Linear Mappings
  • 200. 5.90. Let F : V ! U be linear and let W be a subspace of V. The restriction of F to W is the map FjW : W ! U defined by FjWðvÞ ¼ FðvÞ for every v in W. Prove the following: (a) FjW is linear; (b) KerðFjWÞ ¼ ðKer FÞ W; (c) ImðFjWÞ ¼ FðWÞ. 5.91. A subset X of a vector space V is said to be convex if the line segment L between any two points (vectors) P; Q 2 X is contained in X. (a) Show that the intersection of convex sets is convex; (b) suppose F : V ! U is linear and X is convex. Show that FðXÞ is convex. ANSWERS TO SUPPLEMENTARY PROBLEMS 5.45. ðaÞ 32 ¼ 9; ðbÞ sr 5.46. (a) ð f gÞðxÞ ¼ 4x2 þ 1, (b) ðg f ÞðxÞ ¼ 2x2 þ 6x 1, (c) ðg gÞðxÞ ¼ 4x 9, (d) ð f f ÞðxÞ ¼ x4 þ 6x3 þ 14x2 þ 15x þ 5 5.47. (a) f 1 ðxÞ ¼ 1 3 ðx þ 7Þ, (b) f 1 ðxÞ ¼ ffiffiffiffiffiffiffiffiffiffiffi x 2 3 p 5.49. Fðx; y; zÞ ¼ Aðx; y; zÞT , where (a) A ¼ 1 2 3 4 5 6 , (b) A ¼ a b c d 5.50. (a) u ¼ ð2; 2Þ, k ¼ 3; then FðkuÞ ¼ ð36; 36Þ but kFðuÞ ¼ ð12; 12Þ; (b) Fð0Þ 6¼ 0; (c) u ¼ ð1; 2Þ, v ¼ ð3; 4Þ; then Fðu þ vÞ ¼ ð24; 6Þ but FðuÞ þ FðvÞ ¼ ð14; 6Þ; (d) u ¼ ð1; 2; 3Þ, k ¼ 2; then FðkuÞ ¼ ð2; 10Þ but kFðuÞ ¼ ð2; 10Þ. 5.51. Fða; bÞ ¼ ða þ 2b; 3a þ bÞ 5.52. (a) A ¼ 17 5 23 6 ; (b) None. (2; 4) and (1; 2) are linearly dependent but not (1, 1) and (1, 3). 5.53. B ¼ 1 0 3 0 [Hint: Send ð0; 1ÞT into ð0; 0ÞT .] 5.55. Fðx; yÞ ¼ ðx2 ; y2 Þ 5.56. (a) 13x2 42xy þ 34y2 ¼ 1, (b) 13x2 þ 42xy þ 34y2 ¼ 1 5.57. (a) x2 8xy þ 26y2 þ 6xz 38yz þ 14z2 ¼ 1, (b) x2 þ 2xy þ 3y2 þ 2xz 8yz þ 14z2 ¼ 1 5.58. (a) x y þ 2z ¼ 4, (b) x þ 6z ¼ 4 5.61. (a) dimðKer FÞ ¼ 1, fð7; 2; 1Þg; dimðIm FÞ ¼ 2, fð1; 2; 1Þ; ð0; 1; 2Þg; (b) dimðKer FÞ ¼ 2, fð2; 1; 0; 0Þ; ð1; 0; 1; 1Þg; dimðIm FÞ ¼ 2, fð1; 2; 1Þ; ð0; 1; 3Þg 5.62. (a) dimðKer GÞ ¼ 2, fð1; 0; 1Þ; ð1; 1; 0Þg; dimðIm GÞ ¼ 1, fð1; 2Þg; (b) dimðKer GÞ ¼ 1, fð1; 1; 1Þg; Im G ¼ R2 , fð1; 0Þ; ð0; 1Þg; (c) dimðKer GÞ ¼ 3, fð2; 1; 0; 0; 0Þ; ð1; 0; 1; 1; 0Þ; ð5; 0; 2; 0; 1Þg; dimðIm GÞ ¼ 2, fð1; 1; 3Þ; ð0; 1; 2Þg 5.63. (a) dimðKer AÞ ¼ 2, fð4; 2; 5; 0Þ; ð1; 3; 0; 5Þg; dimðIm AÞ ¼ 2, fð1; 2; 1Þ; ð0; 1; 1Þg; (b) dimðKer BÞ ¼ 1, fð1; 2 3 ; 1; 1Þg; Im B ¼ R3 5.64. Fðx; y; zÞ ¼ ðx þ 4y; 2x þ 5y; 3x þ 6yÞ CHAPTER 5 Linear Mappings 193
  • 201. 5.65. Fðx; y; z; tÞ ¼ ðx þ y z; 2x þ y t; 0Þ 5.66. (a) f1; t; t2 ; . . . ; t6 g, (b) f1; t; t2 ; t3 g 5.68. None, because dim R4 dim R3 : 5.69. Ker 0 ¼ V, Im 0 ¼ f0g 5.70. ðF þ GÞðx; y; zÞ ¼ ðy þ 2z; 2x y þ zÞ, ð3F 2GÞðx; y; zÞ ¼ ð3y 4z; x þ 2y þ 3zÞ 5.71. (a) ðH FÞðx; y; zÞ ¼ ðx þ z; 2yÞ, ðH GÞðx; y; zÞ ¼ ðx y; 4zÞ; (b) not defined; (c) ðH ðF þ GÞÞðx; y; zÞ ¼ ðH F þ H GÞðx; y; zÞ ¼ ð2x y þ z; 2y þ 4zÞ 5.74. Fðx; yÞ ¼ ðx; y; yÞ; Gðx; y; zÞ ¼ ðx; yÞ 5.75. (a) 16, (b) 15, (c) 24 5.76. (a) v ¼ ð2; 3; 1Þ; (b) G1 ðat2 þ bt þ cÞ ¼ ðb 2c; a b þ 2c; a þ b cÞ; (c) H is nonsingular, but not invertible, because dim P2ðtÞ dim R2 . 5.77. dim U ¼ 1; that is, U ¼ K. 5.78. (a) ðF þ GÞðx; yÞ ¼ ðx; xÞ; (b) ð5F 3GÞðx; yÞ ¼ ð5x þ 8y; 3xÞ; (c) ðFGÞðx; yÞ ¼ ðx y; 0Þ; (d) ðGFÞðx; yÞ ¼ ð0; x þ yÞ; (e) F2 ðx; yÞ ¼ ðx þ y; 0Þ (note that F2 ¼ F); ( f ) G2 ðx; yÞ ¼ ðx; yÞ. [Note that G2 þ I ¼ 0; hence, G is a zero of f ðtÞ ¼ t2 þ 1.] 5.79. (a) T1 ðx; yÞ ¼ ð3x þ 2y; 2x yÞ, (b) T1 ðx; yÞ ¼ ð4x þ 3y; 3x þ 2yÞ 5.80. (a) T1 ðx; y; zÞ ¼ ðx þ 3y þ 14z; y 4z; zÞ, (b) T1 ðx; y; zÞ ¼ ðy þ z; y; x y zÞ 5.81. (a) 49, (b) 36, (c) 144 5.82. Squares: 9, 25, 36, 64, 100 5.83. (a) Tðx; yÞ ¼ ð6x þ 14y; 21x þ 27yÞ; (b) Tðx; yÞ ¼ ð0; 0Þ—that is, f ðTÞ ¼ 0 194 CHAPTER 5 Linear Mappings
  • 202. Linear Mappings and Matrices 6.1 Introduction Consider a basis S ¼ fu1; u2; . . . ; ung of a vector space V over a field K. For any vector v 2 V, suppose v ¼ a1u1 þ a2u2 þ þ anun Then the coordinate vector of v relative to the basis S, which we assume to be a column vector (unless otherwise stated or implied), is denoted and defined by ½vS ¼ ½a1; a2; . . . ; anT Recall (Section 4.11) that the mapping v7!½vS, determined by the basis S, is an isomorphism between V and Kn . This chapter shows that there is also an isomorphism, determined by the basis S, between the algebra AðVÞ of linear operators on V and the algebra M of n-square matrices over K. Thus, every linear mapping F: V ! V will correspond to an n-square matrix ½FS determined by the basis S. We will also show how our matrix representation changes when we choose another basis. 6.2 Matrix Representation of a Linear Operator Let T be a linear operator (transformation) from a vector space V into itself, and suppose S ¼ fu1; u2; . . . ; ung is a basis of V. Now Tðu1Þ, Tðu2Þ; . . . ; TðunÞ are vectors in V, and so each is a linear combination of the vectors in the basis S; say, Tðu1Þ ¼ a11u1 þ a12u2 þ þ a1nun Tðu2Þ ¼ a21u1 þ a22u2 þ þ a2nun :::::::::::::::::::::::::::::::::::::::::::::::::::::: TðunÞ ¼ an1u1 þ an2u2 þ þ annun The following definition applies. DEFINITION: The transpose of the above matrix of coefficients, denoted by mSðTÞ or ½TS, is called the matrix representation of T relative to the basis S, or simply the matrix of T in the basis S. (The subscript S may be omitted if the basis S is understood.) Using the coordinate (column) vector notation, the matrix representation of T may be written in the form mSðTÞ ¼ ½TS ¼ ½Tðu1ÞS; ½Tðu2ÞS; . . . ; ½Tðu1ÞS That is, the columns of mðTÞ are the coordinate vectors of Tðu1Þ, Tðu2Þ; . . . ; TðunÞ, respectively. CHAPTER 6 195
  • 203. EXAMPLE 6.1 Let F: R2 ! R2 be the linear operator defined by Fðx; yÞ ¼ ð2x þ 3y; 4x 5yÞ. (a) Find the matrix representation of F relative to the basis S ¼ fu1; u2g ¼ fð1; 2Þ; ð2; 5Þg. (1) First find Fðu1Þ, and then write it as a linear combination of the basis vectors u1 and u2. (For notational convenience, we use column vectors.) We have Fðu1Þ ¼ F 1 2 ¼ 8 6 ¼ x 1 2 þ y 2 5 and x þ 2y ¼ 8 2x þ 5y ¼ 6 Solve the system to obtain x ¼ 52, y ¼ 22. Hence, Fðu1Þ ¼ 52u1 22u2. (2) Next find Fðu2Þ, and then write it as a linear combination of u1 and u2: Fðu2Þ ¼ F 2 5 ¼ 19 17 ¼ x 1 2 þ y 2 5 and x þ 2y ¼ 19 2x þ 5y ¼ 17 Solve the system to get x ¼ 129, y ¼ 55. Thus, Fðu2Þ ¼ 129u1 55u2. Now write the coordinates of Fðu1Þ and Fðu2Þ as columns to obtain the matrix ½FS ¼ 52 129 22 55 (b) Find the matrix representation of F relative to the (usual) basis E ¼ fe1; e2g ¼ fð1; 0Þ; ð0; 1Þg. Find Fðe1Þ and write it as a linear combination of the usual basis vectors e1 and e2, and then find Fðe2Þ and write it as a linear combination of e1 and e2. We have Fðe1Þ ¼ Fð1; 0Þ ¼ ð2; 2Þ ¼ 2e1 þ 4e2 Fðe2Þ ¼ Fð0; 1Þ ¼ ð3; 5Þ ¼ 3e1 5e2 and so ½FE ¼ 2 3 4 5 Note that the coordinates of Fðe1Þ and Fðe2Þ form the columns, not the rows, of ½FE. Also, note that the arithmetic is much simpler using the usual basis of R2 . EXAMPLE 6.2 Let V be the vector space of functions with basis S ¼ fsin t; cos t; e3t g, and let D: V ! V be the differential operator defined by Dð f ðtÞÞ ¼ dð f ðtÞÞ=dt. We compute the matrix representing D in the basis S: Dðsin tÞ ¼ cos t ¼ 0ðsin tÞ þ 1ðcos tÞ þ 0ðe3 tÞ Dðcos tÞ ¼ sin t ¼ 1ðsin tÞ þ 0ðcos tÞ þ 0ðe3t Þ Dðe3t Þ ¼ 3e3t ¼ 0ðsin tÞ þ 0ðcos tÞ þ 3ðe3t Þ and so ½D ¼ 0 1 0 1 0 0 0 0 3 2 6 4 3 7 5 Note that the coordinates of Dðsin tÞ, Dðcos tÞ, Dðe3t Þ form the columns, not the rows, of ½D. Matrix Mappings and Their Matrix Representation Consider the following matrix A, which may be viewed as a linear operator on R2 , and basis S of R2 : A ¼ 3 2 4 5 and S ¼ fu1; u2g ¼ 1 2 ; 2 5 (We write vectors as columns, because our map is a matrix.) We find the matrix representation of A relative to the basis S. 196 CHAPTER 6 Linear Mappings and Matrices
  • 204. (1) First we write Aðu1Þ as a linear combination of u1 and u2. We have Aðu1Þ ¼ 3 2 4 5 1 2 ¼ 1 6 ¼ x 1 2 þ y 2 5 and so x þ 2y ¼ 1 2x þ 5y ¼ 6 Solving the system yields x ¼ 7, y ¼ 4. Thus, Aðu1Þ ¼ 7u1 4u2. (2) Next we write Aðu2Þ as a linear combination of u1 and u2. We have Aðu2Þ ¼ 3 2 4 5 2 5 ¼ 4 7 ¼ x 1 2 þ y 2 5 and so x þ 2y ¼ 4 2x þ 5y ¼ 7 Solving the system yields x ¼ 6, y ¼ 1. Thus, Aðu2Þ ¼ 6u1 þ u2. Writing the coordinates of Aðu1Þ and Aðu2Þ as columns gives us the following matrix representation of A: ½AS ¼ 7 6 4 1 Remark: Suppose we want to find the matrix representation of A relative to the usual basis E ¼ fe1; e2g ¼ f½1; 0T ; ½0; 1T g of R2 : We have Aðe1Þ ¼ 3 2 4 5 1 0 ¼ 3 4 ¼ 3e1 þ 4e2 Aðe2Þ ¼ 3 2 4 5 0 1 ¼ 2 5 ¼ 2e1 5e2 and so ½AE ¼ 3 2 4 5 Note that ½AE is the original matrix A. This result is true in general: The matrix representation of any n n square matrix A over a field K relative to the usual basis E of Kn is the matrix A itself; that is; ½AE ¼ A Algorithm for Finding Matrix Representations Next follows an algorithm for finding matrix representations. The first Step 0 is optional. It may be useful to use it in Step 1(b), which is repeated for each basis vector. ALGORITHM 6.1: The input is a linear operator T on a vector space V and a basis S ¼ fu1; u2; . . . ; ung of V. The output is the matrix representation ½TS. Step 0. Find a formula for the coordinates of an arbitrary vector v relative to the basis S. Step 1. Repeat for each basis vector uk in S: (a) Find TðukÞ. (b) Write TðukÞ as a linear combination of the basis vectors u1; u2; . . . ; un. Step 2. Form the matrix ½TS whose columns are the coordinate vectors in Step 1(b). EXAMPLE 6.3 Let F: R2 ! R2 be defined by Fðx; yÞ ¼ ð2x þ 3y; 4x 5yÞ. Find the matrix representa- tion ½FS of F relative to the basis S ¼ fu1; u2g ¼ fð1; 2Þ; ð2; 5Þg. (Step 0) First find the coordinates of ða; bÞ 2 R2 relative to the basis S. We have a b ¼ x 1 2 þ y 2 5 or x þ 2y ¼ a 2x 5y ¼ b or x þ 2y ¼ a y ¼ 2a þ b CHAPTER 6 Linear Mappings and Matrices 197
  • 205. Solving for x and y in terms of a and b yields x ¼ 5a þ 2b, y ¼ 2a b. Thus, ða; bÞ ¼ ð5a þ 2bÞu1 þ ð2a bÞu2 (Step 1) Now we find Fðu1Þ and write it as a linear combination of u1 and u2 using the above formula for ða; bÞ, and then we repeat the process for Fðu2Þ. We have Fðu1Þ ¼ Fð1; 2Þ ¼ ð4; 14Þ ¼ 8u1 6u2 Fðu2Þ ¼ Fð2; 5Þ ¼ ð11; 33Þ ¼ 11u1 11u2 (Step 2) Finally, we write the coordinates of Fðu1Þ and Fðu2Þ as columns to obtain the required matrix: ½FS ¼ 8 11 6 11 Properties of Matrix Representations This subsection gives the main properties of the matrix representations of linear operators T on a vector space V. We emphasize that we are always given a particular basis S of V. Our first theorem, proved in Problem 6.9, tells us that the ‘‘action’’ of a linear operator T on a vector v is preserved by its matrix representation. THEOREM 6.1: Let T: V ! V be a linear operator, and let S be a (finite) basis of V. Then, for any vector v in V, ½TS½vS ¼ ½TðvÞS. EXAMPLE 6.4 Consider the linear operator F on R2 and the basis S of Example 6.3; that is, Fðx; yÞ ¼ ð2x þ 3y; 4x 5yÞ and S ¼ fu1; u2g ¼ fð1; 2Þ; ð2; 5Þg Let v ¼ ð5; 7Þ; and so FðvÞ ¼ ð11; 55Þ Using the formula from Example 6.3, we get ½v ¼ ½11; 3T and ½FðvÞ ¼ ½55; 33T We verify Theorem 6.1 for this vector v (where ½F is obtained from Example 6.3): ½F½v ¼ 8 11 6 11 11 3 ¼ 55 33 ¼ ½FðvÞ Given a basis S of a vector space V, we have associated a matrix ½T to each linear operator T in the algebra AðVÞ of linear operators on V. Theorem 6.1 tells us that the ‘‘action’’ of an individual linear operator T is preserved by this representation. The next two theorems (proved in Problems 6.10 and 6.11) tell us that the three basic operations in AðVÞ with these operators—namely (i) addition, (ii) scalar multiplication, and (iii) composition—are also preserved. THEOREM 6.2: Let V be an n-dimensional vector space over K, let S be a basis of V, and let M be the algebra of n n matrices over K. Then the mapping m: AðVÞ ! M defined by mðTÞ ¼ ½TS is a vector space isomorphism. That is, for any F; G 2 AðVÞ and any k 2 K, (i) mðF þ GÞ ¼ mðFÞ þ mðGÞ or ½F þ G ¼ ½F þ ½G (ii) mðkFÞ ¼ kmðFÞ or ½kF ¼ k½F (iii) m is bijective (one-to-one and onto). 198 CHAPTER 6 Linear Mappings and Matrices
  • 206. THEOREM 6.3: For any linear operators F; G 2 AðVÞ, mðG FÞ ¼ mðGÞmðFÞ or ½G F ¼ ½G½F (Here G F denotes the composition of the maps G and F.) 6.3 Change of Basis Let V be an n-dimensional vector space over a field K. We have shown that once we have selected a basis S of V, every vector v 2 V can be represented by means of an n-tuple ½vS in Kn , and every linear operator T in AðVÞ can be represented by an n n matrix over K. We ask the following natural question: How do our representations change if we select another basis? In order to answer this question, we first need a definition. DEFINITION: Let S ¼ fu1; u2; . . . ; ung be a basis of a vector space V; and let S0 ¼ fv1; v2; . . . ; vng be another basis. (For reference, we will call S the ‘‘old’’ basis and S0 the ‘‘new’’ basis.) Because S is a basis, each vector in the ‘‘new’’ basis S0 can be written uniquely as a linear combination of the vectors in S; say, v1 ¼ a11u1 þ a12u2 þ þ a1nun v2 ¼ a21u1 þ a22u2 þ þ a2nun ::::::::::::::::::::::::::::::::::::::::::::::::: vn ¼ an1u1 þ an2u2 þ þ annun Let P be the transpose of the above matrix of coefficients; that is, let P ¼ ½pij, where pij ¼ aji. Then P is called the change-of-basis matrix (or transition matrix) from the ‘‘old’’ basis S to the ‘‘new’’ basis S0 . The following remarks are in order. Remark 1: The above change-of-basis matrix P may also be viewed as the matrix whose columns are, respectively, the coordinate column vectors of the ‘‘new’’ basis vectors vi relative to the ‘‘old’’ basis S; namely, P ¼ ½v1S; ½v2S; . . . ; ½vnS Remark 2: Analogously, there is a change-of-basis matrix Q from the ‘‘new’’ basis S0 to the ‘‘old’’ basis S. Similarly, Q may be viewed as the matrix whose columns are, respectively, the coordinate column vectors of the ‘‘old’’ basis vectors ui relative to the ‘‘new’’ basis S0 ; namely, Q ¼ ½u1S0 ; ½u2S0 ; . . . ; ½unS0 Remark 3: Because the vectors v1; v2; . . . ; vn in the new basis S0 are linearly independent, the matrix P is invertible (Problem 6.18). Similarly, Q is invertible. In fact, we have the following proposition (proved in Problem 6.18). PROPOSITION 6.4: Let P and Q be the above change-of-basis matrices. Then Q ¼ P1 . Now suppose S ¼ fu1; u2; . . . ; ung is a basis of a vector space V, and suppose P ¼ ½pij is any nonsingular matrix. Then the n vectors vi ¼ p1iui þ p2iu2 þ þ pniun; i ¼ 1; 2; . . . ; n corresponding to the columns of P, are linearly independent [Problem 6.21(a)]. Thus, they form another basis S0 of V. Moreover, P will be the change-of-basis matrix from S to the new basis S0 . CHAPTER 6 Linear Mappings and Matrices 199
  • 207. EXAMPLE 6.5 Consider the following two bases of R2 : S ¼ fu1; u2g ¼ fð1; 2Þ; ð3; 5Þg and S0 ¼ fv1; v2g ¼ fð1; 1Þ; ð1; 2Þg (a) Find the change-of-basis matrix P from S to the ‘‘new’’ basis S0 . Write each of the new basis vectors of S0 as a linear combination of the original basis vectors u1 and u2 of S. We have 1 1 ¼ x 1 2 þ y 3 5 or x þ 3y ¼ 1 2x þ 5y ¼ 1 yielding x ¼ 8; y ¼ 3 1 1 ¼ x 1 2 þ y 3 5 or x þ 3y ¼ 1 2x þ 5y ¼ 1 yielding x ¼ 11; y ¼ 4 Thus, v1 ¼ 8u1 þ 3u2 v2 ¼ 11u1 þ 4u2 and hence; P ¼ 8 11 3 4 : Note that the coordinates of v1 and v2 are the columns, not rows, of the change-of-basis matrix P. (b) Find the change-of-basis matrix Q from the ‘‘new’’ basis S0 back to the ‘‘old’’ basis S. Here we write each of the ‘‘old’’ basis vectors u1 and u2 of S0 as a linear combination of the ‘‘new’’ basis vectors v1 and v2 of S0 . This yields u1 ¼ 4v1 3v2 u2 ¼ 11v1 8v2 and hence; Q ¼ 4 11 3 8 As expected from Proposition 6.4, Q ¼ P1 . (In fact, we could have obtained Q by simply finding P1 .) EXAMPLE 6.6 Consider the following two bases of R3 : E ¼ fe1; e2; e3g ¼ fð1; 0; 0Þ; ð0; 1; 0Þ; ð0; 0; 1Þg and S ¼ fu1; u2; u3g ¼ fð1; 0; 1Þ; ð2; 1; 2Þ; ð1; 2; 2Þg (a) Find the change-of-basis matrix P from the basis E to the basis S. Because E is the usual basis, we can immediately write each basis element of S as a linear combination of the basis elements of E. Specifically, u1 ¼ ð1; 0; 1Þ ¼ e1 þ e3 u2 ¼ ð2; 1; 2Þ ¼ 2e1 þ e2 þ 2e3 u3 ¼ ð1; 2; 2Þ ¼ e1 þ 2e2 þ 2e3 and hence; P ¼ 1 2 1 0 1 2 1 2 2 2 6 4 3 7 5 Again, the coordinates of u1; u2; u3 appear as the columns in P. Observe that P is simply the matrix whose columns are the basis vectors of S. This is true only because the original basis was the usual basis E. (b) Find the change-of-basis matrix Q from the basis S to the basis E. The definition of the change-of-basis matrix Q tells us to write each of the (usual) basis vectors in E as a linear combination of the basis elements of S. This yields e1 ¼ ð1; 0; 0Þ ¼ 2u1 þ 2u2 u3 e2 ¼ ð0; 1; 0Þ ¼ 2u1 þ u2 e3 ¼ ð0; 0; 1Þ ¼ 3u1 2u2 þ u3 and hence; Q ¼ 2 2 3 2 1 2 1 0 1 2 6 4 3 7 5 We emphasize that to find Q, we need to solve three 3 3 systems of linear equations—one 3 3 system for each of e1; e2; e3. 200 CHAPTER 6 Linear Mappings and Matrices
  • 208. Alternatively, we can find Q ¼ P1 by forming the matrix M ¼ ½P; I and row reducing M to row canonical form: M ¼ 1 2 1 1 0 0 0 1 2 0 1 0 1 2 2 0 0 1 2 6 4 3 7 5 1 0 0 2 2 3 0 1 0 2 1 2 0 0 1 1 0 1 2 6 4 3 7 5 ¼ ½I; P1 thus; Q ¼ P1 ¼ 2 2 3 2 1 2 1 0 1 2 6 4 3 7 5 (Here we have used the fact that Q is the inverse of P.) The result in Example 6.6(a) is true in general. We state this result formally, because it occurs often. PROPOSITION 6.5: The change-of-basis matrix from the usual basis E of Kn to any basis S of Kn is the matrix P whose columns are, respectively, the basis vectors of S. Applications of Change-of-Basis Matrix First we show how a change of basis affects the coordinates of a vector in a vector space V. The following theorem is proved in Problem 6.22. THEOREM 6.6: Let P be the change-of-basis matrix from a basis S to a basis S0 in a vector space V. Then, for any vector v 2 V, we have P½vS0 ¼ ½vS and hence; P1 ½vS ¼ ½vS0 Namely, if we multiply the coordinates of v in the original basis S by P1 , we get the coordinates of v in the new basis S0 . Remark 1: Although P is called the change-of-basis matrix from the old basis S to the new basis S0 , we emphasize that P1 transforms the coordinates of v in the original basis S into the coordinates of v in the new basis S0 . Remark 2: Because of the above theorem, many texts call Q ¼ P1 , not P, the transition matrix from the old basis S to the new basis S0 . Some texts also refer to Q as the change-of-coordinates matrix. We now give the proof of the above theorem for the special case that dim V ¼ 3. Suppose P is the change-of-basis matrix from the basis S ¼ fu1; u2; u3g to the basis S0 ¼ fv1; v2; v3g; say, v1 ¼ a1u1 þ a2u2 þ a3a3 v2 ¼ b1u1 þ b2u2 þ b3u3 v3 ¼ c1u1 þ c2u2 þ c3u3 and hence; P ¼ a1 b1 c1 a2 b2 c2 a3 b3 c3 2 4 3 5 Now suppose v 2 V and, say, v ¼ k1v1 þ k2v2 þ k3v3. Then, substituting for v1; v2; v3 from above, we obtain v ¼ k1ða1u1 þ a2u2 þ a3u3Þ þ k2ðb1u1 þ b2u2 þ b3u3Þ þ k3ðc1u1 þ c2u2 þ c3u3Þ ¼ ða1k1 þ b1k2 þ c1k3Þu1 þ ða2k1 þ b2k2 þ c2k3Þu2 þ ða3k1 þ b3k2 þ c3k3Þu3 CHAPTER 6 Linear Mappings and Matrices 201
  • 209. Thus, ½vS0 ¼ k1 k2 k3 2 4 3 5 and ½vS ¼ a1k1 þ b1k2 þ c1k3 a2k1 þ b2k2 þ c2k3 a3k1 þ b3k2 þ c3k3 2 4 3 5 Accordingly, P½vS0 ¼ a1 b1 c1 a2 b2 c2 a3 b3 c3 2 4 3 5 k1 k2 k3 2 4 3 5 ¼ a1k1 þ b1k2 þ c1k3 a2k1 þ b2k2 þ c2k3 a3k1 þ b3k2 þ c3k3 2 4 3 5 ¼ ½vS Finally, multiplying the equation ½vS ¼ P½vS, by P1 , we get P1 ½vS ¼ P1 P½vS0 ¼ I½vS0 ¼ ½vS0 The next theorem (proved in Problem 6.26) shows how a change of basis affects the matrix representation of a linear operator. THEOREM 6.7: Let P be the change-of-basis matrix from a basis S to a basis S0 in a vector space V. Then, for any linear operator T on V, ½TS0 ¼ P1 ½TSP That is, if A and B are the matrix representations of T relative, respectively, to S and S0 , then B ¼ P1 AP EXAMPLE 6.7 Consider the following two bases of R3 : E ¼ fe1; e2; e3g ¼ fð1; 0; 0Þ; ð0; 1; 0Þ; ð0; 0; 1Þg and S ¼ fu1; u2; u3g ¼ fð1; 0; 1Þ; ð2; 1; 2Þ; ð1; 2; 2Þg The change-of-basis matrix P from E to S and its inverse P1 were obtained in Example 6.6. (a) Write v ¼ ð1; 3; 5Þ as a linear combination of u1; u2; u3, or, equivalently, find ½vS. One way to do this is to directly solve the vector equation v ¼ xu1 þ yu2 þ zu3; that is, 1 3 5 2 4 3 5 ¼ x 1 0 1 2 4 3 5 þ y 2 1 2 2 4 3 5 þ z 1 2 2 2 4 3 5 or x þ 2y þ z ¼ 1 y þ 2z ¼ 3 x þ 2y þ 2z ¼ 5 The solution is x ¼ 7, y ¼ 5, z ¼ 4, so v ¼ 7u1 5u2 þ 4u3. On the other hand, we know that ½vE ¼ ½1; 3; 5T , because E is the usual basis, and we already know P1 . Therefore, by Theorem 6.6, ½vS ¼ P1 ½vE ¼ 2 2 3 2 1 2 1 0 1 2 4 3 5 1 3 5 2 4 3 5 ¼ 7 5 4 2 4 3 5 Thus, again, v ¼ 7u1 5u2 þ 4u3. (b) Let A ¼ 1 3 2 2 4 1 3 1 2 2 4 3 5, which may be viewed as a linear operator on R3 . Find the matrix B that represents A relative to the basis S. 202 CHAPTER 6 Linear Mappings and Matrices
  • 210. The definition of the matrix representation of A relative to the basis S tells us to write each of Aðu1Þ, Aðu2Þ, Aðu3Þ as a linear combination of the basis vectors u1; u2; u3 of S. This yields Aðu1Þ ¼ ð1; 3; 5Þ ¼ 11u1 5u2 þ 6u3 Aðu2Þ ¼ ð1; 2; 9Þ ¼ 21u1 14u2 þ 8u3 Aðu3Þ ¼ ð3; 4; 5Þ ¼ 17u1 8e2 þ 2u3 and hence; B ¼ 11 21 17 5 14 8 6 8 2 2 6 4 3 7 5 We emphasize that to find B, we need to solve three 3 3 systems of linear equations—one 3 3 system for each of Aðu1Þ, Aðu2Þ, Aðu3Þ. On the other hand, because we know P and P1 , we can use Theorem 6.7. That is, B ¼ P1 AP ¼ 2 2 3 2 1 2 1 0 1 2 4 3 5 1 3 2 2 4 1 3 1 2 2 4 3 5 1 2 1 0 1 2 1 2 2 2 4 3 5 ¼ 11 21 17 5 14 8 6 8 2 2 4 3 5 This, as expected, gives the same result. 6.4 Similarity Suppose A and B are square matrices for which there exists an invertible matrix P such that B ¼ P1 AP; then B is said to be similar to A, or B is said to be obtained from A by a similarity transformation. We show (Problem 6.29) that similarity of matrices is an equivalence relation. By Theorem 6.7 and the above remark, we have the following basic result. THEOREM 6.8: Two matrices represent the same linear operator if and only if the matrices are similar. That is, all the matrix representations of a linear operator T form an equivalence class of similar matrices. A linear operator T is said to be diagonalizable if there exists a basis S of V such that T is represented by a diagonal matrix; the basis S is then said to diagonalize T. The preceding theorem gives us the following result. THEOREM 6.9: Let A be the matrix representation of a linear operator T. Then T is diagonalizable if and only if there exists an invertible matrix P such that P1 AP is a diagonal matrix. That is, T is diagonalizable if and only if its matrix representation can be diagonalized by a similarity transformation. We emphasize that not every operator is diagonalizable. However, we will show (Chapter 10) that every linear operator can be represented by certain ‘‘standard’’ matrices called its normal or canonical forms. Such a discussion will require some theory of fields, polynomials, and determinants. Functions and Similar Matrices Suppose f is a function on square matrices that assigns the same value to similar matrices; that is, f ðAÞ ¼ f ðBÞ whenever A is similar to B. Then f induces a function, also denoted by f , on linear operators T in the following natural way. We define f ðTÞ ¼ f ð½TSÞ where S is any basis. By Theorem 6.8, the function is well defined. The determinant (Chapter 8) is perhaps the most important example of such a function. The trace (Section 2.7) is another important example of such a function. CHAPTER 6 Linear Mappings and Matrices 203
  • 211. EXAMPLE 6.8 Consider the following linear operator F and bases E and S of R2 : Fðx; yÞ ¼ ð2x þ 3y; 4x 5yÞ; E ¼ fð1; 0Þ; ð0; 1Þg; S ¼ fð1; 2Þ; ð2; 5Þg By Example 6.1, the matrix representations of F relative to the bases E and S are, respectively, A ¼ 2 3 4 5 and B ¼ 52 129 22 55 Using matrix A, we have (i) Determinant of F ¼ detðAÞ ¼ 10 12 ¼ 22; (ii) Trace of F ¼ trðAÞ ¼ 2 5 ¼ 3: On the other hand, using matrix B, we have (i) Determinant of F ¼ detðBÞ ¼ 2860 þ 2838 ¼ 22; (ii) Trace of F ¼ trðBÞ ¼ 52 55 ¼ 3. As expected, both matrices yield the same result. 6.5 Matrices and General Linear Mappings Last, we consider the general case of linear mappings from one vector space into another. Suppose V and U are vector spaces over the same field K and, say, dim V ¼ m and dim U ¼ n. Furthermore, suppose S ¼ fv1; v2; . . . ; vmg and S0 ¼ fu1; u2; . . . ; ung are arbitrary but fixed bases, respectively, of V and U. Suppose F: V ! U is a linear mapping. Then the vectors Fðv1Þ, Fðv2Þ; . . . ; FðvmÞ belong to U, and so each is a linear combination of the basis vectors in S0 ; say, Fðv1Þ ¼ a11u1 þ a12u2 þ þ a1nun Fðv2Þ ¼ a21u1 þ a22u2 þ þ a2nun ::::::::::::::::::::::::::::::::::::::::::::::::::::::: FðvmÞ ¼ am1u1 þ am2u2 þ þ amnun DEFINITION: The transpose of the above matrix of coefficients, denoted by mS;S0 ðFÞ or ½FS;S0 , is called the matrix representation of F relative to the bases S and S0 . [We will use the simple notation mðFÞ and ½F when the bases are understood.] The following theorem is analogous to Theorem 6.1 for linear operators (Problem 6.67). THEOREM 6.10: For any vector v 2 V, ½FS;S0 ½vS ¼ ½FðvÞS0 . That is, multiplying the coordinates of v in the basis S of V by ½F, we obtain the coordinates of FðvÞ in the basis S0 of U. Recall that for any vector spaces V and U, the collection of all linear mappings from V into U is a vector space and is denoted by HomðV; UÞ. The following theorem is analogous to Theorem 6.2 for linear operators, where now we let M ¼ Mm;n denote the vector space of all m n matrices (Problem 6.67). THEOREM 6.11: The mapping m: HomðV; UÞ ! M defined by mðFÞ ¼ ½F is a vector space isomorphism. That is, for any F; G 2 HomðV; UÞ and any scalar k, (i) mðF þ GÞ ¼ mðFÞ þ mðGÞ or ½F þ G ¼ ½F þ ½G (ii) mðkFÞ ¼ kmðFÞ or ½kF ¼ k½F (iii) m is bijective (one-to-one and onto). 204 CHAPTER 6 Linear Mappings and Matrices
  • 212. Our next theorem is analogous to Theorem 6.3 for linear operators (Problem 6.67). THEOREM 6.12: Let S; S0 ; S00 be bases of vector spaces V; U; W, respectively. Let F: V ! U and G U ! W be linear mappings. Then ½G FS;S00 ¼ ½GS0;S00 ½FS;S0 That is, relative to the appropriate bases, the matrix representation of the composition of two mappings is the matrix product of the matrix representations of the individual mappings. Next we show how the matrix representation of a linear mapping F: V ! U is affected when new bases are selected (Problem 6.67). THEOREM 6.13: Let P be the change-of-basis matrix from a basis e to a basis e0 in V, and let Q be the change-of-basis matrix from a basis f to a basis f 0 in U. Then, for any linear map F: V ! U, ½Fe0; f 0 ¼ Q1 ½Fe; f P In other words, if A is the matrix representation of a linear mapping F relative to the bases e and f , and B is the matrix representation of F relative to the bases e0 and f 0 , then B ¼ Q1 AP Our last theorem, proved in Problem 6.36, shows that any linear mapping from one vector space V into another vector space U can be represented by a very simple matrix. We note that this theorem is analogous to Theorem 3.18 for m n matrices. THEOREM 6.14: Let F: V ! U be linear and, say, rankðFÞ ¼ r. Then there exist bases of V and U such that the matrix representation of F has the form A ¼ Ir 0 0 0 where Ir is the r-square identity matrix. The above matrix A is called the normal or canonical form of the linear map F. SOLVED PROBLEMS Matrix Representation of Linear Operators 6.1. Consider the linear mapping F: R2 ! R2 defined by Fðx; yÞ ¼ ð3x þ 4y; 2x 5yÞ and the following bases of R2 : E ¼ fe1; e2g ¼ fð1; 0Þ; ð0; 1Þg and S ¼ fu1; u2g ¼ fð1; 2Þ; ð2; 3Þg (a) Find the matrix A representing F relative to the basis E. (b) Find the matrix B representing F relative to the basis S. (a) Because E is the usual basis, the rows of A are simply the coefficients in the components of Fðx; yÞ; that is, using ða; bÞ ¼ ae1 þ be2, we have Fðe1Þ ¼ Fð1; 0Þ ¼ ð3; 2Þ ¼ 3e1 þ 2e2 Fðe2Þ ¼ Fð0; 1Þ ¼ ð4; 5Þ ¼ 4e1 5e2 and so A ¼ 3 4 2 5 Note that the coefficients of the basis vectors are written as columns in the matrix representation. CHAPTER 6 Linear Mappings and Matrices 205
  • 213. (b) First find Fðu1Þ and write it as a linear combination of the basis vectors u1 and u2. We have Fðu1Þ ¼ Fð1; 2Þ ¼ ð11; 8Þ ¼ xð1; 2Þ þ yð2; 3Þ; and so x þ 2y ¼ 11 2x þ 3y ¼ 8 Solve the system to obtain x ¼ 49, y ¼ 30. Therefore, Fðu1Þ ¼ 49u1 þ 30u2 Next find Fðu2Þ and write it as a linear combination of the basis vectors u1 and u2. We have Fðu2Þ ¼ Fð2; 3Þ ¼ ð18; 11Þ ¼ xð1; 2Þ þ yð2; 3Þ; and so x þ 2y ¼ 18 2x þ 3y ¼ 11 Solve for x and y to obtain x ¼ 76, y ¼ 47. Hence, Fðu2Þ ¼ 76u1 þ 47u2 Write the coefficients of u1 and u2 as columns to obtain B ¼ 49 76 30 47 (b0 ) Alternatively, one can first find the coordinates of an arbitrary vector ða; bÞ in R2 relative to the basis S. We have ða; bÞ ¼ xð1; 2Þ þ yð2; 3Þ ¼ ðx þ 2y; 2x þ 3yÞ; and so x þ 2y ¼ a 2x þ 3y ¼ b Solve for x and y in terms of a and b to get x ¼ 3a þ 2b, y ¼ 2a b. Thus, ða; bÞ ¼ ð3a þ 2bÞu1 þ ð2a bÞu2 Then use the formula for ða; bÞ to find the coordinates of Fðu1Þ and Fðu2Þ relative to S: Fðu1Þ ¼ Fð1; 2Þ ¼ ð11; 8Þ ¼ 49u1 þ 30u2 Fðu2Þ ¼ Fð2; 3Þ ¼ ð18; 11Þ ¼ 76u1 þ 47u2 and so B ¼ 49 76 30 47 6.2. Consider the following linear operator G on R2 and basis S: Gðx; yÞ ¼ ð2x 7y; 4x þ 3yÞ and S ¼ fu1; u2g ¼ fð1; 3Þ; ð2; 5Þg (a) Find the matrix representation ½GS of G relative to S. (b) Verify ½GS½vS ¼ ½GðvÞS for the vector v ¼ ð4; 3Þ in R2 . First find the coordinates of an arbitrary vector v ¼ ða; bÞ in R2 relative to the basis S. We have a b ¼ x 1 3 þ y 2 5 ; and so x þ 2y ¼ a 3x þ 5y ¼ b Solve for x and y in terms of a and b to get x ¼ 5a þ 2b, y ¼ 3a b. Thus, ða; bÞ ¼ ð5a þ 2bÞu1 þ ð3a bÞu2; and so ½v ¼ ½5a þ 2b; 3a bT (a) Using the formula for ða; bÞ and Gðx; yÞ ¼ ð2x 7y; 4x þ 3yÞ, we have Gðu1Þ ¼ Gð1; 3Þ ¼ ð19; 13Þ ¼ 121u1 70u2 Gðu2Þ ¼ Gð2; 5Þ ¼ ð31; 23Þ ¼ 201u1 116u2 and so ½GS ¼ 121 201 70 116 (We emphasize that the coefficients of u1 and u2 are written as columns, not rows, in the matrix representation.) (b) Use the formula ða; bÞ ¼ ð5a þ 2bÞu1 þ ð3a bÞu2 to get v ¼ ð4; 3Þ ¼ 26u1 þ 15u2 GðvÞ ¼ Gð4; 3Þ ¼ ð20; 7Þ ¼ 131u1 þ 80u2 Then ½vS ¼ ½26; 15T and ½GðvÞS ¼ ½131; 80T 206 CHAPTER 6 Linear Mappings and Matrices
  • 214. Accordingly, ½GS½vS ¼ 121 201 70 116 26 15 ¼ 131 80 ¼ ½GðvÞS (This is expected from Theorem 6.1.) 6.3. Consider the following 2 2 matrix A and basis S of R2 : A ¼ 2 4 5 6 and S ¼ fu1; u2g ¼ 1 2 ; 3 7 The matrix A defines a linear operator on R2 . Find the matrix B that represents the mapping A relative to the basis S. First find the coordinates of an arbitrary vector ða; bÞT with respect to the basis S. We have a b ¼ x 1 2 þ y 3 7 or x þ 3y ¼ a 2x 7y ¼ b Solve for x and y in terms of a and b to obtain x ¼ 7a þ 3b, y ¼ 2a b. Thus, ða; bÞT ¼ ð7a þ 3bÞu1 þ ð2a bÞu2 Then use the formula for ða; bÞT to find the coordinates of Au1 and Au2 relative to the basis S: Au1 ¼ 2 4 5 6 1 2 ¼ 6 7 ¼ 63u1 þ 19u2 Au2 ¼ 2 4 5 6 3 7 ¼ 22 27 ¼ 235u1 þ 71u2 Writing the coordinates as columns yields B ¼ 63 235 19 71 6.4. Find the matrix representation of each of the following linear operators F on R3 relative to the usual basis E ¼ fe1; e2; e3g of R3 ; that is, find ½F ¼ ½FE: (a) F defined by Fðx; y; zÞ ¼ ðx þ 2y 3z; 4x 5y 6z; 7x þ 8y þ 9z). (b) F defined by the 3 3 matrix A ¼ 1 1 1 2 3 4 5 5 5 2 4 3 5. (c) F defined by Fðe1Þ ¼ ð1; 3; 5Þ; Fðe2Þ ¼ ð2; 4; 6Þ, Fðe3Þ ¼ ð7; 7; 7Þ. (Theorem 5.2 states that a linear map is completely defined by its action on the vectors in a basis.) (a) Because E is the usual basis, simply write the coefficients of the components of Fðx; y; zÞ as rows: ½F ¼ 1 2 3 4 5 6 7 8 9 2 4 3 5 (b) Because E is the usual basis, ½F ¼ A, the matrix A itself. (c) Here Fðe1Þ ¼ ð1; 3; 5Þ ¼ e1 þ 3e2 þ 5e3 Fðe2Þ ¼ ð2; 4; 6Þ ¼ 2e1 þ 4e2 þ 6e3 Fðe3Þ ¼ ð7; 7; 7Þ ¼ 7e1 þ 7e2 þ 7e3 and so ½F ¼ 1 2 7 3 4 7 5 6 7 2 4 3 5 That is, the columns of ½F are the images of the usual basis vectors. 6.5. Let G be the linear operator on R3 defined by Gðx; y; zÞ ¼ ð2y þ z; x 4y; 3xÞ. (a) Find the matrix representation of G relative to the basis S ¼ fw1; w2; w3g ¼ fð1; 1; 1Þ; ð1; 1; 0Þ; ð1; 0; 0Þg (b) Verify that ½G½v ¼ ½GðvÞ for any vector v in R3 . CHAPTER 6 Linear Mappings and Matrices 207
  • 215. First find the coordinates of an arbitrary vector ða; b; cÞ 2 R3 with respect to the basis S. Write ða; b; cÞ as a linear combination of w1; w2; w3 using unknown scalars x; y, and z: ða; b; cÞ ¼ xð1; 1; 1Þ þ yð1; 1; 0Þ þ zð1; 0; 0Þ ¼ ðx þ y þ z; x þ y; xÞ Set corresponding components equal to each other to obtain the system of equations x þ y þ z ¼ a; x þ y ¼ b; x ¼ c Solve the system for x; y, z in terms of a; b, c to find x ¼ c, y ¼ b c, z ¼ a b. Thus, ða; b; cÞ ¼ cw1 þ ðb cÞw2 þ ða bÞw3, or equivalently, ½ða; b; cÞ ¼ ½c; b c; a bT (a) Because Gðx; y; zÞ ¼ ð2y þ z; x 4y; 3xÞ, Gðw1Þ ¼ Gð1; 1; 1Þ ¼ ð3; 3; 3Þ ¼ 3w1 6x2 þ 6x3 Gðw2Þ ¼ Gð1; 1; 0Þ ¼ ð2; 3; 3Þ ¼ 3w1 6w2 þ 5w3 Gðw3Þ ¼ Gð1; 0; 0Þ ¼ ð0; 1; 3Þ ¼ 3w1 2w2 w3 Write the coordinates Gðw1Þ, Gðw2Þ, Gðw3Þ as columns to get ½G ¼ 3 3 3 6 6 2 6 5 1 2 4 3 5 (b) Write GðvÞ as a linear combination of w1; w2; w3, where v ¼ ða; b; cÞ is an arbitrary vector in R3 , GðvÞ ¼ Gða; b; cÞ ¼ ð2b þ c; a 4b; 3aÞ ¼ 3aw1 þ ð2a 4bÞw2 þ ða þ 6b þ cÞw3 or equivalently, ½GðvÞ ¼ ½3a; 2a 4b; a þ 6b þ cT Accordingly, ½G½v ¼ 3 3 3 6 6 2 6 5 1 2 4 3 5 c b c a b 2 4 3 5 ¼ 3a 2a 4b a þ 6b þ c 2 4 3 5 ¼ ½GðvÞ 6.6. Consider the following 3 3 matrix A and basis S of R3 : A ¼ 1 2 1 3 1 0 1 4 2 2 4 3 5 and S ¼ fu1; u2; u3g ¼ 1 1 1 2 4 3 5; 0 1 1 2 4 3 5; 1 2 3 2 4 3 5 8 : 9 = ; The matrix A defines a linear operator on R3 . Find the matrix B that represents the mapping A relative to the basis S. (Recall that A represents itself relative to the usual basis of R3 .) First find the coordinates of an arbitrary vector ða; b; cÞ in R3 with respect to the basis S. We have a b c 2 4 3 5 ¼ x 1 1 1 2 4 3 5 þ y 0 1 1 2 4 3 5 þ z 1 2 3 2 4 3 5 or x þ z ¼ a x þ y þ 2z ¼ b x þ y þ 3z ¼ c Solve for x; y; z in terms of a; b; c to get x ¼ a þ b c; y ¼ a þ 2b c; z ¼ c b thus; ða; b; cÞT ¼ ða þ b cÞu1 þ ða þ 2b cÞu2 þ ðc bÞu3 208 CHAPTER 6 Linear Mappings and Matrices
  • 216. Then use the formula for ða; b; cÞT to find the coordinates of Au1, Au2, Au3 relative to the basis S: Aðu1Þ ¼ Að1; 1; 1ÞT ¼ ð0; 2; 3ÞT ¼ u1 þ u2 þ u3 Aðu2Þ ¼ Að1; 1; 0ÞT ¼ ð1; 1; 2ÞT ¼ 4u1 3u2 þ 3u3 Aðu3Þ ¼ Að1; 2; 3ÞT ¼ ð0; 1; 3ÞT ¼ 2u1 u2 þ 2u3 so B ¼ 1 4 2 1 3 1 1 3 2 2 4 3 5 6.7. For each of the following linear transformations (operators) L on R2 , find the matrix A that represents L (relative to the usual basis of R2 ): (a) L is defined by Lð1; 0Þ ¼ ð2; 4Þ and Lð0; 1Þ ¼ ð5; 8Þ. (b) L is the rotation in R2 counterclockwise by 90 . (c) L is the reflection in R2 about the line y ¼ x. (a) Because fð1; 0Þ; ð0; 1Þg is the usual basis of R2 , write their images under L as columns to get A ¼ 2 5 4 8 (b) Under the rotation L, we have Lð1; 0Þ ¼ ð0; 1Þ and Lð0; 1Þ ¼ ð1; 0Þ. Thus, A ¼ 0 1 1 0 (c) Under the reflection L, we have Lð1; 0Þ ¼ ð0; 1Þ and Lð0; 1Þ ¼ ð1; 0Þ. Thus, A ¼ 0 1 1 0 6.8. The set S ¼ fe3t , te3t , t2 e3t g is a basis of a vector space V of functions f : R ! R. Let D be the differential operator on V; that is, Dð f Þ ¼ df =dt. Find the matrix representation of D relative to the basis S. Find the image of each basis function: Dðe3t Þ ¼ 3e3t Dðte3t Þ ¼ e3t þ 3te3t Dðt2 e3t Þ ¼ 2te3t þ 3t2 e3t ¼ 3ðe3t Þ þ 0ðte3t Þ þ 0ðt2 e3t Þ ¼ 1ðe3t Þ þ 3ðte3t Þ þ 0ðt2 e3t Þ ¼ 0ðe3t Þ þ 2ðte3t Þ þ 3ðt2 e3t Þ and thus; ½D ¼ 3 1 0 0 3 2 0 0 3 2 4 3 5 6.9. Prove Theorem 6.1: Let T: V ! V be a linear operator, and let S be a (finite) basis of V. Then, for any vector v in V, ½TS½vS ¼ ½TðvÞS. Suppose S ¼ fu1; u2; . . . ; ung, and suppose, for i ¼ 1; . . . ; n, TðuiÞ ¼ ai1u1 þ ai2u2 þ þ ainun ¼ P n j¼1 aijuj Then ½TS is the n-square matrix whose jth row is ða1j; a2j; . . . ; anjÞ ð1Þ Now suppose v ¼ k1u1 þ k2u2 þ þ knun ¼ P n i¼1 kiui Writing a column vector as the transpose of a row vector, we have ½vS ¼ ½k1; k2; . . . ; knT ð2Þ CHAPTER 6 Linear Mappings and Matrices 209
  • 217. Furthermore, using the linearity of T, TðvÞ ¼ T P n i¼1 kiui ¼ P n i¼1 kiTðuiÞ ¼ P n i¼1 ki P n j¼1 aijuj ¼ P n j¼1 P n i¼1 aijki uj ¼ P n j¼1 ða1jk1 þ a2jk2 þ þ anjknÞuj Thus, ½TðvÞS is the column vector whose jth entry is a1jk1 þ a2jk2 þ þ anjkn ð3Þ On the other hand, the jth entry of ½TS½vS is obtained by multiplying the jth row of ½TS by ½vS—that is (1) by (2). But the product of (1) and (2) is (3). Hence, ½TS½vS and ½TðvÞS have the same entries. Thus, ½TS½vS ¼ ½TðvÞS. 6.10. Prove Theorem 6.2: Let S ¼ fu1; u2; . . . ; ung be a basis for V over K, and let M be the algebra of n-square matrices over K. Then the mapping m: AðVÞ ! M defined by mðTÞ ¼ ½TS is a vector space isomorphism. That is, for any F; G 2 AðVÞ and any k 2 K, we have (i) ½F þ G ¼ ½F þ ½G, (ii) ½kF ¼ k½F, (iii) m is one-to-one and onto. (i) Suppose, for i ¼ 1; . . . ; n, FðuiÞ ¼ P n j¼1 aijuj and GðuiÞ ¼ P n j¼1 bijuj Consider the matrices A ¼ ½aij and B ¼ ½bij. Then ½F ¼ AT and ½G ¼ BT . We have, for i ¼ 1; . . . ; n, ðF þ GÞðuiÞ ¼ FðuiÞ þ GðuiÞ ¼ P n j¼1 ðaij þ bijÞuj Because A þ B is the matrix ðaij þ bijÞ, we have ½F þ G ¼ ðA þ BÞT ¼ AT þ BT ¼ ½F þ ½G (ii) Also, for i ¼ 1; . . . ; n; ðkFÞðuiÞ ¼ kFðuiÞ ¼ k P n j¼1 aijuj ¼ P n j¼1 ðkaijÞuj Because kA is the matrix ðkaijÞ, we have ½kF ¼ ðkAÞT ¼ kAT ¼ k½F (iii) Finally, m is one-to-one, because a linear mapping is completely determined by its values on a basis. Also, m is onto, because matrix A ¼ ½aij in M is the image of the linear operator, FðuiÞ ¼ P n j¼1 aijuj; i ¼ 1; . . . ; n Thus, the theorem is proved. 6.11. Prove Theorem 6.3: For any linear operators G; F 2 AðVÞ, ½G F ¼ ½G½F. Using the notation in Problem 6.10, we have ðG FÞðuiÞ ¼ GðFðuiÞÞ ¼ G P n j¼1 aijuj ¼ P n j¼1 aijGðujÞ ¼ P n j¼1 aij P n k¼1 bjkuk ¼ P n k¼1 P n j¼1 aijbjk uk Recall that AB is the matrix AB ¼ ½cik, where cik ¼ Pn j¼1 aijbjk. Accordingly, ½G F ¼ ðABÞT ¼ BT AT ¼ ½G½F The theorem is proved. 210 CHAPTER 6 Linear Mappings and Matrices
  • 218. 6.12. Let A be the matrix representation of a linear operator T. Prove that, for any polynomial f ðtÞ, we have that f ðAÞ is the matrix representation of f ðTÞ. [Thus, f ðTÞ ¼ 0 if and only if f ðAÞ ¼ 0.] Let f be the mapping that sends an operator T into its matrix representation A. We need to prove that fð f ðTÞÞ ¼ f ðAÞ. Suppose f ðtÞ ¼ antn þ þ a1t þ a0. The proof is by induction on n, the degree of f ðtÞ. Suppose n ¼ 0. Recall that fðI0 Þ ¼ I, where I0 is the identity mapping and I is the identity matrix. Thus, fð f ðTÞÞ ¼ fða0I0 Þ ¼ a0fðI0 Þ ¼ a0I ¼ f ðAÞ and so the theorem holds for n ¼ 0. Now assume the theorem holds for polynomials of degree less than n. Then, because f is an algebra isomorphism, fð f ðTÞÞ ¼ fðanTn þ an1Tn1 þ þ a1T þ a0I0 Þ ¼ anfðTÞfðTn1 Þ þ fðan1Tn1 þ þ a1T þ a0I0 Þ ¼ anAAn1 þ ðan1An1 þ þ a1A þ a0IÞ ¼ f ðAÞ and the theorem is proved. Change of Basis The coordinate vector ½vS in this section will always denote a column vector; that is, ½vS ¼ ½a1; a2; . . . ; anT 6.13. Consider the following bases of R2 : E ¼ fe1; e2g ¼ fð1; 0Þ; ð0; 1Þg and S ¼ fu1; u2g ¼ fð1; 3Þ; ð1; 4Þg (a) Find the change-of-basis matrix P from the usual basis E to S. (b) Find the change-of-basis matrix Q from S back to E. (c) Find the coordinate vector ½v of v ¼ ð5; 3Þ relative to S. (a) Because E is the usual basis, simply write the basis vectors in S as columns: P ¼ 1 1 3 4 (b) Method 1. Use the definition of the change-of-basis matrix. That is, express each vector in E as a linear combination of the vectors in S. We do this by first finding the coordinates of an arbitrary vector v ¼ ða; bÞ relative to S. We have ða; bÞ ¼ xð1; 3Þ þ yð1; 4Þ ¼ ðx þ y; 3x þ 4yÞ or x þ y ¼ a 3x þ 4y ¼ b Solve for x and y to obtain x ¼ 4a b, y ¼ 3a þ b. Thus, v ¼ ð4a bÞu1 þ ð3a þ bÞu2 and ½vS ¼ ½ða; bÞS ¼ ½4a b; 3a þ bT Using the above formula for ½vS and writing the coordinates of the ei as columns yields e1 ¼ ð1; 0Þ ¼ 4u1 3u2 e2 ¼ ð0; 1Þ ¼ u1 þ u2 and Q ¼ 4 1 3 1 Method 2. Because Q ¼ P1 ; find P1 , say by using the formula for the inverse of a 2 2 matrix. Thus, P1 ¼ 4 1 3 1 (c) Method 1. Write v as a linear combination of the vectors in S, say by using the above formula for v ¼ ða; bÞ. We have v ¼ ð5; 3Þ ¼ 23u1 18u2; and so ½vS ¼ ½23; 18T . Method 2. Use, from Theorem 6.6, the fact that ½vS ¼ P1 ½vE and the fact that ½vE ¼ ½5; 3T : ½vS ¼ P1 ½vE ¼ 4 1 3 1 5 3 ¼ 23 18 CHAPTER 6 Linear Mappings and Matrices 211
  • 219. 6.14. The vectors u1 ¼ ð1; 2; 0Þ, u2 ¼ ð1; 3; 2Þ, u3 ¼ ð0; 1; 3Þ form a basis S of R3 . Find (a) The change-of-basis matrix P from the usual basis E ¼ fe1; e2; e3g to S. (b) The change-of-basis matrix Q from S back to E. (a) Because E is the usual basis, simply write the basis vectors of S as columns: P ¼ 1 1 0 2 3 1 0 2 3 2 4 3 5 (b) Method 1. Express each basis vector of E as a linear combination of the basis vectors of S by first finding the coordinates of an arbitrary vector v ¼ ða; b; cÞ relative to the basis S. We have a b c 2 4 3 5 ¼ x 1 2 0 2 4 3 5 þ y 1 3 2 2 4 3 5 þ z 0 1 3 2 4 3 5 or x þ y ¼ a 2x þ 3y þ z ¼ b 2y þ 3z ¼ c Solve for x; y; z to get x ¼ 7a 3b þ c, y ¼ 6a þ 3b c, z ¼ 4a 2b þ c. Thus, v ¼ ða; b; cÞ ¼ ð7a 3b þ cÞu1 þ ð6a þ 3b cÞu2 þ ð4a 2b þ cÞu3 or ½vS ¼ ½ða; b; cÞS ¼ ½7a 3b þ c; 6a þ 3b c; 4a 2b þ cT Using the above formula for ½vS and then writing the coordinates of the ei as columns yields e1 ¼ ð1; 0; 0Þ ¼ 7u1 6u2 þ 4u3 e2 ¼ ð0; 1; 0Þ ¼ 3u1 þ 3u2 2u3 e3 ¼ ð0; 0; 1Þ ¼ u1 u2 þ u3 and Q ¼ 7 3 1 6 3 1 4 2 1 2 4 3 5 Method 2. Find P1 by row reducing M ¼ ½P; I to the form ½I; P1 : M ¼ 1 1 0 1 0 0 2 3 1 0 1 0 0 2 3 0 0 1 2 6 4 3 7 5 1 1 0 1 0 0 0 1 1 2 1 0 0 2 3 0 0 1 2 6 4 3 7 5 1 1 0 1 0 0 0 1 1 2 1 0 0 0 1 4 2 1 2 6 4 3 7 5 1 0 0 7 3 1 0 1 0 6 3 1 0 0 1 4 2 1 2 6 4 3 7 5 ¼ ½I; P1 Thus, Q ¼ P1 ¼ 7 3 1 6 3 1 4 2 1 2 4 3 5. 6.15. Suppose the x-axis and y-axis in the plane R2 are rotated counterclockwise 45 so that the new x 0 -axis and y 0 -axis are along the line y ¼ x and the line y ¼ x, respectively. (a) Find the change-of-basis matrix P. (b) Find the coordinates of the point Að5; 6Þ under the given rotation. (a) The unit vectors in the direction of the new x 0 - and y 0 -axes are u1 ¼ ð1 2 ffiffiffi 2 p ; 1 2 ffiffiffi 2 p Þ and u2 ¼ ð 1 2 ffiffiffi 2 p ; 1 2 ffiffiffi 2 p Þ (The unit vectors in the direction of the original x and y axes are the usual basis of R2 .) Thus, write the coordinates of u1 and u2 as columns to obtain P ¼ 1 2 ffiffiffi 2 p 1 2 ffiffiffi 2 p 1 2 ffiffiffi 2 p 1 2 ffiffiffi 2 p # (b) Multiply the coordinates of the point by P1 : 1 2 ffiffiffi 2 p 1 2 ffiffiffi 2 p 1 2 ffiffiffi 2 p 1 2 ffiffiffi 2 p # 5 6 ¼ 11 2 ffiffiffi 2 p 1 2 ffiffiffi 2 p # (Because P is orthogonal, P1 is simply the transpose of P.) 212 CHAPTER 6 Linear Mappings and Matrices
  • 220. 6.16. The vectors u1 ¼ ð1; 1; 0Þ, u2 ¼ ð0; 1; 1Þ, u3 ¼ ð1; 2; 2Þ form a basis S of R3 . Find the coordinates of an arbitrary vector v ¼ ða; b; cÞ relative to the basis S. Method 1. Express v as a linear combination of u1; u2; u3 using unknowns x; y; z. We have ða; b; cÞ ¼ xð1; 1; 0Þ þ yð0; 1; 1Þ þ zð1; 2; 2Þ ¼ ðx þ z; x þ y þ 2z; y þ 2zÞ this yields the system x þ z ¼ a x þ y þ 2z ¼ b y þ 2z ¼ c or x þ z ¼ a y þ z ¼ a þ b y þ 2z ¼ c or x þ z ¼ a y þ z ¼ a þ b z ¼ a b þ c Solving by back-substitution yields x ¼ b c, y ¼ 2a þ 2b c, z ¼ a b þ c. Thus, ½vS ¼ ½b c; 2a þ 2b c; a b þ cT Method 2. Find P1 by row reducing M ¼ ½P; I to the form ½I; P1 , where P is the change-of-basis matrix from the usual basis E to S or, in other words, the matrix whose columns are the basis vectors of S. We have M ¼ 1 0 1 1 0 0 1 1 2 0 1 0 0 1 2 0 0 1 2 6 4 3 7 5 1 0 1 1 0 0 0 1 1 1 1 0 0 1 2 0 0 1 2 6 4 3 7 5 1 0 1 1 0 0 0 1 1 1 1 0 0 0 1 1 1 1 2 6 4 3 7 5 1 0 0 0 1 1 0 1 0 2 2 1 0 0 1 1 1 1 2 6 4 3 7 5 ¼ ½I; P1 Thus; P1 ¼ 0 1 1 2 2 1 1 1 1 2 6 4 3 7 5 and ½vS ¼ P1 ½vE ¼ 0 1 1 2 2 1 1 1 1 2 6 4 3 7 5 a b c 2 6 4 3 7 5 ¼ b c 2a þ 2b c a b þ c 2 6 4 3 7 5 6.17. Consider the following bases of R2 : S ¼ fu1; u2g ¼ fð1; 2Þ; ð3; 4Þg and S0 ¼ fv1; v2g ¼ fð1; 3Þ; ð3; 8Þg (a) Find the coordinates of v ¼ ða; bÞ relative to the basis S. (b) Find the change-of-basis matrix P from S to S0 . (c) Find the coordinates of v ¼ ða; bÞ relative to the basis S0 . (d) Find the change-of-basis matrix Q from S0 back to S. (e) Verify Q ¼ P1 . (f ) Show that, for any vector v ¼ ða; bÞ in R2 , P1 ½vS ¼ ½vS0 . (See Theorem 6.6.) (a) Let v ¼ xu1 þ yu2 for unknowns x and y; that is, a b ¼ x 1 2 þ y 3 4 or x þ 3y ¼ a 2x 4y ¼ b or x þ 3y ¼ a 2y ¼ 2a þ b Solve for x and y in terms of a and b to get x ¼ 2a 3 2 b and y ¼ a þ 1 2 b. Thus, ða; bÞ ¼ ð2a 3 2Þu1 þ ða þ 1 2 bÞu2 or ½ða; bÞS ¼ ½2a 3 2 b; a þ 1 2 bT (b) Use part (a) to write each of the basis vectors v1 and v2 of S0 as a linear combination of the basis vectors u1 and u2 of S; that is, v1 ¼ ð1; 3Þ ¼ ð2 9 2Þu1 þ ð1 þ 3 2Þu2 ¼ 13 2 u1 þ 5 2 u2 v2 ¼ ð3; 8Þ ¼ ð6 12Þu1 þ ð3 þ 4Þu2 ¼ 18u1 þ 7u2 CHAPTER 6 Linear Mappings and Matrices 213
  • 221. Then P is the matrix whose columns are the coordinates of v1 and v2 relative to the basis S; that is, P ¼ 13 2 18 5 2 7 # (c) Let v ¼ xv1 þ yv2 for unknown scalars x and y: a b ¼ x 1 3 þ y 3 8 or x þ 3y ¼ a 3x þ 8y ¼ b or x þ 3y ¼ a y ¼ b 3a Solve for x and y to get x ¼ 8a þ 3b and y ¼ 3a b. Thus, ða; bÞ ¼ ð8a þ 3bÞv1 þ ð3a bÞv2 or ½ða; bÞS0 ¼ ½8a þ 3b; 3a bT (d) Use part (c) to express each of the basis vectors u1 and u2 of S as a linear combination of the basis vectors v1 and v2 of S0 : u1 ¼ ð1; 2Þ ¼ ð8 6Þv1 þ ð3 þ 2Þv2 ¼ 14v1 þ 5v2 u2 ¼ ð3; 4Þ ¼ ð24 12Þv1 þ ð9 þ 4Þv2 ¼ 36v1 þ 13v2 Write the coordinates of u1 and u2 relative to S0 as columns to obtain Q ¼ 14 36 5 13 . (e) QP ¼ 14 36 5 13 13 2 18 5 2 7 # ¼ 1 0 0 1 ¼ I (f ) Use parts (a), (c), and (d) to obtain P1 ½vS ¼ Q½vS ¼ 14 36 5 13 2a 3 2 b a þ 1 2 b # ¼ 8a þ 3b 3a b ¼ ½vS0 6.18. Suppose P is the change-of-basis matrix from a basis fuig to a basis fwig, and suppose Q is the change-of-basis matrix from the basis fwig back to fuig. Prove that P is invertible and that Q ¼ P1 . Suppose, for i ¼ 1; 2; . . . ; n, that wi ¼ ai1u1 þ ai2u2 þ . . . þ ainun ¼ P n j¼1 aijuj ð1Þ and, for j ¼ 1; 2; . . . ; n, uj ¼ bj1w1 þ bj2w2 þ þ bjnwn ¼ P n k¼1 bjkwk ð2Þ Let A ¼ ½aij and B ¼ ½bjk. Then P ¼ AT and Q ¼ BT . Substituting (2) into (1) yields wi ¼ P n j¼1 aij P n k¼1 bjkwk ¼ P n k¼1 P n j¼1 aijbjk wk Because fwig is a basis, P aijbjk ¼ dik, where dik is the Kronecker delta; that is, dik ¼ 1 if i ¼ k but dik ¼ 0 if i 6¼ k. Suppose AB ¼ ½cik. Then cik ¼ dik. Accordingly, AB ¼ I, and so QP ¼ BT AT ¼ ðABÞT ¼ IT ¼ I Thus, Q ¼ P1 . 6.19. Consider a finite sequence of vectors S ¼ fu1; u2; . . . ; ung. Let S0 be the sequence of vectors obtained from S by one of the following ‘‘elementary operations’’: (1) Interchange two vectors. (2) Multiply a vector by a nonzero scalar. (3) Add a multiple of one vector to another vector. Show that S and S0 span the same subspace W. Also, show that S0 is linearly independent if and only if S is linearly independent. 214 CHAPTER 6 Linear Mappings and Matrices
  • 222. Observe that, for each operation, the vectors S0 are linear combinations of vectors in S. Also, because each operation has an inverse of the same type, each vector in S is a linear combination of vectors in S0 . Thus, S and S0 span the same subspace W. Moreover, S0 is linearly independent if and only if dim W ¼ n, and this is true if and only if S is linearly independent. 6.20. Let A ¼ ½aij and B ¼ ½bij be row equivalent m n matrices over a field K, and let v1; v2; . . . ; vn be any vectors in a vector space V over K. For i ¼ 1; 2; . . . ; m, let ui and wi be defined by ui ¼ ai1v1 þ ai2v2 þ þ ainvn and wi ¼ bi1v1 þ bi2v2 þ þ binvn Show that fuig and fwig span the same subspace of V. Applying an ‘‘elementary operation’’ of Problem 6.19 to fuig is equivalent to applying an elementary row operation to the matrix A. Because A and B are row equivalent, B can be obtained from A by a sequence of elementary row operations. Hence, fwig can be obtained from fuig by the corresponding sequence of operations. Accordingly, fuig and fwig span the same space. 6.21. Suppose u1; u2; . . . ; un belong to a vector space V over a field K, and suppose P ¼ ½aij is an n-square matrix over K. For i ¼ 1; 2; . . . ; n, let vi ¼ ai1u1 þ ai2u2 þ þ ainun. (a) Suppose P is invertible. Show that fuig and fvig span the same subspace of V. Hence, fuig is linearly independent if and only if fvig is linearly independent. (b) Suppose P is singular (not invertible). Show that fvig is linearly dependent. (c) Suppose fvig is linearly independent. Show that P is invertible. (a) Because P is invertible, it is row equivalent to the identity matrix I. Hence, by Problem 6.19, fvig and fuig span the same subspace of V. Thus, one is linearly independent if and only if the other is linearly independent. (b) Because P is not invertible, it is row equivalent to a matrix with a zero row. This means fvig spans a substance that has a spanning set with less than n elements. Thus, fvig is linearly dependent. (c) This is the contrapositive of the statement of part (b), and so it follows from part (b). 6.22. Prove Theorem 6.6: Let P be the change-of-basis matrix from a basis S to a basis S0 in a vector space V. Then, for any vector v 2 V, we have P½vS0 ¼ ½vS, and hence, P1 ½vS ¼ ½vS0 . Suppose S ¼ fu1; . . . ; ung and S0 ¼ fw1; . . . ; wng, and suppose, for i ¼ 1; . . . ; n, wi ¼ ai1u1 þ ai2u2 þ þ ainun ¼ P n j¼1 aijuj Then P is the n-square matrix whose jth row is ða1j; a2j; . . . ; anjÞ ð1Þ Also suppose v ¼ k1w1 þ k2w2 þ þ knwn ¼ Pn i¼1 kiwi. Then ½vS0 ¼ ½k1; k2; . . . ; knT ð2Þ Substituting for wi in the equation for v, we obtain v ¼ P n i¼1 kiwi ¼ P n i¼1 ki P n j¼1 aijuj ¼ P n j¼1 P n i¼1 aijki uj ¼ P n j¼1 ða1jk1 þ a2jk2 þ þ anjknÞuj Accordingly, ½vS is the column vector whose jth entry is a1jk1 þ a2jk2 þ þ anjkn ð3Þ On the other hand, the jth entry of P½vS0 is obtained by multiplying the jth row of P by ½vS0 —that is, (1) by (2). However, the product of (1) and (2) is (3). Hence, P½vS0 and ½vS have the same entries. Thus, P½vS0 ¼ ½vS0 , as claimed. Furthermore, multiplying the above by P1 gives P1 ½vS ¼ P1 P½vS0 ¼ ½vS0 . CHAPTER 6 Linear Mappings and Matrices 215
  • 223. Linear Operators and Change of Basis 6.23. Consider the linear transformation F on R2 defined by Fðx; yÞ ¼ ð5x y; 2x þ yÞ and the following bases of R2 : E ¼ fe1; e2g ¼ fð1; 0Þ; ð0; 1Þg and S ¼ fu1; u2g ¼ fð1; 4Þ; ð2; 7Þg (a) Find the change-of-basis matrix P from E to S and the change-of-basis matrix Q from S back to E. (b) Find the matrix A that represents F in the basis E. (c) Find the matrix B that represents F in the basis S. (a) Because E is the usual basis, simply write the vectors in S as columns to obtain the change-of-basis matrix P. Recall, also, that Q ¼ P1 . Thus, P ¼ 1 2 4 7 and Q ¼ P1 ¼ 7 2 4 1 (b) Write the coefficients of x and y in Fðx; yÞ ¼ ð5x y; 2x þ yÞ as rows to get A ¼ 5 1 2 1 (c) Method 1. Find the coordinates of Fðu1Þ and Fðu2Þ relative to the basis S. This may be done by first finding the coordinates of an arbitrary vector ða; bÞ in R2 relative to the basis S. We have ða; bÞ ¼ xð1; 4Þ þ yð2; 7Þ ¼ ðx þ 2y; 4x þ 7yÞ; and so x þ 2y ¼ a 4x þ 7y ¼ b Solve for x and y in terms of a and b to get x ¼ 7a þ 2b, y ¼ 4a b. Then ða; bÞ ¼ ð7a þ 2bÞu1 þ ð4a bÞu2 Now use the formula for ða; bÞ to obtain Fðu1Þ ¼ Fð1; 4Þ ¼ ð1; 6Þ ¼ 5u1 2u2 Fðu2Þ ¼ Fð2; 7Þ ¼ ð3; 11Þ ¼ u1 þ u2 and so B ¼ 5 1 2 1 Method 2. By Theorem 6.7, B ¼ P1 AP. Thus, B ¼ P1 AP ¼ 7 2 4 1 5 1 2 1 1 2 4 7 ¼ 5 1 2 1 6.24. Let A ¼ 2 3 4 1 . Find the matrix B that represents the linear operator A relative to the basis S ¼ fu1; u2g ¼ f½1; 3T ; ½2; 5T g. [Recall A defines a linear operator A: R2 ! R2 relative to the usual basis E of R2 ]. Method 1. Find the coordinates of Aðu1Þ and Aðu2Þ relative to the basis S by first finding the coordinates of an arbitrary vector ½a; bT in R2 relative to the basis S. By Problem 6.2, ½a; bT ¼ ð5a þ 2bÞu1 þ ð3a bÞu2 Using the formula for ½a; bT , we obtain Aðu1Þ ¼ 2 3 4 1 1 3 ¼ 11 1 ¼ 53u1 þ 32u2 and Aðu2Þ ¼ 2 3 4 1 2 5 ¼ 19 3 ¼ 89u1 þ 54u2 Thus; B ¼ 53 89 32 54 Method 2. Use B ¼ P1 AP, where P is the change-of-basis matrix from the usual basis E to S. Thus, simply write the vectors in S (as columns) to obtain the change-of-basis matrix P and then use the formula 216 CHAPTER 6 Linear Mappings and Matrices
  • 224. for P1 . This gives P ¼ 1 2 3 5 and P1 ¼ 5 2 3 1 Then B ¼ P1 AP ¼ 1 2 3 5 2 3 4 1 5 2 3 1 ¼ 53 89 32 54 6.25. Let A ¼ 1 3 1 2 5 4 1 2 2 2 4 3 5: Find the matrix B that represents the linear operator A relative to the basis S ¼ fu1; u2; u3g ¼ f½1; 1; 0T ; ½0; 1; 1T ; ½1; 2; 2T g [Recall A that defines a linear operator A: R3 ! R3 relative to the usual basis E of R3 .] Method 1. Find the coordinates of Aðu1Þ, Aðu2Þ, Aðu3Þ relative to the basis S by first finding the coordinates of an arbitrary vector v ¼ ða; b; cÞ in R3 relative to the basis S. By Problem 6.16, ½vS ¼ ðb cÞu1 þ ð2a þ 2b cÞu2 þ ða b þ cÞu3 Using this formula for ½a; b; cT , we obtain Aðu1Þ ¼ ½4; 7; 1T ¼ 8u1 þ 7u2 5u3; Aðu2Þ ¼ ½4; 1; 0T ¼ u1 6u2 þ 3u3 Aðu3Þ ¼ ½9; 4; 1T ¼ 3u1 11u2 þ 6u3 Writing the coefficients of u1; u2; u3 as columns yields B ¼ 8 1 3 7 6 11 5 3 6 2 4 3 5 Method 2. Use B ¼ P1 AP, where P is the change-of-basis matrix from the usual basis E to S. The matrix P (whose columns are simply the vectors in S) and P1 appear in Problem 6.16. Thus, B ¼ P1 AP ¼ 0 1 1 2 2 1 1 1 1 2 4 3 5 1 3 1 2 5 4 1 2 2 2 4 3 5 1 0 1 1 1 2 0 1 2 2 4 3 5 ¼ 8 1 3 7 6 11 5 3 6 2 4 3 5 6.26. Prove Theorem 6.7: Let P be the change-of-basis matrix from a basis S to a basis S0 in a vector space V. Then, for any linear operator T on V, ½TS0 ¼ P1 ½TSP. Let v be a vector in V. Then, by Theorem 6.6, P½vS0 ¼ ½vS. Therefore, P1 ½TSP½vS0 ¼ P1 ½TS½vS ¼ P1 ½TðvÞS ¼ ½TðvÞS0 But ½TS0 ½vS0 ¼ ½TðvÞS0 . Hence, P1 ½TSP½vS0 ¼ ½TS0 ½vS0 Because the mapping v 7! ½vS0 is onto Kn , we have P1 ½TSPX ¼ ½TS0 X for every X 2 Kn . Thus, P1 ½TSP ¼ ½TS0 , as claimed. Similarity of Matrices 6.27. Let A ¼ 4 2 3 6 and P ¼ 1 2 3 4 . (a) Find B ¼ P1 AP. (b) Verify trðBÞ ¼ trðAÞ: (c) Verify detðBÞ ¼ detðAÞ: (a) First find P1 using the formula for the inverse of a 2 2 matrix. We have P1 ¼ 2 1 3 2 1 2 # CHAPTER 6 Linear Mappings and Matrices 217
  • 225. Then B ¼ P1 AP ¼ 2 1 3 2 1 2 4 2 3 6 1 2 3 4 ¼ 25 30 27 2 15 (b) trðAÞ ¼ 4 þ 6 ¼ 10 and trðBÞ ¼ 25 15 ¼ 10. Hence, trðBÞ ¼ trðAÞ. (c) detðAÞ ¼ 24 þ 6 ¼ 30 and detðBÞ ¼ 375 þ 405 ¼ 30. Hence, detðBÞ ¼ detðAÞ. 6.28. Find the trace of each of the linear transformations F on R3 in Problem 6.4. Find the trace (sum of the diagonal elements) of any matrix representation of F such as the matrix representation ½F ¼ ½FE of F relative to the usual basis E given in Problem 6.4. (a) trðFÞ ¼ trð½FÞ ¼ 1 5 þ 9 ¼ 5. (b) trðFÞ ¼ trð½FÞ ¼ 1 þ 3 þ 5 ¼ 9. (c) trðFÞ ¼ trð½FÞ ¼ 1 þ 4 þ 7 ¼ 12. 6.29. Write A B if A is similar to B—that is, if there exists an invertible matrix P such that A ¼ P1 BP. Prove that is an equivalence relation (on square matrices); that is, (a) A A, for every A. (b) If A B, then B A. (c) If A B and B C, then A C. (a) The identity matrix I is invertible, and I1 ¼ I. Because A ¼ I1 AI, we have A A. (b) Because A B, there exists an invertible matrix P such that A ¼ P1 BP. Hence, B ¼ PAP1 ¼ ðP1 Þ1 AP and P1 is also invertible. Thus, B A. (c) Because A B, there exists an invertible matrix P such that A ¼ P1 BP, and as B C, there exists an invertible matrix Q such that B ¼ Q1 CQ. Thus, A ¼ P1 BP ¼ P1 ðQ1 CQÞP ¼ ðP1 Q1 ÞCðQPÞ ¼ ðQPÞ1 CðQPÞ and QP is also invertible. Thus, A C. 6.30. Suppose B is similar to A, say B ¼ P1 AP. Prove (a) Bn ¼ P1 An P, and so Bn is similar to An . (b) f ðBÞ ¼ P1 f ðAÞP, for any polynomial f ðxÞ, and so f ðBÞ is similar to f ðAÞ: (c) B is a root of a polynomial gðxÞ if and only if A is a root of gðxÞ. (a) The proof is by induction on n. The result holds for n ¼ 1 by hypothesis. Suppose n 1 and the result holds for n 1. Then Bn ¼ BBn1 ¼ ðP1 APÞðP1 An1 PÞ ¼ P1 An P (b) Suppose f ðxÞ ¼ anxn þ þ a1x þ a0. Using the left and right distributive laws and part (a), we have P1 f ðAÞP ¼ P1 ðanAn þ þ a1A þ a0IÞP ¼ P1 ðanAn ÞP þ þ P1 ða1AÞP þ P1 ða0IÞP ¼ anðP1 An PÞ þ þ a1ðP1 APÞ þ a0ðP1 IPÞ ¼ anBn þ þ a1B þ a0I ¼ f ðBÞ (c) By part (b), gðBÞ ¼ 0 if and only if P1 gðAÞP ¼ 0 if and only if gðAÞ ¼ P0P1 ¼ 0. Matrix Representations of General Linear Mappings 6.31. Let F: R3 ! R2 be the linear map defined by Fðx; y; zÞ ¼ ð3x þ 2y 4z; x 5y þ 3zÞ. (a) Find the matrix of F in the following bases of R3 and R2 : S ¼ fw1; w2; w3g ¼ fð1; 1; 1Þ; ð1; 1; 0Þ; ð1; 0; 0Þg and S0 ¼ fu1; u2g ¼ fð1; 3Þ; ð2; 5Þg 218 CHAPTER 6 Linear Mappings and Matrices
  • 226. (b) Verify Theorem 6.10: The action of F is preserved by its matrix representation; that is, for any v in R3 , we have ½FS;S0 ½vS ¼ ½FðvÞS0 . (a) From Problem 6.2, ða; bÞ ¼ ð5a þ 2bÞu1 þ ð3a bÞu2. Thus, Fðw1Þ ¼ Fð1; 1; 1Þ ¼ ð1; 1Þ ¼ 7u1 þ 4u2 Fðw2Þ ¼ Fð1; 1; 0Þ ¼ ð5; 4Þ ¼ 33u1 þ 19u2 Fðw3Þ ¼ Fð1; 0; 0Þ ¼ ð3; 1Þ ¼ 13u1 þ 8u2 Write the coordinates of Fðw1Þ, Fðw2Þ; Fðw3Þ as columns to get ½FS;S0 ¼ 7 33 13 4 19 8 (b) If v ¼ ðx; y; zÞ, then, by Problem 6.5, v ¼ zw1 þ ðy zÞw2 þ ðx yÞw3. Also, FðvÞ ¼ ð3x þ 2y 4z; x 5y þ 3zÞ ¼ ð13x 20y þ 26zÞu1 þ ð8x þ 11y 15zÞu2 Hence; ½vS ¼ ðz; y z; x yÞT and ½FðvÞS0 ¼ 13x 20y þ 26z 8x þ 11y 15z Thus, ½FS;S0 ½vS ¼ 7 33 13 4 19 8 z y x x y 2 4 3 5 ¼ 13x 20y þ 26z 8x þ 11y 15z ¼ ½FðvÞS0 6.32. Let F: Rn ! Rm be the linear mapping defined as follows: Fðx1; x2; . . . ; xnÞ ¼ ða11x1 þ þ a1nxn, a21x1 þ þ a2nxn; . . . ; am1x1 þ þ amnxnÞ (a) Show that the rows of the matrix ½F representing F relative to the usual bases of Rn and Rm are the coefficients of the xi in the components of Fðx1; . . . ; xnÞ. (b) Find the matrix representation of each of the following linear mappings relative to the usual basis of Rn : (i) F: R2 ! R3 defined by Fðx; yÞ ¼ ð3x y; 2x þ 4y; 5x 6yÞ. (ii) F: R4 ! R2 defined by Fðx; y; s; tÞ ¼ ð3x 4y þ 2s 5t; 5x þ 7y s 2tÞ. (iii) F: R3 ! R4 defined by Fðx; y; zÞ ¼ ð2x þ 3y 8z; x þ y þ z; 4x 5z; 6yÞ. (a) We have Fð1; 0; . . . ; 0Þ ¼ ða11; a21; . . . ; am1Þ Fð0; 1; . . . ; 0Þ ¼ ða12; a22; . . . ; am2Þ ::::::::::::::::::::::::::::::::::::::::::::::::::::: Fð0; 0; . . . ; 1Þ ¼ ða1n; a2n; . . . ; amnÞ and thus; ½F ¼ a11 a12 . . . a1n a21 a22 . . . a2n ::::::::::::::::::::::::::::::::: am1 am2 . . . amn 2 6 6 4 3 7 7 5 (b) By part (a), we need only look at the coefficients of the unknown x; y; . . . in Fðx; y; . . .Þ. Thus, ðiÞ ½F ¼ 3 1 2 4 5 6 2 4 3 5; ðiiÞ ½F ¼ 3 4 2 5 5 7 1 2 ; ðiiiÞ ½F ¼ 2 3 8 1 1 1 4 0 5 0 6 0 2 6 6 4 3 7 7 5 6.33. Let A ¼ 2 5 3 1 4 7 . Recall that A determines a mapping F: R3 ! R2 defined by FðvÞ ¼ Av, where vectors are written as columns. Find the matrix ½F that represents the mapping relative to the following bases of R3 and R2 : (a) The usual bases of R3 and of R2 . (b) S ¼ fw1; w2; w3g ¼ fð1; 1; 1Þ; ð1; 1; 0Þ; ð1; 0; 0Þg and S0 ¼ fu1; u2g ¼ fð1; 3Þ; ð2; 5Þg. (a) Relative to the usual bases, ½F is the matrix A. CHAPTER 6 Linear Mappings and Matrices 219
  • 227. (b) From Problem 9.2, ða; bÞ ¼ ð5a þ 2bÞu1 þ ð3a bÞu2. Thus, Fðw1Þ ¼ 2 5 3 1 4 7 1 1 1 2 6 4 3 7 5 ¼ 4 4 ¼ 12u1 þ 8u2 Fðw2Þ ¼ 2 5 3 1 4 7 1 1 0 2 6 4 3 7 5 ¼ 7 3 ¼ 41u1 þ 24u2 Fðw3Þ ¼ 2 5 3 1 4 7 1 0 0 2 6 4 3 7 5 ¼ 2 1 ¼ 8u1 þ 5u2 Writing the coefficients of Fðw1Þ, Fðw2Þ, Fðw3Þ as columns yields ½F ¼ 12 41 8 8 24 5 . 6.34. Consider the linear transformation T on R2 defined by Tðx; yÞ ¼ ð2x 3y; x þ 4yÞ and the following bases of R2 : E ¼ fe1; e2g ¼ fð1; 0Þ; ð0; 1Þg and S ¼ fu1; u2g ¼ fð1; 3Þ; ð2; 5Þg (a) Find the matrix A representing T relative to the bases E and S. (b) Find the matrix B representing T relative to the bases S and E. (We can view T as a linear mapping from one space into another, each having its own basis.) (a) From Problem 6.2, ða; bÞ ¼ ð5a þ 2bÞu1 þ ð3a bÞu2. Hence, Tðe1Þ ¼ Tð1; 0Þ ¼ ð2; 1Þ ¼ 8u1 þ 5u2 Tðe2Þ ¼ Tð0; 1Þ ¼ ð3; 4Þ ¼ 23u1 13u2 and so A ¼ 8 23 5 13 (b) We have Tðu1Þ ¼ Tð1; 3Þ ¼ ð7; 13Þ ¼ 7e1 þ 13e2 Tðu2Þ ¼ Tð2; 5Þ ¼ ð11; 22Þ ¼ 11e1 þ 22e2 and so B ¼ 7 11 13 22 6.35. How are the matrices A and B in Problem 6.34 related? By Theorem 6.12, the matrices A and B are equivalent to each other; that is, there exist nonsingular matrices P and Q such that B ¼ Q1 AP, where P is the change-of-basis matrix from S to E, and Q is the change-of-basis matrix from E to S. Thus, P ¼ 1 2 3 5 ; Q ¼ 5 2 3 1 ; Q1 ¼ 1 2 3 5 and Q1 AP ¼ 1 2 3 5 8 23 5 13 1 2 3 5 ¼ 7 11 13 22 ¼ B 6.36. Prove Theorem 6.14: Let F: V ! U be linear and, say, rankðFÞ ¼ r. Then there exist bases V and of U such that the matrix representation of F has the following form, where Ir is the r-square identity matrix: A ¼ Ir 0 0 0 Suppose dim V ¼ m and dim U ¼ n. Let W be the kernel of F and U0 the image of F. We are given that rank ðFÞ ¼ r. Hence, the dimension of the kernel of F is m r. Let fw1; . . . ; wmrg be a basis of the kernel of F and extend this to a basis of V: fv1; . . . ; vr; w1; . . . ; wmrg Set u1 ¼ Fðv1Þ; u2 ¼ Fðv2Þ; . . . ; ur ¼ FðvrÞ 220 CHAPTER 6 Linear Mappings and Matrices
  • 228. Then fu1; . . . ; urg is a basis of U0 , the image of F. Extend this to a basis of U, say fu1; . . . ; ur; urþ1; . . . ; ung Observe that Fðv1Þ ¼ u1 ¼ 1u1 þ 0u2 þ þ 0ur þ 0urþ1 þ þ 0un Fðv2Þ ¼ u2 ¼ 0u1 þ 1u2 þ þ 0ur þ 0urþ1 þ þ 0un :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: FðvrÞ ¼ ur ¼ 0u1 þ 0u2 þ þ 1ur þ 0urþ1 þ þ 0un Fðw1Þ ¼ 0 ¼ 0u1 þ 0u2 þ þ 0ur þ 0urþ1 þ þ 0un :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: FðwmrÞ ¼ 0 ¼ 0u1 þ 0u2 þ þ 0ur þ 0urþ1 þ þ 0un Thus, the matrix of F in the above bases has the required form. SUPPLEMENTARY PROBLEMS Matrices and Linear Operators 6.37. Let F: R2 ! R2 be defined by Fðx; yÞ ¼ ð4x þ 5y; 2x yÞ. (a) Find the matrix A representing F in the usual basis E. (b) Find the matrix B representing F in the basis S ¼ fu1; u2g ¼ fð1; 4Þ; ð2; 9Þg. (c) Find P such that B ¼ P1 AP. (d) For v ¼ ða; bÞ, find ½vS and ½FðvÞS. Verify that ½FS½vS ¼ ½FðvÞS. 6.38. Let A: R2 ! R2 be defined by the matrix A ¼ 5 1 2 4 . (a) Find the matrix B representing A relative to the basis S ¼ fu1; u2g ¼ fð1; 3Þ; ð2; 8Þg. (Recall that A represents the mapping A relative to the usual basis E.) (b) For v ¼ ða; bÞ, find ½vS and ½AðvÞS. 6.39. For each linear transformation L on R2 , find the matrix A representing L (relative to the usual basis of R2 ): (a) L is the rotation in R2 counterclockwise by 45 . (b) L is the reflection in R2 about the line y ¼ x. (c) L is defined by Lð1; 0Þ ¼ ð3; 5Þ and Lð0; 1Þ ¼ ð7; 2Þ. (d) L is defined by Lð1; 1Þ ¼ ð3; 7Þ and Lð1; 2Þ ¼ ð5; 4Þ. 6.40. Find the matrix representing each linear transformation T on R3 relative to the usual basis of R3 : (a) Tðx; y; zÞ ¼ ðx; y; 0Þ. (b) Tðx; y; zÞ ¼ ðz; y þ z; x þ y þ zÞ. (c) Tðx; y; zÞ ¼ ð2x 7y 4z; 3x þ y þ 4z; 6x 8y þ zÞ. 6.41. Repeat Problem 6.40 using the basis S ¼ fu1; u2; u3g ¼ fð1; 1; 0Þ; ð1; 2; 3Þ; ð1; 3; 5Þg. 6.42. Let L be the linear transformation on R3 defined by Lð1; 0; 0Þ ¼ ð1; 1; 1Þ; Lð0; 1; 0Þ ¼ ð1; 3; 5Þ; Lð0; 0; 1Þ ¼ ð2; 2; 2Þ (a) Find the matrix A representing L relative to the usual basis of R3 . (b) Find the matrix B representing L relative to the basis S in Problem 6.41. 6.43. Let D denote the differential operator; that is, Dð f ðtÞÞ ¼ df =dt. Each of the following sets is a basis of a vector space V of functions. Find the matrix representing D in each basis: (a) fet ; e2t ; te2t g. (b) f1; t; sin 3t; cos 3tg. (c) fe5t ; te5t ; t2 e5t g. CHAPTER 6 Linear Mappings and Matrices 221
  • 229. 6.44. Let D denote the differential operator on the vector space V of functions with basis S ¼ fsin y, cos yg. (a) Find the matrix A ¼ ½DS. (b) Use A to show that D is a zero of f ðtÞ ¼ t2 þ 1. 6.45. Let V be the vector space of 2 2 matrices. Consider the following matrix M and usual basis E of V: M ¼ a b c d and E ¼ 1 0 0 0 ; 0 1 0 0 ; 0 0 1 0 ; 0 0 0 1 Find the matrix representing each of the following linear operators T on V relative to E: (a) TðAÞ ¼ MA. (b) TðAÞ ¼ AM. (c) TðAÞ ¼ MA AM. 6.46. Let 1V and 0V denote the identity and zero operators, respectively, on a vector space V. Show that, for any basis S of V, (a) ½1V S ¼ I, the identity matrix. (b) ½0V S ¼ 0, the zero matrix. Change of Basis 6.47. Find the change-of-basis matrix P from the usual basis E of R2 to a basis S, the change-of-basis matrix Q from S back to E, and the coordinates of v ¼ ða; bÞ relative to S, for the following bases S: (a) S ¼ fð1; 2Þ; ð3; 5Þg. (c) S ¼ fð2; 5Þ; ð3; 7Þg. (b) S ¼ fð1; 3Þ; ð3; 8Þg. (d) S ¼ fð2; 3Þ; ð4; 5Þg. 6.48. Consider the bases S ¼ fð1; 2Þ; ð2; 3Þg and S0 ¼ fð1; 3Þ; ð1; 4Þg of R2 . Find the change-of-basis matrix: (a) P from S to S0 . (b) Q from S0 back to S. 6.49. Suppose that the x-axis and y-axis in the plane R2 are rotated counterclockwise 30 to yield new x 0 -axis and y 0 -axis for the plane. Find (a) The unit vectors in the direction of the new x 0 -axis and y 0 -axis. (b) The change-of-basis matrix P for the new coordinate system. (c) The new coordinates of the points Að1; 3Þ, Bð2; 5Þ, Cða; bÞ. 6.50. Find the change-of-basis matrix P from the usual basis E of R3 to a basis S, the change-of-basis matrix Q from S back to E, and the coordinates of v ¼ ða; b; cÞ relative to S, where S consists of the vectors: (a) u1 ¼ ð1; 1; 0Þ; u2 ¼ ð0; 1; 2Þ; u3 ¼ ð0; 1; 1Þ. (b) u1 ¼ ð1; 0; 1Þ; u2 ¼ ð1; 1; 2Þ; u3 ¼ ð1; 2; 4Þ. (c) u1 ¼ ð1; 2; 1Þ; u2 ¼ ð1; 3; 4Þ; u3 ¼ ð2; 5; 6Þ. 6.51. Suppose S1; S2; S3 are bases of V. Let P and Q be the change-of-basis matrices, respectively, from S1 to S2 and from S2 to S3. Prove that PQ is the change-of-basis matrix from S1 to S3. Linear Operators and Change of Basis 6.52. Consider the linear operator F on R2 defined by Fðx; yÞ ¼ ð5x þ y; 3x 2yÞ and the following bases of R2 : S ¼ fð1; 2Þ; ð2; 3Þg and S0 ¼ fð1; 3Þ; ð1; 4Þg (a) Find the matrix A representing F relative to the basis S. (b) Find the matrix B representing F relative to the basis S0 . (c) Find the change-of-basis matrix P from S to S0 . (d) How are A and B related? 6.53. Let A: R2 ! R2 be defined by the matrix A ¼ 1 1 3 2 . Find the matrix B that represents the linear operator A relative to each of the following bases: (a) S ¼ fð1; 3ÞT ; ð2; 5ÞT g. (b) S ¼ fð1; 3ÞT ; ð2; 4ÞT g. 222 CHAPTER 6 Linear Mappings and Matrices
  • 230. 6.54. Let F: R2 ! R2 be defined by Fðx; yÞ ¼ ðx 3y; 2x 4yÞ. Find the matrix A that represents F relative to each of the following bases: (a) S ¼ fð2; 5Þ; ð3; 7Þg. (b) S ¼ fð2; 3Þ; ð4; 5Þg. 6.55. Let A: R3 ! R3 be defined by the matrix A ¼ 1 3 1 2 7 4 1 4 3 2 4 3 5. Find the matrix B that represents the linear operator A relative to the basis S ¼ fð1; 1; 1ÞT ; ð0; 1; 1ÞT ; ð1; 2; 3ÞT g. Similarity of Matrices 6.56. Let A ¼ 1 1 2 3 and P ¼ 1 2 3 5 . (a) Find B ¼ P1 AP. (b) Verify that trðBÞ ¼ trðAÞ: (c) Verify that detðBÞ ¼ detðAÞ. 6.57. Find the trace and determinant of each of the following linear maps on R2 : (a) Fðx; yÞ ¼ ð2x 3y; 5x þ 4yÞ. (b) Gðx; yÞ ¼ ðax þ by; cx þ dyÞ. 6.58. Find the trace and determinant of each of the following linear maps on R3 : (a) Fðx; y; zÞ ¼ ðx þ 3y; 3x 2z; x 4y 3zÞ. (b) Gðx; y; zÞ ¼ ðy þ 3z; 2x 4z; 5x þ 7yÞ. 6.59. Suppose S ¼ fu1; u2g is a basis of V, and T: V ! V is defined by Tðu1Þ ¼ 3u1 2u2 and Tðu2Þ ¼ u1 þ 4u2. Suppose S0 ¼ fw1; w2g is a basis of V for which w1 ¼ u1 þ u2 and w2 ¼ 2u1 þ 3u2. (a) Find the matrices A and B representing T relative to the bases S and S0 , respectively. (b) Find the matrix P such that B ¼ P1 AP. 6.60. Let A be a 2 2 matrix such that only A is similar to itself. Show that A is a scalar matrix, that is, that A ¼ a 0 0 a . 6.61. Show that all matrices similar to an invertible matrix are invertible. More generally, show that similar matrices have the same rank. Matrix Representation of General Linear Mappings 6.62. Find the matrix representation of each of the following linear maps relative to the usual basis for Rn : (a) F: R3 ! R2 defined by Fðx; y; zÞ ¼ ð2x 4y þ 9z; 5x þ 3y 2zÞ. (b) F: R2 ! R4 defined by Fðx; yÞ ¼ ð3x þ 4y; 5x 2y; x þ 7y; 4xÞ: (c) F: R4 ! R defined by Fðx1; x2; x3; x4Þ ¼ 2x1 þ x2 7x3 x4. 6.63. Let G: R3 ! R2 be defined by Gðx; y; zÞ ¼ ð2x þ 3y z; 4x y þ 2zÞ. (a) Find the matrix A representing G relative to the bases S ¼ fð1; 1; 0Þ; ð1; 2; 3Þ; ð1; 3; 5Þg and S0 ¼ fð1; 2Þ; ð2; 3Þg (b) For any v ¼ ða; b; cÞ in R3 , find ½vS and ½GðvÞS0 . (c) Verify that A½vS ¼ ½GðvÞS0 . 6.64. Let H: R2 ! R2 be defined by Hðx; yÞ ¼ ð2x þ 7y; x 3yÞ and consider the following bases of R2 : S ¼ fð1; 1Þ; ð1; 2Þg and S0 ¼ fð1; 4Þ; ð1; 5Þg (a) Find the matrix A representing H relative to the bases S and S0 . (b) Find the matrix B representing H relative to the bases S0 and S. CHAPTER 6 Linear Mappings and Matrices 223
  • 231. 6.65. Let F: R3 ! R2 be defined by Fðx; y; zÞ ¼ ð2x þ y z; 3x 2y þ 4zÞ. (a) Find the matrix A representing F relative to the bases S ¼ fð1; 1; 1Þ; ð1; 1; 0Þ; ð1; 0; 0Þg and S0 ¼ ð1; 3Þ; ð1; 4Þg (b) Verify that, for any v ¼ ða; b; cÞ in R3 , A½vS ¼ ½FðvÞS0 . 6.66. Let S and S0 be bases of V, and let 1V be the identity mapping on V. Show that the matrix A representing 1V relative to the bases S and S0 is the inverse of the change-of-basis matrix P from S to S0 ; that is, A ¼ P1 . 6.67. Prove (a) Theorem 6.10, (b) Theorem 6.11, (c) Theorem 6.12, (d) Theorem 6.13. [Hint: See the proofs of the analogous Theorems 6.1 (Problem 6.9), 6.2 (Problem 6.10), 6.3 (Problem 6.11), and 6.7 (Problem 6.26).] Miscellaneous Problems 6.68. Suppose F: V ! V is linear. A subspace W of V is said to be invariant under F if FðWÞ W. Suppose W is invariant under F and dim W ¼ r. Show that F has a block triangular matrix representation M ¼ A B 0 C where A is an r r submatrix. 6.69. Suppose V ¼ U þ W, and suppose U and V are each invariant under a linear operator F: V ! V. Also, suppose dim U ¼ r and dim W ¼ S. Show that F has a block diagonal matrix representation M ¼ A 0 0 B where A and B are r r and s s submatrices. 6.70. Two linear operators F and G on V are said to be similar if there exists an invertible linear operator T on V such that G ¼ T1 F T. Prove (a) F and G are similar if and only if, for any basis S of V, ½FS and ½GS are similar matrices. (b) If F is diagonalizable (similar to a diagonal matrix), then any similar matrix G is also diagonalizable. ANSWERS TO SUPPLEMENTARY PROBLEMS Notation: M ¼ ½R1; R2; . . . represents a matrix M with rows R1; R2; . . . : 6.37. (a) A ¼ ½4; 5; 2; 1; (b) B ¼ ½220; 487; 98; 217; (c) P ¼ ½1; 2; 4; 9; (d) ½vS ¼ ½9a 2b; 4a þ bT and ½FðvÞS ¼ ½32a þ 47b; 14a 21bT 6.38. (a) B ¼ ½6; 28; 4; 15; (b) ½vS ¼ ½4a b; 3 2 a þ 1 2 bT and ½AðvÞS ¼ ½18a 8b; 1 2 ð13a þ 7bÞ 6.39. (a) ½ ffiffiffi 2 p ; ffiffiffi 2 p ; ffiffiffi 2 p ; ffiffiffi 2 p ; (b) ½0; 1; 1; 0; (c) ½3; 7; 5; 2; (d) ½1; 2; 18; 11 6.40. (a) ½1; 0; 0; 0; 1; 0; 0; 0; 0; (b) ½0; 0; 1; 0; 1; 1; 1; 1; 1; (c) ½2; 7; 4; 3; 1; 4; 6; 8; 1 6.41. (a) ½1; 3; 5; 0; 5; 10; 0; 3; 6; (b) ½0; 1; 2; 1; 2; 3; 1; 0; 0; (c) ½15; 65; 104; 49; 219; 351; 29; 130; 208 6.42. (a) ½1; 1; 2; 1; 3; 2; 1; 5; 2; (b) ½0; 2; 14; 22; 0; 5; 8 6.43. (a) ½1; 0; 0; 0; 2; 1; 0; 0; 2; (b) ½0; 1; 0; 0; 0; 0; 0; 0; 3; 0; 0; 3; 0; (c) ½5; 1; 0; 0; 5; 2; 0; 0; 5 224 CHAPTER 6 Linear Mappings and Matrices
  • 232. 6.44. (a) A ¼ ½0; 1; 1; 0; (b) A2 þ I ¼ 0 6.45. (a) ½a; 0; b; 0; 0; a; 0; b; c; 0; d; 0; 0; c; 0; d; (b) ½a; c; 0; 0; b; d; 0; 0; 0; 0; a; c; 0; 0; b; d; (c) ½0; c; b; 0; b; a d; 0; b; c; 0; d a; c; 0; c; b; 0 6.47. (a) ½1; 3; 2; 5; ½5; 3; 2; 1; ½v ¼ ½5a þ 3b; 2a bT ; (b) ½1; 3; 3; 8; ½8; 3; 3; 1; ½v ¼ ½8a 3b; 3a þ bT ; (c) ½2; 3; 5; 7; ½7; 3; 5; 2; ½v ¼ ½7a þ 3b; 5a 2bT ; (d) ½2; 4; 3; 5; ½ 5 2 ; 2; 3 2 ; 1; ½v ¼ ½ 5 2 a þ 2b; 3 2 a bT 6.48. (a) P ¼ ½3; 5; 1; 2; (b) Q ¼ ½2; 5; 1; 3 6.49. Here K ¼ ffiffiffi 3 p : (a) 1 2 ðK; 1Þ; 1 2 ð1; KÞ; ðbÞ P ¼ 1 2 ½K; 1; 1; K; ðcÞ 1 2 ½K þ 3; 3K 1T ; 1 2 ½2K 5; 5K 2T ; 1 2 ½aK þ b; bK aT 6.50. P is the matrix whose columns are u1; u2; u3; Q ¼ P1 ; ½v ¼ Q½a; b; cT : (a) Q ¼ ½1; 0; 0; 1; 1; 1; 2; 2; 1; ½v ¼ ½a; a b þ c; 2a þ 2b cT ; (b) Q ¼ ½0; 2; 1; 2; 3; 2; 1; 1; 1; ½v ¼ ½2b þ c; 2a þ 3b 2c; a b þ cT ; (c) Q ¼ ½2; 2; 1; 7; 4; 1; 5; 3; 1; ½v ¼ ½2a þ 2b c; 7a þ 4b c; 5a 3b þ cT 6.52. (a) ½23; 39; 15; 26; (b) ½35; 41; 27; 32; (c) ½3; 5; 1; 2; (d) B ¼ P1 AP 6.53. (a) ½28; 47; 15; 25; (b) ½13; 18; 15 2 ; 10 6.54. (a) ½43; 60; 33; 46; (b) 1 2 ½3; 7; 5; 9 6.55. ½10; 8; 20; 13; 11; 28; 5; 4; 10 6.56. (a) ½34; 57; 19; 32; (b) trðBÞ ¼ trðAÞ ¼ 2; (c) detðBÞ ¼ detðAÞ ¼ 5 6.57. (a) trðFÞ ¼ 6; detðFÞ ¼ 23; (b) trðGÞ ¼ a þ d; detðGÞ ¼ ad bc 6.58. (a) trðFÞ ¼ 2; detðFÞ ¼ 13; (b) trðGÞ ¼ 0; detðGÞ ¼ 22 6.59. (a) A ¼ ½3; 1; 2; 4; B ¼ ½8; 11; 2; 1; (b) P ¼ ½1; 2; 1; 3 6.62. (a) ½2; 4; 9; 5; 3; 2; (b) ½3; 5; 1; 4; 4; 2; 7; 0; (c) ½2; 1; 7; 1 6.63. (a) ½9; 1; 4; 7; 2; 1; (b) ½vS ¼ ½a þ 2b c; 5a 5b þ 2c; 3a þ 3b cT , and ½GðvÞS0 ¼ ½2a 11b þ 7c; 7b 4cT 6.64. (a) A ¼ ½47; 85; 38; 69; (b) B ¼ ½71; 88; 41; 51 6.65. A ¼ ½3; 11; 5; 1; 8; 3 CHAPTER 6 Linear Mappings and Matrices 225
  • 233. Inner Product Spaces, Orthogonality 7.1 Introduction The definition of a vector space V involves an arbitrary field K. Here we first restrict K to be the real field R, in which case V is called a real vector space; in the last sections of this chapter, we extend our results to the case where K is the complex field C, in which case V is called a complex vector space. Also, we adopt the previous notation that u; v; w are vectors in V a; b; c; k are scalars in K Furthermore, the vector spaces V in this chapter have finite dimension unless otherwise stated or implied. Recall that the concepts of ‘‘length’’ and ‘‘orthogonality’’ did not appear in the investigation of arbitrary vector spaces V (although they did appear in Section 1.4 on the spaces Rn and Cn ). Here we place an additional structure on a vector space V to obtain an inner product space, and in this context these concepts are defined. 7.2 Inner Product Spaces We begin with a definition. DEFINITION: Let V be a real vector space. Suppose to each pair of vectors u; v 2 V there is assigned a real number, denoted by hu; vi. This function is called a (real) inner product on V if it satisfies the following axioms: ½I1 (Linear Property): hau1 þ bu2; vi ¼ ahu1; vi þ bhu2; vi. ½I2 (Symmetric Property): hu; vi ¼ hv; ui. ½I3 (Positive Definite Property): hu; ui 0.; and hu; ui ¼ 0 if and only if u ¼ 0. The vector space V with an inner product is called a (real) inner product space. Axiom ½I1 states that an inner product function is linear in the first position. Using ½I1 and the symmetry axiom ½I2, we obtain hu; cv1 þ dv2i ¼ hcv1 þ dv2; ui ¼ chv1; ui þ dhv2; ui ¼ chu; v1i þ dhu; v2i That is, the inner product function is also linear in its second position. Combining these two properties and using induction yields the following general formula: P i aiui; P j bjvj ¼ P i P j aibjhui; vji CHAPTER 7 226
  • 234. That is, an inner product of linear combinations of vectors is equal to a linear combination of the inner products of the vectors. EXAMPLE 7.1 Let V be a real inner product space. Then, by linearity, h3u1 4u2; 2v1 5v2 þ 6v3i ¼ 6hu1; v1i 15hu1; v2i þ 18hu1; v3i 8hu2; v1i þ 20hu2; v2i 24hu2; v3i h2u 5v; 4u þ 6vi ¼ 8hu; ui þ 12hu; vi 20hv; ui 30hv; vi ¼ 8hu; ui 8hv; ui 30hv; vi Observe that in the last equation we have used the symmetry property that hu; vi ¼ hv; ui. Remark: Axiom ½I1 by itself implies h0; 0i ¼ h0v; 0i ¼ 0hv; 0i ¼ 0: Thus, ½I1, ½I2, ½I3 are equivalent to ½I1, ½I2, and the following axiom: ½I0 3 If u 6¼ 0; then hu; ui is positive: That is, a function satisfying ½I1, ½I2, ½I0 3 is an inner product. Norm of a Vector By the third axiom ½I3 of an inner product, hu; ui is nonnegative for any vector u. Thus, its positive square root exists. We use the notation kuk ¼ ffiffiffiffiffiffiffiffiffiffiffi hu; ui p This nonnegative number is called the norm or length of u. The relation kuk2 ¼ hu; ui will be used frequently. Remark: If kuk ¼ 1 or, equivalently, if hu; ui ¼ 1, then u is called a unit vector and it is said to be normalized. Every nonzero vector v in V can be multiplied by the reciprocal of its length to obtain the unit vector ^ v ¼ 1 kvk v which is a positive multiple of v. This process is called normalizing v. 7.3 Examples of Inner Product Spaces This section lists the main examples of inner product spaces used in this text. Euclidean n-Space Rn Consider the vector space Rn . The dot product or scalar product in Rn is defined by u v ¼ a1b1 þ a2b2 þ þ anbn where u ¼ ðaiÞ and v ¼ ðbiÞ. This function defines an inner product on Rn . The norm kuk of the vector u ¼ ðaiÞ in this space is as follows: kuk ¼ ffiffiffiffiffiffiffiffiffi u u p ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi a2 1 þ a2 2 þ þ a2 n q On the other hand, by the Pythagorean theorem, the distance from the origin O in R3 to a point Pða; b; cÞ is given by ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi a2 þ b2 þ c2 p . This is precisely the same as the above-defined norm of the vector v ¼ ða; b; cÞ in R3 . Because the Pythagorean theorem is a consequence of the axioms of CHAPTER 7 Inner Product Spaces, Orthogonality 227
  • 235. Euclidean geometry, the vector space Rn with the above inner product and norm is called Euclidean n-space. Although there are many ways to define an inner product on Rn , we shall assume this inner product unless otherwise stated or implied. It is called the usual (or standard) inner product on Rn . Remark: Frequently the vectors in Rn will be represented by column vectors—that is, by n 1 column matrices. In such a case, the formula hu; vi ¼ uT v defines the usual inner product on Rn . EXAMPLE 7.2 Let u ¼ ð1; 3; 4; 2Þ, v ¼ ð4; 2; 2; 1Þ, w ¼ ð5; 1; 2; 6Þ in R4 . (a) Show h3u 2v; wi ¼ 3hu; wi 2hv; wi: By definition, hu; wi ¼ 5 3 þ 8 þ 12 ¼ 22 and hv; wi ¼ 20 þ 2 4 þ 6 ¼ 24 Note that 3u 2v ¼ ð5; 13; 16; 4Þ. Thus, h3u 2v; wi ¼ 25 13 þ 32 þ 24 ¼ 18 As expected, 3hu; wi 2hv; wi ¼ 3ð22Þ 2ð24Þ ¼ 18 ¼ h3u 2v; wi. (b) Normalize u and v: By definition, kuk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 9 þ 16 þ 4 p ¼ ffiffiffiffiffi 30 p and kvk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 16 þ 4 þ 4 þ 1 p ¼ 5 We normalize u and v to obtain the following unit vectors in the directions of u and v, respectively: ^ u ¼ 1 kuk u ¼ 1 ffiffiffiffiffi 30 p ; 3 ffiffiffiffiffi 30 p ; 4 ffiffiffiffiffi 30 p ; 2 ffiffiffiffiffi 30 p and ^ v ¼ 1 kvk v ¼ 4 5 ; 2 5 ; 2 5 ; 1 5 Function Space C½a; b and Polynomial Space PðtÞ The notation C½a; b is used to denote the vector space of all continuous functions on the closed interval ½a; b—that is, where a t b. The following defines an inner product on C½a; b, where f ðtÞ and gðtÞ are functions in C½a; b: h f ; gi ¼ ðb a f ðtÞgðtÞ dt It is called the usual inner product on C½a; b. The vector space PðtÞ of all polynomials is a subspace of C½a; b for any interval ½a; b, and hence, the above is also an inner product on PðtÞ. EXAMPLE 7.3 Consider f ðtÞ ¼ 3t 5 and gðtÞ ¼ t2 in the polynomial space PðtÞ with inner product h f ; gi ¼ ð1 0 f ðtÞgðtÞ dt: (a) Find h f ; gi. We have f ðtÞgðtÞ ¼ 3t3 5t2 . Hence, h f ; gi ¼ ð1 0 ð3t3 5t2 Þ dt ¼ 3 4 t4 5 3 t3 1 0 ¼ 3 4 5 3 ¼ 11 12 228 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 236. (b) Find k f k and kgk. We have ½ f ðtÞ2 ¼ f ðtÞ f ðtÞ ¼ 9t2 30t þ 25 and ½gðtÞ2 ¼ t4 . Then k f k2 ¼ h f ; f i ¼ ð1 0 ð9t2 30t þ 25Þ dt ¼ 3t3 15t2 þ 25t 1 0 ¼ 13 kgk2 ¼ hg; gi ¼ ð1 0 t4 dt ¼ 1 5 t5 1 0 ¼ 1 5 Therefore, k f k ¼ ffiffiffiffiffi 13 p and kgk ¼ ffiffi 1 5 q ¼ 1 5 ffiffiffi 5 p . Matrix Space M ¼ Mm;n Let M ¼ Mm;n, the vector space of all real m n matrices. An inner product is defined on M by hA; Bi ¼ trðBT AÞ where, as usual, trð Þ is the trace—the sum of the diagonal elements. If A ¼ ½aij and B ¼ ½bij, then hA; Bi ¼ trðBT AÞ ¼ P m i¼1 P n j¼1 aijbij and kAk2 ¼ hA; Ai ¼ P m i¼1 P n j¼1 a2 ij That is, hA; Bi is the sum of the products of the corresponding entries in A and B and, in particular, hA; Ai is the sum of the squares of the entries of A. Hilbert Space Let V be the vector space of all infinite sequences of real numbers ða1; a2; a3; . . .Þ satisfying P 1 i¼1 a2 i ¼ a2 1 þ a2 2 þ 1 that is, the sum converges. Addition and scalar multiplication are defined in V componentwise; that is, if u ¼ ða1; a2; . . .Þ and v ¼ ðb1; b2; . . .Þ then u þ v ¼ ða1 þ b1; a2 þ b2; . . .Þ and ku ¼ ðka1; ka2; . . .Þ An inner product is defined in v by hu; vi ¼ a1b1 þ a2b2 þ The above sum converges absolutely for any pair of points in V. Hence, the inner product is well defined. This inner product space is called l2-space or Hilbert space. 7.4 Cauchy–Schwarz Inequality, Applications The following formula (proved in Problem 7.8) is called the Cauchy–Schwarz inequality or Schwarz inequality. It is used in many branches of mathematics. THEOREM 7.1: (Cauchy–Schwarz) For any vectors u and v in an inner product space V, hu; vi2 hu; uihv; vi or jhu; vij kukkvk Next we examine this inequality in specific cases. EXAMPLE 7.4 (a) Consider any real numbers a1; . . . ; an, b1; . . . ; bn. Then, by the Cauchy–Schwarz inequality, ða1b1 þ a2b2 þ þ anbnÞ2 ða2 1 þ þ a2 nÞðb2 1 þ þ b2 nÞ That is, ðu vÞ2 kuk2 kvk2 , where u ¼ ðaiÞ and v ¼ ðbiÞ. CHAPTER 7 Inner Product Spaces, Orthogonality 229
  • 237. (b) Let f and g be continuous functions on the unit interval ½0; 1. Then, by the Cauchy–Schwarz inequality, ð1 0 f ðtÞgðtÞ dt 2 ð1 0 f 2 ðtÞ dt ð1 0 g2 ðtÞ dt That is, ðh f ; giÞ2 k f k2 kvk2 . Here V is the inner product space C½0; 1. The next theorem (proved in Problem 7.9) gives the basic properties of a norm. The proof of the third property requires the Cauchy–Schwarz inequality. THEOREM 7.2: Let V be an inner product space. Then the norm in V satisfies the following properties: ½N1 kvk 0; and kvk ¼ 0 if and only if v ¼ 0. ½N2 kkvk ¼ jkjkvk. ½N3 ku þ vk kuk þ kvk. The property ½N3 is called the triangle inequality, because if we view u þ v as the side of the triangle formed with sides u and v (as shown in Fig. 7-1), then ½N3 states that the length of one side of a triangle cannot be greater than the sum of the lengths of the other two sides. Angle Between Vectors For any nonzero vectors u and v in an inner product space V, the angle between u and v is defined to be the angle y such that 0 y p and cos y ¼ hu; vi kukkvk By the Cauchy–Schwartz inequality, 1 cos y 1, and so the angle exists and is unique. EXAMPLE 7.5 (a) Consider vectors u ¼ ð2; 3; 5Þ and v ¼ ð1; 4; 3Þ in R3 . Then hu; vi ¼ 2 12 þ 15 ¼ 5; kuk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4 þ 9 þ 25 p ¼ ffiffiffiffiffi 38 p ; kvk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 16 þ 9 p ¼ ffiffiffiffiffi 26 p Then the angle y between u and v is given by cos y ¼ 5 ffiffiffiffiffi 38 p ffiffiffiffiffi 26 p Note that y is an acute angle, because cos y is positive. (b) Let f ðtÞ ¼ 3t 5 and gðtÞ ¼ t2 in the polynomial space PðtÞ with inner product h f ; gi ¼ Ð1 0 f ðtÞgðtÞ dt. By Example 7.3, h f ; gi ¼ 11 12 ; k f k ¼ ffiffiffiffiffi 13 p ; kgk ¼ 1 5 ffiffiffi 5 p Then the ‘‘angle’’ y between f and g is given by cos y ¼ 11 12 ð ffiffiffiffiffi 13 p Þ 1 5 ffiffiffi 5 p ¼ 55 12 ffiffiffiffiffi 13 p ffiffiffi 5 p Note that y is an obtuse angle, because cos y is negative. Figure 7-1 230 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 238. 7.5 Orthogonality Let V be an inner product space. The vectors u; v 2 V are said to be orthogonal and u is said to be orthogonal to v if hu; vi ¼ 0 The relation is clearly symmetric—if u is orthogonal to v, then hv; ui ¼ 0, and so v is orthogonal to u. We note that 0 2 V is orthogonal to every v 2 V, because h0; vi ¼ h0v; vi ¼ 0hv; vi ¼ 0 Conversely, if u is orthogonal to every v 2 V, then hu; ui ¼ 0 and hence u ¼ 0 by ½I3: Observe that u and v are orthogonal if and only if cos y ¼ 0, where y is the angle between u and v. Also, this is true if and only if u and v are ‘‘perpendicular’’—that is, y ¼ p=2 (or y ¼ 90 ). EXAMPLE 7.6 (a) Consider the vectors u ¼ ð1; 1; 1Þ, v ¼ ð1; 2; 3Þ, w ¼ ð1; 4; 3Þ in R3 . Then hu; vi ¼ 1 þ 2 3 ¼ 0; hu; wi ¼ 1 4 þ 3 ¼ 0; hv; wi ¼ 1 8 9 ¼ 16 Thus, u is orthogonal to v and w, but v and w are not orthogonal. (b) Consider the functions sin t and cos t in the vector space C½p; p of continuous functions on the closed interval ½p; p. Then hsin t; cos ti ¼ ðp p sin t cos t dt ¼ 1 2 sin2 tjp p ¼ 0 0 ¼ 0 Thus, sin t and cos t are orthogonal functions in the vector space C½p; p. Remark: A vector w ¼ ðx1; x2; . . . ; xnÞ is orthogonal to u ¼ ða1; a2; . . . ; anÞ in Rn if hu; wi ¼ a1x1 þ a2x2 þ þ anxn ¼ 0 That is, w is orthogonal to u if w satisfies a homogeneous equation whose coefficients are the elements of u. EXAMPLE 7.7 Find a nonzero vector w that is orthogonal to u1 ¼ ð1; 2; 1Þ and u2 ¼ ð2; 5; 4Þ in R3 . Let w ¼ ðx; y; zÞ. Then we want hu1; wi ¼ 0 and hu2; wi ¼ 0. This yields the homogeneous system x þ 2y þ z ¼ 0 2x þ 5y þ 4z ¼ 0 or x þ 2y þ z ¼ 0 y þ 2z ¼ 0 Here z is the only free variable in the echelon system. Set z ¼ 1 to obtain y ¼ 2 and x ¼ 3. Thus, w ¼ ð3; 2; 1Þ is a desired nonzero vector orthogonal to u1 and u2. Any multiple of w will also be orthogonal to u1 and u2. Normalizing w, we obtain the following unit vector orthogonal to u1 and u2: ^ w ¼ w kwk ¼ 3 ffiffiffiffiffi 14 p ; 2 ffiffiffiffiffi 14 p ; 1 ffiffiffiffiffi 14 p Orthogonal Complements Let S be a subset of an inner product space V. The orthogonal complement of S, denoted by S? (read ‘‘S perp’’) consists of those vectors in V that are orthogonal to every vector u 2 S; that is, S? ¼ fv 2 V : hv; ui ¼ 0 for every u 2 Sg CHAPTER 7 Inner Product Spaces, Orthogonality 231
  • 239. In particular, for a given vector u in V, we have u? ¼ fv 2 V : hv; ui ¼ 0g that is, u? consists of all vectors in V that are orthogonal to the given vector u. We show that S? is a subspace of V. Clearly 0 2 S? , because 0 is orthogonal to every vector in V. Now suppose v, w 2 S? . Then, for any scalars a and b and any vector u 2 S, we have hav þ bw; ui ¼ ahv; ui þ bhw; ui ¼ a 0 þ b 0 ¼ 0 Thus, av þ bw 2 S? , and therefore S? is a subspace of V. We state this result formally. PROPOSITION 7.3: Let S be a subset of a vector space V. Then S? is a subspace of V. Remark 1: Suppose u is a nonzero vector in R3 . Then there is a geometrical description of u? . Specifically, u? is the plane in R3 through the origin O and perpendicular to the vector u. This is shown in Fig. 7-2. Remark 2: Let W be the solution space of an m n homogeneous system AX ¼ 0, where A ¼ ½aij and X ¼ ½xi. Recall that W may be viewed as the kernel of the linear mapping A: Rn ! Rm . Now we can give another interpretation of W using the notion of orthogonality. Specifically, each solution vector w ¼ ðx1; x2; . . . ; xnÞ is orthogonal to each row of A; hence, W is the orthogonal complement of the row space of A. EXAMPLE 7.8 Find a basis for the subspace u? of R3 , where u ¼ ð1; 3; 4Þ. Note that u? consists of all vectors w ¼ ðx; y; zÞ such that hu; wi ¼ 0, or x þ 3y 4z ¼ 0. The free variables are y and z. (1) Set y ¼ 1, z ¼ 0 to obtain the solution w1 ¼ ð3; 1; 0Þ. (2) Set y ¼ 0, z ¼ 1 to obtain the solution w1 ¼ ð4; 0; 1Þ. The vectors w1 and w2 form a basis for the solution space of the equation, and hence a basis for u? . Suppose W is a subspace of V. Then both W and W? are subspaces of V. The next theorem, whose proof (Problem 7.28) requires results of later sections, is a basic result in linear algebra. THEOREM 7.4: Let W be a subspace of V. Then V is the direct sum of W and W? ; that is, V ¼ W W? . Figure 7-2 232 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 240. 7.6 Orthogonal Sets and Bases Consider a set S ¼ fu1; u2; . . . ; urg of nonzero vectors in an inner product space V. S is called orthogonal if each pair of vectors in S are orthogonal, and S is called orthonormal if S is orthogonal and each vector in S has unit length. That is, (i) Orthogonal: hui; uji ¼ 0 for i 6¼ j (ii) Orthonormal: hui; uji ¼ 0 for i 6¼ j 1 for i ¼ j Normalizing an orthogonal set S refers to the process of multiplying each vector in S by the reciprocal of its length in order to transform S into an orthonormal set of vectors. The following theorems apply. THEOREM 7.5: Suppose S is an orthogonal set of nonzero vectors. Then S is linearly independent. THEOREM 7.6: (Pythagoras) Suppose fu1; u2; . . . ; urg is an orthogonal set of vectors. Then ku1 þ u2 þ þ urk2 ¼ ku1k2 þ ku2k2 þ þ kurk2 These theorems are proved in Problems 7.15 and 7.16, respectively. Here we prove the Pythagorean theorem in the special and familiar case for two vectors. Specifically, suppose hu; vi ¼ 0. Then ku þ vk2 ¼ hu þ v; u þ vi ¼ hu; ui þ 2hu; vi þ hv; vi ¼ hu; ui þ hv; vi ¼ kuk2 þ kvk2 which gives our result. EXAMPLE 7.9 (a) Let E ¼ fe1; e2; e3g ¼ fð1; 0; 0Þ; ð0; 1; 0Þ; ð0; 0; 1Þg be the usual basis of Euclidean space R3 . It is clear that he1; e2i ¼ he1; e3i ¼ he2; e3i ¼ 0 and he1; e1i ¼ he2; e2i ¼ he3; e3i ¼ 1 Namely, E is an orthonormal basis of R3 . More generally, the usual basis of Rn is orthonormal for every n. (b) Let V ¼ C½p; p be the vector space of continuous functions on the interval p t p with inner product defined by h f ; gi ¼ Ðp p f ðtÞgðtÞ dt. Then the following is a classical example of an orthogonal set in V: f1; cos t; cos 2t; cos 3t; . . . ; sin t; sin 2t; sin 3t; . . .g This orthogonal set plays a fundamental role in the theory of Fourier series. Orthogonal Basis and Linear Combinations, Fourier Coefficients Let S consist of the following three vectors in R3 : u1 ¼ ð1; 2; 1Þ; u2 ¼ ð2; 1; 4Þ; u3 ¼ ð3; 2; 1Þ The reader can verify that the vectors are orthogonal; hence, they are linearly independent. Thus, S is an orthogonal basis of R3 . Suppose we want to write v ¼ ð7; 1; 9Þ as a linear combination of u1; u2; u3. First we set v as a linear combination of u1; u2; u3 using unknowns x1; x2; x3 as follows: v ¼ x1u1 þ x2u2 þ x3u3 or ð7; 1; 9Þ ¼ x1ð1; 2; 1Þ þ x2ð2; 1; 4Þ þ x3ð3; 2; 1Þ ð*Þ We can proceed in two ways. METHOD 1: Expand ð*Þ (as in Chapter 3) to obtain the system x1 þ 2x2 þ 3x3 ¼ 7; 2x1 þ x2 2x3 ¼ 1; x1 4x2 þ x3 ¼ 7 Solve the system by Gaussian elimination to obtain x1 ¼ 3, x2 ¼ 1, x3 ¼ 2. Thus, v ¼ 3u1 u2 þ 2u3. CHAPTER 7 Inner Product Spaces, Orthogonality 233
  • 241. METHOD 2: (This method uses the fact that the basis vectors are orthogonal, and the arithmetic is much simpler.) If we take the inner product of each side of ð*Þ with respect to ui, we get hv; uii ¼ hx1u2 þ x2u2 þ x3u3; uii or hv; uii ¼ xihui; uii or xi ¼ hv; uii hui; uii Here two terms drop out, because u1; u2; u3 are orthogonal. Accordingly, x1 ¼ hv; u1i hu1; u1i ¼ 7 þ 2 þ 9 1 þ 4 þ 1 ¼ 18 6 ¼ 3; x2 ¼ hv; u2i hu2; u2i ¼ 14 þ 1 36 4 þ 1 þ 16 ¼ 21 21 ¼ 1 x3 ¼ hv; u3i hu3; u3i ¼ 21 2 þ 9 9 þ 4 þ 1 ¼ 28 14 ¼ 2 Thus, again, we get v ¼ 3u1 u2 þ 2u3. The procedure in Method 2 is true in general. Namely, we have the following theorem (proved in Problem 7.17). THEOREM 7.7: Let fu1; u2; . . . ; ung be an orthogonal basis of V. Then, for any v 2 V, v ¼ hv; u1i hu1; u1i u1 þ hv; u2i hu2; u2i u2 þ þ hv; uni hun; uni un Remark: The scalar ki hv; uii hui; uii is called the Fourier coefficient of v with respect to ui, because it is analogous to a coefficient in the Fourier series of a function. This scalar also has a geometric interpretation, which is discussed below. Projections Let V be an inner product space. Suppose w is a given nonzero vector in V, and suppose v is another vector. We seek the ‘‘projection of v along w,’’ which, as indicated in Fig. 7-3(a), will be the multiple cw of w such that v0 ¼ v cw is orthogonal to w. This means hv cw; wi ¼ 0 or hv; wi chw; wi ¼ 0 or c ¼ hv; wi hw; wi Accordingly, the projection of v along w is denoted and defined by projðv; wÞ ¼ cw ¼ hv; wi hw; wi w Such a scalar c is unique, and it is called the Fourier coefficient of v with respect to w or the component of v along w. The above notion is generalized as follows (see Problem 7.25). Figure 7-3 234 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 242. THEOREM 7.8: Suppose w1; w2; . . . ; wr form an orthogonal set of nonzero vectors in V. Let v be any vector in V. Define v0 ¼ v ðc1w1 þ c2w2 þ þ crwrÞ where c1 ¼ hv; w1i hw1; w1i ; c2 ¼ hv; w2i hw2; w2i ; . . . ; cr ¼ hv; wri hwr; wri Then v0 is orthogonal to w1; w2; . . . ; wr. Note that each ci in the above theorem is the component (Fourier coefficient) of v along the given wi. Remark: The notion of the projection of a vector v 2 V along a subspace W of V is defined as follows. By Theorem 7.4, V ¼ W W? . Hence, v may be expressed uniquely in the form v ¼ w þ w0 ; where w 2 W and w0 2 W? We define w to be the projection of v along W, and denote it by projðv; WÞ, as pictured in Fig. 7-2(b). In particular, if W ¼ spanðw1; w2; . . . ; wrÞ, where the wi form an orthogonal set, then projðv; WÞ ¼ c1w1 þ c2w2 þ þ crwr Here ci is the component of v along wi, as above. 7.7 Gram–Schmidt Orthogonalization Process Suppose fv1; v2; . . . ; vng is a basis of an inner product space V. One can use this basis to construct an orthogonal basis fw1; w2; . . . ; wng of V as follows. Set w1 ¼ v1 w2 ¼ v2 hv2; w1i hw1; w1i w1 w3 ¼ v3 hv3; w1i hw1; w1i w1 hv3; w2i hw2; w2i w2 ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: wn ¼ vn hvn; w1i hw1; w1i w1 hvn; w2i hw2; w2i w2 hvn; wn1i hwn1; wn1i wn1 In other words, for k ¼ 2; 3; . . . ; n, we define wk ¼ vk ck1w1 ck2w2 ck;k1wk1 where cki ¼ hvk; wii=hwi; wii is the component of vk along wi. By Theorem 7.8, each wk is orthogonal to the preceeding w’s. Thus, w1; w2; . . . ; wn form an orthogonal basis for V as claimed. Normalizing each wi will then yield an orthonormal basis for V. The above construction is known as the Gram–Schmidt orthogonalization process. The following remarks are in order. Remark 1: Each vector wk is a linear combination of vk and the preceding w’s. Hence, one can easily show, by induction, that each wk is a linear combination of v1; v2; . . . ; vn. Remark 2: Because taking multiples of vectors does not affect orthogonality, it may be simpler in hand calculations to clear fractions in any new wk, by multiplying wk by an appropriate scalar, before obtaining the next wkþ1. CHAPTER 7 Inner Product Spaces, Orthogonality 235
  • 243. Remark 3: Suppose u1; u2; . . . ; ur are linearly independent, and so they form a basis for U ¼ spanðuiÞ. Applying the Gram–Schmidt orthogonalization process to the u’s yields an orthogonal basis for U. The following theorems (proved in Problems 7.26 and 7.27) use the above algorithm and remarks. THEOREM 7.9: Let fv1; v2; . . . ; vng be any basis of an inner product space V. Then there exists an orthonormal basis fu1; u2; . . . ; ung of V such that the change-of-basis matrix from fvig to fuig is triangular; that is, for k ¼ 1; . . . ; n, uk ¼ ak1v1 þ ak2v2 þ þ akkvk THEOREM 7.10: Suppose S ¼ fw1; w2; . . . ; wrg is an orthogonal basis for a subspace W of a vector space V. Then one may extend S to an orthogonal basis for V; that is, one may find vectors wrþ1; . . . ; wn such that fw1; w2; . . . ; wng is an orthogonal basis for V. EXAMPLE 7.10 Apply the Gram–Schmidt orthogonalization process to find an orthogonal basis and then an orthonormal basis for the subspace U of R4 spanned by v1 ¼ ð1; 1; 1; 1Þ; v2 ¼ ð1; 2; 4; 5Þ; v3 ¼ ð1; 3; 4; 2Þ (1) First set w1 ¼ v1 ¼ ð1; 1; 1; 1Þ. (2) Compute v2 hv2; w1i hw1; w1i w1 ¼ v2 12 4 w1 ¼ ð2; 1; 1; 2Þ Set w2 ¼ ð2; 1; 1; 2Þ. (3) Compute v3 hv3; w1i hw1; w1i w1 hv3; w2i hw2; w2i w2 ¼ v3 ð8Þ 4 w1 ð7Þ 10 w2 ¼ 8 5 ; 17 10 ; 13 10 ; 7 5 Clear fractions to obtain w3 ¼ ð6; 17; 13; 14Þ. Thus, w1; w2; w3 form an orthogonal basis for U. Normalize these vectors to obtain an orthonormal basis fu1; u2; u3g of U. We have kw1k2 ¼ 4, kw2k2 ¼ 10, kw3k2 ¼ 910, so u1 ¼ 1 2 ð1; 1; 1; 1Þ; u2 ¼ 1 ffiffiffiffiffi 10 p ð2; 1; 1; 2Þ; u3 ¼ 1 ffiffiffiffiffiffiffiffi 910 p ð16; 17; 13; 14Þ EXAMPLE 7.11 Let V be the vector space of polynomials f ðtÞ with inner product h f ; gi ¼ Ð1 1 f ðtÞgðtÞ dt. Apply the Gram–Schmidt orthogonalization process to f1; t; t2 ; t3 g to find an orthogonal basis f f0; f1; f2; f3g with integer coefficients for P3ðtÞ. Here we use the fact that, for r þ s ¼ n, htr ; ts i ¼ ð1 1 tn dt ¼ tnþ1 n þ 1 1 1 ¼ 2=ðn þ 1Þ when n is even 0 when n is odd (1) First set f0 ¼ 1. (2) Compute t ¼ ht; 1i h1; 1i ð1Þ ¼ t 0 ¼ t. Set f1 ¼ t. (3) Compute t2 ht2 ; 1i h1; 1i ð1Þ ht2 ; ti ht; ti ðtÞ ¼ t2 2 3 2 ð1Þ þ 0ðtÞ ¼ t2 1 3 Multiply by 3 to obtain f2 ¼ 3t2 ¼ 1. 236 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 244. (4) Compute t3 ht3 ; 1i h1; 1i ð1Þ ht3 ; ti ht; ti ðtÞ ht3 ; 3t2 1i h3t2 1; 3t2 1i ð3t2 1Þ ¼ t3 0ð1Þ 2 5 2 3 ðtÞ 0ð3t2 1Þ ¼ t3 3 5 t Multiply by 5 to obtain f3 ¼ 5t3 3t. Thus, f1; t; 3t2 1; 5t3 3tg is the required orthogonal basis. Remark: Normalizing the polynomials in Example 7.11 so that pð1Þ ¼ 1 yields the polynomials 1; t; 1 2 ð3t2 1Þ; 1 2 ð5t3 3tÞ These are the first four Legendre polynomials, which appear in the study of differential equations. 7.8 Orthogonal and Positive Definite Matrices This section discusses two types of matrices that are closely related to real inner product spaces V. Here vectors in Rn will be represented by column vectors. Thus, hu; vi ¼ uT v denotes the inner product in Euclidean space Rn . Orthogonal Matrices A real matrix P is orthogonal if P is nonsingular and P1 ¼ PT , or, in other words, if PPT ¼ PT P ¼ I. First we recall (Theorem 2.6) an important characterization of such matrices. THEOREM 7.11: Let P be a real matrix. Then the following are equivalent: (a) P is orthogonal; (b) the rows of P form an orthonormal set; (c) the columns of P form an orthonormal set. (This theorem is true only using the usual inner product on Rn . It is not true if Rn is given any other inner product.) EXAMPLE 7.12 (a) Let P ¼ 1= ffiffiffi 3 p 1= ffiffiffi 3 p 1= ffiffiffi 3 p 0 1= ffiffiffi 2 p 1= ffiffiffi 2 p 2= ffiffiffi 6 p 1= ffiffiffi 6 p 1= ffiffiffi 6 p 2 4 3 5: The rows of P are orthogonal to each other and are unit vectors. Thus P is an orthogonal matrix. (b) Let P be a 2 2 orthogonal matrix. Then, for some real number y, we have P ¼ cos y sin y sin y cos y or P ¼ cos y sin y sin y cos y The following two theorems (proved in Problems 7.37 and 7.38) show important relationships between orthogonal matrices and orthonormal bases of a real inner product space V. THEOREM 7.12: Suppose E ¼ feig and E0 ¼ fe0 ig are orthonormal bases of V. Let P be the change- of-basis matrix from the basis E to the basis E0 . Then P is orthogonal. THEOREM 7.13: Let fe1; . . . ; eng be an orthonormal basis of an inner product space V. Let P ¼ ½aij be an orthogonal matrix. Then the following n vectors form an orthonormal basis for V: e0 i ¼ a1ie1 þ a2ie2 þ þ anien; i ¼ 1; 2; . . . ; n CHAPTER 7 Inner Product Spaces, Orthogonality 237
  • 245. Positive Definite Matrices Let A be a real symmetric matrix; that is, AT ¼ A. Then A is said to be positive definite if, for every nonzero vector u in Rn , hu; Aui ¼ uT Au 0 Algorithms to decide whether or not a matrix A is positive definite will be given in Chapter 12. However, for 2 2 matrices, we have simple criteria that we state formally in the following theorem (proved in Problem 7.43). THEOREM 7.14: A 2 2 real symmetric matrix A ¼ a b c d ¼ a b b d is positive definite if and only if the diagonal entries a and d are positive and the determinant jAj ¼ ad bc ¼ ad b2 is positive. EXAMPLE 7.13 Consider the following symmetric matrices: A ¼ 1 3 3 4 ; B ¼ 1 2 2 3 ; C ¼ 1 2 2 5 A is not positive definite, because jAj ¼ 4 9 ¼ 5 is negative. B is not positive definite, because the diagonal entry 3 is negative. However, C is positive definite, because the diagonal entries 1 and 5 are positive, and the determinant jCj ¼ 5 4 ¼ 1 is also positive. The following theorem (proved in Problem 7.44) holds. THEOREM 7.15: Let A be a real positive definite matrix. Then the function hu; vi ¼ uT Av is an inner product on Rn . Matrix Representation of an Inner Product (Optional) Theorem 7.15 says that every positive definite matrix A determines an inner product on Rn . This subsection may be viewed as giving the converse of this result. Let V be a real inner product space with basis S ¼ fu1; u2; . . . ; ung. The matrix A ¼ ½aij; where aij ¼ hui; uji is called the matrix representation of the inner product on V relative to the basis S. Observe that A is symmetric, because the inner product is symmetric; that is, hui; uji ¼ huj; uii. Also, A depends on both the inner product on V and the basis S for V. Moreover, if S is an orthogonal basis, then A is diagonal, and if S is an orthonormal basis, then A is the identity matrix. EXAMPLE 7.14 The vectors u1 ¼ ð1; 1; 0Þ, u2 ¼ ð1; 2; 3Þ, u3 ¼ ð1; 3; 5Þ form a basis S for Euclidean space R3 . Find the matrix A that represents the inner product in R3 relative to this basis S. First compute each hui; uji to obtain hu1; u1i ¼ 1 þ 1 þ 0 ¼ 2; hu2; u2i ¼ 1 þ 4 þ 9 ¼ 14; hu1; u2i ¼ 1 þ 2 þ 0 ¼ 3; hu2; u3i ¼ 1 þ 6 þ 15 ¼ 22; hu1; u3i ¼ 1 þ 3 þ 0 ¼ 4 hu3; u3i ¼ 1 þ 9 þ 25 ¼ 35 Then A ¼ 2 3 4 3 14 22 4 22 35 2 4 3 5. As expected, A is symmetric. The following theorems (proved in Problems 7.45 and 7.46, respectively) hold. THEOREM 7.16: Let A be the matrix representation of an inner product relative to basis S for V. Then, for any vectors u; v 2 V, we have hu; vi ¼ ½uT A½v where ½u and ½v denote the (column) coordinate vectors relative to the basis S. 238 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 246. THEOREM 7.17: Let A be the matrix representation of any inner product on V. Then A is a positive definite matrix. 7.9 Complex Inner Product Spaces This section considers vector spaces over the complex field C. First we recall some properties of the complex numbers (Section 1.7), especially the relations between a complex number z ¼ a þ bi; where a; b 2 R; and its complex conjugate z ¼ a bi: z z ¼ a2 þ b2 ; jzj ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi a2 þ b2 p ; z1 þ z2 ¼ z1 þ z2 z1z2 ¼ z1z2; z ¼ z Also, z is real if and only if z ¼ z. The following definition applies. DEFINITION: Let V be a vector space over C. Suppose to each pair of vectors, u; v 2 V there is assigned a complex number, denoted by hu; vi. This function is called a (complex) inner product on V if it satisfies the following axioms: ½I1 * (Linear Property) hau1 þ bu2; vi ¼ ahu1; vi þ bhu2; vi ½I2 * (Conjugate Symmetric Property) hu; vi ¼ hv; ui ½I3 * (Positive Definite Property) hu; ui 0; and hu; ui ¼ 0 if and only if u ¼ 0. The vector space V over C with an inner product is called a (complex) inner product space. Observe that a complex inner product differs from the real case only in the second axiom ½I2 *: Axiom ½I1 * (Linear Property) is equivalent to the two conditions: ðaÞ hu1 þ u2; vi ¼ hu1; vi þ hu2; vi; ðbÞ hku; vi ¼ khu; vi On the other hand, applying ½I1 * and ½I2 *, we obtain hu; kvi ¼ hkv; ui ¼ khv; ui ¼ khv; ui ¼ khu; vi That is, we must take the conjugate of a complex number when it is taken out of the second position of a complex inner product. In fact (Problem 7.47), the inner product is conjugate linear in the second position; that is, hu; av1 þ bv2i ¼ ahu; v1i þ bhu; v2i Combining linear in the first position and conjugate linear in the second position, we obtain, by induction, P i aiui; P j bjvj * + ¼ P i;j aibjhui; vji The following remarks are in order. Remark 1: Axiom ½I1 * by itself implies that h0; 0i ¼ h0v; 0i ¼ 0hv; 0i ¼ 0. Accordingly, ½I1 *, ½I2 *, and ½I3 * are equivalent to ½I1 *, ½I2 *, and the following axiom: ½I3 *0 If u 6¼ 0; then hu; ui 0: That is, a function satisfying ½I1, ½I2 *, and ½I3 *0 is a (complex) inner product on V. Remark 2: By ½I2 *; hu; ui ¼ hu; ui. Thus, hu; ui must be real. By ½I3 *; hu; ui must be nonnegative, and hence, its positive real square root exists. As with real inner product spaces, we define kuk ¼ ffiffiffiffiffiffiffiffiffiffiffi hu; ui p to be the norm or length of u. Remark 3: In addition to the norm, we define the notions of orthogonality, orthogonal comple- ment, and orthogonal and orthonormal sets as before. In fact, the definitions of distance and Fourier coefficient and projections are the same as in the real case. CHAPTER 7 Inner Product Spaces, Orthogonality 239
  • 247. EXAMPLE 7.15 (Complex Euclidean Space Cn ). Let V ¼ Cn , and let u ¼ ðziÞ and v ¼ ðwiÞ be vectors in Cn . Then hu; vi ¼ P k zkwk ¼ z1w1 þ z2w2 þ þ znwn is an inner product on V, called the usual or standard inner product on Cn . V with this inner product is called Complex Euclidean Space. We assume this inner product on Cn unless otherwise stated or implied. Assuming u and v are column vectors, the above inner product may be defined by hu; vi ¼ uT v where, as with matrices, v means the conjugate of each element of v. If u and v are real, we have wi ¼ wi. In this case, the inner product reduced to the analogous one on Rn . EXAMPLE 7.16 (a) Let V be the vector space of complex continuous functions on the (real) interval a t b. Then the following is the usual inner product on V: h f ; gi ¼ ðb a f ðtÞgðtÞ dt (b) Let U be the vector space of m n matrices over C. Suppose A ¼ ðzijÞ and B ¼ ðwijÞ are elements of U. Then the following is the usual inner product on U: hA; Bi ¼ trðBH AÞ ¼ P m i¼1 P n j¼1 wijzij As usual, BH ¼ BT ; that is, BH is the conjugate transpose of B. The following is a list of theorems for complex inner product spaces that are analogous to those for the real case. Here a Hermitian matrix A (i.e., one where AH ¼ AT ¼ AÞ plays the same role that a symmetric matrix A (i.e., one where AT ¼ A) plays in the real case. (Theorem 7.18 is proved in Problem 7.50.) THEOREM 7.18: (Cauchy–Schwarz) Let V be a complex inner product space. Then jhu; vij kukkvk THEOREM 7.19: Let W be a subspace of a complex inner product space V. Then V ¼ W W? . THEOREM 7.20: Suppose fu1; u2; . . . ; ung is a basis for a complex inner product space V. Then, for any v 2 V, v ¼ hv; u1i hu1; u1i u1 þ hv; u2i hu2; u2i u2 þ þ hv; uni hun; uni un THEOREM 7.21: Suppose fu1; u2; . . . ; ung is a basis for a complex inner product space V. Let A ¼ ½aij be the complex matrix defined by aij ¼ hui; uji. Then, for any u; v 2 V, hu; vi ¼ ½uT A½v where ½u and ½v are the coordinate column vectors in the given basis fuig. (Remark: This matrix A is said to represent the inner product on V.) THEOREM 7.22: Let A be a Hermitian matrix (i.e., AH ¼ AT ¼ AÞ such that XT A X is real and positive for every nonzero vector X 2 Cn . Then hu; vi ¼ uT A v is an inner product on Cn . THEOREM 7.23: Let A be the matrix that represents an inner product on V. Then A is Hermitian, and XT AX is real and positive for any nonzero vector in Cn . 240 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 248. 7.10 Normed Vector Spaces (Optional) We begin with a definition. DEFINITION: Let V be a real or complex vector space. Suppose to each v 2 V there is assigned a real number, denoted by kvk. This function k k is called a norm on V if it satisfies the following axioms: ½N1 kvk 0; and kvk ¼ 0 if and only if v ¼ 0. ½N2 kkvk ¼ jkjkvk. ½N3 ku þ vk kuk þ kvk. A vector space V with a norm is called a normed vector space. Suppose V is a normed vector space. The distance between two vectors u and v in V is denoted and defined by dðu; vÞ ¼ ku vk The following theorem (proved in Problem 7.56) is the main reason why dðu; vÞ is called the distance between u and v. THEOREM 7.24: Let V be a normed vector space. Then the function dðu; vÞ ¼ ku vk satisfies the following three axioms of a metric space: ½M1 dðu; vÞ 0; and dðu; vÞ ¼ 0 if and only if u ¼ v. ½M2 dðu; vÞ ¼ dðv; uÞ. ½M3 dðu; vÞ dðu; wÞ þ dðw; vÞ. Normed Vector Spaces and Inner Product Spaces Suppose V is an inner product space. Recall that the norm of a vector v in V is defined by kvk ¼ ffiffiffiffiffiffiffiffiffiffiffi hv; vi p One can prove (Theorem 7.2) that this norm satisfies ½N1, ½N2, and ½N3. Thus, every inner product space V is a normed vector space. On the other hand, there may be norms on a vector space V that do not come from an inner product on V, as shown below. Norms on Rn and Cn The following define three important norms on Rn and Cn : kða1; . . . ; anÞk1 ¼ maxðjaijÞ kða1; . . . ; anÞk1 ¼ ja1j þ ja2j þ þ janj kða1; . . . ; anÞk2 ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ja1j2 þ ja2j2 þ þ janj2 q (Note that subscripts are used to distinguish between the three norms.) The norms k k1, k k1, and k k2 are called the infinity-norm, one-norm, and two-norm, respectively. Observe that k k2 is the norm on Rn (respectively, Cn ) induced by the usual inner product on Rn (respectively, Cn ). We will let d1, d1, d2 denote the corresponding distance functions. EXAMPLE 7.17 Consider vectors u ¼ ð1; 5; 3Þ and v ¼ ð4; 2; 3Þ in R3 . (a) The infinity norm chooses the maximum of the absolute values of the components. Hence, kuk1 ¼ 5 and kvk1 ¼ 4 CHAPTER 7 Inner Product Spaces, Orthogonality 241
  • 249. (b) The one-norm adds the absolute values of the components. Thus, kuk1 ¼ 1 þ 5 þ 3 ¼ 9 and kvk1 ¼ 4 þ 2 þ 3 ¼ 9 (c) The two-norm is equal to the square root of the sum of the squares of the components (i.e., the norm induced by the usual inner product on R3 ). Thus, kuk2 ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 25 þ 9 p ¼ ffiffiffiffiffi 35 p and kvk2 ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 16 þ 4 þ 9 p ¼ ffiffiffiffiffi 29 p (d) Because u v ¼ ð1 4; 5 2; 3 þ 3Þ ¼ ð3; 7; 6Þ, we have d1ðu; vÞ ¼ 7; d1ðu; vÞ ¼ 3 þ 7 þ 6 ¼ 16; d2ðu; vÞ ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 9 þ 49 þ 36 p ¼ ffiffiffiffiffi 94 p EXAMPLE 7.18 Consider the Cartesian plane R2 shown in Fig. 7-4. (a) Let D1 be the set of points u ¼ ðx; yÞ in R2 such that kuk2 ¼ 1. Then D1 consists of the points ðx; yÞ such that kuk2 2 ¼ x2 þ y2 ¼ 1. Thus, D1 is the unit circle, as shown in Fig. 7-4. (b) Let D2 be the set of points u ¼ ðx; yÞ in R2 such that kuk1 ¼ 1. Then D1 consists of the points ðx; yÞ such that kuk1 ¼ jxj þ jyj ¼ 1. Thus, D2 is the diamond inside the unit circle, as shown in Fig. 7-4. (c) Let D3 be the set of points u ¼ ðx; yÞ in R2 such that kuk1 ¼ 1. Then D3 consists of the points ðx; yÞ such that kuk1 ¼ maxðjxj, jyjÞ ¼ 1. Thus, D3 is the square circumscribing the unit circle, as shown in Fig. 7-4. Norms on C½a; b Consider the vector space V ¼ C½a; b of real continuous functions on the interval a t b. Recall that the following defines an inner product on V: h f ; gi ¼ ðb a f ðtÞgðtÞ dt Accordingly, the above inner product defines the following norm on V ¼ C½a; b (which is analogous to the k k2 norm on Rn ): k f k2 ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ðb a ½ f ðtÞ2 dt s Figure 7-4 242 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 250. The following define the other norms on V ¼ C½a; b: k f k1 ¼ ðb a j f ðtÞj dt and k f k1 ¼ maxðj f ðtÞjÞ There are geometrical descriptions of these two norms and their corresponding distance functions, which are described below. The first norm is pictured in Fig. 7-5. Here k f k1 ¼ area between the function j f j and the t-axis d1ð f ; gÞ ¼ area between the functions f and g This norm is analogous to the norm k k1 on Rn . The second norm is pictured in Fig. 7-6. Here k f k1 ¼ maximum distance between f and the t-axis d1ð f ; gÞ ¼ maximum distance between f and g This norm is analogous to the norms k k1 on Rn . SOLVED PROBLEMS Inner Products 7.1. Expand: (a) h5u1 þ 8u2; 6v1 7v2i, (b) h3u þ 5v; 4u 6vi, (c) k2u 3vk2 Use linearity in both positions and, when possible, symmetry, hu; vi ¼ hv; ui. Figure 7-5 Figure 7-6 CHAPTER 7 Inner Product Spaces, Orthogonality 243
  • 251. (a) Take the inner product of each term on the left with each term on the right: h5u1 þ 8u2; 6v1 7v2i ¼ h5u1; 6v1i þ h5u1; 7v2i þ h8u2; 6v1i þ h8u2; 7v2i ¼ 30hu1; v1i 35hu1; v2i þ 48hu2; v1i 56hu2; v2i [Remark: Observe the similarity between the above expansion and the expansion (5a–8b)(6c–7d ) in ordinary algebra.] (b) h3u þ 5v; 4u 6vi ¼ 12hu; ui 18hu; vi þ 20hv; ui 30hv; vi ¼ 12hu; ui þ 2hu; vi 30hv; vi (c) k2u 3vk2 ¼ h2u 3v; 2u 3vi ¼ 4hu; ui 6hu; vi 6hv; ui þ 9hv; vi ¼ 4kuk2 12ðu; vÞ þ 9kvk2 7.2. Consider vectors u ¼ ð1; 2; 4Þ; v ¼ ð2; 3; 5Þ; w ¼ ð4; 2; 3Þ in R3 . Find (a) u v, (b) u w; (c) v w, (d) ðu þ vÞ w, (e) kuk, (f ) kvk. (a) Multiply corresponding components and add to get u v ¼ 2 6 þ 20 ¼ 16: (b) u w ¼ 4 þ 4 12 ¼ 4. (c) v w ¼ 8 6 15 ¼ 13. (d) First find u þ v ¼ ð3; 1; 9Þ. Then ðu þ vÞ w ¼ 12 2 27 ¼ 17. Alternatively, using ½I1, ðu þ vÞ w ¼ u w þ v w ¼ 4 13 ¼ 17. (e) First find kuk2 by squaring the components of u and adding: kuk2 ¼ 12 þ 22 þ 42 ¼ 1 þ 4 þ 16 ¼ 21; and so kuk ¼ ffiffiffiffiffi 21 p (f ) kvk2 ¼ 4 þ 9 þ 25 ¼ 38, and so kvk ¼ ffiffiffiffiffi 38 p . 7.3. Verify that the following defines an inner product in R2 : hu; vi ¼ x1y1 x1y2 x2y1 þ 3x2y2; where u ¼ ðx1; x2Þ; v ¼ ðy1; y2Þ We argue via matrices. We can write hu; vi in matrix notation as follows: hu; vi ¼ uT Av ¼ ½x1; x2 1 1 1 3 y1 y2 Because A is real and symmetric, we need only show that A is positive definite. The diagonal elements 1 and 3 are positive, and the determinant kAk ¼ 3 1 ¼ 2 is positive. Thus, by Theorem 7.14, A is positive definite. Accordingly, by Theorem 7.15, hu; vi is an inner product. 7.4. Consider the vectors u ¼ ð1; 5Þ and v ¼ ð3; 4Þ in R2 . Find (a) hu; vi with respect to the usual inner product in R2 . (b) hu; vi with respect to the inner product in R2 in Problem 7.3. (c) kvk using the usual inner product in R2 . (d) kvk using the inner product in R2 in Problem 7.3. (a) hu; vi ¼ 3 þ 20 ¼ 23. (b) hu; vi ¼ 1 3 1 4 5 3 þ 3 5 4 ¼ 3 4 15 þ 60 ¼ 44. (c) kvk2 ¼ hv; vi ¼ hð3; 4Þ; ð3; 4Þi ¼ 9 þ 16 ¼ 25; hence, jvk ¼ 5. (d) kvk2 ¼ hv; vi ¼ hð3; 4Þ; ð3; 4Þi ¼ 9 12 12 þ 48 ¼ 33; hence, kvk ¼ ffiffiffiffiffi 33 p . 7.5. Consider the following polynomials in PðtÞ with the inner product h f ; gi ¼ Ð1 0 f ðtÞgðtÞ dt: f ðtÞ ¼ t þ 2; gðtÞ ¼ 3t 2; hðtÞ ¼ t2 2t 3 (a) Find h f ; gi and h f ; hi. (b) Find k f k and kgk. (c) Normalize f and g. 244 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 252. (a) Integrate as follows: h f ; gi ¼ ð1 0 ðt þ 2Þð3t 2Þ dt ¼ ð1 0 ð3t2 þ 4t 4Þ dt ¼ t3 þ 2t2 4t 1 0 ¼ 1 h f ; hi ¼ ð1 0 ðt þ 2Þðt2 2t 3Þ dt ¼ t4 4 7t2 2 6t 1 0 ¼ 37 4 (b) h f ; f i ¼ Ð1 0 ðt þ 2Þðt þ 2Þ dt ¼ 19 3 ; hence, k f k ¼ ffiffiffiffi 19 3 q ¼ 1 3 ffiffiffiffiffi 57 p hg; gi ¼ ð1 0 ð3t 2Þð3t 2Þ ¼ 1; hence; kgk ¼ ffiffiffi 1 p ¼ 1 (c) Because k f k ¼ 1 3 ffiffiffiffiffi 57 p and g is already a unit vector, we have ^ f ¼ 1 k f k f ¼ 3 ffiffiffiffiffi 57 p ðt þ 2Þ and ^ g ¼ g ¼ 3t 2 7.6. Find cos y where y is the angle between: (a) u ¼ ð1; 3; 5; 4Þ and v ¼ ð2; 3; 4; 1Þ in R4 , (b) A ¼ 9 8 7 6 5 4 and B ¼ 1 2 3 4 5 6 , where hA; Bi ¼ trðBT AÞ: Use cos y ¼ hu; vi kukkvk (a) Compute: hu; vi ¼ 2 9 20 þ 4 ¼ 23; kuk2 ¼ 1 þ 9 þ 25 þ 16 ¼ 51; kvk2 ¼ 4 þ 9 þ 16 þ 1 ¼ 30 Thus; cos y ¼ 23 ffiffiffiffiffi 51 p ffiffiffiffiffi 30 p ¼ 23 3 ffiffiffiffiffiffiffiffi 170 p (b) Use hA; Bi ¼ trðBT AÞ ¼ Pm i¼1 Pn j¼1 aijbij, the sum of the products of corresponding entries. hA; Bi ¼ 9 þ 16 þ 21 þ 24 þ 25 þ 24 ¼ 119 Use kAk2 ¼ hA; Ai ¼ Pm i¼1 Pn j¼1 a2 ij ; the sum of the squares of all the elements of A. kAk2 ¼ hA; Ai ¼ 92 þ 82 þ 72 þ 62 þ 52 þ 42 ¼ 271; kBk2 ¼ hB; Bi ¼ 12 þ 22 þ 32 þ 42 þ 52 þ 62 ¼ 91; and so and so kAk ¼ ffiffiffiffiffiffiffiffi 271 p kBk ¼ ffiffiffiffiffi 91 p Thus; cos y ¼ 119 ffiffiffiffiffiffiffiffi 271 p ffiffiffiffiffi 91 p 7.7. Verify each of the following: (a) Parallelogram Law (Fig. 7-7): ku þ vk2 þ ku vk2 ¼ 2kuk2 þ 2kvk2 . (b) Polar form for hu; vi (which shows the inner product can be obtained from the norm function): hu; vi ¼ 1 4 ðku þ vk2 ku vk2 Þ: Expand as follows to obtain ku þ vk2 ¼ hu þ v; u þ vi ¼ kuk2 þ 2hu; vi þ kvk2 ð1Þ ku vk2 ¼ hu v; u vi ¼ kuk2 2hu; vi þ kvk2 ð2Þ Add (1) and (2) to get the Parallelogram Law (a). Subtract (2) from (1) to obtain ku þ vk2 ku vk2 ¼ 4hu; vi Divide by 4 to obtain the (real) polar form (b). CHAPTER 7 Inner Product Spaces, Orthogonality 245
  • 253. 7.8. Prove Theorem 7.1 (Cauchy–Schwarz): For u and v in a real inner product space V; hu; ui2 hu; uihv; vi or jhu; vij kukkvk: For any real number t, htu þ v; tu þ vi ¼ t2 hu; ui þ 2thu; vi þ hv; vi ¼ t2 kuk2 þ 2thu; vi þ kvk2 Let a ¼ kuk2 , b ¼ 2hu; vÞ, c ¼ kvk2 . Because ktu þ vk2 0, we have at2 þ bt þ c 0 for every value of t. This means that the quadratic polynomial cannot have two real roots, which implies that b2 4ac 0 or b2 4ac. Thus, 4hu; vi2 4kuk2 kvk2 Dividing by 4 gives our result. 7.9. Prove Theorem 7.2: The norm in an inner product space V satisfies (a) ½N1 kvk 0; and kvk ¼ 0 if and only if v ¼ 0. (b) ½N2 kkvk ¼ jkjkvk. (c) ½N3 ku þ vk kuk þ kvk. (a) If v 6¼ 0, then hv; vi 0, and hence, kvk ¼ ffiffiffiffiffiffiffiffiffiffiffi hv; vi p 0. If v ¼ 0, then h0; 0i ¼ 0. Consequently, k0k ¼ ffiffiffi 0 p ¼ 0. Thus, ½N1 is true. (b) We have kkvk2 ¼ hkv; kvi ¼ k2 hv; vi ¼ k2 kvk2 . Taking the square root of both sides gives ½N2. (c) Using the Cauchy–Schwarz inequality, we obtain ku þ vk2 ¼ hu þ v; u þ vi ¼ hu; ui þ hu; vi þ hu; vi þ hv; vi kuk2 þ 2kukkvk þ kvk2 ¼ ðkuk þ kvkÞ2 Taking the square root of both sides yields ½N3. Orthogonality, Orthonormal Complements, Orthogonal Sets 7.10. Find k so that u ¼ ð1; 2; k; 3Þ and v ¼ ð3; k; 7; 5Þ in R4 are orthogonal. First find hu; vi ¼ ð1; 2; k; 3Þ ð3; k; 7; 5Þ ¼ 3 þ 2k þ 7k 15 ¼ 9k 12 Then set hu; vi ¼ 9k 12 ¼ 0 to obtain k ¼ 4 3. 7.11. Let W be the subspace of R5 spanned by u ¼ ð1; 2; 3; 1; 2Þ and v ¼ ð2; 4; 7; 2; 1Þ. Find a basis of the orthogonal complement W? of W. We seek all vectors w ¼ ðx; y; z; s; tÞ such that hw; ui ¼ x þ 2y þ 3z s þ 2t ¼ 0 hw; vi ¼ 2x þ 4y þ 7z þ 2s t ¼ 0 Eliminating x from the second equation, we find the equivalent system x þ 2y þ 3z s þ 2t ¼ 0 z þ 4s 5t ¼ 0 Figure 7-7 246 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 254. The free variables are y; s, and t. Therefore, (1) Set y ¼ 1, s ¼ 0, t ¼ 0 to obtain the solution w1 ¼ ð2; 1; 0; 0; 0Þ. (2) Set y ¼ 0, s ¼ 1, t ¼ 0 to find the solution w2 ¼ ð13; 0; 4; 1; 0Þ. (3) Set y ¼ 0, s ¼ 0, t ¼ 1 to obtain the solution w3 ¼ ð17; 0; 5; 0; 1Þ. The set fw1; w2; w3g is a basis of W? . 7.12. Let w ¼ ð1; 2; 3; 1Þ be a vector in R4 . Find an orthogonal basis for w? . Find a nonzero solution of x þ 2y þ 3z þ t ¼ 0, say v1 ¼ ð0; 0; 1; 3Þ. Now find a nonzero solution of the system x þ 2y þ 3z þ t ¼ 0; z 3t ¼ 0 say v2 ¼ ð0; 5; 3; 1Þ. Last, find a nonzero solution of the system x þ 2y þ 3z þ t ¼ 0; 5y þ 3z þ t ¼ 0; z 3t ¼ 0 say v3 ¼ ð14; 2; 3; 1Þ. Thus, v1, v2, v3 form an orthogonal basis for w? . 7.13. Let S consist of the following vectors in R4 : u1 ¼ ð1; 1; 0; 1Þ; u2 ¼ ð1; 2; 1; 3Þ; u3 ¼ ð1; 1; 9; 2Þ; u4 ¼ ð16; 13; 1; 3Þ (a) Show that S is orthogonal and a basis of R4 . (b) Find the coordinates of an arbitrary vector v ¼ ða; b; c; dÞ in R4 relative to the basis S. (a) Compute u1 u2 ¼ 1 þ 2 þ 0 3 ¼ 0; u2 u3 ¼ 1 þ 2 9 þ 6 ¼ 0; u1 u3 ¼ 1 þ 1 þ 0 2 ¼ 0; u2 u4 ¼ 16 26 þ 1 þ 9 ¼ 0; u1 u4 ¼ 16 13 þ 0 3 ¼ 0 u3 u4 ¼ 16 13 9 þ 6 ¼ 0 Thus, S is orthogonal, and S is linearly independent. Accordingly, S is a basis for R4 because any four linearly independent vectors form a basis of R4 . (b) Because S is orthogonal, we need only find the Fourier coefficients of v with respect to the basis vectors, as in Theorem 7.7. Thus, k1 ¼ hv; u1i hu1; u1i ¼ a þ b d 3 ; k2 ¼ hv; u2i hu2; u2i ¼ a þ 2b þ c þ 3d 15 ; k3 ¼ hv; u3i hu3; u3i ¼ a þ b 9c þ 2d 87 k4 ¼ hv; u4i hu4; u4i ¼ 16a 13b þ c þ 3d 435 are the coordinates of v with respect to the basis S. 7.14. Suppose S, S1, S2 are the subsets of V. Prove the following: (a) S S?? . (b) If S1 S2, then S? 2 S? 1 . (c) S? ¼ span ðSÞ? . (a) Let w 2 S. Then hw; vi ¼ 0 for every v 2 S? ; hence, w 2 S?? . Accordingly, S S?? . (b) Let w 2 S? 2 . Then hw; vi ¼ 0 for every v 2 S2. Because S1 S2, hw; vi ¼ 0 for every v ¼ S1. Thus, w 2 S? 1 , and hence, S? 2 S? 1 . (c) Because S spanðSÞ, part (b) gives us spanðSÞ? S? . Suppose u 2 S? and v 2 spanðSÞ. Then there exist w1; w2; . . . ; wk in S such that v ¼ a1w1 þ a2w2 þ þ akwk. Then, using u 2 S? , we have hu; vi ¼ hu; a1w1 þ a2w2 þ þ akwki ¼ a1hu; w1i þ a2hu; w2i þ þ akhu; wki ¼ a1ð0Þ þ a2ð0Þ þ þ akð0Þ ¼ 0 Thus, u 2 spanðSÞ? . Accordingly, S? spanðSÞ? . Both inclusions give S? ¼ spanðSÞ? . 7.15. Prove Theorem 7.5: Suppose S is an orthogonal set of nonzero vectors. Then S is linearly independent. CHAPTER 7 Inner Product Spaces, Orthogonality 247
  • 255. Suppose S ¼ fu1; u2; . . . ; urg and suppose a1u1 þ a2u2 þ þ arur ¼ 0 ð1Þ Taking the inner product of (1) with u1, we get 0 ¼ h0; u1i ¼ ha1u1 þ a2u2 þ þ arur; u1i ¼ a1hu1; u1i þ a2hu2; u1i þ þ arhur; u1i ¼ a1hu1; u1i þ a2 0 þ þ ar 0 ¼ a1hu1; u1i Because u1 6¼ 0, we have hu1; u1i 6¼ 0. Thus, a1 ¼ 0. Similarly, for i ¼ 2; . . . ; r, taking the inner product of (1) with ui, 0 ¼ h0; uii ¼ ha1u1 þ þ arur; uii ¼ a1hu1; uii þ þ aihui; uii þ þ arhur; uii ¼ aihui; uii But hui; uii 6¼ 0, and hence, every ai ¼ 0. Thus, S is linearly independent. 7.16. Prove Theorem 7.6 (Pythagoras): Suppose fu1; u2; . . . ; urg is an orthogonal set of vectors. Then ku1 þ u2 þ þ urk2 ¼ ku1k2 þ ku2k2 þ þ kurk2 Expanding the inner product, we have ku1 þ u2 þ þ urk2 ¼ hu1 þ u2 þ þ ur; u1 þ u2 þ þ uri ¼ hu1; u1i þ hu2; u2i þ þ hur; uri þ P i6¼j hui; uji The theorem follows from the fact that hui; uii ¼ kuik2 and hui; uji ¼ 0 for i 6¼ j. 7.17. Prove Theorem 7.7: Let fu1; u2; . . . ; ung be an orthogonal basis of V. Then for any v 2 V, v ¼ hv; u1i hu1; u1i u1 þ hv; u2i hu2; u2i u2 þ þ hv; uni hun; uni un Suppose v ¼ k1u1 þ k2u2 þ þ knun. Taking the inner product of both sides with u1 yields hv; u1i ¼ hk1u2 þ k2u2 þ þ knun; u1i ¼ k1hu1; u1i þ k2hu2; u1i þ þ knhun; u1i ¼ k1hu1; u1i þ k2 0 þ þ kn 0 ¼ k1hu1; u1i Thus, k1 ¼ hv; u1i hu1; u1i . Similarly, for i ¼ 2; . . . ; n, hv; uii ¼ hk1ui þ k2u2 þ þ knun; uii ¼ k1hu1; uii þ k2hu2; uii þ þ knhun; uii ¼ k1 0 þ þ kihui; uii þ þ kn 0 ¼ kihui; uii Thus, ki ¼ hv; uii hu1; uii . Substituting for ki in the equation v ¼ k1u1 þ þ knun, we obtain the desired result. 7.18. Suppose E ¼ fe1; e2; . . . ; eng is an orthonormal basis of V. Prove (a) For any u 2 V, we have u ¼ hu; e1ie1 þ hu; e2ie2 þ þ hu; enien. (b) ha1e1 þ þ anen; b1e1 þ þ bneni ¼ a1b1 þ a2b2 þ þ anbn. (c) For any u; v 2 V, we have hu; vi ¼ hu; e1ihv; e1i þ þ hu; enihv; eni. (a) Suppose u ¼ k1e1 þ k2e2 þ þ knen. Taking the inner product of u with e1, hu; e1i ¼ hk1e1 þ k2e2 þ þ knen; e1i ¼ k1he1; e1i þ k2he2; e1i þ þ knhen; e1i ¼ k1ð1Þ þ k2ð0Þ þ þ knð0Þ ¼ k1 248 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 256. Similarly, for i ¼ 2; . . . ; n, hu; eii ¼ hk1e1 þ þ kiei þ þ knen; eii ¼ k1he1; eii þ þ kihei; eii þ þ knhen; eii ¼ k1ð0Þ þ þ kið1Þ þ þ knð0Þ ¼ ki Substituting hu; eii for ki in the equation u ¼ k1e1 þ þ knen, we obtain the desired result. (b) We have P n i¼1 aiei; P n j¼1 bjej * + ¼ P n i;j¼1 aibjhei; eji ¼ P n i¼1 aibihei; eii þ P i6¼j aibjhei; eji But hei; eji ¼ 0 for i 6¼ j, and hei; eji ¼ 1 for i ¼ j. Hence, as required, P n i¼1 aiei; P n j¼1 bjej * + ¼ P n i¼1 aibi ¼ a1b1 þ a2b2 þ þ anbn (c) By part (a), we have u ¼ hu; e1ie1 þ þ hu; enien and v ¼ hv; e1ie1 þ þ hv; enien Thus, by part (b), hu; vi ¼ hu; e1ihv; e1i þ hu; e2ihv; e2i þ þ hu; enihv; eni Projections, Gram–Schmidt Algorithm, Applications 7.19. Suppose w 6¼ 0. Let v be any vector in V. Show that c ¼ hv; wi hw; wi ¼ hv; wi kwk2 is the unique scalar such that v0 ¼ v cw is orthogonal to w. In order for v0 to be orthogonal to w we must have hv cw; wi ¼ 0 or hv; wi chw; wi ¼ 0 or hv; wi ¼ chw; wi Thus, c hv; wi hw; wi . Conversely, suppose c ¼ hv; wi hw; wi . Then hv cw; wi ¼ hv; wi chw; wi ¼ hv; wi hv; wi hw; wi hw; wi ¼ 0 7.20. Find the Fourier coefficient c and the projection of v ¼ ð1; 2; 3; 4Þ along w ¼ ð1; 2; 1; 2Þ in R4 . Compute hv; wi ¼ 1 4 þ 3 8 ¼ 8 and kwk2 ¼ 1 þ 4 þ 1 þ 4 ¼ 10. Then c ¼ 8 10 ¼ 4 5 and projðv; wÞ ¼ cw ¼ ð 4 5 ; 8 5 ; 4 5 ; 8 5Þ 7.21. Consider the subspace U of R4 spanned by the vectors: v1 ¼ ð1; 1; 1; 1Þ; v2 ¼ ð1; 1; 2; 4Þ; v3 ¼ ð1; 2; 4; 3Þ Find (a) an orthogonal basis of U; (b) an orthonormal basis of U. (a) Use the Gram–Schmidt algorithm. Begin by setting w1 ¼ u ¼ ð1; 1; 1; 1Þ. Next find v2 hv2; w1i hw1; w1i w1 ¼ ð1; 1; 2; 4Þ 8 4 ð1; 1; 1; 1Þ ¼ ð1; 1; 0; 2Þ Set w2 ¼ ð1; 1; 0; 2Þ. Then find v3 hv3; w1i hw1; w1i w1 hv3; w2i hw2; w2i w2 ¼ ð1; 2; 4; 3Þ ð4Þ 4 ð1; 1; 1; 1Þ ð9Þ 6 ð1; 1; 0; 2Þ ¼ ð1 2 ; 3 2 ; 3; 1Þ Clear fractions to obtain w3 ¼ ð1; 3; 6; 2Þ. Then w1; w2; w3 form an orthogonal basis of U. CHAPTER 7 Inner Product Spaces, Orthogonality 249
  • 257. (b) Normalize the orthogonal basis consisting of w1; w2; w3. Because kw1k2 ¼ 4, kw2k2 ¼ 6, and kw3k2 ¼ 50, the following vectors form an orthonormal basis of U: u1 ¼ 1 2 ð1; 1; 1; 1Þ; u2 ¼ 1 ffiffiffi 6 p ð1; 1; 0; 2Þ; u3 ¼ 1 5 ffiffiffi 2 p ð1; 3; 6; 2Þ 7.22. Consider the vector space PðtÞ with inner product h f ; gi ¼ Ð1 0 f ðtÞgðtÞ dt. Apply the Gram– Schmidt algorithm to the set f1; t; t2 g to obtain an orthogonal set f f0; f1; f2g with integer coefficients. First set f0 ¼ 1. Then find t ht; 1i h1; 1i 1 ¼ t 1 2 1 1 ¼ t 1 2 Clear fractions to obtain f1 ¼ 2t 1. Then find t2 ht2 ; 1i h1; 1i ð1Þ ht2 ; 2t 1i h2t 1; 2t 1i ð2t 1Þ ¼ t2 1 3 1 ð1Þ 1 6 1 3 ð2t 1Þ ¼ t2 t þ 1 6 Clear fractions to obtain f2 ¼ 6t2 6t þ 1. Thus, f1; 2t 1; 6t2 6t þ 1g is the required orthogonal set. 7.23. Suppose v ¼ ð1; 3; 5; 7Þ. Find the projection of v onto W or, in other words, find w 2 W that minimizes kv wk, where W is the subspance of R4 spanned by (a) u1 ¼ ð1; 1; 1; 1Þ and u2 ¼ ð1; 3; 4; 2Þ, (b) v1 ¼ ð1; 1; 1; 1Þ and v2 ¼ ð1; 2; 3; 2Þ. (a) Because u1 and u2 are orthogonal, we need only compute the Fourier coefficients: c1 ¼ hv; u1i hu1; u1i ¼ 1 þ 3 þ 5 þ 7 1 þ 1 þ 1 þ 1 ¼ 16 4 ¼ 4 c2 ¼ hv; u2i hu2; u2i ¼ 1 9 þ 20 14 1 þ 9 þ 16 þ 4 ¼ 2 30 ¼ 1 15 Then w ¼ projðv; WÞ ¼ c1u1 þ c2u2 ¼ 4ð1; 1; 1; 1Þ 1 15 ð1; 3; 4; 2Þ ¼ ð59 15 ; 63 5 ; 56 15 ; 62 15Þ: (b) Because v1 and v2 are not orthogonal, first apply the Gram–Schmidt algorithm to find an orthogonal basis for W. Set w1 ¼ v1 ¼ ð1; 1; 1; 1Þ. Then find v2 hv2; w1i hw1; w1i w1 ¼ ð1; 2; 3; 2Þ 8 4 ð1; 1; 1; 1Þ ¼ ð1; 0; 1; 0Þ Set w2 ¼ ð1; 0; 1; 0Þ. Now compute c1 ¼ hv; w1i hw1; w1i ¼ 1 þ 3 þ 5 þ 7 1 þ 1 þ 1 þ 1 ¼ 16 4 ¼ 4 c2 ¼ hv; w2i hw2; w2i 1 þ 0 þ 5 þ 0 1 þ 0 þ 1 þ 0 ¼ 6 2 ¼ 3 Then w ¼ projðv; WÞ ¼ c1w1 þ c2w2 ¼ 4ð1; 1; 1; 1Þ 3ð1; 0; 1; 0Þ ¼ ð7; 4; 1; 4Þ. 7.24. Suppose w1 and w2 are nonzero orthogonal vectors. Let v be any vector in V. Find c1 and c2 so that v0 is orthogonal to w1 and w2, where v0 ¼ v c1w1 c2w2. If v0 is orthogonal to w1, then 0 ¼ hv c1w1 c2w2; w1i ¼ hv; w1i c1hw1; w1i c2hw2; w1i ¼ hv; w1i c1hw1; w1i c20 ¼ hv; w1i c1hw1; w1i Thus, c1 ¼ hv; w1i=hw1; w1i. (That is, c1 is the component of v along w1.) Similarly, if v0 is orthogonal to w2, then 0 ¼ hv c1w1 c2w2; w2i ¼ hv; w2i c2hw2; w2i Thus, c2 ¼ hv; w2i=hw2; w2i. (That is, c2 is the component of v along w2.) 250 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 258. 7.25. Prove Theorem 7.8: Suppose w1; w2; . . . ; wr form an orthogonal set of nonzero vectors in V. Let v 2 V. Define v0 ¼ v ðc1w1 þ c2w2 þ þ crwrÞ; where ci ¼ hv; wii hwi; wii Then v0 is orthogonal to w1; w2; . . . ; wr. For i ¼ 1; 2; . . . ; r and using hwi; wji ¼ 0 for i 6¼ j, we have hv c1w1 c2x2 crwr; wii ¼ hv; wii c1hw1; wii cihwi; wii crhwr; wii ¼ hv; wii c1 0 cihwi; wii cr 0 ¼ hv; wii cihwi; wii ¼ hv; wii hv; wii hwi; wii hwi; wii ¼ 0 The theorem is proved. 7.26. Prove Theorem 7.9: Let fv1; v2; . . . ; vng be any basis of an inner product space V. Then there exists an orthonormal basis fu1; u2; . . . ; ung of V such that the change-of-basis matrix from fvig to fuig is triangular; that is, for k ¼ 1; 2; . . . ; n, uk ¼ ak1v1 þ ak2v2 þ þ akkvk The proof uses the Gram–Schmidt algorithm and Remarks 1 and 3 of Section 7.7. That is, apply the algorithm to fvig to obtain an orthogonal basis fwi; . . . ; wng, and then normalize fwig to obtain an orthonormal basis fuig of V. The specific algorithm guarantees that each wk is a linear combination of v1; . . . ; vk, and hence, each uk is a linear combination of v1; . . . ; vk. 7.27. Prove Theorem 7.10: Suppose S ¼ fw1; w2; . . . ; wrg, is an orthogonal basis for a subspace W of V. Then one may extend S to an orthogonal basis for V; that is, one may find vectors wrþ1; . . . ; wr such that fw1; w2; . . . ; wng is an orthogonal basis for V. Extend S to a basis S0 ¼ fw1; . . . ; wr; vrþ1; . . . ; vng for V. Applying the Gram–Schmidt algorithm to S0 , we first obtain w1; w2; . . . ; wr because S is orthogonal, and then we obtain vectors wrþ1; . . . ; wn, where fw1; w2; . . . ; wng is an orthogonal basis for V. Thus, the theorem is proved. 7.28. Prove Theorem 7.4: Let W be a subspace of V. Then V ¼ W W? . By Theorem 7.9, there exists an orthogonal basis fu1; . . . ; urg of W, and by Theorem 7.10 we can extend it to an orthogonal basis fu1; u2; . . . ; ung of V. Hence, urþ1; . . . ; un 2 W? . If v 2 V, then v ¼ a1u1 þ þ anun; where a1u1 þ þ arur 2 W and arþ1urþ1 þ þ anun 2 W? Accordingly, V ¼ W þ W? . On the other hand, if w 2 W W? , then hw; wi ¼ 0. This yields w ¼ 0. Hence, W W? ¼ f0g. The two conditions V ¼ W þ W? and W W? ¼ f0g give the desired result V ¼ W W? . Remark: Note that we have proved the theorem for the case that V has finite dimension. We remark that the theorem also holds for spaces of arbitrary dimension. 7.29. Suppose W is a subspace of a finite-dimensional space V. Prove that W ¼ W?? . By Theorem 7.4, V ¼ W W? , and also V ¼ W? W?? . Hence, dim W ¼ dim V dim W? and dim W?? ¼ dim V dim W? This yields dim W ¼ dim W?? . But W W?? (see Problem 7.14). Hence, W ¼ W?? , as required. 7.30. Prove the following: Suppose w1; w2; . . . ; wr form an orthogonal set of nonzero vectors in V. Let v be any vector in V and let ci be the component of v along wi. Then, for any scalars a1; . . . ; ar, we have v P r k¼1 ckwk v P r k¼1 akwk That is, P ciwi is the closest approximation to v as a linear combination of w1; . . . ; wr. CHAPTER 7 Inner Product Spaces, Orthogonality 251
  • 259. By Theorem 7.8, v P ckwk is orthogonal to every wi and hence orthogonal to any linear combination of w1; w2; . . . ; wr. Therefore, using the Pythagorean theorem and summing from k ¼ 1 to r, v P akwk k k 2 ¼ v P ckwk þ P ðck akÞwk k k 2 ¼ v P ckwk k k 2 þ P ðck akÞwk k k 2 v P ckwk k k 2 The square root of both sides gives our theorem. 7.31. Suppose fe1; e2; . . . ; erg is an orthonormal set of vectors in V. Let v be any vector in V and let ci be the Fourier coefficient of v with respect to ui. Prove Bessel’s inequality: P r k¼1 c2 k kvk2 Note that ci ¼ hv; eii, because keik ¼ 1. Then, using hei; eji ¼ 0 for i 6¼ j and summing from k ¼ 1 to r, we get 0 v P ckek; v P ck; ek h i ¼ hv; vi 2 v; P ckeki þ P c2 k ¼ hv; vi P 2ckhv; eki þ P c2 k ¼ hv; vi P 2c2 k þ P c2 k ¼ hv; vi P c2 k This gives us our inequality. Orthogonal Matrices 7.32. Find an orthogonal matrix P whose first row is u1 ¼ ð1 3 ; 2 3 ; 2 3Þ. First find a nonzero vector w2 ¼ ðx; y; zÞ that is orthogonal to u1—that is, for which 0 ¼ hu1; w2i ¼ x 3 þ 2y 3 þ 2z 3 ¼ 0 or x þ 2y þ 2z ¼ 0 One such solution is w2 ¼ ð0; 1; 1Þ. Normalize w2 to obtain the second row of P: u2 ¼ ð0; 1= ffiffiffi 2 p ; 1= ffiffiffi 2 p Þ Next find a nonzero vector w3 ¼ ðx; y; zÞ that is orthogonal to both u1 and u2—that is, for which 0 ¼ hu1; w3i ¼ x 3 þ 2y 3 þ 2z 3 ¼ 0 or x þ 2y þ 2z ¼ 0 0 ¼ hu2; w3i ¼ y ffiffiffi 2 p y ffiffiffi 2 p ¼ 0 or y z ¼ 0 Set z ¼ 1 and find the solution w3 ¼ ð4; 1; 1Þ. Normalize w3 and obtain the third row of P; that is, u3 ¼ ð4= ffiffiffiffiffi 18 p ; 1= ffiffiffiffiffi 18 p ; 1= ffiffiffiffiffi 18 p Þ: P ¼ 1 3 2 3 2 3 0 1= ffiffiffi 2 p 1= ffiffiffi 2 p 4=3 ffiffiffi 2 p 1=3 ffiffiffi 2 p 1=3 ffiffiffi 2 p 2 4 3 5 Thus; We emphasize that the above matrix P is not unique. 7.33. Let A ¼ 1 1 1 1 3 4 7 5 2 2 4 3 5. Determine whether or not: (a) the rows of A are orthogonal; (b) A is an orthogonal matrix; (c) the columns of A are orthogonal. (a) Yes, because ð1; 1; 1Þ ð1; 3; 4Þ ¼ 1 þ 3 4 ¼ 0, ð1; 1 1Þ ð7; 5; 2Þ ¼ 7 5 2 ¼ 0, and ð1; 3; 4Þ ð7; 5; 2Þ ¼ 7 15 þ 8 ¼ 0. (b) No, because the rows of A are not unit vectors, for example, ð1; 1; 1Þ2 ¼ 1 þ 1 þ 1 ¼ 3. (c) No; for example, ð1; 1; 7Þ ð1; 3; 5Þ ¼ 1 þ 3 35 ¼ 31 6¼ 0. 7.34. Let B be the matrix obtained by normalizing each row of A in Problem 7.33. (a) Find B. (b) Is B an orthogonal matrix? (c) Are the columns of B orthogonal? 252 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 260. (a) We have kð1; 1; 1Þk2 ¼ 1 þ 1 þ 1 ¼ 3; kð1; 3; 4Þk2 ¼ 1 þ 9 þ 16 ¼ 26 kð7; 5; 2Þk2 ¼ 49 þ 25 þ 4 ¼ 78 Thus; B ¼ 1= ffiffiffi 3 p 1= ffiffiffi 3 p 1= ffiffiffi 3 p 1= ffiffiffiffiffi 26 p 3= ffiffiffiffiffi 26 p 4= ffiffiffiffiffi 26 p 7= ffiffiffiffiffi 78 p 5= ffiffiffiffiffi 78 p 2= ffiffiffiffiffi 78 p 2 6 4 3 7 5 (b) Yes, because the rows of B are still orthogonal and are now unit vectors. (c) Yes, because the rows of B form an orthonormal set of vectors. Then, by Theorem 7.11, the columns of B must automatically form an orthonormal set. 7.35. Prove each of the following: (a) P is orthogonal if and only if PT is orthogonal. (b) If P is orthogonal, then P1 is orthogonal. (c) If P and Q are orthogonal, then PQ is orthogonal. (a) We have ðPT ÞT ¼ P. Thus, P is orthogonal if and only if PPT ¼ I if and only if PTT PT ¼ I if and only if PT is orthogonal. (b) We have PT ¼ P1 , because P is orthogonal. Thus, by part (a), P1 is orthogonal. (c) We have PT ¼ P1 and QT ¼ Q1 . Thus, ðPQÞðPQÞT ¼ PQQT PT ¼ PQQ1 P1 ¼ I. Therefore, ðPQÞT ¼ ðPQÞ1 , and so PQ is orthogonal. 7.36. Suppose P is an orthogonal matrix. Show that (a) hPu; Pvi ¼ hu; vi for any u; v 2 V; (b) kPuk ¼ kuk for every u 2 V. Use PT P ¼ I and hu; vi ¼ uT v. (a) hPu; Pvi ¼ ðPuÞT ðPvÞ ¼ uT PT Pv ¼ uT v ¼ hu; vi. (b) We have kPuk2 ¼ hPu; Pui ¼ uT PT Pu ¼ uT u ¼ hu; ui ¼ kuk2 Taking the square root of both sides gives our result. 7.37. Prove Theorem 7.12: Suppose E ¼ feig and E0 ¼ fe0 ig are orthonormal bases of V. Let P be the change-of-basis matrix from E to E0 . Then P is orthogonal. Suppose e0 i ¼ bi1e1 þ bi2e2 þ þ binen; i ¼ 1; . . . ; n ð1Þ Using Problem 7.18(b) and the fact that E0 is orthonormal, we get dij ¼ he0 i; e0 ji ¼ bi1bj1 þ bi2bj2 þ þ binbjn ð2Þ Let B ¼ ½bij be the matrix of the coefficients in (1). (Then P ¼ BT .) Suppose BBT ¼ ½cij. Then cij ¼ bi1bj1 þ bi2bj2 þ þ binbjn ð3Þ By (2) and (3), we have cij ¼ dij. Thus, BBT ¼ I. Accordingly, B is orthogonal, and hence, P ¼ BT is orthogonal. 7.38. Prove Theorem 7.13: Let fe1; . . . ; eng be an orthonormal basis of an inner product space V. Let P ¼ ½aij be an orthogonal matrix. Then the following n vectors form an orthonormal basis for V: e0 i ¼ a1ie1 þ a2ie2 þ þ anien; i ¼ 1; 2; . . . ; n CHAPTER 7 Inner Product Spaces, Orthogonality 253
  • 261. Because feig is orthonormal, we get, by Problem 7.18(b), he0 i; e0 ji ¼ a1ia1j þ a2ia2j þ þ anianj ¼ hCi; Cji where Ci denotes the ith column of the orthogonal matrix P ¼ ½aij: Because P is orthogonal, its columns form an orthonormal set. This implies he0 i; e0 ji ¼ hCi; Cji ¼ dij: Thus, fe0 ig is an orthonormal basis. Inner Products And Positive Definite Matrices 7.39. Which of the following symmetric matrices are positive definite? (a) A ¼ 3 4 4 5 , (b) B ¼ 8 3 3 2 , (c) C ¼ 2 1 1 3 , (d) D ¼ 3 5 5 9 Use Theorem 7.14 that a 2 2 real symmetric matrix is positive definite if and only if its diagonal entries are positive and if its determinant is positive. (a) No, because jAj ¼ 15 16 ¼ 1 is negative. (b) Yes. (c) No, because the diagonal entry 3 is negative. (d) Yes. 7.40. Find the values of k that make each of the following matrices positive definite: (a) A ¼ 2 4 4 k , (b) B ¼ 4 k k 9 , (c) C ¼ k 5 5 2 (a) First, k must be positive. Also, jAj ¼ 2k 16 must be positive; that is, 2k 16 0. Hence, k 8. (b) We need jBj ¼ 36 k2 positive; that is, 36 k2 0. Hence, k2 36 or 6 k 6. (c) C can never be positive definite, because C has a negative diagonal entry 2. 7.41. Find the matrix A that represents the usual inner product on R2 relative to each of the following bases of R2 : ðaÞ fv1 ¼ ð1; 3Þ; v2 ¼ ð2; 5Þg; ðbÞ fw1 ¼ ð1; 2Þ; w2 ¼ ð4; 2Þg: (a) Compute hv1; v1i ¼ 1 þ 9 ¼ 10, hv1; v2i ¼ 2 þ 15 ¼ 17, hv2; v2i ¼ 4 þ 25 ¼ 29. Thus, A ¼ 10 17 17 29 . (b) Compute hw1; w1i ¼ 1 þ 4 ¼ 5, hw1; w2i ¼ 4 4 ¼ 0, hw2; w2i ¼ 16 þ 4 ¼ 20. Thus, A ¼ 5 0 0 20 . (Because the basis vectors are orthogonal, the matrix A is diagonal.) 7.42. Consider the vector space P2ðtÞ with inner product h f ; gi ¼ Ð1 1 f ðtÞgðtÞ dt. (a) Find h f ; gi, where f ðtÞ ¼ t þ 2 and gðtÞ ¼ t2 3t þ 4. (b) Find the matrix A of the inner product with respect to the basis f1; t; t2 g of V. (c) Verify Theorem 7.16 by showing that h f ; gi ¼ ½ f T A½g with respect to the basis f1; t; t2 g. (a) h f ; gi ¼ ð1 1 ðt þ 2Þðt2 3t þ 4Þ dt ¼ ð1 1 ðt3 t2 2t þ 8Þ dt ¼ t4 4 t3 3 t2 þ 8t 1 1 ¼ 46 3 (b) Here we use the fact that if r þ s ¼ n, htr ; tr i ¼ ð1 1 tn dt ¼ tnþ1 n þ 1 1 1 ¼ 2=ðn þ 1Þ if n is even; 0 if n is odd: Then h1; 1i ¼ 2, h1; ti ¼ 0, h1; t2 i ¼ 2 3, ht; ti ¼ 2 3, ht; t2 i ¼ 0, ht2 ; t2 i ¼ 2 5. Thus, A ¼ 2 0 2 3 0 2 3 0 2 3 0 2 5 2 4 3 5 254 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 262. (c) We have ½ f T ¼ ð2; 1; 0Þ and ½gT ¼ ð4; 3; 1Þ relative to the given basis. Then ½ f T A½g ¼ ð2; 1; 0Þ 2 0 2 3 0 2 3 0 2 3 0 2 5 2 4 3 5 4 3 1 2 4 3 5 ¼ ð4; 2 3 ; 4 3Þ 4 3 1 2 4 3 5 ¼ 46 3 ¼ h f ; gi 7.43. Prove Theorem 7.14: A ¼ a b b c is positive definite if and only if a and d are positive and jAj ¼ ad b2 is positive. Let u ¼ ½x; yT . Then f ðuÞ ¼ uT Au ¼ ½x; y a b b d x y ¼ ax2 þ 2bxy þ dy2 Suppose f ðuÞ 0 for every u 6¼ 0. Then f ð1; 0Þ ¼ a 0 and f ð0; 1Þ ¼ d 0. Also, we have f ðb; aÞ ¼ aðad b2 Þ 0. Because a 0, we get ad b2 0. Conversely, suppose a 0, b ¼ 0, ad b2 0. Completing the square gives us f ðuÞ ¼ a x2 þ 2b a xy þ b2 a2 y2 þ dy2 b2 a y2 ¼ a x þ by a 2 þ ad b2 a y2 Accordingly, f ðuÞ 0 for every u 6¼ 0. 7.44. Prove Theorem 7.15: Let A be a real positive definite matrix. Then the function hu; vi ¼ uT Av is an inner product on Rn . For any vectors u1; u2, and v, hu1 þ u2; vi ¼ ðu1 þ u2ÞT Av ¼ ðuT 1 þ uT 2 ÞAv ¼ uT 1 Av þ uT 2 Av ¼ hu1; vi þ hu2; vi and, for any scalar k and vectors u; v, hku; vi ¼ ðkuÞT Av ¼ kuT Av ¼ khu; vi Thus ½I1 is satisfied. Because uT Av is a scalar, ðuT AvÞT ¼ uT Av. Also, AT ¼ A because A is symmetric. Therefore, hu; vi ¼ uT Av ¼ ðuT AvÞT ¼ vT AT uTT ¼ vT Au ¼ hv; ui Thus, ½I2 is satisfied. Last, because A is positive definite, XT AX 0 for any nonzero X 2 Rn . Thus, for any nonzero vector v; hv; vi ¼ vT Av 0. Also, h0; 0i ¼ 0T A0 ¼ 0. Thus, ½I3 is satisfied. Accordingly, the function hu; vi ¼ Av is an inner product. 7.45. Prove Theorem 7.16: Let A be the matrix representation of an inner product relative to a basis S of V. Then, for any vectors u; v 2 V, we have hu; vi ¼ ½uT A½v Suppose S ¼ fw1; w2; . . . ; wng and A ¼ ½kij. Hence, kij ¼ hwi; wji. Suppose u ¼ a1w1 þ a2w2 þ þ anwn and v ¼ b1w1 þ b2w2 þ þ bnwn Then hu; vi ¼ P n i¼1 P n j¼1 aibjhwi; wji ð1Þ On the other hand, ½uT A½v ¼ ða1; a2; . . . ; anÞ k11 k12 . . . k1n k21 k22 . . . k2n :::::::::::::::::::::::::::::: kn1 kn2 . . . knn 2 6 6 6 4 3 7 7 7 5 b1 b2 . . . bn 2 6 6 6 6 4 3 7 7 7 7 5 ¼ P n i¼1 aiki1; P n i¼1 aiki2; . . . ; P n i¼1 aikin b1 b2 . . . bn 2 6 6 6 6 4 3 7 7 7 7 5 ¼ P n j¼1 P n i¼1 aibjkij ð2Þ Equations ð1Þ and (2) give us our result. CHAPTER 7 Inner Product Spaces, Orthogonality 255
  • 263. 7.46. Prove Theorem 7.17: Let A be the matrix representation of any inner product on V. Then A is a positive definite matrix. Because hwi; wji ¼ hwj; wii for any basis vectors wi and wj, the matrix A is symmetric. Let X be any nonzero vector in Rn . Then ½u ¼ X for some nonzero vector u 2 V. Theorem 7.16 tells us that XT AX ¼ ½uT A½u ¼ hu; ui 0. Thus, A is positive definite. Complex Inner Product Spaces 7.47. Let V be a complex inner product space. Verify the relation hu; av1 þ bv2i ¼ ahu; v1i þ bhu; v2i Using ½I2 *, ½I1 *, and then ½I2 *, we find hu; av1 þ bv2i ¼ hav1 þ bv2; ui ¼ ahv1; ui þ bhv2; ui ¼ ahv1; ui þ bhv2; ui ¼ ahu; v1i þ bhu; v2i 7.48. Suppose hu; vi ¼ 3 þ 2i in a complex inner product space V. Find (a) hð2 4iÞu; vi; (b) hu; ð4 þ 3iÞvi; (c) hð3 6iÞu; ð5 2iÞvi: (a) hð2 4iÞu; vi ¼ ð2 4iÞhu; vi ¼ ð2 4iÞð3 þ 2iÞ ¼ 14 8i (b) hu; ð4 þ 3iÞvi ¼ ð4 þ 3iÞhu; vi ¼ ð4 3iÞð3 þ 2iÞ ¼ 18 i (c) hð3 6iÞu; ð5 2iÞvi ¼ ð3 6iÞð5 2iÞhu; vi ¼ ð3 6iÞð5 þ 2iÞð3 þ 2iÞ ¼ 129 18i 7.49. Find the Fourier coefficient (component) c and the projection cw of v ¼ ð3 þ 4i; 2 3iÞ along w ¼ ð5 þ i; 2iÞ in C2 . Recall that c ¼ hv; wi=hw; wi. Compute hv; wi ¼ ð3 þ 4iÞð5 þ iÞ þ ð2 3iÞð2iÞ ¼ ð3 þ 4iÞð5 iÞ þ ð2 3iÞð2iÞ ¼ 19 þ 17i 6 4i ¼ 13 þ 13i hw; wi ¼ 25 þ 1 þ 4 ¼ 30 Thus, c ¼ ð13 þ 13iÞ=30 ¼ 13 30 þ 13 30 i: Accordingly, projðv; wÞ ¼ cw ¼ ð26 15 þ 39 15 i; 13 15 þ 1 15 iÞ 7.50. Prove Theorem 7.18 (Cauchy–Schwarz): Let V be a complex inner product space. Then jhu; vij kukkvk. If v ¼ 0, the inequality reduces to 0 0 and hence is valid. Now suppose v 6¼ 0. Using z z ¼ jzj2 (for any complex number z) and hv; ui ¼ hu; vi, we expand ku hu; vitvk2 0, where t is any real value: 0 ku hu; vitvk2 ¼ hu hu; vitv; u hu; vitvi ¼ hu; ui hu; vithu; vi hu; vÞthv; ui þ hu; vihu; vit2 hv; vi ¼ kuk2 2tjhu; vij2 þ jhu; vij2 t2 kvk2 Set t ¼ 1=kvk2 to find 0 kuk2 jhu; vij2 kvk2 , from which jhu; vij2 kvk2 kvk2 . Taking the square root of both sides, we obtain the required inequality. 7.51. Find an orthogonal basis for u? in C3 where u ¼ ð1; i; 1 þ iÞ. Here u? consists of all vectors s ¼ ðx; y; zÞ such that hw; ui ¼ x iy þ ð1 iÞz ¼ 0 Find one solution, say w1 ¼ ð0; 1 i; iÞ. Then find a solution of the system x iy þ ð1 iÞz ¼ 0; ð1 þ iÞy iz ¼ 0 Here z is a free variable. Set z ¼ 1 to obtain y ¼ i=ð1 þ iÞ ¼ ð1 þ iÞ=2 and x ¼ ð3i 3Þ2. Multiplying by 2 yields the solution w2 ¼ ð3i 3, 1 þ i, 2). The vectors w1 and w2 form an orthogonal basis for u? . 256 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 264. 7.52. Find an orthonormal basis of the subspace W of C3 spanned by v1 ¼ ð1; i; 0Þ and v2 ¼ ð1; 2; 1 iÞ: Apply the Gram–Schmidt algorithm. Set w1 ¼ v1 ¼ ð1; i; 0Þ. Compute v2 hv2; w1i hw1; w1i w1 ¼ ð1; 2; 1 iÞ 1 2i 2 ð1; i; 0Þ ¼ ð1 2 þ i; 1 1 2 i; 1 iÞ Multiply by 2 to clear fractions, obtaining w2 ¼ ð1 þ 2i; 2 i; 2 2iÞ. Next find kw1k ¼ ffiffiffi 2 p and then kw2k ¼ ffiffiffiffiffi 18 p . Normalizing fw1; w2g, we obtain the following orthonormal basis of W: u1 ¼ 1 ffiffiffi 2 p ; i ffiffiffi 2 p ; 0 ; u2 ¼ 1 þ 2i ffiffiffiffiffi 18 p ; 2 i ffiffiffiffiffi 18 p ; 2 2i ffiffiffiffiffi 18 p 7.53. Find the matrix P that represents the usual inner product on C3 relative to the basis f1; i; 1 ig. Compute the following six inner products: h1; 1i ¼ 1; hi; ii ¼ i i ¼ 1; h1; ii ¼ i ¼ i; hi; 1 ii ¼ ið1 iÞ ¼ 1 þ i; h1; 1 ii ¼ 1 i ¼ 1 þ i h1 i; 1 ii ¼ 2 Then, using ðu; vÞ ¼ hv; ui, we obtain P ¼ 1 i 1 þ i i 1 1 þ i 1 i 1 i 2 2 4 3 5 (As expected, P is Hermitian; that is, PH ¼ P.) Normed Vector Spaces 7.54. Consider vectors u ¼ ð1; 3; 6; 4Þ and v ¼ ð3; 5; 1; 2Þ in R4 . Find (a) kuk1 and kvj1, (b) kuk1 and kvk1, (c) kuk2 and kvk2, (d) d1ðu; vÞ; d1ðu; vÞ, d2ðu; vÞ. (a) The infinity norm chooses the maximum of the absolute values of the components. Hence, kuk1 ¼ 6 and kvk1 ¼ 5 (b) The one-norm adds the absolute values of the components. Thus, kuk1 ¼ 1 þ 3 þ 6 þ 4 ¼ 14 and kvk1 ¼ 3 þ 5 þ 1 þ 2 ¼ 11 (c) The two-norm is equal to the square root of the sum of the squares of the components (i.e., the norm induced by the usual inner product on R3 ). Thus, kuk2 ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 1 þ 9 þ 36 þ 16 p ¼ ffiffiffiffiffi 62 p and kvk2 ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 9 þ 25 þ 1 þ 4 p ¼ ffiffiffiffiffi 39 p (d) First find u v ¼ ð2; 8; 7; 6Þ. Then d1ðu; vÞ ¼ ku vk1 ¼ 8 d1ðu; vÞ ¼ ku vk1 ¼ 2 þ 8 þ 7 þ 6 ¼ 23 d2ðu; vÞ ¼ ku vk2 ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 4 þ 64 þ 49 þ 36 p ¼ ffiffiffiffiffiffiffiffi 153 p 7.55. Consider the function f ðtÞ ¼ t2 4t in C½0; 3. (a) Find k f k1, (b) Plot f ðtÞ in the plane R2 , (c) Find k f k1, (d) Find k f k2. (a) We seek k f k1 ¼ maxðj f ðtÞjÞ. Because f ðtÞ is differentiable on ½0; 3, j f ðtÞj has a maximum at a critical point of f ðtÞ (i.e., when the derivative f 0 ðtÞ ¼ 0), or at an endpoint of ½0; 3. Because f 0 ðtÞ ¼ 2t 4, we set 2t 4 ¼ 0 and obtain t ¼ 2 as a critical point. Compute f ð2Þ ¼ 4 8 ¼ 4; f ð0Þ ¼ 0 0 ¼ 0; f ð3Þ ¼ 9 12 ¼ 3 Thus, k f k1 ¼ j f ð2Þj ¼ j 4j ¼ 4. CHAPTER 7 Inner Product Spaces, Orthogonality 257
  • 265. (b) Compute f ðtÞ for various values of t in ½0; 3, for example, t 0 1 2 3 f ðtÞ 0 3 4 3 Plot the points in R2 and then draw a continuous curve through the points, as shown in Fig. 7-8. (c) We seek k f k1 ¼ Ð3 0 j f ðtÞj dt. As indicated in Fig. 7-3, f ðtÞ is negative in ½0; 3; hence, j f ðtÞj ¼ ðt2 4tÞ ¼ 4t t2 k f k1 ¼ ð3 0 ð4t t2 Þ dt ¼ 2t2 t3 3 3 0 ¼ 18 9 ¼ 9 Thus; (d) k f k2 2 ¼ ð3 0 f ðtÞ2 dt ¼ ð3 0 ðt4 8t3 þ 16t2 Þ dt ¼ t5 5 2t4 þ 16t3 3 3 0 ¼ 153 5 . Thus, k f k2 ¼ ffiffiffiffiffiffiffiffi 153 5 r . 7.56. Prove Theorem 7.24: Let V be a normed vector space. Then the function dðu; vÞ ¼ ku vk satisfies the following three axioms of a metric space: ½M1 dðu; vÞ 0; and dðu; vÞ ¼ 0 iff u ¼ v. ½M2 dðu; vÞ ¼ dðv; uÞ. ½M3 dðu; vÞ dðu; wÞ þ dðw; vÞ. If u 6¼ v, then u v 6¼ 0, and hence, dðu; vÞ ¼ ku vk 0. Also, dðu; uÞ ¼ ku uk ¼ k0k ¼ 0. Thus, ½M1 is satisfied. We also have dðu; vÞ ¼ ku vk ¼ k 1ðv uÞk ¼ j 1jkv uk ¼ kv uk ¼ dðv; uÞ and dðu; vÞ ¼ ku vk ¼ kðu wÞ þ ðw vÞk ku wk þ kw vk ¼ dðu; wÞ þ dðw; vÞ Thus, ½M2 and ½M3 are satisfied. SUPPLEMENTARY PROBLEMS Inner Products 7.57. Verify that the following is an inner product on R2 , where u ¼ ðx1; x2Þ and v ¼ ðy1; y2Þ: f ðu; vÞ ¼ x1y1 2x1 y2 2x2 y1 þ 5x2 y2 7.58. Find the values of k so that the following is an inner product on R2 , where u ¼ ðx1; x2Þ and v ¼ ðy1; y2Þ: f ðu; vÞ ¼ x1y1 3x1 y2 3x2 y1 þ kx2 y2 Figure 7-8 258 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 266. 7.59. Consider the vectors u ¼ ð1; 3Þ and v ¼ ð2; 5Þ in R2 . Find (a) hu; vi with respect to the usual inner product in R2 . (b) hu; vi with respect to the inner product in R2 in Problem 7.57. (c) kvk using the usual inner product in R2 . (d) kvk using the inner product in R2 in Problem 7.57. 7.60. Show that each of the following is not an inner product on R3 , where u ¼ ðx1; x2; x3Þ and v ¼ ðy1; y2; y3Þ: (a) hu; vi ¼ x1y1 þ x2y2; (b) hu; vi ¼ x1y2x3 þ y1x2y3. 7.61. Let V be the vector space of m n matrices over R. Show that hA; Bi ¼ trðBT AÞ defines an inner product in V. 7.62. Suppose jhu; vij ¼ kukkvk. (That is, the Cauchy–Schwarz inequality reduces to an equality.) Show that u and v are linearly dependent. 7.63. Suppose f ðu; vÞ and gðu; vÞ are inner products on a vector space V over R. Prove (a) The sum f þ g is an inner product on V, where ð f þ gÞðu; vÞ ¼ f ðu; vÞ þ gðu; vÞ. (b) The scalar product kf , for k 0, is an inner product on V, where ðkf Þðu; vÞ ¼ kf ðu; vÞ. Orthogonality, Orthogonal Complements, Orthogonal Sets 7.64. Let V be the vector space of polynomials over R of degree 2 with inner product defined by h f ; gi ¼ Ð1 0 f ðtÞgðtÞ dt. Find a basis of the subspace W orthogonal to hðtÞ ¼ 2t þ 1. 7.65. Find a basis of the subspace W of R4 orthogonal to u1 ¼ ð1; 2; 3; 4Þ and u2 ¼ ð3; 5; 7; 8Þ. 7.66. Find a basis for the subspace W of R5 orthogonal to the vectors u1 ¼ ð1; 1; 3; 4; 1Þ and u2 ¼ ð1; 2; 1; 2; 1Þ. 7.67. Let w ¼ ð1; 2; 1; 3Þ be a vector in R4 . Find (a) an orthogonal basis for w? ; (b) an orthonormal basis for w? . 7.68. Let W be the subspace of R4 orthogonal to u1 ¼ ð1; 1; 2; 2Þ and u2 ¼ ð0; 1; 2; 1Þ. Find (a) an orthogonal basis for W; (b) an orthonormal basis for W. (Compare with Problem 7.65.) 7.69. Let S consist of the following vectors in R4 : u1 ¼ ð1; 1; 1; 1Þ; u2 ¼ ð1; 1; 1; 1Þ; u3 ¼ ð1; 1; 1; 1Þ; u4 ¼ ð1; 1; 1; 1Þ (a) Show that S is orthogonal and a basis of R4 . (b) Write v ¼ ð1; 3; 5; 6Þ as a linear combination of u1; u2; u3; u4. (c) Find the coordinates of an arbitrary vector v ¼ ða; b; c; dÞ in R4 relative to the basis S. (d) Normalize S to obtain an orthonormal basis of R4 . 7.70. Let M ¼ M2;2 with inner product hA; Bi ¼ trðBT AÞ. Show that the following is an orthonormal basis for M: 1 0 0 0 ; 0 1 0 0 ; 0 0 1 0 ; 0 0 0 1 7.71. Let M ¼ M2;2 with inner product hA; Bi ¼ trðBT AÞ. Find an orthogonal basis for the orthogonal complement of (a) diagonal matrices, (b) symmetric matrices. CHAPTER 7 Inner Product Spaces, Orthogonality 259
  • 267. 7.72. Suppose fu1; u2; . . . ; urg is an orthogonal set of vectors. Show that fk1u1; k2u2; . . . ; krurg is an orthogonal set for any scalars k1; k2; . . . ; kr. 7.73. Let U and W be subspaces of a finite-dimensional inner product space V. Show that (a) ðU þ WÞ? ¼ U? W? ; (b) ðU WÞ? ¼ U? þ W? . Projections, Gram–Schmidt Algorithm, Applications 7.74. Find the Fourier coefficient c and projection cw of v along w, where (a) v ¼ ð2; 3; 5Þ and w ¼ ð1; 5; 2Þ in R3 : (b) v ¼ ð1; 3; 1; 2Þ and w ¼ ð1; 2; 7; 4Þ in R4 : (c) v ¼ t2 and w ¼ t þ 3 in PðtÞ; with inner product h f ; gi ¼ Ð1 0 f ðtÞgðtÞ dt (d) v ¼ 1 2 3 4 and w ¼ 1 1 5 5 in M ¼ M2;2; with inner product hA; Bi ¼ trðBT AÞ: 7.75. Let U be the subspace of R4 spanned by v1 ¼ ð1; 1; 1; 1Þ; v2 ¼ ð1; 1; 2; 2Þ; v3 ¼ ð1; 2; 3; 4Þ (a) Apply the Gram–Schmidt algorithm to find an orthogonal and an orthonormal basis for U. (b) Find the projection of v ¼ ð1; 2; 3; 4Þ onto U. 7.76. Suppose v ¼ ð1; 2; 3; 4; 6Þ. Find the projection of v onto W, or, in other words, find w 2 W that minimizes kv wk, where W is the subspace of R5 spanned by (a) u1 ¼ ð1; 2; 1; 2; 1Þ and u2 ¼ ð1; 1; 2; 1; 1Þ, (b) v1 ¼ ð1; 2; 1; 2; 1Þ and v2 ¼ ð1; 0; 1; 5; 1Þ. 7.77. Consider the subspace W ¼ P2ðtÞ of PðtÞ with inner product h f ; gi ¼ Ð1 0 f ðtÞgðtÞ dt. Find the projection of f ðtÞ ¼ t3 onto W. (Hint: Use the orthogonal polynomials 1; 2t 1, 6t2 6t þ 1 obtained in Problem 7.22.) 7.78. Consider PðtÞ with inner product h f ; gi ¼ Ð1 1 f ðtÞgðtÞ dt and the subspace W ¼ P3ðtÞ: (a) Find an orthogonal basis for W by applying the Gram–Schmidt algorithm to f1; t; t2 ; t3 g. (b) Find the projection of f ðtÞ ¼ t5 onto W. Orthogonal Matrices 7.79. Find the number and exhibit all 2 2 orthogonal matrices of the form 1 3 x y z . 7.80. Find a 3 3 orthogonal matrix P whose first two rows are multiples of u ¼ ð1; 1; 1Þ and v ¼ ð1; 2; 3Þ, respectively. 7.81. Find a symmetric orthogonal matrix P whose first row is ð1 3 ; 2 3 ; 2 3Þ. (Compare with Problem 7.32.) 7.82. Real matrices A and B are said to be orthogonally equivalent if there exists an orthogonal matrix P such that B ¼ PT AP. Show that this relation is an equivalence relation. Positive Definite Matrices and Inner Products 7.83. Find the matrix A that represents the usual inner product on R2 relative to each of the following bases: (a) fv1 ¼ ð1; 4Þ; v2 ¼ ð2; 3Þg, (b) fw1 ¼ ð1; 3Þ; w2 ¼ ð6; 2Þg. 7.84. Consider the following inner product on R2 : f ðu; vÞ ¼ x1y1 2x1y2 2x2y1 þ 5x2y2; where u ¼ ðx1; x2Þ v ¼ ðy1; y2Þ Find the matrix B that represents this inner product on R2 relative to each basis in Problem 7.83. 260 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 268. 7.85. Find the matrix C that represents the usual basis on R3 relative to the basis S of R3 consisting of the vectors u1 ¼ ð1; 1; 1Þ, u2 ¼ ð1; 2; 1Þ, u3 ¼ ð1; 1; 3Þ. 7.86. Let V ¼ P2ðtÞ with inner product h f ; gi ¼ Ð1 0 f ðtÞgðtÞ dt. (a) Find h f ; gi, where f ðtÞ ¼ t þ 2 and gðtÞ ¼ t2 3t þ 4. (b) Find the matrix A of the inner product with respect to the basis f1; t; t2 g of V. (c) Verify Theorem 7.16 that h f ; gi ¼ ½ f T A½g with respect to the basis f1; t; t2 g. 7.87. Determine which of the following matrices are positive definite: (a) 1 3 3 5 , (b) 3 4 4 7 , (c) 4 2 2 1 , (d) 6 7 7 9 . 7.88. Suppose A and B are positive definite matrices. Show that: (a) A þ B is positive definite and (b) kA is positive definite for k 0. 7.89. Suppose B is a real nonsingular matrix. Show that: (a) BT B is symmetric and (b) BT B is positive definite. Complex Inner Product Spaces 7.90. Verify that ha1u1 þ a2u2 b1v1 þ b2v2i ¼ a1 b1hu1; v1i þ a1 b2hu1; v2i þ a2 b1hu2; v1i þ a2 b2hu2; v2i More generally, prove that h Pm i¼1 aiui; Pn j¼1 bjvji ¼ P i;j ai bjhui; vii. 7.91. Consider u ¼ ð1 þ i; 3; 4 iÞ and v ¼ ð3 4i; 1 þ i; 2iÞ in C3 . Find (a) hu; vi, (b) hv; ui, (c) kuk, (d) kvk, (e) dðu; vÞ. 7.92. Find the Fourier coefficient c and the projection cw of (a) u ¼ ð3 þ i; 5 2iÞ along w ¼ ð5 þ i; 1 þ iÞ in C2 , (b) u ¼ ð1 i; 3i; 1 þ iÞ along w ¼ ð1; 2 i; 3 þ 2iÞ in C3 . 7.93. Let u ¼ ðz1; z2Þ and v ¼ ðw1; w2Þ belong to C2 . Verify that the following is an inner product of C2 : f ðu; vÞ ¼ z1 w1 þ ð1 þ iÞz1 w2 þ ð1 iÞz2 w1 þ 3z2 w2 7.94. Find an orthogonal basis and an orthonormal basis for the subspace W of C3 spanned by u1 ¼ ð1; i; 1Þ and u2 ¼ ð1 þ i; 0; 2Þ. 7.95. Let u ¼ ðz1; z2Þ and v ¼ ðw1; w2Þ belong to C2 . For what values of a; b; c; d 2 C is the following an inner product on C2 ? f ðu; vÞ ¼ az1 w1 þ bz1 w2 þ cz2 w1 þ dz2 w2 7.96. Prove the following form for an inner product in a complex space V: hu; vi ¼ 1 4 ku þ vk2 1 4 ku vk2 þ 1 4 ku þ ivk2 1 4 ku ivk2 [Compare with Problem 7.7(b).] 7.97. Let V be a real inner product space. Show that (i) kuk ¼ kvk if and only if hu þ v; u vi ¼ 0; (ii) ku þ vk2 ¼ kuk2 þ kvk2 if and only if hu; vi ¼ 0. Show by counterexamples that the above statements are not true for, say, C2 . 7.98. Find the matrix P that represents the usual inner product on C3 relative to the basis f1; 1 þ i; 1 2ig. CHAPTER 7 Inner Product Spaces, Orthogonality 261
  • 269. 7.99. A complex matrix A is unitary if it is invertible and A1 ¼ AH . Alternatively, A is unitary if its rows (columns) form an orthonormal set of vectors (relative to the usual inner product of Cn ). Find a unitary matrix whose first row is: (a) a multiple of ð1; 1 iÞ; (b) a multiple of ð1 2 ; 1 2 i; 1 2 1 2 iÞ. Normed Vector Spaces 7.100. Consider vectors u ¼ ð1; 3; 4; 1; 2Þ and v ¼ ð3; 1; 2; 3; 1Þ in R5 . Find (a) kuk1 and kvk1 , (b) kuk1 and kvk1, (c) kuk2 and kvk2, (d) d1ðu; vÞ; d1ðu; vÞ, d2ðu; vÞ 7.101. Repeat Problem 7.100 for u ¼ ð1 þ i; 2 4iÞ and v ¼ ð1 i; 2 þ 3iÞ in C2 . 7.102. Consider the functions f ðtÞ ¼ 5t t2 and gðtÞ ¼ 3t t2 in C½0; 4. Find (a) d1ð f ; gÞ, (b) d1ð f ; gÞ, (c) d2ð f ; gÞ 7.103. Prove (a) k k1 is a norm on Rn . (b) k k1 is a norm on Rn . 7.104. Prove (a) k k1 is a norm on C½a; b. (b) k k1 is a norm on C½a; b. ANSWERS TO SUPPLEMENTARY PROBLEMS Notation: M ¼ ½R1; R2; . . . denotes a matrix M with rows R1; R2; : . . . Also, basis need not be unique. 7.58. k 9 7.59. (a) 13, (b) 71, (c) ffiffiffiffiffi 29 p , (d) ffiffiffiffiffi 89 p 7.60. Let u ¼ ð0; 0; 1Þ; then hu; ui ¼ 0 in both cases 7.64. f7t2 5t; 12t2 5g 7.65. fð1; 2; 1; 0Þ; ð4; 4; 0; 1Þg 7.66. ð1; 0; 0; 0; 1Þ; ð6; 2; 0; 1; 0Þ; ð5; 2; 1; 0; 0Þ 7.67. (a) u1 ¼ ð0; 0; 3; 1Þ; u2 ¼ ð0; 5; 1; 3Þ; u3 ¼ ð14; 2; 1; 3Þ; (b) u1= ffiffiffiffiffi 10 p ; u2= ffiffiffiffiffi 35 p ; u3= ffiffiffiffiffiffiffiffi 210 p 7.68. (a) ð0; 2; 1; 0Þ; ð15; 1; 2; 5Þ, (b) ð0; 2; 1; 0Þ= ffiffiffi 5 p ; ð15; 1; 2; 5Þ= ffiffiffiffiffiffiffiffi 255 p 7.69. (b) v ¼ 1 4 ð5u1 þ 3u2 13u3 þ 9u4Þ, (c) ½v ¼ 1 4 ½a þ b þ c þ d; a þ b c d; a b þ c d; a b c þ d 7.71. (a) ½0; 1; 0; 0; ½0; 0; 1; 0, (b) ½0; 1; 1; 0 7.74. (a) c ¼ 23 30, (b) c ¼ 1 7, (c) c ¼ 15 148, (d) c ¼ 19 26 7.75. (a) w1 ¼ ð1; 1; 1; 1Þ; w2 ¼ ð0; 2; 1; 1Þ; w3 ¼ ð12; 4; 1; 7Þ, (b) projðv; UÞ ¼ 1 5 ð1; 12; 3; 6Þ 7.76. (a) projðv; WÞ ¼ 1 8 ð23; 25; 30; 25; 23Þ, (b) First find an orthogonal basis for W; say, w1 ¼ ð1; 2; 1; 2; 1Þ and w2 ¼ ð0; 2; 0; 3; 2Þ. Then projðv; WÞ ¼ 1 17 ð34; 76; 34; 56; 42Þ 7.77. projð f ; WÞ ¼ 3 2 t2 3 5 t þ 1 20 262 CHAPTER 7 Inner Product Spaces, Orthogonality
  • 270. 7.78. (a) f1; t; 3t2 1; 5t3 3tg, projð f ; WÞ ¼ 10 9 t3 5 21 t 7.79. Four: ½a; b; b; a, ½a; b; b; a, ½a; b; b; a, ½a; b; b; a, where a ¼ 1 3 and b ¼ 1 3 ffiffiffi 8 p 7.80. P ¼ ½1=a; 1=a; 1=a; 1=b; 2=b; 3=b; 5=c; 2=c; 3=c, where a ¼ ffiffiffi 3 p ; b ¼ ffiffiffiffiffi 14 p ; c ¼ ffiffiffiffiffi 38 p 7.81. 1 3 ½1; 2; 2; 2; 2; 1; 2; 1; 2 7.83. (a) ½17; 10; 10; 13, (b) ½10; 0; 0; 40 7.84. (a) ½65; 68; 68; 73, (b) ½58; 8; 8; 8 7.85. ½3; 4; 3; 4; 6; 2; 3; 2; 11 7.86. (a) 83 12, (b) ½1; a; b; a; b; c; b; c; d, where a ¼ 1 2, b ¼ 1 3, c ¼ 1 4, d ¼ 1 5 7.87. (a) No, (b) Yes, (c) No, (d) Yes 7.91. (a) 4i, (b) 4i, (c) ffiffiffiffiffi 28 p , (d) ffiffiffiffiffi 31 p , (e) ffiffiffiffiffi 59 p 7.92. (a) c ¼ 1 28 ð19 5iÞ, (b) c ¼ 1 19 ð3 þ 6iÞ 7.94. fv1 ¼ ð1; i; 1Þ= ffiffiffi 3 p ; v2 ¼ ð2i; 1 3i; 3 iÞ= ffiffiffiffiffi 24 p g 7.95. a and d real and positive, c ¼ b and ad bc positive. 7.97. u ¼ ð1; 2Þ; v ¼ ði; 2iÞ 7.98. P ¼ ½1; 1 i; 1 þ 2i; 1 þ i; 2; 1 þ 3i; 1 2i; 1 3i; 5 7.99. (a) ð1= ffiffiffi 3 p Þ½1; 1 i; 1 þ i; 1, (b) ½a; ai; a ai; bi; b; 0; a; ai; a ai, where a ¼ 1 2 and b ¼ 1= ffiffiffi 2 p . 7.100. (a) 4 and 3, (b) 11 and 10, (c) ffiffiffiffiffi 31 p and ffiffiffiffiffi 24 p , (d) 6; 19; 9 7.101. (a) ffiffiffiffiffi 20 p and ffiffiffiffiffi 13 p , (b) ffiffiffi 2 p þ ffiffiffiffiffi 20 p and ffiffiffi 2 p þ ffiffiffiffiffi 13 p , (c) ffiffiffiffiffi 22 p and ffiffiffiffiffi 15 p , (d) 7; 9; ffiffiffiffiffi 53 p 7.102. (a) 8, (b) 16, (c) 16= ffiffiffi 3 p CHAPTER 7 Inner Product Spaces, Orthogonality 263
  • 271. Determinants 8.1 Introduction Each n-square matrix A ¼ ½aij is assigned a special scalar called the determinant of A, denoted by detðAÞ or jAj or a11 a12 . . . a1n a21 a22 . . . a2n ::::::::::::::::::::::::::::: an1 an2 . . . ann We emphasize that an n n array of scalars enclosed by straight lines, called a determinant of order n, is not a matrix but denotes the determinant of the enclosed array of scalars (i.e., the enclosed matrix). The determinant function was first discovered during the investigation of systems of linear equations. We shall see that the determinant is an indispensable tool in investigating and obtaining properties of square matrices. The definition of the determinant and most of its properties also apply in the case where the entries of a matrix come from a commutative ring. We begin with a special case of determinants of orders 1, 2, and 3. Then we define a determinant of arbitrary order. This general definition is preceded by a discussion of permutations, which is necessary for our general definition of the determinant. 8.2 Determinants of Orders 1 and 2 Determinants of orders 1 and 2 are defined as follows: ja11j ¼ a11 and a11 a12 a21 a22 ¼ a11a22 a12a21 Thus, the determinant of a 1 1 matrix A ¼ ½a11 is the scalar a11; that is, detðAÞ ¼ ja11j ¼ a11. The determinant of order two may easily be remembered by using the following diagram: a11 a12 a21 a22 That, is, the determinant is equal to the product of the elements along the plus-labeled arrow minus the product of the elements along the minus-labeled arrow. (There is an analogous diagram for determinants of order 3, but not for higher-order determinants.) EXAMPLE 8.1 (a) Because the determinant of order 1 is the scalar itself, we have: detð27Þ ¼ 27; detð7Þ ¼ 7; detðt 3Þ ¼ t 3 (b) 5 3 4 6 ¼ 5ð6Þ 3ð4Þ ¼ 30 12 ¼ 18; 3 2 5 7 ¼ 21 þ 10 ¼ 31 ! ! þ CHAPTER 8 264
  • 272. Application to Linear Equations Consider two linear equations in two unknowns, say a1z þ b1y ¼ c1 a2x þ b2y ¼ c2 Let D ¼ a1b2 a2b1, the determinant of the matrix of coefficients. Then the system has a unique solution if and only if D 6¼ 0. In such a case, the unique solution may be expressed completely in terms of determinants as follows: x ¼ Nx D ¼ b2c1 b1c2 a1b2 a2b1 ¼ c1 b1 c2 b2 a1 b1 a2 b2 ; y ¼ Ny D ¼ a1c2 a2c1 a1b2 a2b1 ¼ a1 c1 a2 c2 a1 b1 a2 b2 Here D appears in the denominator of both quotients. The numerators Nx and Ny of the quotients for x and y, respectively, can be obtained by substituting the column of constant terms in place of the column of coefficients of the given unknown in the matrix of coefficients. On the other hand, if D ¼ 0, then the system may have no solution or more than one solution. EXAMPLE 8.2 Solve by determinants the system 4x 3y ¼ 15 2x þ 5y ¼ 1 First find the determinant D of the matrix of coefficients: D ¼ 4 3 2 5 ¼ 4ð5Þ ð3Þð2Þ ¼ 20 þ 6 ¼ 26 Because D 6¼ 0, the system has a unique solution. To obtain the numerators Nx and Ny, simply replace, in the matrix of coefficients, the coefficients of x and y, respectively, by the constant terms, and then take their determinants: Nx ¼ 15 3 1 5 ¼ 75 þ 3 ¼ 78 Ny ¼ 4 15 2 1 ¼ 4 30 ¼ 26 Then the unique solution of the system is x ¼ Nx D ¼ 78 26 ¼ 3; y ¼ Ny D ¼ 26 26 ¼ 1 8.3 Determinants of Order 3 Consider an arbitrary 3 3 matrix A ¼ ½aij. The determinant of A is defined as follows: detðAÞ ¼ a11 a12 a13 a21 a22 a23 a31 a32 a33 ¼ a11a22a33 þ a12a23a31 þ a13a21a32 a13a22a31 a12a21a33 a11a23a32 Observe that there are six products, each product consisting of three elements of the original matrix. Three of the products are plus-labeled (keep their sign) and three of the products are minus-labeled (change their sign). The diagrams in Fig. 8-1 may help us to remember the above six products in detðAÞ. That is, the determinant is equal to the sum of the products of the elements along the three plus-labeled arrows in CHAPTER 8 Determinants 265
  • 273. Fig. 8-1 plus the sum of the negatives of the products of the elements along the three minus-labeled arrows. We emphasize that there are no such diagrammatic devices with which to remember determinants of higher order. EXAMPLE 8.3 Let A ¼ 2 1 1 0 5 2 1 3 4 2 4 3 5 and B ¼ 3 2 1 4 5 1 2 3 4 2 4 3 5. Find detðAÞ and detðBÞ. Use the diagrams in Fig. 8-1: detðAÞ ¼ 2ð5Þð4Þ þ 1ð2Þð1Þ þ 1ð3Þð0Þ 1ð5Þð1Þ ð3Þð2Þð2Þ 4ð1Þð0Þ ¼ 40 2 þ 0 5 12 0 ¼ 21 detðBÞ ¼ 60 4 þ 12 10 9 þ 32 ¼ 81 Alternative Form for a Determinant of Order 3 The determinant of the 3 3 matrix A ¼ ½aij may be rewritten as follows: detðAÞ ¼ a11ða22a23 a23a32Þ a12ða21a33 a23a31Þ þ a13ða21a32 a22a31Þ ¼ a11 a22 a23 a32 a33 a12 a21 a23 a31 a33 þ a13 a21 a22 a31 a32 which is a linear combination of three determinants of order 2 whose coefficients (with alternating signs) form the first row of the given matrix. This linear combination may be indicated in the form a11 a11 a12 a13 a21 a22 a23 a31 a32 a33 a12 a11 a12 a13 a21 a22 a23 a31 a32 a33 þ a13 a11 a12 a13 a21 a22 a23 a31 a32 a33 Note that each 2 2 matrix can be obtained by deleting, in the original matrix, the row and column containing its coefficient. EXAMPLE 8.4 1 2 3 4 2 3 0 5 1 ¼ 1 1 2 3 4 2 3 0 5 1 2 1 2 3 4 2 3 0 5 1 þ 3 1 2 3 4 2 3 0 5 1 ¼ 1 2 3 5 1 2 4 3 0 1 þ 3 4 2 0 5 ¼ 1ð2 15Þ 2ð4 þ 0Þ þ 3ð20 þ 0Þ ¼ 13 þ 8 þ 60 ¼ 55 Figure 8-1 266 CHAPTER 8 Determinants
  • 274. 8.4 Permutations A permutation s of the set f1; 2; . . . ; ng is a one-to-one mapping of the set onto itself or, equivalently, a rearrangement of the numbers 1; 2; . . . ; n. Such a permutation s is denoted by s ¼ 1 2 . . . n j1 j2 . . . jn or s ¼ j1j2 jn; where ji ¼ sðiÞ The set of all such permutations is denoted by Sn, and the number of such permutations is n!. If s 2 Sn; then the inverse mapping s1 2 Sn; and if s; t 2 Sn, then the composition mapping s t 2 Sn. Also, the identity mapping e ¼ s s1 2 Sn. (In fact, e ¼ 123 . . . n.) EXAMPLE 8.5 (a) There are 2! ¼ 2 1 ¼ 2 permutations in S2; they are 12 and 21. (b) There are 3! ¼ 3 2 1 ¼ 6 permutations in S3; they are 123, 132, 213, 231, 312, 321. Sign (Parity) of a Permutation Consider an arbitrary permutation s in Sn, say s ¼ j1j2 jn: We say s is an even or odd permutation according to whether there is an even or odd number of inversions in s. By an inversion in s we mean a pair of integers ði; kÞ such that i k, but i precedes k in s. We then define the sign or parity of s, written sgn s, by sgn s ¼ 1 if s is even 1 if s is odd EXAMPLE 8.6 (a) Find the sign of s ¼ 35142 in S5. For each element k, we count the number of elements i such that i k and i precedes k in s. There are 2 numbers ð3 and 5Þ greater than and preceding 1; 3 numbers ð3; 5; and 4Þ greater than and preceding 2; 1 number ð5Þ greater than and preceding 4: (There are no numbers greater than and preceding either 3 or 5.) Because there are, in all, six inversions, s is even and sgn s ¼ 1. (b) The identity permutation e ¼ 123 . . . n is even because there are no inversions in e. (c) In S2, the permutation 12 is even and 21 is odd. In S3, the permutations 123, 231, 312 are even and the permutations 132, 213, 321 are odd. (d) Let t be the permutation that interchanges two numbers i and j and leaves the other numbers fixed. That is, tðiÞ ¼ j; tðjÞ ¼ i; tðkÞ ¼ k; where k 6¼ i; j We call t a transposition. If i j, then there are 2ð j iÞ 1 inversions in t, and hence, the transposition t is odd. Remark: One can show that, for any n, half of the permutations in Sn are even and half of them are odd. For example, 3 of the 6 permutations in S3 are even, and 3 are odd. 8.5. Determinants of Arbitrary Order Let A ¼ ½aij be a square matrix of order n over a field K. Consider a product of n elements of A such that one and only one element comes from each row and one and only one element comes from each column. Such a product can be written in the form a1j1 a2j2 anjn CHAPTER 8 Determinants 267
  • 275. that is, where the factors come from successive rows, and so the first subscripts are in the natural order 1; 2; . . . ; n. Now because the factors come from different columns, the sequence of second subscripts forms a permutation s ¼ j1 j2 jn in Sn. Conversely, each permutation in Sn determines a product of the above form. Thus, the matrix A contains n! such products. DEFINITION: The determinant of A ¼ ½aij, denoted by detðAÞ or jAj, is the sum of all the above n! products, where each such product is multiplied by sgn s. That is, jAj ¼ P s ðsgn sÞa1j1 a2j2 anjn or jAj ¼ P s2Sn ðsgn sÞa1sð1Þa2sð2Þ ansðnÞ The determinant of the n-square matrix A is said to be of order n. The next example shows that the above definition agrees with the previous definition of determinants of orders 1, 2, and 3. EXAMPLE 8.7 (a) Let A ¼ ½a11 be a 1 1 matrix. Because S1 has only one permutation, which is even, detðAÞ ¼ a11, the number itself. (b) Let A ¼ ½aij be a 2 2 matrix. In S2, the permutation 12 is even and the permutation 21 is odd. Hence, detðAÞ ¼ a11 a12 a21 a22 ¼ a11a22 a12a21 (c) Let A ¼ ½aij be a 3 3 matrix. In S3, the permutations 123, 231, 312 are even, and the permutations 321, 213, 132 are odd. Hence, detðAÞ ¼ a11 a12 a13 a21 a22 a23 a31 a32 a33 ¼ a11a22a33 þ a12a23a31 þ a13a21a32 a13a22a31 a12a21a33 a11a23a32 Remark: As n increases, the number of terms in the determinant becomes astronomical. Accordingly, we use indirect methods to evaluate determinants rather than the definition of the determinant. In fact, we prove a number of properties about determinants that will permit us to shorten the computation considerably. In particular, we show that a determinant of order n is equal to a linear combination of determinants of order n 1, as in the case n ¼ 3 above. 8.6 Properties of Determinants We now list basic properties of the determinant. THEOREM 8.1: The determinant of a matrix A and its transpose AT are equal; that is, jAj ¼ jAT j. By this theorem (proved in Problem 8.22), any theorem about the determinant of a matrix A that concerns the rows of A will have an analogous theorem concerning the columns of A. The next theorem (proved in Problem 8.24) gives certain cases for which the determinant can be obtained immediately. THEOREM 8.2: Let A be a square matrix. (i) If A has a row (column) of zeros, then jAj ¼ 0. (ii) If A has two identical rows (columns), then jAj ¼ 0. 268 CHAPTER 8 Determinants
  • 276. (iii) If A is triangular (i.e., A has zeros above or below the diagonal), then jAj ¼ product of diagonal elements. Thus, in particular, jIj ¼ 1, where I is the identity matrix. The next theorem (proved in Problems 8.23 and 8.25) shows how the determinant of a matrix is affected by the elementary row and column operations. THEOREM 8.3: Suppose B is obtained from A by an elementary row (column) operation. (i) If two rows (columns) of A were interchanged, then jBj ¼ jAj. (ii) If a row (column) of A were multiplied by a scalar k, then jBj ¼ kjAj. (iii) If a multiple of a row (column) of A were added to another row (column) of A, then jBj ¼ jAj. Major Properties of Determinants We now state two of the most important and useful theorems on determinants. THEOREM 8.4: The determinant of a product of two matrices A and B is the product of their determinants; that is, detðABÞ ¼ detðAÞ detðBÞ The above theorem says that the determinant is a multiplicative function. THEOREM 8.5: Let A be a square matrix. Then the following are equivalent: (i) A is invertible; that is, A has an inverse A1 . (ii) AX ¼ 0 has only the zero solution. (iii) The determinant of A is not zero; that is, detðAÞ 6¼ 0. Remark: Depending on the author and the text, a nonsingular matrix A is defined to be an invertible matrix A, or a matrix A for which jAj 6¼ 0, or a matrix A for which AX ¼ 0 has only the zero solution. The above theorem shows that all such definitions are equivalent. We will prove Theorems 8.4 and 8.5 (in Problems 8.29 and 8.28, respectively) using the theory of elementary matrices and the following lemma (proved in Problem 8.26), which is a special case of Theorem 8.4. LEMMA 8.6: Let E be an elementary matrix. Then, for any matrix A; jEAj ¼ jEjjAj. Recall that matrices A and B are similar if there exists a nonsingular matrix P such that B ¼ P1 AP. Using the multiplicative property of the determinant (Theorem 8.4), one can easily prove (Problem 8.31) the following theorem. THEOREM 8.7: Suppose A and B are similar matrices. Then jAj ¼ jBj. 8.7 Minors and Cofactors Consider an n-square matrix A ¼ ½aij. Let Mij denote the ðn 1Þ-square submatrix of A obtained by deleting its ith row and jth column. The determinant jMijj is called the minor of the element aij of A, and we define the cofactor of aij, denoted by Aij; to be the ‘‘signed’’ minor: Aij ¼ ð1Þiþj jMijj CHAPTER 8 Determinants 269
  • 277. Note that the ‘‘signs’’ ð1Þiþj accompanying the minors form a chessboard pattern with þ’s on the main diagonal: þ þ . . . þ þ . . . þ þ . . . ::::::::::::::::::::::::::::::: 2 6 6 4 3 7 7 5 We emphasize that Mij denotes a matrix, whereas Aij denotes a scalar. Remark: The sign ð1Þiþj of the cofactor Aij is frequently obtained using the checkerboard pattern. Specifically, beginning with þ and alternating signs: þ; ; þ; ; . . . ; count from the main diagonal to the appropriate square. EXAMPLE 8.8 Let A ¼ 1 2 3 4 5 6 7 8 9 2 4 3 5. Find the following minors and cofactors: (a) jM23j and A23, (b) jM31j and A31. (a) jM23j ¼ 1 2 3 4 5 6 7 8 9 ¼ 1 2 7 8 ¼ 8 14 ¼ 6, and so A23 ¼ ð1Þ2þ3 jM23j ¼ ð6Þ ¼ 6 (b) jM31j ¼ 1 2 3 4 5 6 7 8 9 ¼ 2 3 5 6 ¼ 12 15 ¼ 3, and so A31 ¼ ð1Þ1þ3 jM31j ¼ þð3Þ ¼ 3 Laplace Expansion The following theorem (proved in Problem 8.32) holds. THEOREM 8.8: (Laplace) The determinant of a square matrix A ¼ ½aij is equal to the sum of the products obtained by multiplying the elements of any row (column) by their respective cofactors: jAj ¼ ai1Ai1 þ ai2Ai2 þ þ ainAin ¼ P n j¼1 aijAij jAj ¼ a1jA1j þ a2jA2j þ þ anjAnj ¼ P n i¼1 aijAij The above formulas for jAj are called the Laplace expansions of the determinant of A by the ith row and the jth column. Together with the elementary row (column) operations, they offer a method of simplifying the computation of jAj, as described below. 8.8 Evaluation of Determinants The following algorithm reduces the evaluation of a determinant of order n to the evaluation of a determinant of order n 1. ALGORITHM 8.1: (Reduction of the order of a determinant) The input is a nonzero n-square matrix A ¼ ½aij with n 1. Step 1. Choose an element aij ¼ 1 or, if lacking, aij 6¼ 0. Step 2. Using aij as a pivot, apply elementary row (column) operations to put 0’s in all the other positions in the column (row) containing aij. Step 3. Expand the determinant by the column (row) containing aij. 270 CHAPTER 8 Determinants
  • 278. The following remarks are in order. Remark 1: Algorithm 8.1 is usually used for determinants of order 4 or more. With determinants of order less than 4, one uses the specific formulas for the determinant. Remark 2: Gaussian elimination or, equivalently, repeated use of Algorithm 8.1 together with row interchanges can be used to transform a matrix A into an upper triangular matrix whose determinant is the product of its diagonal entries. However, one must keep track of the number of row interchanges, because each row interchange changes the sign of the determinant. EXAMPLE 8.9 Use Algorithm 8.1 to find the determinant of A ¼ 5 4 2 1 2 3 1 2 5 7 3 9 1 2 1 4 2 6 6 4 3 7 7 5. Use a23 ¼ 1 as a pivot to put 0’s in the other positions of the third column; that is, apply the row operations ‘‘Replace R1 by 2R2 þ R1,’’ ‘‘Replace R3 by 3R2 þ R3,’’ and ‘‘Replace R4 by R2 þ R4.’’ By Theorem 8.3(iii), the value of the determinant does not change under these operations. Thus, jAj ¼ 5 4 2 1 2 3 1 2 5 7 3 9 1 2 1 4 ¼ 1 2 0 5 2 3 1 2 1 2 0 3 3 1 0 2 Now expand by the third column. Specifically, neglect all terms that contain 0 and use the fact that the sign of the minor M23 is ð1Þ2þ3 ¼ 1. Thus, jAj ¼ 1 2 0 5 2 3 1 2 1 2 0 3 3 1 0 2 ¼ 1 2 5 1 2 3 3 1 2 ¼ ð4 18 þ 5 30 3 þ 4Þ ¼ ð38Þ ¼ 38 8.9 Classical Adjoint Let A ¼ ½aij be an n n matrix over a field K and let Aij denote the cofactor of aij. The classical adjoint of A, denoted by adj A, is the transpose of the matrix of cofactors of A. Namely, adj A ¼ ½AijT We say ‘‘classical adjoint’’ instead of simply ‘‘adjoint’’ because the term ‘‘adjoint’’ is currently used for an entirely different concept. EXAMPLE 8.10 Let A ¼ 2 3 4 0 4 2 1 1 5 2 4 3 5. The cofactors of the nine elements of A follow: A11 ¼ þ 4 2 1 5 ¼ 18; A21 ¼ 3 4 1 5 ¼ 11; A31 ¼ þ 3 4 4 2 ¼ 10; A12 ¼ 0 2 1 5 ¼ 2; A22 ¼ þ 2 4 1 5 ¼ 14; A32 ¼ 2 4 0 2 ¼ 4; A13 ¼ þ 0 4 1 1 ¼ 4 A23 ¼ 2 3 1 1 ¼ 5 A33 ¼ þ 2 3 0 4 ¼ 8 CHAPTER 8 Determinants 271
  • 279. The transpose of the above matrix of cofactors yields the classical adjoint of A; that is, adj A ¼ 18 11 10 2 14 4 4 5 8 2 4 3 5 The following theorem (proved in Problem 8.34) holds. THEOREM 8.9: Let A be any square matrix. Then Aðadj AÞ ¼ ðadj AÞA ¼ jAjI where I is the identity matrix. Thus, if jAj 6¼ 0, A1 ¼ 1 jAj ðadj AÞ EXAMPLE 8.11 Let A be the matrix in Example 8.10. We have detðAÞ ¼ 40 þ 6 þ 0 16 þ 4 þ 0 ¼ 46 Thus, A does have an inverse, and, by Theorem 8.9, A1 ¼ 1 jAj ðadj AÞ ¼ 1 46 18 11 10 2 14 4 4 5 8 2 6 4 3 7 5 ¼ 9 23 11 46 5 23 1 23 7 23 2 23 2 23 5 46 4 23 2 6 4 3 7 5 8.10 Applications to Linear Equations, Cramer’s Rule Consider a system AX ¼ B of n linear equations in n unknowns. Here A ¼ ½aij is the (square) matrix of coefficients and B ¼ ½bi is the column vector of constants. Let Ai be the matrix obtained from A by replacing the ith column of A by the column vector B. Furthermore, let D ¼ detðAÞ; N1 ¼ detðA1Þ; N2 ¼ detðA2Þ; . . . ; Nn ¼ detðAnÞ The fundamental relationship between determinants and the solution of the system AX ¼ B follows. THEOREM 8.10: The (square) system AX ¼ B has a solution if and only if D 6¼ 0. In this case, the unique solution is given by x1 ¼ N1 D ; x2 ¼ N2 D ; . . . ; xn ¼ Nn D The above theorem (proved in Problem 8.10) is known as Cramer’s rule for solving systems of linear equations. We emphasize that the theorem only refers to a system with the same number of equations as unknowns, and that it only gives the solution when D 6¼ 0. In fact, if D ¼ 0, the theorem does not tell us whether or not the system has a solution. However, in the case of a homogeneous system, we have the following useful result (to be proved in Problem 8.54). THEOREM 8.11: A square homogeneous system AX ¼ 0 has a nonzero solution if and only if D ¼ jAj ¼ 0. 272 CHAPTER 8 Determinants
  • 280. EXAMPLE 8.12 Solve the system using determinants x þ y þ z ¼ 5 x 2y 3z ¼ 1 2x þ y z ¼ 3 8 : First compute the determinant D of the matrix of coefficients: D ¼ 1 1 1 1 2 3 2 1 1 ¼ 2 6 þ 1 þ 4 þ 3 þ 1 ¼ 5 Because D 6¼ 0, the system has a unique solution. To compute Nx, Ny, Nz, we replace, respectively, the coefficients of x; y; z in the matrix of coefficients by the constant terms. This yields Nx ¼ 5 1 1 1 2 3 3 1 1 ¼ 20; Ny ¼ 1 5 1 1 1 3 2 3 1 ¼ 10; Nz ¼ 1 1 5 1 2 1 2 1 3 ¼ 15 Thus, the unique solution of the system is x ¼ Nx=D ¼ 4, y ¼ Ny=D ¼ 2, z ¼ Nz=D ¼ 3; that is, the vector u ¼ ð4; 2; 3Þ. 8.11 Submatrices, Minors, Principal Minors Let A ¼ ½aij be a square matrix of order n. Consider any r rows and r columns of A. That is, consider any set I ¼ ði1; i2; . . . ; irÞ of r row indices and any set J ¼ ðj1; j2; . . . ; jrÞ of r column indices. Then I and J define an r r submatrix of A, denoted by AðI; JÞ, obtained by deleting the rows and columns of A whose subscripts do not belong to I or J, respectively. That is, AðI; JÞ ¼ ½ast : s 2 I; t 2 J The determinant jAðI; JÞj is called a minor of A of order r and ð1Þi1þi2þþirþj1þj2þþjr jAðI; JÞj is the corresponding signed minor. (Note that a minor of order n 1 is a minor in the sense of Section 8.7, and the corresponding signed minor is a cofactor.) Furthermore, if I0 and J0 denote, respectively, the remaining row and column indices, then jAðI0 ; J0 Þj denotes the complementary minor, and its sign (Problem 8.74) is the same sign as the minor. EXAMPLE 8.13 Let A ¼ ½aij be a 5-square matrix, and let I ¼ f1; 2; 4g and J ¼ f2; 3; 5g. Then I0 ¼ f3; 5g and J0 ¼ f1; 4g, and the corresponding minor jMj and complementary minor jM0 j are as follows: jMj ¼ jAðI; JÞj ¼ a12 a13 a15 a22 a23 a25 a42 a43 a45 and jM0 j ¼ jAðI0 ; J0 Þj ¼ a31 a34 a51 a54 Because 1 þ 2 þ 4 þ 2 þ 3 þ 5 ¼ 17 is odd, jMj is the signed minor, and jM0 j is the signed complementary minor. Principal Minors A minor is principal if the row and column indices are the same, or equivalently, if the diagonal elements of the minor come from the diagonal of the matrix. We note that the sign of a principal minor is always þ1, because the sum of the row and identical column subscripts must always be even. CHAPTER 8 Determinants 273
  • 281. EXAMPLE 8.14 Let A ¼ 1 2 1 3 5 4 3 1 2 2 4 3 5. Find the sums C1, C2, and C3 of the principal minors of A of orders 1, 2, and 3, respectively. (a) There are three principal minors of order 1. These are j1j ¼ 1; j5j ¼ 5; j 2j ¼ 2; and so C1 ¼ 1 þ 5 2 ¼ 4 Note that C1 is simply the trace of A. Namely, C1 ¼ trðAÞ: (b) There are three ways to choose two of the three diagonal elements, and each choice gives a minor of order 2. These are 1 2 3 5 ¼ 1; 1 1 3 2 ¼ 1; 5 4 1 2 ¼ 14 (Note that these minors of order 2 are the cofactors A33, A22, and A11 of A, respectively.) Thus, C2 ¼ 1 þ 1 14 ¼ 14 (c) There is only one way to choose three of the three diagonal elements. Thus, the only minor of order 3 is the determinant of A itself. Thus, C3 ¼ jAj ¼ 10 24 3 15 4 þ 12 ¼ 44 8.12 Block Matrices and Determinants The following theorem (proved in Problem 8.36) is the main result of this section. THEOREM 8.12: Suppose M is an upper (lower) triangular block matrix with the diagonal blocks A1; A2; . . . ; An. Then detðMÞ ¼ detðA1Þ detðA2Þ . . . detðAnÞ EXAMPLE 8.15 Find jMj where M ¼ 2 3 4 7 8 1 5 3 2 1 0 0 2 1 5 0 0 3 1 4 0 0 5 2 6 2 6 6 6 6 4 3 7 7 7 7 5 Note that M is an upper triangular block matrix. Evaluate the determinant of each diagonal block: 2 3 1 5 ¼ 10 þ 3 ¼ 13; 2 1 5 3 1 4 5 2 6 ¼ 12 þ 20 þ 30 þ 25 16 18 ¼ 29 Then jMj ¼ 13ð29Þ ¼ 377. Remark: Suppose M ¼ A B C D , where A; B; C; D are square matrices. Then it is not generally true that jMj ¼ jAjjDj jBjjCj. (See Problem 8.68.) 8.13 Determinants and Volume Determinants are related to the notions of area and volume as follows. Let u1; u2; . . . ; un be vectors in Rn . Let S be the (solid) parallelopiped determined by the vectors; that is, S ¼ fa1u1 þ a2u2 þ þ anun : 0 ai 1 for i ¼ 1; . . . ; ng (When n ¼ 2; S is a parallelogram.) Let VðSÞ denote the volume of S (or area of S when n ¼ 2Þ. Then VðSÞ ¼ absolute value of det ðAÞ 274 CHAPTER 8 Determinants
  • 282. where A is the matrix with rows u1; u2; . . . ; un. In general, VðSÞ ¼ 0 if and only if the vectors u1; . . . ; un do not form a coordinate system for Rn (i.e., if and only if the vectors are linearly dependent). EXAMPLE 8.16 Let u1 ¼ ð1; 1; 0Þ, u2 ¼ ð1; 1; 1Þ, u3 ¼ ð0; 2; 3Þ. Find the volume VðSÞ of the parallelo- piped S in R3 (Fig. 8-2) determined by the three vectors. Evaluate the determinant of the matrix whose rows are u1; u2; u3: 1 1 0 1 1 1 0 2 3 ¼ 3 þ 0 þ 0 0 2 3 ¼ 2 Hence, VðSÞ ¼ j 2j ¼ 2. 8.14 Determinant of a Linear Operator Let F be a linear operator on a vector space V with finite dimension. Let A be the matrix representation of F relative to some basis S of V. Then we define the determinant of F, written detðFÞ, by detðFÞ ¼ jAj If B were another matrix representation of F relative to another basis S0 of V, then A and B are similar matrices (Theorem 6.7) and jBj ¼ jAj (Theorem 8.7). In other words, the above definition detðFÞ is independent of the particular basis S of V. (We say that the definition is well defined.) The next theorem (to be proved in Problem 8.62) follows from analogous theorems on matrices. THEOREM 8.13: Let F and G be linear operators on a vector space V. Then (i) detðF GÞ ¼ detðFÞ detðGÞ. (ii) F is invertible if and only if detðFÞ 6¼ 0. EXAMPLE 8.17 Let F be the following linear operator on R3 and let A be the matrix that represents F relative to the usual basis of R3 : Fðx; y; zÞ ¼ ð2x 4y þ z; x 2y þ 3z; 5x þ y zÞ and A ¼ 2 4 1 1 2 3 5 1 1 2 4 3 5 Then detðFÞ ¼ jAj ¼ 4 60 þ 1 þ 10 6 4 ¼ 55 z y x 0 u3 u2 u1 Figure 8-2 CHAPTER 8 Determinants 275
  • 283. 8.15 Multilinearity and Determinants Let V be a vector space over a field K. Let a ¼ Vn ; that is, a consists of all the n-tuples A ¼ ðA1; A2; . . . ; AnÞ where the Ai are vectors in V. The following definitions apply. DEFINITION: A function D: a ! K is said to be multilinear if it is linear in each component: (i) If Ai ¼ B þ C, then DðAÞ ¼ Dð. . . ; B þ C; . . .Þ ¼ Dð. . . ; B; . . . ; Þ þ Dð. . . ; C; . . .Þ (ii) If Ai ¼ kB, where k 2 K, then DðAÞ ¼ Dð. . . ; kB; . . .Þ ¼ kDð. . . ; B; . . .Þ We also say n-linear for multilinear if there are n components. DEFINITION: A function D: a ! K is said to be alternating if DðAÞ ¼ 0 whenever A has two identical elements: DðA1; A2; . . . ; AnÞ ¼ 0 whenever Ai ¼ Aj; i 6¼ j Now let M denote the set of all n-square matrices A over a field K. We may view A as an n-tuple consisting of its row vectors A1; A2; . . . ; An; that is, we may view A in the form A ¼ ðA1; A2; . . . ; AnÞ. The following theorem (proved in Problem 8.37) characterizes the determinant function. THEOREM 8.14: There exists a unique function D: M ! K such that (i) D is multilinear, (ii) D is alternating, (iii) DðIÞ ¼ 1. This function D is the determinant function; that is, DðAÞ ¼ jAj; for any matrix A 2 M. SOLVED PROBLEMS Computation of Determinants 8.1. Evaluate the determinant of each of the following matrices: (a) A ¼ 6 5 2 3 , (b) B ¼ 2 3 4 7 ; (c) C ¼ 4 5 1 2 ; (d) D ¼ t 5 6 3 t þ 2 Use the formula a b c d ¼ ad bc: (a) jAj ¼ 6ð3Þ 5ð2Þ ¼ 18 10 ¼ 8 (b) jBj ¼ 14 þ 12 ¼ 26 (c) jCj ¼ 8 5 ¼ 13 (d) jDj ¼ ðt 5Þðt þ 2Þ 18 ¼ t2 3t 10 18 ¼ t2 10t 28 8.2. Evaluate the determinant of each of the following matrices: (a) A ¼ 2 3 4 5 4 3 1 2 1 2 4 3 5, (b) B ¼ 1 2 3 2 4 1 1 5 2 2 4 3 5, (c) C ¼ 1 3 5 3 1 2 1 2 1 2 4 3 5 276 CHAPTER 8 Determinants
  • 284. Use the diagram in Fig. 8-1 to obtain the six products: (a) jAj ¼ 2ð4Þð1Þ þ 3ð3Þð1Þ þ 4ð2Þð5Þ 1ð4Þð4Þ 2ð3Þð2Þ 1ð3Þð5Þ ¼ 8 þ 9 þ 40 16 12 15 ¼ 14 (b) jBj ¼ 8 þ 2 þ 30 12 þ 5 8 ¼ 9 (c) jCj ¼ 1 þ 6 þ 30 5 þ 4 9 ¼ 25 8.3. Compute the determinant of each of the following matrices: (a) A ¼ 2 3 4 5 6 7 8 9 1 2 4 3 5, (b) B ¼ 4 6 8 9 0 2 7 3 0 0 5 6 0 0 0 3 2 6 6 4 3 7 7 5, (c) C ¼ 1 2 1 1 3 3 4 1 2 1 1 4 1 2 6 4 3 7 5: (a) One can simplify the entries by first subtracting twice the first row from the second row—that is, by applying the row operation ‘‘Replace R2 by 21 þ R2.’’ Then jAj ¼ 2 3 4 5 6 7 8 9 1 ¼ 2 3 4 1 0 1 8 9 1 ¼ 0 24 þ 36 0 þ 18 3 ¼ 27 (b) B is triangular, so jBj ¼ product of the diagonal entries ¼ 120. (c) The arithmetic is simpler if fractions are first eliminated. Hence, multiply the first row R1 by 6 and the second row R2 by 4. Then j24Cj ¼ 3 6 2 3 2 4 1 4 1 ¼ 6 þ 24 þ 24 þ 4 48 þ 18 ¼ 28; so jCj ¼ 28 24 ¼ 7 6 8.4. Compute the determinant of each of the following matrices: (a) A ¼ 2 5 3 2 2 3 2 5 1 3 2 2 1 6 4 3 2 6 6 4 3 7 7 5, (b) B ¼ 6 2 1 0 5 2 1 1 2 1 1 1 2 2 3 3 0 2 3 1 1 1 3 4 2 2 6 6 6 6 4 3 7 7 7 7 5 (a) Use a31 ¼ 1 as a pivot to put 0’s in the first column, by applying the row operations ‘‘Replace R1 by 2R3 þ R1,’’ ‘‘Replace R2 by 2R3 þ R2,’’ and ‘‘Replace R4 by R3 þ R4.’’ Then jAj ¼ 2 5 3 2 2 3 2 5 1 3 2 2 1 6 4 3 ¼ 0 1 1 6 0 3 2 1 1 3 2 2 0 3 2 5 ¼ 1 1 6 3 2 1 3 2 5 ¼ 10 þ 3 36 þ 36 2 15 ¼ 4 (b) First reduce jBj to a determinant of order 4, and then to a determinant of order 3, for which we can use Fig. 8-1. First use c22 ¼ 1 as a pivot to put 0’s in the second column, by applying the row operations ‘‘Replace R1 by 2R2 þ R1,’’ ‘‘Replace R3 by R2 þ R3,’’ and ‘‘Replace R5 by R2 þ R5.’’ Then jBj ¼ 2 0 1 4 3 2 1 1 2 1 1 0 1 0 2 3 0 2 3 1 1 0 2 2 3 ¼ 2 1 4 3 1 1 0 2 3 2 3 1 1 2 2 3 ¼ 1 1 4 5 0 1 0 0 5 2 3 5 1 2 2 7 ¼ 1 4 5 5 3 5 1 2 7 ¼ 21 þ 20 þ 50 þ 15 þ 10 140 ¼ 34 CHAPTER 8 Determinants 277
  • 285. Cofactors, Classical Adjoints, Minors, Principal Minors 8.5. Let A ¼ 2 1 3 4 5 4 7 2 4 0 6 3 3 2 5 2 2 6 6 4 3 7 7 5: (a) Find A23, the cofactor (signed minor) of 7 in A. (b) Find the minor and the signed minor of the submatrix M ¼ Að2; 4; 2; 3Þ. (c) Find the principal minor determined by the first and third diagonal entries—that is, by M ¼ Að1; 3; 1; 3Þ. (a) Take the determinant of the submatrix of A obtained by deleting row 2 and column 3 (those which contain the 7), and multiply the determinant by ð1Þ2þ3 : A23 ¼ 2 1 4 4 0 3 3 2 2 ¼ ð61Þ ¼ 61 The exponent 2 þ 3 comes from the subscripts of A23—that is, from the fact that 7 appears in row 2 and column 3. (b) The row subscripts are 2 and 4 and the column subscripts are 2 and 3. Hence, the minor is the determinant jMj ¼ a22 a23 a42 a43 ¼ 4 7 2 5 ¼ 20 þ 14 ¼ 6 and the signed minor is ð1Þ2þ4þ2þ3 jMj ¼ jMj ¼ ð6Þ ¼ 6. (c) The principal minor is the determinant jMj ¼ a11 a13 a31 a33 ¼ 2 3 4 6 ¼ 12 þ 12 ¼ 24 Note that now the diagonal entries of the submatrix are diagonal entries of the original matrix. Also, the sign of the principal minor is positive. 8.6. Let B ¼ 1 1 1 2 3 4 5 8 9 2 4 3 5. Find: (a) jBj, (b) adj B, (c) B1 using adj B. (a) jBj ¼ 27 þ 20 þ 16 15 32 18 ¼ 2 (b) Take the transpose of the matrix of cofactors: adj B ¼ 3 4 8 9 2 4 5 9 2 3 5 8 1 1 8 9 1 1 5 9 1 1 5 8 1 1 3 4 1 1 2 4 1 1 2 3 2 6 6 6 6 6 6 6 6 4 3 7 7 7 7 7 7 7 7 5 T ¼ 5 2 1 1 4 3 1 2 1 2 6 4 3 7 5 T ¼ 5 1 1 2 4 2 1 3 1 2 6 4 3 7 5 (c) Because jBj 6¼ 0, B1 ¼ 1 jBj ðadj BÞ ¼ 1 2 5 1 1 2 4 2 1 3 1 2 4 3 5 ¼ 5 2 1 2 1 2 1 2 1 1 2 3 2 1 2 2 6 4 3 7 5 8.7. Let A ¼ 1 2 3 4 5 6 0 7 8 2 4 3 5, and let Sk denote the sum of its principal minors of order k. Find Sk for (a) k ¼ 1, (b) k ¼ 2, (c) k ¼ 3. 278 CHAPTER 8 Determinants
  • 286. (a) The principal minors of order 1 are the diagonal elements. Thus, S1 is the trace of A; that is, S1 ¼ trðAÞ ¼ 1 þ 5 þ 8 ¼ 14 (b) The principal minors of order 2 are the cofactors of the diagonal elements. Thus, S2 ¼ A11 þ A22 þ A33 ¼ 5 6 7 8 þ 1 3 0 8 þ 1 2 4 5 ¼ 2 þ 8 3 ¼ 3 (c) There is only one principal minor of order 3, the determinant of A. Then S3 ¼ jAj ¼ 40 þ 0 þ 84 0 42 64 ¼ 18 8.8. Let A ¼ 1 3 0 1 4 2 5 1 1 0 3 2 3 2 1 4 2 6 6 4 3 7 7 5. Find the number Nk and sum Sk of principal minors of order: (a) k ¼ 1, (b) k ¼ 2, (c) k ¼ 3, (d) k ¼ 4. Each (nonempty) subset of the diagonal (or equivalently, each nonempty subset of f1; 2; 3; 4gÞ determines a principal minor of A, and Nk ¼ n k ¼ n! k!ðn kÞ! of them are of order k. Thus; N1 ¼ 4 1 ¼ 4; N2 ¼ 4 2 ¼ 6; N3 ¼ 4 3 ¼ 4; N4 ¼ 4 4 ¼ 1 (a) S1 ¼ j1j þ j2j þ j3j þ j4j ¼ 1 þ 2 þ 3 þ 4 ¼ 10 (b) S2 ¼ 1 3 4 2 þ 1 0 1 3 þ 1 1 3 4 þ 2 5 0 3 þ 2 1 2 4 þ 3 2 1 4 ¼ 14 þ 3 þ 7 þ 6 þ 10 þ 14 ¼ 54 (c) S3 ¼ 1 3 0 4 2 5 1 0 3 þ 1 3 1 4 2 1 3 2 4 þ 1 0 1 1 3 2 3 1 4 þ 2 5 1 0 3 2 2 1 4 ¼ 57 þ 65 þ 22 þ 54 ¼ 198 (d) S4 ¼ detðAÞ ¼ 378 Determinants and Systems of Linear Equations 8.9. Use determinants to solve the system 3y þ 2x ¼ z þ 1 3x þ 2z ¼ 8 5y 3z 1 ¼ x 2y : 8 : First arrange the equation in standard form, then compute the determinant D of the matrix of coefficients: 2x þ 3y z ¼ 1 3x þ 5y þ 2z ¼ 8 x 2y 3z ¼ 1 and D ¼ 2 3 1 3 5 2 1 2 3 ¼ 30 þ 6 þ 6 þ 5 þ 8 þ 27 ¼ 22 Because D 6¼ 0, the system has a unique solution. To compute Nx; Ny; Nz, we replace, respectively, the coefficients of x; y; z in the matrix of coefficients by the constant terms. Then Nx ¼ 1 3 1 8 5 2 1 2 1 ¼ 66; Ny ¼ 2 1 1 3 8 2 1 1 3 ¼ 22; Nz ¼ 2 3 1 3 5 8 1 2 1 ¼ 44 CHAPTER 8 Determinants 279
  • 287. Thus, x ¼ Nx D ¼ 66 22 ¼ 3; y ¼ Ny D ¼ 22 22 ¼ 1; z ¼ Nz D ¼ 44 22 ¼ 2 8.10. Consider the system kx þ y þ z ¼ 1 x þ ky þ z ¼ 1 x þ y þ kz ¼ 1 8 : Use determinants to find those values of k for which the system has (a) a unique solution, (b) more than one solution, (c) no solution. (a) The system has a unique solution when D 6¼ 0, where D is the determinant of the matrix of coefficients. Compute D ¼ k 1 1 1 k 1 1 1 k ¼ k3 þ 1 þ 1 k k k ¼ k3 3k þ 2 ¼ ðk 1Þ2 ðk þ 2Þ Thus, the system has a unique solution when ðk 1Þ2 ðk þ 2Þ 6¼ 0; when k 6¼ 1 and k 6¼ 2 (b and c) Gaussian elimination shows that the system has more than one solution when k ¼ 1, and the system has no solution when k ¼ 2. Miscellaneous Problems 8.11. Find the volume VðSÞ of the parallelepiped S in R3 determined by the vectors: (a) u1 ¼ ð1; 1; 1Þ; u2 ¼ ð1; 3; 4Þ; u3 ¼ ð1; 2; 5Þ. (b) u1 ¼ ð1; 2; 4Þ; u2 ¼ ð2; 1; 3Þ; u3 ¼ ð5; 7; 9Þ. VðSÞ is the absolute value of the determinant of the matrix M whose rows are the given vectors. Thus, (a) jMj ¼ 1 1 1 1 3 4 1 2 5 ¼ 15 4 þ 2 3 þ 8 þ 5 ¼ 7. Hence, VðSÞ ¼ j 7j ¼ 7. (b) jMj ¼ 1 2 4 2 1 3 5 7 9 ¼ 9 30 þ 56 20 þ 21 36 ¼ 0. Thus, VðSÞ ¼ 0, or, in other words, u1; u2; u3 lie in a plane and are linearly dependent. 8.12. Find detðMÞ where M ¼ 3 4 0 0 0 2 5 0 0 0 0 9 2 0 0 0 5 0 6 7 0 0 4 3 4 2 6 6 6 6 4 3 7 7 7 7 5 ¼ 3 4 0 0 0 2 5 0 0 0 0 9 2 0 0 0 5 0 6 7 0 0 4 3 4 2 6 6 6 6 4 3 7 7 7 7 5 M is a (lower) triangular block matrix; hence, evaluate the determinant of each diagonal block: 3 4 2 5 ¼ 15 8 ¼ 7; j2j ¼ 2; 6 7 3 4 ¼ 24 21 ¼ 3 Thus, jMj ¼ 7ð2Þð3Þ ¼ 42. 8.13. Find the determinant of F: R3 ! R3 defined by Fðx; y; zÞ ¼ ðx þ 3y 4z; 2y þ 7z; x þ 5y 3zÞ 280 CHAPTER 8 Determinants
  • 288. The determinant of a linear operator F is equal to the determinant of any matrix that represents F. Thus first find the matrix A representing F in the usual basis (whose rows, respectively, consist of the coefficients of x; y; z). Then A ¼ 1 3 4 0 2 7 1 5 3 2 4 3 5; and so detðFÞ ¼ jAj ¼ 6 þ 21 þ 0 þ 8 35 0 ¼ 8 8.14. Write out g ¼ gðx1; x2; x3; x4Þ explicitly where gðx1; x2; . . . ; xnÞ ¼ Q ij ðxi xjÞ: The symbol Q is used for a product of terms in the same way that the symbol P is used for a sum of terms. That is, Q ij ðxi xjÞ means the product of all terms ðxi xjÞ for which i j. Hence, g ¼ gðx1; . . . ; x4Þ ¼ ðx1 x2Þðx1 x3Þðx1 x4Þðx2 x3Þðx2 x4Þðx3 x4Þ 8.15. Let D be a 2-linear, alternating function. Show that DðA; BÞ ¼ DðB; AÞ. Because D is alternating, DðA; AÞ ¼ 0, DðB; BÞ ¼ 0. Hence, DðA þ B; A þ BÞ ¼ DðA; AÞ þ DðA; BÞ þ DðB; AÞ þ DðB; BÞ ¼ DðA; BÞ þ DðB; AÞ However, DðA þ B; A þ BÞ ¼ 0. Hence, DðA; BÞ ¼ DðB; AÞ, as required. Permutations 8.16. Determine the parity (sign) of the permutation s ¼ 364152. Count the number of inversions. That is, for each element k, count the number of elements i in s such that i k and i precedes k in s. Namely, k ¼ 1: 3 numbers ð3; 6; 4Þ k ¼ 2: 4 numbers ð3; 6; 4; 5Þ k ¼ 3: 0 numbers k ¼ 4: 1 number ð6Þ k ¼ 5: 1 number ð6Þ k ¼ 6: 0 numbers Because 3 þ 4 þ 0 þ 1 þ 1 þ 0 ¼ 9 is odd, s is an odd permutation, and sgn s ¼ 1. 8.17. Let s ¼ 24513 and t ¼ 41352 be permutations in S5. Find (a) t s, (b) s1 . Recall that s ¼ 24513 and t ¼ 41352 are short ways of writing s ¼ 1 2 3 4 5 2 4 5 1 3 or sð1Þ ¼ 2; sð2Þ ¼ 4; sð3Þ ¼ 5; sð4Þ ¼ 1; sð5Þ ¼ 3 t ¼ 1 2 3 4 5 4 1 3 5 2 c or tð1Þ ¼ 4; tð2Þ ¼ 1; tð3Þ ¼ 3; tð4Þ ¼ 5; tð5Þ ¼ 2 (a) The effects of s and then t on 1; 2; 3; 4; 5 are as follows: 1 ! 2 ! 1; 2 ! 4 ! 5; 3 ! 5 ! 2; 4 ! 1 ! 4; 5 ! 3 ! 3 [That is, for example, ðt sÞð1Þ ¼ tðsð1ÞÞ ¼ tð2Þ ¼ 1: Thus, t s ¼ 15243. (b) By definition, s1 ð jÞ ¼ k if and only if sðkÞ ¼ j. Hence, s1 ¼ 2 4 5 1 3 1 2 3 4 5 ¼ 1 2 3 4 5 4 1 5 2 3 or s1 ¼ 41523 8.18. Let s ¼ j1 j2 . . . jn be any permutation in Sn. Show that, for each inversion ði; kÞ where i k but i precedes k in s, there is a pair ði*; j*Þ such that i* k* and sði*Þ sð j*Þ ð1Þ and vice versa. Thus, s is even or odd according to whether there is an even or an odd number of pairs satisfying (1). CHAPTER 8 Determinants 281
  • 289. Choose i* and k* so that sði*Þ ¼ i and sðk*Þ ¼ k. Then i k if and only if sði*Þ sðk*Þ, and i precedes k in s if and only if i* k*. 8.19. Consider the polynomials g ¼ gðx1; . . . ; xnÞ and sðgÞ, defined by g ¼ gðx1; . . . ; xnÞ ¼ Q ij ðxi xjÞ and sðgÞ ¼ Q ij ðxsðiÞ xsðjÞÞ (See Problem 8.14.) Show that sðgÞ ¼ g when s is an even permutation, and sðgÞ ¼ g when s is an odd permutation. That is, sðgÞ ¼ ðsgn sÞg. Because s is one-to-one and onto, sðgÞ ¼ Q ij ðxsðiÞ xsð jÞÞ ¼ Q ij or ij ðxi xjÞ Thus, sðgÞ or sðgÞ ¼ g according to whether there is an even or an odd number of terms of the form xi xj, where i j. Note that for each pair ði; jÞ for which i j and sðiÞ sð jÞ there is a term ðxsðiÞ xsð jÞÞ in sðgÞ for which sðiÞ sð jÞ. Because s is even if and only if there is an even number of pairs satisfying (1), we have sðgÞ ¼ g if and only if s is even. Hence, sðgÞ ¼ g if and only if s is odd. 8.20. Let s; t 2 Sn. Show that sgnðt sÞ ¼ ðsgn tÞðsgn sÞ. Thus, the product of two even or two odd permutations is even, and the product of an odd and an even permutation is odd. Using Problem 8.19, we have sgnðt sÞ g ¼ ðt sÞðgÞ ¼ tðsðgÞÞ ¼ tððsgn sÞgÞ ¼ ðsgn tÞðsgn sÞg Accordingly, sgn ðt sÞ ¼ ðsgn tÞðsgn sÞ. 8.21. Consider the permutation s ¼ j1 j2 jn. Show that sgn s1 ¼ sgn s and, for scalars aij, show that aj11aj22 ajnn ¼ a1k1 a2k2 ankn where s1 ¼ k1k2 kn. We have s1 s ¼ e, the identity permutation. Because e is even, s1 and s are both even or both odd. Hence sgn s1 ¼ sgn s. Because s ¼ j1j2 jn is a permutation, aj11aj22 ajnn ¼ a1k1 a2k2 ankn . Then k1; k2; . . . ; kn have the property that sðk1Þ ¼ 1; sðk2Þ ¼ 2; . . . ; sðknÞ ¼ n Let t ¼ k1k2 kn. Then, for i ¼ 1; . . . ; n, ðs tÞðiÞ ¼ sðtðiÞÞ ¼ sðkiÞ ¼ i Thus, s t ¼ e, the identity permutation. Hence, t ¼ s1 . Proofs of Theorems 8.22. Prove Theorem 8.1: jAT j ¼ jAj. If A ¼ ½aij, then AT ¼ ½bij, with bij ¼ aji. Hence, jAT j ¼ P s2Sn ðsgn sÞb1sð1Þb2sð2Þ bnsðnÞ ¼ P s2Sn ðsgn sÞasð1Þ;1asð2Þ;2 asðnÞ;n Let t ¼ s1 . By Problem 8.21 sgn t ¼ sgn s, and asð1Þ;1asð2Þ;2 asðnÞ;n ¼ a1tð1Þa2tð2Þ antðnÞ. Hence, jAT j ¼ P s2Sn ðsgn tÞa1tð1Þa2tð2Þ antðnÞ 282 CHAPTER 8 Determinants
  • 290. However, as s runs through all the elements of Sn; t ¼ s1 also runs through all the elements of Sn. Thus, jAT j ¼ jAj. 8.23. Prove Theorem 8.3(i): If two rows (columns) of A are interchanged, then jBj ¼ jAj. We prove the theorem for the case that two columns are interchanged. Let t be the transposition that interchanges the two numbers corresponding to the two columns of A that are interchanged. If A ¼ ½aij and B ¼ ½bij, then bij ¼ aitðjÞ. Hence, for any permutation s, b1sð1Þb2sð2Þ bnsðnÞ ¼ a1ðt sÞð1Þa2ðt sÞð2Þ anðt sÞðnÞ Thus, jBj ¼ P s2Sn ðsgn sÞb1sð1Þb2sð2Þ bnsðnÞ ¼ P s2Sn ðsgn sÞa1ðt sÞð1Þa2ðt sÞð2Þ anðt sÞðnÞ Because the transposition t is an odd permutation, sgnðt sÞ ¼ ðsgn tÞðsgn sÞ ¼ sgn s. Accordingly, sgn s ¼ sgn ðt sÞ; and so jBj ¼ P s2Sn ½sgnðt sÞa1ðt sÞð1Þa2ðt sÞð2Þ anðt sÞðnÞ But as s runs through all the elements of Sn; t s also runs through all the elements of Sn: Hence, jBj ¼ jAj. 8.24. Prove Theorem 8.2. (i) If A has a row (column) of zeros, then jAj ¼ 0. (ii) If A has two identical rows (columns), then jAj ¼ 0. (iii) If A is triangular, then jAj ¼ product of diagonal elements. Thus, jIj ¼ 1. (i) Each term in jAj contains a factor from every row, and so from the row of zeros. Thus, each term of jAj is zero, and so jAj ¼ 0. (ii) Suppose 1 þ 1 6¼ 0 in K. If we interchange the two identical rows of A, we still obtain the matrix A. Hence, by Problem 8.23, jAj ¼ jAj, and so jAj ¼ 0. Now suppose 1 þ 1 ¼ 0 in K. Then sgn s ¼ 1 for every s 2 Sn: Because A has two identical rows, we can arrange the terms of A into pairs of equal terms. Because each pair is 0, the determinant of A is zero. (iii) Suppose A ¼ ½aij is lower triangular; that is, the entries above the diagonal are all zero: aij ¼ 0 whenever i j. Consider a term t of the determinant of A: t ¼ ðsgn sÞa1i1 a2i2 anin ; where s ¼ i1i2 in Suppose i1 6¼ 1. Then 1 i1 and so a1i1 ¼ 0; hence, t ¼ 0: That is, each term for which i1 6¼ 1 is zero. Now suppose i1 ¼ 1 but i2 6¼ 2. Then 2 i2, and so a2i2 ¼ 0; hence, t ¼ 0. Thus, each term for which i1 6¼ 1 or i2 6¼ 2 is zero. Similarly, we obtain that each term for which i1 6¼ 1 or i2 6¼ 2 or . . . or in 6¼ n is zero. Accordingly, jAj ¼ a11a22 ann ¼ product of diagonal elements. 8.25. Prove Theorem 8.3: B is obtained from A by an elementary operation. (i) If two rows (columns) of A were interchanged, then jBj ¼ jAj. (ii) If a row (column) of A were multiplied by a scalar k, then jBj ¼ kjAj. (iii) If a multiple of a row (column) of A were added to another row (column) of A; then jBj ¼ jAj. (i) This result was proved in Problem 8.23. (ii) If the jth row of A is multiplied by k, then every term in jAj is multiplied by k, and so jBj ¼ kjAj. That is, jBj ¼ P s ðsgn sÞa1i1 a2i2 ðkajij Þ anin ¼ k P s ðsgn sÞa1i1 a2i2 anin ¼ kjAj CHAPTER 8 Determinants 283
  • 291. (iii) Suppose c times the kth row is added to the jth row of A. Using the symbol ^ to denote the jth position in a determinant term, we have jBj ¼ P s ðsgn sÞa1i1 a2i2 ðcakik þ ajij Þ . . . anin ¼ c P s ðsgn sÞa1i1 a2i2 c akik anin þ P s ðsgn sÞa1i1 a2i2 ajij anin The first sum is the determinant of a matrix whose kth and jth rows are identical. Accordingly, by Theorem 8.2(ii), the sum is zero. The second sum is the determinant of A. Thus, jBj ¼ c 0 þ jAj ¼ jAj. 8.26. Prove Lemma 8.6: Let E be an elementary matrix. Then jEAj ¼ jEjjAj. Consider the elementary row operations: (i) Multiply a row by a constant k 6¼ 0, (ii) Interchange two rows, (iii) Add a multiple of one row to another. Let E1; E2; E3 be the corresponding elementary matrices That is, E1; E2; E3 are obtained by applying the above operations to the identity matrix I. By Problem 8.25, jE1j ¼ kjIj ¼ k; jE2j ¼ jIj ¼ 1; jE3j ¼ jIj ¼ 1 Recall (Theorem 3.11) that EiA is identical to the matrix obtained by applying the corresponding operation to A. Thus, by Theorem 8.3, we obtain the following which proves our lemma: jE1Aj ¼ kjAj ¼ jE1jjAj; jE2Aj ¼ jAj ¼ jE2jjAj; jE3Aj ¼ jAj ¼ 1jAj ¼ jE3jjAj 8.27. Suppose B is row equivalent to a square matrix A. Prove that jBj ¼ 0 if and only if jAj ¼ 0. By Theorem 8.3, the effect of an elementary row operation is to change the sign of the determinant or to multiply the determinant by a nonzero scalar. Hence, jBj ¼ 0 if and only if jAj ¼ 0. 8.28. Prove Theorem 8.5: Let A be an n-square matrix. Then the following are equivalent: (i) A is invertible, (ii) AX ¼ 0 has only the zero solution, (iii) detðAÞ 6¼ 0. The proof is by the Gaussian algorithm. If A is invertible, it is row equivalent to I. But jIj 6¼ 0. Hence, by Problem 8.27, jAj 6¼ 0. If A is not invertible, it is row equivalent to a matrix with a zero row. Hence, detðAÞ ¼ 0. Thus, (i) and (iii) are equivalent. If AX ¼ 0 has only the solution X ¼ 0, then A is row equivalent to I and A is invertible. Conversely, if A is invertible with inverse A1 , then X ¼ IX ¼ ðA1 AÞX ¼ A1 ðAXÞ ¼ A1 0 ¼ 0 is the only solution of AX ¼ 0. Thus, (i) and (ii) are equivalent. 8.29. Prove Theorem 8.4: jABj ¼ jAjjBj. If A is singular, then AB is also singular, and so jABj ¼ 0 ¼ jAjjBj. On the other hand, if A is nonsingular, then A ¼ En E2E1, a product of elementary matrices. Then, Lemma 8.6 and induction yields jABj ¼ jEn E2E1Bj ¼ jEnj jE2jjE1jjBj ¼ jAjjBj 8.30. Suppose P is invertible. Prove that jP1 j ¼ jPj1 . P1 P ¼ I: Hence; 1 ¼ jIj ¼ jP1 Pj ¼ jP1 jjPj; and so jP1 j ¼ jPj1 : 8.31. Prove Theorem 8.7: Suppose A and B are similar matrices. Then jAj ¼ jBj. Because A and B are similar, there exists an invertible matrix P such that B ¼ P1 AP. Therefore, using Problem 8.30, we get jBj ¼ jP1 APj ¼ jP1 jjAjjPj ¼ jAjjP1 jjP ¼ jAj. We remark that although the matrices P1 and A may not commute, their determinants jP1 j and jAj do commute, because they are scalars in the field K. 8.32. Prove Theorem 8.8 (Laplace): Let A ¼ ½aij, and let Aij denote the cofactor of aij. Then, for any i or j jAj ¼ ai1Ai1 þ þ ainAin and jAj ¼ a1jA1j þ þ anjAnj d 284 CHAPTER 8 Determinants
  • 292. Because jAj ¼ jAT j, we need only prove one of the expansions, say, the first one in terms of rows of A. Each term in jAj contains one and only one entry of the ith row ðai1; ai2; . . . ; ainÞ of A. Hence, we can write jAj in the form jAj ¼ ai1A* i1 þ ai2A* i2 þ þ ainA* in (Note that A* ij is a sum of terms involving no entry of the ith row of A.) Thus, the theorem is proved if we can show that A* ij ¼ Aij ¼ ð1Þiþj jMijj where Mij is the matrix obtained by deleting the row and column containing the entry aij: (Historically, the expression A* ij was defined as the cofactor of aij, and so the theorem reduces to showing that the two definitions of the cofactor are equivalent.) First we consider the case that i ¼ n, j ¼ n. Then the sum of terms in jAj containing ann is annA* nn ¼ ann P s ðsgn sÞa1sð1Þa2sð2Þ an1;sðn1Þ where we sum over all permutations s 2 Sn for which sðnÞ ¼ n. However, this is equivalent (Prove!) to summing over all permutations of f1; . . . ; n 1g. Thus, A* nn ¼ jMnnj ¼ ð1Þnþn jMnnj. Now we consider any i and j. We interchange the ith row with each succeeding row until it is last, and we interchange the jth column with each succeeding column until it is last. Note that the determinant jMijj is not affected, because the relative positions of the other rows and columns are not affected by these interchanges. However, the ‘‘sign’’ of jAj and of A* ij is changed n 1 and then n j times. Accordingly, A* ij ¼ ð1Þniþnj jMijj ¼ ð1Þiþj jMijj 8.33. Let A ¼ ½aij and let B be the matrix obtained from A by replacing the ith row of A by the row vector ðbi1; . . . ; binÞ. Show that jBj ¼ bi1Ai1 þ bi2Ai2 þ þ binAin Furthermore, show that, for j 6¼ i, aj1Ai1 þ aj2Ai2 þ þ ajnAin ¼ 0 and a1jA1i þ a2jA2i þ þ anjAni ¼ 0 Let B ¼ ½bij. By Theorem 8.8, jBj ¼ bi1Bi1 þ bi2Bi2 þ þ binBin Because Bij does not depend on the ith row of B; we get Bij ¼ Aij for j ¼ 1; . . . ; n. Hence, jBj ¼ bi1Ai1 þ bi2Ai2 þ þ binAin Now let A0 be obtained from A by replacing the ith row of A by the jth row of A. Because A0 has two identical rows, jA0 j ¼ 0. Thus, by the above result, jA0 j ¼ aj1Ai1 þ aj2Ai2 þ þ ajnAin ¼ 0 Using jAT j ¼ jAj, we also obtain that a1jA1i þ a2jA2i þ þ anjAni ¼ 0. 8.34. Prove Theorem 8.9: Aðadj AÞ ¼ ðadj AÞA ¼ jAjI. Let A ¼ ½aij and let Aðadj AÞ ¼ ½bij. The ith row of A is ðai1; ai2; . . . ; ainÞ ð1Þ Because adj A is the transpose of the matrix of cofactors, the jth column of adj A is the tranpose of the cofactors of the jth row of A: ðAj; Aj2; . . . ; AjnÞT ð2Þ Now bij; the ij entry in Aðadj AÞ, is obtained by multiplying expressions (1) and (2): bij ¼ ai1Aj1 þ ai2Aj2 þ þ ainAjn CHAPTER 8 Determinants 285
  • 293. By Theorem 8.8 and Problem 8.33, bij ¼ jAj if i ¼ j 0 if i 6¼ j Accordingly, Aðadj AÞ is the diagonal matrix with each diagonal element jAj. In other words, Aðadj AÞ ¼ jAjI. Similarly, ðadj AÞA ¼ jAjI. 8.35. Prove Theorem 8.10 (Cramer’s rule): The (square) system AX ¼ B has a unique solution if and only if D 6¼ 0. In this case, xi ¼ Ni=D for each i. By previous results, AX ¼ B has a unique solution if and only if A is invertible, and A is invertible if and only if D ¼ jAj 6¼ 0. Now suppose D 6¼ 0. By Theorem 8.9, A1 ¼ ð1=DÞðadj AÞ. Multiplying AX ¼ B by A1 , we obtain X ¼ A1 AX ¼ ð1=DÞðadj AÞB ð1Þ Note that the ith row of ð1=DÞðadj AÞ is ð1=DÞðA1i; A2i; . . . ; AniÞ. If B ¼ ðb1; b2; . . . ; bnÞT , then, by (1), xi ¼ ð1=DÞðb1A1i þ b2A2i þ þ bnAniÞ However, as in Problem 8.33, b1A1i þ b2A2i þ þ bnAni ¼ Ni, the determinant of the matrix obtained by replacing the ith column of A by the column vector B. Thus, xi ¼ ð1=DÞNi, as required. 8.36. Prove Theorem 8.12: Suppose M is an upper (lower) triangular block matrix with diagonal blocks A1; A2; . . . ; An. Then detðMÞ ¼ detðA1Þ detðA2Þ detðAnÞ We need only prove the theorem for n ¼ 2—that is, when M is a square matrix of the form M ¼ A C 0 B . The proof of the general theorem follows easily by induction. Suppose A ¼ ½aij is r-square, B ¼ ½bij is s-square, and M ¼ ½mij is n-square, where n ¼ r þ s. By definition, detðMÞ ¼ P s2Sn ðsgn sÞm1sð1Þm2sð2Þ mnsðnÞ If i r and j r, then mij ¼ 0. Thus, we need only consider those permutations s such that sfr þ 1; r þ 2; . . . ; r þ sg ¼ fr þ 1; r þ 2; . . . ; r þ sg and sf1; 2; . . . ; rg ¼ f1; 2; . . . ; rg Let s1ðkÞ ¼ sðkÞ for k r, and let s2ðkÞ ¼ sðr þ kÞ r for k s. Then ðsgn sÞm1sð1Þm2sð2Þ mnsðnÞ ¼ ðsgn s1Þa1s1ð1Þa2s1ð2Þ ars1ðrÞðsgn s2Þb1s2ð1Þb2s2ð2Þ bss2ðsÞ which implies detðMÞ ¼ detðAÞ detðBÞ. 8.37. Prove Theorem 8.14: There exists a unique function D : M ! K such that (i) D is multilinear, (ii) D is alternating, (iii) DðIÞ ¼ 1. This function D is the determinant function; that is, DðAÞ ¼ jAj. Let D be the determinant function, DðAÞ ¼ jAj. We must show that D satisfies (i), (ii), and (iii), and that D is the only function satisfying (i), (ii), and (iii). By Theorem 8.2, D satisfies (ii) and (iii). Hence, we show that it is multilinear. Suppose the ith row of A ¼ ½aij has the form ðbi1 þ ci1; bi2 þ ci2; . . . ; bin þ cinÞ. Then DðAÞ ¼ DðA1; . . . ; Bi þ Ci; . . . ; AnÞ ¼ P Sn ðsgn sÞa1sð1Þ ai1;sði1ÞðbisðiÞ þ cisðiÞÞ ansðnÞ ¼ P Sn ðsgn sÞa1sð1Þ bisðiÞ ansðnÞ þ P Sn ðsgn sÞa1sð1Þ cisðiÞ ansðnÞ ¼ DðA1; . . . ; Bi; . . . ; AnÞ þ DðA1; . . . ; Ci; . . . ; AnÞ 286 CHAPTER 8 Determinants
  • 294. Also, by Theorem 8.3(ii), DðA1; . . . ; kAi; . . . ; AnÞ ¼ kDðA1; . . . ; Ai; . . . ; AnÞ Thus, D is multilinear—D satisfies (i). We next must prove the uniqueness of D. Suppose D satisfies (i), (ii), and (iii). If fe1; . . . ; eng is the usual basis of Kn , then, by (iii), Dðe1; e2; . . . ; enÞ ¼ DðIÞ ¼ 1. Using (ii), we also have that Dðei1 ; ei2 ; . . . ; ein Þ ¼ sgn s; where s ¼ i1i2 in ð1Þ Now suppose A ¼ ½aij. Observe that the kth row Ak of A is Ak ¼ ðak1; ak2; . . . ; aknÞ ¼ ak1e1 þ ak2e2 þ þ aknen Thus, DðAÞ ¼ Dða11e1 þ þ a1nen; a21e1 þ þ a2nen; . . . ; an1e1 þ þ annenÞ Using the multilinearity of D, we can write DðAÞ as a sum of terms of the form DðAÞ ¼ P Dða1i1 ei1 ; a2i2 ei2 ; . . . ; anin ein Þ ¼ P ða1i1 a2i2 anin ÞDðei1 ; ei2 ; . . . ; ein Þ ð2Þ where the sum is summed over all sequences i1i2 . . . in, where ik 2 f1; . . . ; ng. If two of the indices are equal, say ij ¼ ik but j 6¼ k, then, by (ii), Dðei1 ; ei2 ; . . . ; ein Þ ¼ 0 Accordingly, the sum in (2) need only be summed over all permutations s ¼ i1i2 in. Using (1), we finally have that DðAÞ ¼ P s ða1i1 a2i2 anin ÞDðei1 ; ei2 ; . . . ; ein Þ ¼ P s ðsgn sÞa1i1 a2i2 anin ; where s ¼ i1i2 in Hence, D is the determinant function, and so the theorem is proved. SUPPLEMENTARY PROBLEMS Computation of Determinants 8.38. Evaluate: (a) 2 6 4 1 , (b) 5 1 3 2 , (c) 2 8 5 3 , (d) 4 9 1 3 , (e) a þ b a b a þ b 8.39. Find all t such that (a) t 4 3 2 t 9 ¼ 0, (b) t 1 4 3 t 2 ¼ 0 8.40. Compute the determinant of each of the following matrices: (a) 2 1 1 0 5 2 1 3 4 2 4 3 5, (b) 3 2 4 2 5 1 0 6 1 2 4 3 5, (c) 2 1 4 6 3 2 4 1 2 2 4 3 5, (d) 7 6 5 1 2 1 3 2 1 2 4 3 5 CHAPTER 8 Determinants 287
  • 295. 8.41. Find the determinant of each of the following matrices: (a) 1 2 2 3 1 0 2 0 3 1 1 2 4 3 0 2 2 6 6 4 3 7 7 5, (b) 2 1 3 2 3 0 1 2 1 1 4 3 2 2 1 1 2 6 6 4 3 7 7 5 8.42. Evaluate: (a) 2 1 3 4 2 1 2 1 3 3 5 4 5 2 1 4 , (b) 2 1 4 3 1 1 0 2 3 2 3 1 1 2 2 3 , (c) 1 2 3 1 1 1 2 0 2 0 4 5 1 4 4 6 8.43. Evaluate each of the following determinants: (a) 1 2 1 3 1 2 1 1 2 3 3 1 0 2 1 5 1 2 3 4 2 3 1 1 2 , (b) 1 3 5 7 9 2 4 2 4 2 0 0 1 2 3 0 0 5 6 2 0 0 2 3 1 , (c) 1 2 3 4 5 5 4 3 2 1 0 0 6 5 1 0 0 0 7 4 0 0 0 2 3 Cofactors, Classical Adjoints, Inverses 8.44. Find detðAÞ, adj A, and A1 , where (a) A ¼ 1 1 0 1 1 1 0 2 1 2 4 3 5, (b) A ¼ 1 2 2 3 1 0 1 1 1 2 4 3 5 8.45. Find the classical adjoint of each matrix in Problem 8.41. 8.46. Let A ¼ a b c d . (a) Find adj A, (b) Show that adjðadj AÞ ¼ A, (c) When does A ¼ adj A? 8.47. Show that if A is diagonal (triangular) then adj A is diagonal (triangular). 8.48. Suppose A ¼ ½aij is triangular. Show that (a) A is invertible if and only if each diagonal element aii 6¼ 0. (b) The diagonal elements of A1 (if it exists) are a1 ii , the reciprocals of the diagonal elements of A. Minors, Principal Minors 8.49. Let A ¼ 1 2 3 2 1 0 2 3 3 1 2 5 4 3 0 1 2 6 6 4 3 7 7 5 and B ¼ 1 3 1 5 2 3 1 4 0 5 2 1 3 0 5 2 2 6 6 4 3 7 7 5. Find the minor and the signed minor corresponding to the following submatrices: (a) Að1; 4; 3; 4Þ, (b) Bð1; 4; 3; 4Þ, (c) Að2; 3; 2; 4Þ, (d) Bð2; 3; 2; 4Þ. 8.50. For k ¼ 1; 2; 3, find the sum Sk of all principal minors of order k for (a) A ¼ 1 3 2 2 4 3 5 2 1 2 4 3 5, (b) B ¼ 1 5 4 2 6 1 3 2 0 2 4 3 5, (c) C ¼ 1 4 3 2 1 5 4 7 11 2 4 3 5 288 CHAPTER 8 Determinants
  • 296. 8.51. For k ¼ 1; 2; 3; 4, find the sum Sk of all principal minors of order k for (a) A ¼ 1 2 3 1 1 2 0 5 0 1 2 2 4 0 1 3 2 6 6 4 3 7 7 5, (b) B ¼ 1 2 1 2 0 1 2 3 1 3 0 4 2 7 4 5 2 6 6 4 3 7 7 5 Determinants and Linear Equations 8.52. Solve the following systems by determinants: (a) 3x þ 5y ¼ 8 4x 2y ¼ 1 , (b) 2x 3y ¼ 1 4x þ 7y ¼ 1 , (c) ax 2by ¼ c 3ax 5by ¼ 2c ðab 6¼ 0Þ 8.53. Solve the following systems by determinants: (a) 2x 5y þ 2z ¼ 2 x þ 2y 4z ¼ 5 3x 4y 6z ¼ 1 8 : , (b) 2z þ 3 ¼ y þ 3x x 3z ¼ 2y þ 1 3y þ z ¼ 2 2x 8 : 8.54. Prove Theorem 8.11: The system AX ¼ 0 has a nonzero solution if and only if D ¼ jAj ¼ 0. Permutations 8.55. Find the parity of the permutations s ¼ 32154, t ¼ 13524, p ¼ 42531 in S5. 8.56. For the permutations in Problem 8.55, find (a) t s, (b) p s, (c) s1 , (d) t1 . 8.57. Let t 2 Sn: Show that t s runs through Sn as s runs through Sn; that is, Sn ¼ ft s : s 2 Sng: 8.58. Let s 2 Sn have the property that sðnÞ ¼ n. Let s* 2 Sn1 be defined by s*ðxÞ ¼ sðxÞ. (a) Show that sgn s* ¼ sgn s, (b) Show that as s runs through Sn, where sðnÞ ¼ n, s* runs through Sn1; that is, Sn1 ¼ fs* : s 2 Sn; sðnÞ ¼ ng: 8.59. Consider a permutation s ¼ j1 j2 . . . jn. Let feig be the usual basis of Kn , and let A be the matrix whose ith row is eji [i.e., A ¼ ðej1 , ej2 ; . . . ; ejn Þ]. Show that jAj ¼ sgn s. Determinant of Linear Operators 8.60. Find the determinant of each of the following linear transformations: (a) T:R2 ! R2 defined by Tðx; yÞ ¼ ð2x 9y; 3x 5yÞ, (b) T:R3 ! R3 defined by Tðx; y; zÞ ¼ ð3x 2z; 5y þ 7z; x þ y þ zÞ, (c) T:R3 ! R2 defined by Tðx; y; zÞ ¼ ð2x þ 7y 4z; 4x 6y þ 2zÞ. 8.61. Let D:V ! V be the differential operator; that is, Dð f ðtÞÞ ¼ df =dt. Find detðDÞ if V is the vector space of functions with the following bases: (a) f1; t; . . . ; t5 g, (b) fet ; e2t ; e3t g, (c) fsin t; cos tg. 8.62. Prove Theorem 8.13: Let F and G be linear operators on a vector space V. Then (i) detðF GÞ ¼ detðFÞ detðGÞ, (ii) F is invertible if and only if detðFÞ 6¼ 0. 8.63. Prove (a) detð1V Þ ¼ 1, where 1V is the identity operator, (b) -detðT1 Þ ¼ detðTÞ1 when T is invertible. CHAPTER 8 Determinants 289
  • 297. Miscellaneous Problems 8.64. Find the volume VðSÞ of the parallelopiped S in R3 determined by the following vectors: (a) u1 ¼ ð1; 2; 3Þ, u2 ¼ ð3; 4; 1Þ, u3 ¼ ð2; 1; 5Þ, (b) u1 ¼ ð1; 1; 3Þ, u2 ¼ ð1; 2; 4Þ, u3 ¼ ð4; 1; 5Þ. 8.65. Find the volume VðSÞ of the parallelepiped S in R4 determined by the following vectors: u1 ¼ ð1; 2; 5; 1Þ; u2 ¼ ð2; 1; 2; 1Þ; u3 ¼ ð3; 0; 1 2Þ; u4 ¼ ð1; 1; 4; 1Þ 8.66. Let V be the space of 2 2 matrices M ¼ a b c d over R. Determine whether D:V ! R is 2-linear (with respect to the rows), where ðaÞ DðMÞ ¼ a þ d; ðbÞ DðMÞ ¼ ad; ðcÞ DðMÞ ¼ ac bd; ðdÞ DðMÞ ¼ ab cd; ðeÞ DðMÞ ¼ 0 ðfÞ DðMÞ ¼ 1 8.67. Let A be an n-square matrix. Prove jkAj ¼ kn jAj. 8.68. Let A; B; C; D be commuting n-square matrices. Consider the 2n-square block matrix M ¼ A B C D . Prove that jMj ¼ jAjjDj jBjjCj. Show that the result may not be true if the matrices do not commute. 8.69. Suppose A is orthogonal; that is, AT A ¼ I. Show that detðAÞ ¼ 1. 8.70. Let V be the space of m-square matrices viewed as m-tuples of row vectors. Suppose D:V ! K is m-linear and alternating. Show that (a) Dð. . . ; A; . . . ; B; . . .Þ ¼ Dð. . . ; B; . . . ; A; . . .Þ; sign changed when two rows are interchanged. (b) If A1; A2; . . . ; Am are linearly dependent, then DðA1; A2; . . . ; AmÞ ¼ 0. 8.71. Let V be the space of m-square matrices (as above), and suppose D: V ! K. Show that the following weaker statement is equivalent to D being alternating: DðA1; A2; . . . ; AnÞ ¼ 0 whenever Ai ¼ Aiþ1 for some i Let V be the space of n-square matrices over K. Suppose B 2 V is invertible and so detðBÞ 6¼ 0. Define D: V ! K by DðAÞ ¼ detðABÞ=detðBÞ, where A 2 V. Hence, DðA1; A2; . . . ; AnÞ ¼ detðA1B; A2B; . . . ; AnBÞ=detðBÞ where Ai is the ith row of A, and so AiB is the ith row of AB. Show that D is multilinear and alternating, and that DðIÞ ¼ 1. (This method is used by some texts to prove that jABj ¼ jAjjBj.) 8.72. Show that g ¼ gðx1; . . . ; xnÞ ¼ ð1Þn Vn1ðxÞ where g ¼ gðxiÞ is the difference product in Problem 8.19, x ¼ xn, and Vn1 is the Vandermonde determinant defined by Vn1ðxÞ 1 1 . . . 1 1 x1 x2 . . . xn1 x x2 1 x2 2 . . . x2 n1 x2 :::::::::::::::::::::::::::::::::::::::::::: xn1 1 xn1 2 . . . xn1 n1 xn1 2 6 6 6 6 6 6 6 4 8.73. Let A be any matrix. Show that the signs of a minor A½I; J and its complementary minor A½I0 ; J0 are equal. 290 CHAPTER 8 Determinants
  • 298. 8.74. Let A be an n-square matrix. The determinantal rank of A is the order of the largest square submatrix of A (obtained by deleting rows and columns of A) whose determinant is not zero. Show that the determinantal rank of A is equal to its rank—the maximum number of linearly independent rows (or columns). ANSWERS TO SUPPLEMENTARY PROBLEMS Notation: M ¼ ½R1; R2; . . . denotes a matrix with rows R1; R2; : . . . 8.38. (a) 22, (b) 13, (c) 46, (d) 21, (e) a2 þ ab þ b2 8.39. (a) 3; 10; (b) 5; 2 8.40. (a) 21, (b) 11, (c) 100, (d) 0 8.41. (a) 131, (b) 55 8.42. (a) 33, (b) 0, (c) 45 8.43. (a) 32, (b) 14, (c) 468 8.44. (a) jAj ¼ 2; adj A ¼ ½1; 1; 1; 1; 1; 1; 2; 2; 0, (b) jAj ¼ 1; adj A ¼ ½1; 0; 2; 3; 1; 6; 2; 1; 5. Also, A1 ¼ ðadj AÞ=jAj 8.45. (a) ½16; 29; 26; 2; 30; 38; 16; 29; 8; 51; 13; 1; 13; 1; 28; 18, (b) ½21; 14; 17; 19; 44; 11; 33; 11; 29; 1; 13; 21; 17; 7; 19; 18 8.46. (a) adj A ¼ ½d; b; c; a, (c) A ¼ kI 8.49. (a) 3; 3, (b) 23; 23, (c) 3; 3, (d) 17; 17 8.50. (a) 2; 17; 73, (b) 7; 10; 105, (c) 13; 54; 0 8.51. (a) 6; 13; 62; 219; (b) 7; 37; 30; 20 8.52. (a) x ¼ 21 26 ; y ¼ 29 26; (b) x ¼ 5 13 ; y ¼ 1 13; (c) x ¼ c a ; y ¼ c b 8.53. (a) x ¼ 5; y ¼ 2; z ¼ 1, (b) Because D ¼ 0, the system cannot be solved by determinants. 8.55. (a) sgn s ¼ 1; sgn t ¼ 1; sgn p ¼ 1 8.56. (a) t s ¼ 53142, (b) p s ¼ 52413, (c) s1 ¼ 32154, (d) t1 ¼ 14253 8.60. (a) detðTÞ ¼ 17, (b) detðTÞ ¼ 4, (c) not defined 8.61. (a) 0, (b) 6, (c) 1 8.64. (a) 18, (b) 0 8.65. 17 8.66. (a) no, (b) yes, (c) yes, (d) no, (e) yes, (f ) no CHAPTER 8 Determinants 291
  • 299. CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 9.1 Introduction The ideas in this chapter can be discussed from two points of view. Matrix Point of View Suppose an n-square matrix A is given. The matrix A is said to be diagonalizable if there exists a nonsingular matrix P such that B ¼ P1 AP is diagonal. This chapter discusses the diagonalization of a matrix A. In particular, an algorithm is given to find the matrix P when it exists. Linear Operator Point of View Suppose a linear operator T: V ! V is given. The linear operator T is said to be diagonalizable if there exists a basis S of V such that the matrix representation of T relative to the basis S is a diagonal matrix D. This chapter discusses conditions under which the linear operator T is diagonalizable. Equivalence of the Two Points of View The above two concepts are essentially the same. Specifically, a square matrix A may be viewed as a linear operator F defined by FðXÞ ¼ AX where X is a column vector, and B ¼ P1 AP represents F relative to a new coordinate system (basis) S whose elements are the columns of P. On the other hand, any linear operator T can be represented by a matrix A relative to one basis and, when a second basis is chosen, T is represented by the matrix B ¼ P1 AP where P is the change-of-basis matrix. Most theorems will be stated in two ways: one in terms of matrices A and again in terms of linear mappings T. Role of Underlying Field K The underlying number field K did not play any special role in our previous discussions on vector spaces and linear mappings. However, the diagonalization of a matrix A or a linear operator T will depend on the CHAPTER 9 292
  • 300. roots of a polynomial DðtÞ over K, and these roots do depend on K. For example, suppose DðtÞ ¼ t2 þ 1. Then DðtÞ has no roots if K ¼ R, the real field; but DðtÞ has roots i if K ¼ C, the complex field. Furthermore, finding the roots of a polynomial with degree greater than two is a subject unto itself (frequently discussed in numerical analysis courses). Accordingly, our examples will usually lead to those polynomials DðtÞ whose roots can be easily determined. 9.2 Polynomials of Matrices Consider a polynomial f ðtÞ ¼ antn þ þ a1t þ a0 over a field K. Recall (Section 2.8) that if A is any square matrix, then we define f ðAÞ ¼ anAn þ þ a1A þ a0I where I is the identity matrix. In particular, we say that A is a root of f ðtÞ if f ðAÞ ¼ 0, the zero matrix. EXAMPLE 9.1 Let A ¼ 1 2 3 4 . Then A2 ¼ 7 10 15 22 . Let f ðtÞ ¼ 2t2 3t þ 5 and gðtÞ ¼ t2 5t 2 Then f ðAÞ ¼ 2A2 3A þ 5I ¼ 14 20 30 44 þ 3 6 9 12 þ 5 0 0 5 ¼ 16 14 21 37 and gðAÞ ¼ A2 5A 2I ¼ 7 10 15 22 þ 5 10 15 20 þ 2 0 0 2 ¼ 0 0 0 0 Thus, A is a zero of gðtÞ. The following theorem (proved in Problem 9.7) applies. THEOREM 9.1: Let f and g be polynomials. For any square matrix A and scalar k, (i) ð f þ gÞðAÞ ¼ f ðAÞ þ gðAÞ (iii) ðkf ÞðAÞ ¼ kf ðAÞ (ii) ð fgÞðAÞ ¼ f ðAÞgðAÞ (iv) f ðAÞgðAÞ ¼ gðAÞ f ðAÞ: Observe that (iv) tells us that any two polynomials in A commute. Matrices and Linear Operators Now suppose that T: V ! V is a linear operator on a vector space V. Powers of T are defined by the composition operation: T2 ¼ T T; T3 ¼ T2 T; . . . Also, for any polynomial f ðtÞ ¼ antn þ þ a1t þ a0, we define f ðTÞ in the same way as we did for matrices: f ðTÞ ¼ anTn þ þ a1T þ a0I where I is now the identity mapping. We also say that T is a zero or root of f ðtÞ if f ðTÞ ¼ 0; the zero mapping. We note that the relations in Theorem 9.1 hold for linear operators as they do for matrices. Remark: Suppose A is a matrix representation of a linear operator T. Then f ðAÞ is the matrix representation of f ðTÞ, and, in particular, f ðTÞ ¼ 0 if and only if f ðAÞ ¼ 0. CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 293
  • 301. 9.3 Characteristic Polynomial, Cayley–Hamilton Theorem Let A ¼ ½aij be an n-square matrix. The matrix M ¼ A tIn, where In is the n-square identity matrix and t is an indeterminate, may be obtained by subtracting t down the diagonal of A. The negative of M is the matrix tIn A, and its determinant DðtÞ ¼ detðtIn AÞ ¼ ð1Þn detðA tInÞ which is a polynomial in t of degree n and is called the characteristic polynomial of A. We state an important theorem in linear algebra (proved in Problem 9.8). THEOREM 9.2: (Cayley–Hamilton) Every matrix A is a root of its characteristic polynomial. Remark: Suppose A ¼ ½aij is a triangular matrix. Then tI A is a triangular matrix with diagonal entries t aii; hence, DðtÞ ¼ detðtI AÞ ¼ ðt a11Þðt a22Þ ðt annÞ Observe that the roots of DðtÞ are the diagonal elements of A. EXAMPLE 9.2 Let A ¼ 1 3 4 5 . Its characteristic polynomial is DðtÞ ¼ jtI Aj ¼ t 1 3 4 t 5 ¼ ðt 1Þðt 5Þ 12 ¼ t2 6t 7 As expected from the Cayley–Hamilton theorem, A is a root of DðtÞ; that is, DðAÞ ¼ A2 6A 7I ¼ 13 18 24 37 þ 6 18 24 30 þ 7 0 0 7 ¼ 0 0 0 0 Now suppose A and B are similar matrices, say B ¼ P1 AP, where P is invertible. We show that A and B have the same characteristic polynomial. Using tI ¼ P1 tIP, we have DBðtÞ ¼ detðtI BÞ ¼ detðtI P1 APÞ ¼ detðP1 tIP P1 APÞ ¼ det½P1 ðtI AÞP ¼ detðP1 Þ detðtI AÞ detðPÞ Using the fact that determinants are scalars and commute and that detðP1 Þ detðPÞ ¼ 1, we finally obtain DBðtÞ ¼ detðtI AÞ ¼ DAðtÞ Thus, we have proved the following theorem. THEOREM 9.3: Similar matrices have the same characteristic polynomial. Characteristic Polynomials of Degrees 2 and 3 There are simple formulas for the characteristic polynomials of matrices of orders 2 and 3. (a) Suppose A ¼ a11 a12 a21 a22 . Then DðtÞ ¼ t2 ða11 þ a22Þt þ detðAÞ ¼ t2 trðAÞ t þ detðAÞ Here trðAÞ denotes the trace of A—that is, the sum of the diagonal elements of A. (b) Suppose A ¼ a11 a12 a13 a21 a22 a23 a31 a32 a33 2 4 3 5. Then DðtÞ ¼ t3 trðAÞ t2 þ ðA11 þ A22 þ A33Þt detðAÞ (Here A11, A22, A33 denote, respectively, the cofactors of a11, a22, a33.) 294 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
  • 302. EXAMPLE 9.3 Find the characteristic polynomial of each of the following matrices: (a) A ¼ 5 3 2 10 , (b) B ¼ 7 1 6 2 , (c) C ¼ 5 2 4 4 . (a) We have trðAÞ ¼ 5 þ 10 ¼ 15 and jAj ¼ 50 6 ¼ 44; hence, DðtÞ þ t2 15t þ 44. (b) We have trðBÞ ¼ 7 þ 2 ¼ 9 and jBj ¼ 14 þ 6 ¼ 20; hence, DðtÞ ¼ t2 9t þ 20. (c) We have trðCÞ ¼ 5 4 ¼ 1 and jCj ¼ 20 þ 8 ¼ 12; hence, DðtÞ ¼ t2 t 12. EXAMPLE 9.4 Find the characteristic polynomial of A ¼ 1 1 2 0 3 2 1 3 9 2 4 3 5. We have trðAÞ ¼ 1 þ 3 þ 9 ¼ 13. The cofactors of the diagonal elements are as follows: A11 ¼ 3 2 3 9 ¼ 21; A22 ¼ 1 2 1 9 ¼ 7; A33 ¼ 1 1 0 3 ¼ 3 Thus, A11 þ A22 þ A33 ¼ 31. Also, jAj ¼ 27 þ 2 þ 0 6 6 0 ¼ 17. Accordingly, DðtÞ ¼ t3 13t2 þ 31t 17 Remark: The coefficients of the characteristic polynomial DðtÞ of the 3-square matrix A are, with alternating signs, as follows: S1 ¼ trðAÞ; S2 ¼ A11 þ A22 þ A33; S3 ¼ detðAÞ We note that each Sk is the sum of all principal minors of A of order k. The next theorem, whose proof lies beyond the scope of this text, tells us that this result is true in general. THEOREM 9.4: Let A be an n-square matrix. Then its characteristic polynomial is DðtÞ ¼ tn S1tn1 þ S2tn2 þ þ ð1Þn Sn where Sk is the sum of the principal minors of order k. Characteristic Polynomial of a Linear Operator Now suppose T: V ! V is a linear operator on a vector space V of finite dimension. We define the characteristic polynomial DðtÞ of T to be the characteristic polynomial of any matrix representation of T. Recall that if A and B are matrix representations of T, then B ¼ P1 AP, where P is a change-of-basis matrix. Thus, A and B are similar, and by Theorem 9.3, A and B have the same characteristic polynomial. Accordingly, the characteristic polynomial of T is independent of the particular basis in which the matrix representation of T is computed. Because f ðTÞ ¼ 0 if and only if f ðAÞ ¼ 0, where f ðtÞ is any polynomial and A is any matrix representation of T, we have the following analogous theorem for linear operators. THEOREM 9.20 : (Cayley–Hamilton) A linear operator T is a zero of its characteristic polynomial. CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 295
  • 303. 9.4 Diagonalization, Eigenvalues and Eigenvectors Let A be any n-square matrix. Then A can be represented by (or is similar to) a diagonal matrix D ¼ diagðk1; k2; . . . ; knÞ if and only if there exists a basis S consisting of (column) vectors u1; u2; . . . ; un such that Au1 ¼ k1u1 Au2 ¼ k2u2 :::::::::::::::::::::::::::::::::::: Aun ¼ knun In such a case, A is said to be diagonizable. Furthermore, D ¼ P1 AP, where P is the nonsingular matrix whose columns are, respectively, the basis vectors u1; u2; . . . ; un. The above observation leads us to the following definition. DEFINITION: Let A be any square matrix. A scalar l is called an eigenvalue of A if there exists a nonzero (column) vector v such that Av ¼ lv Any vector satisfying this relation is called an eigenvector of A belonging to the eigenvalue l. We note that each scalar multiple kv of an eigenvector v belonging to l is also such an eigenvector, because AðkvÞ ¼ kðAvÞ ¼ kðlvÞ ¼ lðkvÞ The set El of all such eigenvectors is a subspace of V (Problem 9.19), called the eigenspace of l. (If dim El ¼ 1, then El is called an eigenline and l is called a scaling factor.) The terms characteristic value and characteristic vector (or proper value and proper vector) are sometimes used instead of eigenvalue and eigenvector. The above observation and definitions give us the following theorem. THEOREM 9.5: An n-square matrix A is similar to a diagonal matrix D if and only if A has n linearly independent eigenvectors. In this case, the diagonal elements of D are the corresponding eigenvalues and D ¼ P1 AP, where P is the matrix whose columns are the eigenvectors. Suppose a matrix A can be diagonalized as above, say P1 AP ¼ D, where D is diagonal. Then A has the extremely useful diagonal factorization: A ¼ PDP1 Using this factorization, the algebra of A reduces to the algebra of the diagonal matrix D, which can be easily calculated. Specifically, suppose D ¼ diagðk1; k2; . . . ; knÞ. Then Am ¼ ðPDP1 Þm ¼ PDm P1 ¼ P diagðkm 1 ; . . . ; km n ÞP1 More generally, for any polynomial f ðtÞ, f ðAÞ ¼ f ðPDP1 Þ ¼ Pf ðDÞP1 ¼ P diagð f ðk1Þ; f ðk2Þ; . . . ; f ðknÞÞP1 Furthermore, if the diagonal entries of D are nonnegative, let B ¼ P diagð ffiffiffiffiffi k1 p ; ffiffiffiffiffi k2 p ; . . . ; ffiffiffiffiffi kn p Þ P1 Then B is a nonnegative square root of A; that is, B2 ¼ A and the eigenvalues of B are nonnegative. 296 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
  • 304. EXAMPLE 9.5 Let A ¼ 3 1 2 2 and let v1 ¼ 1 2 and v2 ¼ 1 1 . Then Av1 ¼ 3 1 2 2 1 2 ¼ 1 2 ¼ v1 and Av2 ¼ 3 1 2 2 1 1 ¼ 4 4 ¼ 4v2 Thus, v1 and v2 are eigenvectors of A belonging, respectively, to the eigenvalues l1 ¼ 1 and l2 ¼ 4. Observe that v1 and v2 are linearly independent and hence form a basis of R2 . Accordingly, A is diagonalizable. Furthermore, let P be the matrix whose columns are the eigenvectors v1 and v2. That is, let P ¼ 1 1 2 1 # ; and so P1 ¼ 1 3 1 3 2 3 1 3 # Then A is similar to the diagonal matrix D ¼ P1 AP ¼ 1 3 1 3 2 3 1 3 # 3 1 2 2 # 1 1 2 1 # ¼ 1 0 0 4 # As expected, the diagonal elements 1 and 4 in D are the eigenvalues corresponding, respectively, to the eigenvectors v1 and v2, which are the columns of P. In particular, A has the factorization A ¼ PDP1 ¼ 1 1 2 1 # 1 0 0 4 # 1 3 1 3 2 3 1 3 # Accordingly, A4 ¼ 1 1 2 1 # 1 0 0 256 # 1 3 1 3 2 3 1 3 # ¼ 171 85 170 86 # Moreover, suppose f ðtÞ ¼ t3 5t2 þ 3t þ 6; hence, f ð1Þ ¼ 5 and f ð4Þ ¼ 2. Then f ðAÞ ¼ Pf ðDÞP1 ¼ 1 1 2 1 5 0 0 2 1 3 1 3 2 3 1 3 # ¼ 3 1 2 4 Last, we obtain a ‘‘positive square root’’ of A. Specifically, using ffiffiffi 1 p ¼ 1 and ffiffiffi 4 p ¼ 2, we obtain the matrix B ¼ P ffiffiffiffi D p P1 ¼ 1 1 2 1 1 0 0 2 1 3 1 3 2 3 1 3 # ¼ 5 3 1 3 2 3 4 3 # where B2 ¼ A and where B has positive eigenvalues 1 and 2. Remark: Throughout this chapter, we use the following fact: If P ¼ a b c d ; then P1 ¼ d=jPj b=jPj c=jPj a=jPj : That is, P1 is obtained by interchanging the diagonal elements a and d of P, taking the negatives of the nondiagonal elements b and c, and dividing each element by the determinant jPj. Properties of Eigenvalues and Eigenvectors Example 9.5 indicates the advantages of a diagonal representation (factorization) of a square matrix. In the following theorem (proved in Problem 9.20), we list properties that help us to find such a representation. CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 297
  • 305. THEOREM 9.6: Let A be a square matrix. Then the following are equivalent. (i) A scalar l is an eigenvalue of A. (ii) The matrix M ¼ A lI is singular. (iii) The scalar l is a root of the characteristic polynomial DðtÞ of A. The eigenspace El of an eigenvalue l is the solution space of the homogeneous system MX ¼ 0, where M ¼ A lI; that is, M is obtained by subtracting l down the diagonal of A. Some matrices have no eigenvalues and hence no eigenvectors. However, using Theorem 9.6 and the Fundamental Theorem of Algebra (every polynomial over the complex field C has a root), we obtain the following result. THEOREM 9.7: Let A be a square matrix over the complex field C. Then A has at least one eigenvalue. The following theorems will be used subsequently. (The theorem equivalent to Theorem 9.8 for linear operators is proved in Problem 9.21, and Theorem 9.9 is proved in Problem 9.22.) THEOREM 9.8: Suppose v1; v2; . . . ; vn are nonzero eigenvectors of a matrix A belonging to distinct eigenvalues l1; l2; . . . ; ln. Then v1; v2; . . . ; vn are linearly independent. THEOREM 9.9: Suppose the characteristic polynomial DðtÞ of an n-square matrix A is a product of n distinct factors, say, DðtÞ ¼ ðt a1Þðt a2Þ ðt anÞ. Then A is similar to the diagonal matrix D ¼ diagða1; a2; . . . ; anÞ. If l is an eigenvalue of a matrix A, then the algebraic multiplicity of l is defined to be the multiplicity of l as a root of the characteristic polynomial of A, and the geometric multiplicity of l is defined to be the dimension of its eigenspace, dim El. The following theorem (whose equivalent for linear operators is proved in Problem 9.23) holds. THEOREM 9.10: The geometric multiplicity of an eigenvalue l of a matrix A does not exceed its algebraic multiplicity. Diagonalization of Linear Operators Consider a linear operator T: V ! V. Then T is said to be diagonalizable if it can be represented by a diagonal matrix D. Thus, T is diagonalizable if and only if there exists a basis S ¼ fu1; u2; . . . ; ung of V for which Tðu1Þ ¼ k1u1 Tðu2Þ ¼ k2u2 ::::::::::::::::::::::::::::::::::::::: TðunÞ ¼ knun In such a case, T is represented by the diagonal matrix D ¼ diagðk1; k2; . . . ; knÞ relative to the basis S. The above observation leads us to the following definitions and theorems, which are analogous to the definitions and theorems for matrices discussed above. DEFINITION: Let T be a linear operator. A scalar l is called an eigenvalue of T if there exists a nonzero vector v such that TðvÞ ¼ lv. Every vector satisfying this relation is called an eigenvector of T belonging to the eigenvalue l. 298 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
  • 306. The set El of all eigenvectors belonging to an eigenvalue l is a subspace of V, called the eigenspace of l. (Alternatively, l is an eigenvalue of T if lI T is singular, and, in this case, El is the kernel of lI T.) The algebraic and geometric multiplicities of an eigenvalue l of a linear operator T are defined in the same way as those of an eigenvalue of a matrix A. The following theorems apply to a linear operator T on a vector space V of finite dimension. THEOREM 9.50 : T can be represented by a diagonal matrix D if and only if there exists a basis S of V consisting of eigenvectors of T. In this case, the diagonal elements of D are the corresponding eigenvalues. THEOREM 9.60 : Let T be a linear operator. Then the following are equivalent: (i) A scalar l is an eigenvalue of T. (ii) The linear operator lI T is singular. (iii) The scalar l is a root of the characteristic polynomial DðtÞ of T. THEOREM 9.70 : Suppose V is a complex vector space. Then T has at least one eigenvalue. THEOREM 9.80 : Suppose v1; v2; . . . ; vn are nonzero eigenvectors of a linear operator T belonging to distinct eigenvalues l1; l2; . . . ; ln. Then v1; v2; . . . ; vn are linearly independent. THEOREM 9.90 : Suppose the characteristic polynomial DðtÞ of T is a product of n distinct factors, say, DðtÞ ¼ ðt a1Þðt a2Þ ðt anÞ. Then T can be represented by the diagonal matrix D ¼ diagða1; a2; . . . ; anÞ. THEOREM 9.100 : The geometric multiplicity of an eigenvalue l of T does not exceed its algebraic multiplicity. Remark: The following theorem reduces the investigation of the diagonalization of a linear operator T to the diagonalization of a matrix A. THEOREM 9.11: Suppose A is a matrix representation of T. Then T is diagonalizable if and only if A is diagonalizable. 9.5 Computing Eigenvalues and Eigenvectors, Diagonalizing Matrices This section gives an algorithm for computing eigenvalues and eigenvectors for a given square matrix A and for determining whether or not a nonsingular matrix P exists such that P1 AP is diagonal. ALGORITHM 9.1: (Diagonalization Algorithm) The input is an n-square matrix A. Step 1. Find the characteristic polynomial DðtÞ of A. Step 2. Find the roots of DðtÞ to obtain the eigenvalues of A. Step 3. Repeat (a) and (b) for each eigenvalue l of A. (a) Form the matrix M ¼ A lI by subtracting l down the diagonal of A. (b) Find a basis for the solution space of the homogeneous system MX ¼ 0. (These basis vectors are linearly independent eigenvectors of A belonging to l.) CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 299
  • 307. Step 4. Consider the collection S ¼ fv1; v2; . . . ; vmg of all eigenvectors obtained in Step 3. (a) If m 6¼ n, then A is not diagonalizable. (b) If m ¼ n, then A is diagonalizable. Specifically, let P be the matrix whose columns are the eigenvectors v1; v2; . . . ; vn. Then D ¼ P1 AP ¼ diagðl1; l2; . . . ; lnÞ where li is the eigenvalue corresponding to the eigenvector vi. EXAMPLE 9.6 The diagonalizable algorithm is applied to A ¼ 4 2 3 1 . (1) The characteristic polynomial DðtÞ of A is computed. We have trðAÞ ¼ 4 1 ¼ 3; jAj ¼ 4 6 ¼ 10; hence, DðtÞ ¼ t2 3t 10 ¼ ðt 5Þðt þ 2Þ (2) Set DðtÞ ¼ ðt 5Þðt þ 2Þ ¼ 0. The roots l1 ¼ 5 and l2 ¼ 2 are the eigenvalues of A. (3) (i) We find an eigenvector v1 of A belonging to the eigenvalue l1 ¼ 5. Subtract l1 ¼ 5 down the diagonal of A to obtain the matrix M ¼ 1 2 3 6 . The eigenvectors belonging to l1 ¼ 5 form the solution of the homogeneous system MX ¼ 0; that is, 1 2 3 6 x y ¼ 0 0 or x þ 2y ¼ 0 3x 6y ¼ 0 or x þ 2y ¼ 0 The system has only one free variable. Thus, a nonzero solution, for example, v1 ¼ ð2; 1Þ, is an eigenvector that spans the eigenspace of l1 ¼ 5. (ii) We find an eigenvector v2 of A belonging to the eigenvalue l2 ¼ 2. Subtract 2 (or add 2) down the diagonal of A to obtain the matrix M ¼ 6 2 3 1 and the homogenous system 6x þ 2y ¼ 0 3x þ y ¼ 0 or 3x þ y ¼ 0: The system has only one independent solution. Thus, a nonzero solution, say v2 ¼ ð1; 3Þ; is an eigenvector that spans the eigenspace of l2 ¼ 2: (4) Let P be the matrix whose columns are the eigenvectors v1 and v2. Then P ¼ 2 1 1 3 ; and so P1 ¼ 3 7 1 7 1 7 2 7 # Accordingly, D ¼ P1 AP is the diagonal matrix whose diagonal entries are the corresponding eigenvalues; that is, D ¼ P1 AP ¼ 3 7 1 7 1 7 2 7 # 4 2 3 1 2 1 1 3 ¼ 5 0 0 2 EXAMPLE 9.7 Consider the matrix B ¼ 5 1 1 3 . We have trðBÞ ¼ 5 þ 3 ¼ 8; jBj ¼ 15 þ 1 ¼ 16; so DðtÞ ¼ t2 8t þ 16 ¼ ðt 4Þ2 Accordingly, l ¼ 4 is the only eigenvalue of B. 300 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
  • 308. Subtract l ¼ 4 down the diagonal of B to obtain the matrix M ¼ 1 1 1 1 and the homogeneous system x y ¼ 0 x y ¼ 0 or x y ¼ 0 The system has only one independent solution; for example, x ¼ 1; y ¼ 1. Thus, v ¼ ð1; 1Þ and its multiples are the only eigenvectors of B. Accordingly, B is not diagonalizable, because there does not exist a basis consisting of eigenvectors of B. EXAMPLE 9.8 Consider the matrix A ¼ 3 5 2 3 . Here trðAÞ ¼ 3 3 ¼ 0 and jAj ¼ 9 þ 10 ¼ 1. Thus, DðtÞ ¼ t2 þ 1 is the characteristic polynomial of A. We consider two cases: (a) A is a matrix over the real field R. Then DðtÞ has no (real) roots. Thus, A has no eigenvalues and no eigenvectors, and so A is not diagonalizable. (b) A is a matrix over the complex field C. Then DðtÞ ¼ ðt iÞðt þ iÞ has two roots, i and i. Thus, A has two distinct eigenvalues i and i, and hence, A has two independent eigenvectors. Accordingly there exists a nonsingular matrix P over the complex field C for which P1 AP ¼ i 0 0 i Therefore, A is diagonalizable (over C). 9.6 Diagonalizing Real Symmetric Matrices and Quadratic Forms There are many real matrices A that are not diagonalizable. In fact, some real matrices may not have any (real) eigenvalues. However, if A is a real symmetric matrix, then these problems do not exist. Namely, we have the following theorems. THEOREM 9.12: Let A be a real symmetric matrix. Then each root l of its characteristic polynomial is real. THEOREM 9.13: Let A be a real symmetric matrix. Suppose u and v are eigenvectors of A belonging to distinct eigenvalues l1 and l2. Then u and v are orthogonal, that; is, hu; vi ¼ 0. The above two theorems give us the following fundamental result. THEOREM 9.14: Let A be a real symmetric matrix. Then there exists an orthogonal matrix P such that D ¼ P1 AP is diagonal. The orthogonal matrix P is obtained by normalizing a basis of orthogonal eigenvectors of A as illustrated below. In such a case, we say that A is ‘‘orthogonally diagonalizable.’’ EXAMPLE 9.9 Let A ¼ 2 2 2 5 , a real symmetric matrix. Find an orthogonal matrix P such that P1 AP is diagonal. First we find the characteristic polynomial DðtÞ of A. We have trðAÞ ¼ 2 þ 5 ¼ 7; jAj ¼ 10 4 ¼ 6; so DðtÞ ¼ t2 7t þ 6 ¼ ðt 6Þðt 1Þ Accordingly, l1 ¼ 6 and l2 ¼ 1 are the eigenvalues of A. (a) Subtracting l1 ¼ 6 down the diagonal of A yields the matrix M ¼ 4 2 2 1 and the homogeneous system 4x 2y ¼ 0 2x y ¼ 0 or 2x þ y ¼ 0 A nonzero solution is u1 ¼ ð1; 2Þ. CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 301
  • 309. (b) Subtracting l2 ¼ 1 down the diagonal of A yields the matrix M ¼ 1 2 2 4 and the homogeneous system x 2y ¼ 0 (The second equation drops out, because it is a multiple of the first equation.) A nonzero solution is u2 ¼ ð2; 1Þ. As expected from Theorem 9.13, u1 and u2 are orthogonal. Normalizing u1 and u2 yields the orthonormal vectors ^ u1 ¼ ð1= ffiffiffi 5 p ; 2= ffiffiffi 5 p Þ and ^ u2 ¼ ð2= ffiffiffi 5 p ; 1= ffiffiffi 5 p Þ Finally, let P be the matrix whose columns are ^ u1 and ^ u2, respectively. Then P ¼ 1= ffiffiffi 5 p 2= ffiffiffi 5 p 2= ffiffiffi 5 p 1= ffiffiffi 5 p and P1 AP ¼ 6 0 0 1 As expected, the diagonal entries of P1 AP are the eigenvalues corresponding to the columns of P. The procedure in the above Example 9.9 is formalized in the following algorithm, which finds an orthogonal matrix P such that P1 AP is diagonal. ALGORITHM 9.2: (Orthogonal Diagonalization Algorithm) The input is a real symmetric matrix A. Step 1. Find the characteristic polynomial DðtÞ of A. Step 2. Find the eigenvalues of A, which are the roots of DðtÞ. Step 3. For each eigenvalue l of A in Step 2, find an orthogonal basis of its eigenspace. Step 4. Normalize all eigenvectors in Step 3, which then forms an orthonormal basis of Rn . Step 5. Let P be the matrix whose columns are the normalized eigenvectors in Step 4. Application to Quadratic Forms Let q be a real polynomial in variables x1; x2; . . . ; xn such that every term in q has degree two; that is, qðx1; x2; . . . ; xnÞ ¼ P i cix2 i þ P ij dijxixj; where ci; dij 2 R Then q is called a quadratic form. If there are no cross-product terms xixj (i.e., all dij ¼ 0), then q is said to be diagonal. The above quadratic form q determines a real symmetric matrix A ¼ ½aij, where aii ¼ ci and aij ¼ aji ¼ 1 2 dij. Namely, q can be written in the matrix form qðXÞ ¼ XT AX where X ¼ ½x1; x2; . . . ; xnT is the column vector of the variables. Furthermore, suppose X ¼ PY is a linear substitution of the variables. Then substitution in the quadratic form yields qðYÞ ¼ ðPYÞT AðPYÞ ¼ YT ðPT APÞY Thus, PT AP is the matrix representation of q in the new variables. We seek an orthogonal matrix P such that the orthogonal substitution X ¼ PY yields a diagonal quadratic form for which PT AP is diagonal. Because P is orthogonal, PT ¼ P1 , and hence, PT AP ¼ P1 AP. The above theory yields such an orthogonal matrix P. 302 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
  • 310. EXAMPLE 9.10 Consider the quadratic form qðx; yÞ ¼ 2x2 4xy þ 5y2 ¼ XT AX; where A ¼ 2 2 2 5 and X ¼ x y By Example 9.9, P1 AP ¼ 6 0 0 1 ¼ PT AP; where P ¼ 1= ffiffiffi 5 p 2= ffiffiffi 5 p 2= ffiffiffi 5 p 1= ffiffiffi 5 p # Let Y ¼ ½s; tT : Then matrix P corresponds to the following linear orthogonal substitution x ¼ PY of the variables x and y in terms of the variables s and t: x ¼ 1 ffiffiffi 5 p s þ 2 ffiffiffi 5 p t; y ¼ 2 ffiffiffi 5 p s þ 1 ffiffiffi 5 p t This substitution in qðx; yÞ yields the diagonal quadratic form qðs; tÞ ¼ 6s2 þ t2 . 9.7 Minimal Polynomial Let A be any square matrix. Let JðAÞ denote the collection of all polynomials f ðtÞ for which A is a root— that is, for which f ðAÞ ¼ 0. The set JðAÞ is not empty, because the Cayley–Hamilton Theorem 9.1 tells us that the characteristic polynomial DAðtÞ of A belongs to JðAÞ. Let mðtÞ denote the monic polynomial of lowest degree in JðAÞ. (Such a polynomial mðtÞ exists and is unique.) We call mðtÞ the minimal polynomial of the matrix A. Remark: A polynomial f ðtÞ 6¼ 0 is monic if its leading coefficient equals one. The following theorem (proved in Problem 9.33) holds. THEOREM 9.15: The minimal polynomial mðtÞ of a matrix (linear operator) A divides every polynomial that has A as a zero. In particular, mðtÞ divides the characteristic polynomial DðtÞ of A. There is an even stronger relationship between mðtÞ and DðtÞ. THEOREM 9.16: The characteristic polynomial DðtÞ and the minimal polynomial mðtÞ of a matrix A have the same irreducible factors. This theorem (proved in Problem 9.35) does not say that mðtÞ ¼ DðtÞ, only that any irreducible factor of one must divide the other. In particular, because a linear factor is irreducible, mðtÞ and DðtÞ have the same linear factors. Hence, they have the same roots. Thus, we have the following theorem. THEOREM 9.17: A scalar l is an eigenvalue of the matrix A if and only if l is a root of the minimal polynomial of A. EXAMPLE 9.11 Find the minimal polynomial mðtÞ of A ¼ 2 2 5 3 7 15 1 2 4 2 4 3 5. First find the characteristic polynomial DðtÞ of A. We have trðAÞ ¼ 5; A11 þ A22 þ A33 ¼ 2 3 þ 8 ¼ 7; and jAj ¼ 3 Hence, DðtÞ ¼ t3 5t2 þ 7t 3 ¼ ðt 1Þ2 ðt 3Þ CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 303
  • 311. The minimal polynomial mðtÞ must divide DðtÞ. Also, each irreducible factor of DðtÞ (i.e., t 1 and t 3) must also be a factor of mðtÞ. Thus, mðtÞ is exactly one of the following: f ðtÞ ¼ ðt 3Þðt 1Þ or gðtÞ ¼ ðt 3Þðt 1Þ2 We know, by the Cayley–Hamilton theorem, that gðAÞ ¼ DðAÞ ¼ 0. Hence, we need only test f ðtÞ. We have f ðAÞ ¼ ðA IÞðA 3IÞ ¼ 1 2 5 3 6 15 1 2 5 2 4 3 5 1 2 5 3 4 15 1 2 7 2 4 3 5 ¼ 0 0 0 0 0 0 0 0 0 2 4 3 5 Thus, f ðtÞ ¼ mðtÞ ¼ ðt 1Þðt 3Þ ¼ t2 4t þ 3 is the minimal polynomial of A. EXAMPLE 9.12 (a) Consider the following two r-square matrices, where a 6¼ 0: Jðl; rÞ ¼ l 1 0 . . . 0 0 0 l 1 . . . 0 0 ::::::::::::::::::::::::::::::::: 0 0 0 . . . l 1 0 0 0 . . . 0 l 2 6 6 6 6 4 3 7 7 7 7 5 and A ¼ l a 0 . . . 0 0 0 l a . . . 0 0 ::::::::::::::::::::::::::::::::: 0 0 0 . . . l a 0 0 0 . . . 0 l 2 6 6 6 6 4 3 7 7 7 7 5 The first matrix, called a Jordan Block, has l’s on the diagonal, 1’s on the superdiagonal (consisting of the entries above the diagonal entries), and 0’s elsewhere. The second matrix A has l’s on the diagonal, a’s on the superdiagonal, and 0’s elsewhere. [Thus, A is a generalization of Jðl; rÞ.] One can show that f ðtÞ ¼ ðt lÞr is both the characteristic and minimal polynomial of both Jðl; rÞ and A. (b) Consider an arbitrary monic polynomial: f ðtÞ ¼ tn þ an1tn1 þ þ a1t þ a0 Let Cð f Þ be the n-square matrix with 1’s on the subdiagonal (consisting of the entries below the diagonal entries), the negatives of the coefficients in the last column, and 0’s elsewhere as follows: Cð f Þ ¼ 0 0 . . . 0 a0 1 0 . . . 0 a1 0 1 . . . 0 a2 :::::::::::::::::::::::::::::::::: 0 0 . . . 1 an1 2 6 6 6 6 4 3 7 7 7 7 5 Then Cð f Þ is called the companion matrix of the polynomial f ðtÞ. Moreover, the minimal polynomial mðtÞ and the characteristic polynomial DðtÞ of the companion matrix Cð f Þ are both equal to the original polynomial f ðtÞ. Minimal Polynomial of a Linear Operator The minimal polynomial mðtÞ of a linear operator T is defined to be the monic polynomial of lowest degree for which T is a root. However, for any polynomial f ðtÞ, we have f ðTÞ ¼ 0 if and only if f ðAÞ ¼ 0 where A is any matrix representation of T. Accordingly, T and A have the same minimal polynomials. Thus, the above theorems on the minimal polynomial of a matrix also hold for the minimal polynomial of a linear operator. That is, we have the following theorems. THEOREM 9.150 : The minimal polynomial mðtÞ of a linear operator T divides every polynomial that has T as a root. In particular, mðtÞ divides the characteristic polynomial DðtÞ of T. 304 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
  • 312. THEOREM 9.160 : The characteristic and minimal polynomials of a linear operator T have the same irreducible factors. THEOREM 9.170 : A scalar l is an eigenvalue of a linear operator T if and only if l is a root of the minimal polynomial mðtÞ of T. 9.8 Characteristic and Minimal Polynomials of Block Matrices This section discusses the relationship of the characteristic polynomial and the minimal polynomial to certain (square) block matrices. Characteristic Polynomial and Block Triangular Matrices Suppose M is a block triangular matrix, say M ¼ A1 B 0 A2 , where A1 and A2 are square matrices. Then tI M is also a block triangular matrix, with diagonal blocks tI A1 and tI A2. Thus, jtI Mj ¼ tI A1 B 0 tI A2 ¼ jtI A1jjtI A2j That is, the characteristic polynomial of M is the product of the characteristic polynomials of the diagonal blocks A1 and A2. By induction, we obtain the following useful result. THEOREM 9.18: Suppose M is a block triangular matrix with diagonal blocks A1; A2; . . . ; Ar. Then the characteristic polynomial of M is the product of the characteristic polynomials of the diagonal blocks Ai; that is, DM ðtÞ ¼ DA1 ðtÞDA2 ðtÞ . . . DAr ðtÞ EXAMPLE 9.13 Consider the matrix M ¼ 9 1 5 7 8 3 2 4 0 0 3 6 0 0 1 8 2 6 6 4 3 7 7 5. Then M is a block triangular matrix with diagonal blocks A ¼ 9 1 8 3 and B ¼ 3 6 1 8 . Here trðAÞ ¼ 9 þ 3 ¼ 12; trðBÞ ¼ 3 þ 8 ¼ 11; detðAÞ ¼ 27 þ 8 ¼ 35; detðBÞ ¼ 24 þ 6 ¼ 30; and so and so DAðtÞ ¼ t2 12t þ 35 ¼ ðt 5Þðt 7Þ DBðtÞ ¼ t2 11t þ 30 ¼ ðt 5Þðt 6Þ Accordingly, the characteristic polynomial of M is the product DM ðtÞ ¼ DAðtÞDBðtÞ ¼ ðt 5Þ2 ðt 6Þðt 7Þ Minimal Polynomial and Block Diagonal Matrices The following theorem (proved in Problem 9.36) holds. THEOREM 9.19: Suppose M is a block diagonal matrix with diagonal blocks A1; A2; . . . ; Ar. Then the minimal polynomial of M is equal to the least common multiple (LCM) of the minimal polynomials of the diagonal blocks Ai. Remark: We emphasize that this theorem applies to block diagonal matrices, whereas the analogous Theorem 9.18 on characteristic polynomials applies to block triangular matrices. CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 305
  • 313. EXAMPLE 9.14 Find the characteristic polynomal DðtÞ and the minimal polynomial mðtÞ of the block diagonal matrix: M ¼ 2 5 0 0 0 0 2 0 0 0 0 0 4 2 0 0 0 3 5 0 0 0 0 0 7 2 6 6 6 6 4 3 7 7 7 7 5 ¼ diagðA1; A2; A3Þ; where A1 ¼ 2 5 0 2 ; A2 ¼ 4 2 3 5 ; A3 ¼ ½7 Then DðtÞ is the product of the characterization polynomials D1ðtÞ, D2ðtÞ, D3ðtÞ of A1; A2; A3, respectively. One can show that D1ðtÞ ¼ ðt 2Þ2 ; D2ðtÞ ¼ ðt 2Þðt 7Þ; D3ðtÞ ¼ t 7 Thus, DðtÞ ¼ ðt 2Þ3 ðt 7Þ2 . [As expected, deg DðtÞ ¼ 5: The minimal polynomials m1ðtÞ, m2ðtÞ, m3ðtÞ of the diagonal blocks A1; A2; A3, respectively, are equal to the characteristic polynomials; that is, m1ðtÞ ¼ ðt 2Þ2 ; m2ðtÞ ¼ ðt 2Þðt 7Þ; m3ðtÞ ¼ t 7 But mðtÞ is equal to the least common multiple of m1ðtÞ; m2ðtÞ; m3ðtÞ. Thus, mðtÞ ¼ ðt 2Þ2 ðt 7Þ. SOLVED PROBLEMS Polynomials of Matrices, Characteristic Polynomials 9.1. Let A ¼ 1 2 4 5 . Find f ðAÞ, where ðaÞ f ðtÞ ¼ t2 3t þ 7; ðbÞ f ðtÞ ¼ t2 6t þ 13 First find A2 ¼ 1 2 4 5 1 2 4 5 ¼ 7 12 24 17 . Then (a) f ðAÞ ¼ A2 3A þ 7I ¼ 7 12 24 17 þ 3 6 12 15 þ 7 0 0 7 ¼ 3 6 12 9 (b) f ðAÞ ¼ A2 6A þ 13I ¼ 7 12 24 17 þ 6 12 24 30 þ 13 0 0 13 ¼ 0 0 0 0 [Thus, A is a root of f ðtÞ.] 9.2. Find the characteristic polynomial DðtÞ of each of the following matrices: (a) A ¼ 2 5 4 1 , (b) B ¼ 7 3 5 2 , (c) C ¼ 3 2 9 3 Use the formula ðtÞ ¼ t2 trðMÞ t þ jMj for a 2 2 matrix M: (a) trðAÞ ¼ 2 þ 1 ¼ 3, jAj ¼ 2 20 ¼ 18, so DðtÞ ¼ t2 3t 18 (b) trðBÞ ¼ 7 2 ¼ 5, jBj ¼ 14 þ 15 ¼ 1, so DðtÞ ¼ t2 5t þ 1 (c) trðCÞ ¼ 3 3 ¼ 0, jCj ¼ 9 þ 18 ¼ 9, so DðtÞ ¼ t2 þ 9 9.3. Find the characteristic polynomial DðtÞ of each of the following matrices: (a) A ¼ 1 2 3 3 0 4 6 4 5 2 4 3 5, (b) B ¼ 1 6 2 3 2 0 0 3 4 2 4 3 5 306 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
  • 314. Use the formula DðtÞ ¼ t3 trðAÞt2 þ ðA11 þ A22 þ A33Þt jAj, where Aii is the cofactor of aii in the 3 3 matrix A ¼ ½aij. (a) trðAÞ ¼ 1 þ 0 þ 5 ¼ 6, A11 ¼ 0 4 4 5 ¼ 16; A22 ¼ 1 3 6 5 ¼ 13; A33 ¼ 1 2 3 0 ¼ 6 A11 þ A22 þ A33 ¼ 35, and jAj ¼ 48 þ 36 16 30 ¼ 38 Thus; DðtÞ ¼ t3 6t2 35t 38 (b) trðBÞ ¼ 1 þ 2 4 ¼ 1 B11 ¼ 2 0 3 4 ¼ 8; B22 ¼ 1 2 0 4 ¼ 4; B33 ¼ 1 6 3 2 ¼ 20 B11 þ B22 þ B33 ¼ 8, and jBj ¼ 8 þ 18 72 ¼ 62 Thus; DðtÞ ¼ t3 þ t2 8t þ 62 9.4. Find the characteristic polynomial DðtÞ of each of the following matrices: (a) A ¼ 2 5 1 1 1 4 2 2 0 0 6 5 0 0 2 3 2 6 6 4 3 7 7 5, (b) B ¼ 1 1 2 2 0 3 3 4 0 0 5 5 0 0 0 6 2 6 6 4 3 7 7 5 (a) A is block triangular with diagonal blocks A1 ¼ 2 5 1 4 and A2 ¼ 6 5 2 3 Thus; DðtÞ ¼ DA1 ðtÞDA2 ðtÞ ¼ ðt2 6t þ 3Þðt2 9t þ 28Þ (b) Because B is triangular, DðtÞ ¼ ðt 1Þðt 3Þðt 5Þðt 6Þ. 9.5. Find the characteristic polynomial DðtÞ of each of the following linear operators: (a) F: R2 ! R2 defined by Fðx; yÞ ¼ ð3x þ 5y; 2x 7yÞ. (b) D: V ! V defined by Dð f Þ ¼ df =dt, where V is the space of functions with basis S ¼ fsin t; cos tg. The characteristic polynomial DðtÞ of a linear operator is equal to the characteristic polynomial of any matrix A that represents the linear operator. (a) Find the matrix A that represents T relative to the usual basis of R2 . We have A ¼ 3 5 2 7 ; so DðtÞ ¼ t2 trðAÞ t þ jAj ¼ t2 þ 4t 31 (b) Find the matrix A representing the differential operator D relative to the basis S. We have Dðsin tÞ ¼ cos t ¼ 0ðsin tÞ þ 1ðcos tÞ Dðcos tÞ ¼ sin t ¼ 1ðsin tÞ þ 0ðcos tÞ and so A ¼ 0 1 1 0 DðtÞ ¼ t2 trðAÞ t þ jAj ¼ t2 þ 1 Therefore; 9.6. Show that a matrix A and its transpose AT have the same characteristic polynomial. By the transpose operation, ðtI AÞT ¼ tIT AT ¼ tI AT . Because a matrix and its transpose have the same determinant, DAðtÞ ¼ jtI Aj ¼ jðtI AÞT j ¼ jtI AT j ¼ DAT ðtÞ CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 307
  • 315. 9.7. Prove Theorem 9.1: Let f and g be polynomials. For any square matrix A and scalar k, (i) ð f þ gÞðAÞ ¼ f ðAÞ þ gðAÞ, (iii) ðkf ÞðAÞ ¼ kf ðAÞ, (ii) ð fgÞðAÞ ¼ f ðAÞgðAÞ, (iv) f ðAÞgðAÞ ¼ gðAÞf ðAÞ. Suppose f ¼ antn þ þ a1t þ a0 and g ¼ bmtm þ þ b1t þ b0. Then, by definition, f ðAÞ ¼ anAn þ þ a1A þ a0I and gðAÞ ¼ bmAm þ þ b1A þ b0I (i) Suppose m n and let bi ¼ 0 if i m. Then f þ g ¼ ðan þ bnÞtn þ þ ða1 þ b1Þt þ ða0 þ b0Þ Hence, ð f þ gÞðAÞ ¼ ðan þ bnÞAn þ þ ða1 þ b1ÞA þ ða0 þ b0ÞI ¼ anAn þ bnAn þ þ a1A þ b1A þ a0I þ b0I ¼ f ðAÞ þ gðAÞ (ii) By definition, fg ¼ cnþmtnþm þ þ c1t þ c0 ¼ P nþm k¼0 cktk , where ck ¼ a0bk þ a1bk1 þ þ akb0 ¼ P k i¼0 aibki Hence, ð fgÞðAÞ ¼ P nþm k¼0 ckAk and f ðAÞgðAÞ ¼ P n i¼0 aiAi P m j¼0 bjAj ¼ P n i¼0 P m j¼0 aibjAiþj ¼ P nþm k¼0 ckAk ¼ ð fgÞðAÞ (iii) By definition, kf ¼ kantn þ þ ka1t þ ka0, and so ðkf ÞðAÞ ¼ kanAn þ þ ka1A þ ka0I ¼ kðanAn þ þ a1A þ a0IÞ ¼ kf ðAÞ (iv) By (ii), gðAÞf ðAÞ ¼ ðgf ÞðAÞ ¼ ð fgÞðAÞ ¼ f ðAÞgðAÞ. 9.8. Prove the Cayley–Hamilton Theorem 9.2: Every matrix A is a root of its characterstic polynomial DðtÞ. Let A be an arbitrary n-square matrix and let DðtÞ be its characteristic polynomial, say, DðtÞ ¼ jtI Aj ¼ tn þ an1tn1 þ þ a1t þ a0 Now let BðtÞ denote the classical adjoint of the matrix tI A. The elements of BðtÞ are cofactors of the matrix tI A and hence are polynomials in t of degree not exceeding n 1. Thus, BðtÞ ¼ Bn1tn1 þ þ B1t þ B0 where the Bi are n-square matrices over K which are independent of t. By the fundamental property of the classical adjoint (Theorem 8.9), ðtI AÞBðtÞ ¼ jtI AjI, or ðtI AÞðBn1tn1 þ þ B1t þ B0Þ ¼ ðtn þ an1tn1 þ þ a1t þ a0ÞI Removing the parentheses and equating corresponding powers of t yields Bn1 ¼ I; Bn2 ABn1 ¼ an1I; . . . ; B0 AB1 ¼ a1I; AB0 ¼ a0I Multiplying the above equations by An ; An1 ; . . . ; A; I, respectively, yields An Bn1 ¼ AnI; An1 Bn2 An Bn1 ¼ an1An1 ; . . . ; AB0 A2 B1 ¼ a1A; AB0 ¼ a0I Adding the above matrix equations yields 0 on the left-hand side and DðAÞ on the right-hand side; that is, 0 ¼ An þ an1An1 þ þ a1A þ a0I Therefore, DðAÞ ¼ 0, which is the Cayley–Hamilton theorem. 308 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
  • 316. Eigenvalues and Eigenvectors of 2 2 Matrices 9.9. Let A ¼ 3 4 2 6 . (a) Find all eigenvalues and corresponding eigenvectors. (b) Find matrices P and D such that P is nonsingular and D ¼ P1 AP is diagonal. (a) First find the characteristic polynomial DðtÞ of A: DðtÞ ¼ t2 trðAÞ t þ jAj ¼ t2 þ 3t 10 ¼ ðt 2Þðt þ 5Þ The roots l ¼ 2 and l ¼ 5 of DðtÞ are the eigenvalues of A. We find corresponding eigenvectors. (i) Subtract l ¼ 2 down the diagonal of A to obtain the matrix M ¼ A 2I, where the corresponding homogeneous system MX ¼ 0 yields the eigenvectors corresponding to l ¼ 2. We have M ¼ 1 4 2 8 ; corresponding to x 4y ¼ 0 2x 8y ¼ 0 or x 4y ¼ 0 The system has only one free variable, and v1 ¼ ð4; 1Þ is a nonzero solution. Thus, v1 ¼ ð4; 1Þ is an eigenvector belonging to (and spanning the eigenspace of) l ¼ 2. (ii) Subtract l ¼ 5 (or, equivalently, add 5) down the diagonal of A to obtain M ¼ 8 4 2 1 ; corresponding to 8x 4y ¼ 0 2x y ¼ 0 or 2x y ¼ 0 The system has only one free variable, and v2 ¼ ð1; 2Þ is a nonzero solution. Thus, v2 ¼ ð1; 2Þ is an eigenvector belonging to l ¼ 5. (b) Let P be the matrix whose columns are v1 and v2. Then P ¼ 4 1 1 2 and D ¼ P1 AP ¼ 2 0 0 5 Note that D is the diagonal matrix whose diagonal entries are the eigenvalues of A corresponding to the eigenvectors appearing in P. Remark: Here P is the change-of-basis matrix from the usual basis of R2 to the basis S ¼ fv1; v2g, and D is the matrix that represents (the matrix function) A relative to the new basis S. 9.10. Let A ¼ 2 2 1 3 . (a) Find all eigenvalues and corresponding eigenvectors. (b) Find a nonsingular matrix P such that D ¼ P1 AP is diagonal, and P1 . (c) Find A6 and f ðAÞ, where t4 3t3 6t2 þ 7t þ 3. (d) Find a ‘‘real cube root’’ of B—that is, a matrix B such that B3 ¼ A and B has real eigenvalues. (a) First find the characteristic polynomial DðtÞ of A: DðtÞ ¼ t2 trðAÞ t þ jAj ¼ t2 5t þ 4 ¼ ðt 1Þðt 4Þ The roots l ¼ 1 and l ¼ 4 of DðtÞ are the eigenvalues of A. We find corresponding eigenvectors. (i) Subtract l ¼ 1 down the diagonal of A to obtain the matrix M ¼ A lI, where the corresponding homogeneous system MX ¼ 0 yields the eigenvectors belonging to l ¼ 1. We have M ¼ 1 2 1 2 ; corresponding to x þ 2y ¼ 0 x þ 2y ¼ 0 or x þ 2y ¼ 0 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 309
  • 317. The system has only one independent solution; for example, x ¼ 2, y ¼ 1. Thus, v1 ¼ ð2; 1Þ is an eigenvector belonging to (and spanning the eigenspace of) l ¼ 1. (ii) Subtract l ¼ 4 down the diagonal of A to obtain M ¼ 2 2 1 1 ; corresponding to 2x þ 2y ¼ 0 x y ¼ 0 or x y ¼ 0 The system has only one independent solution; for example, x ¼ 1, y ¼ 1. Thus, v2 ¼ ð1; 1Þ is an eigenvector belonging to l ¼ 4. (b) Let P be the matrix whose columns are v1 and v2. Then P ¼ 2 1 1 1 and D ¼ P1 AP ¼ 1 0 0 4 ; where P1 ¼ 1 3 1 3 1 3 2 3 # (c) Using the diagonal factorization A ¼ PDP1 , and 16 ¼ 1 and 46 ¼ 4096, we get A6 ¼ PD6 P1 ¼ 2 1 1 1 # 1 0 0 4096 # 1 3 1 3 1 3 2 3 # ¼ 1366 2230 1365 2731 # Also, f ð1Þ ¼ 2 and f ð4Þ ¼ 1. Hence, f ðAÞ ¼ Pf ðDÞP1 ¼ 2 1 1 1 # 2 0 0 1 # 1 3 1 3 1 3 2 3 # ¼ 1 2 1 0 # (d) Here 1 0 0 ffiffiffi 4 3 p is the real cube root of D. Hence the real cube root of A is B ¼ P ffiffiffiffi D 3 p P1 ¼ 2 1 1 1 # 1 0 0 ffiffiffi 4 3 p # 1 3 1 3 1 3 2 3 # ¼ 1 3 2 þ ffiffiffi 4 3 p 2 þ 2 ffiffiffi 4 3 p 1 þ ffiffiffi 4 3 p 1 þ 2 ffiffiffi 4 3 p # 9.11. Each of the following real matrices defines a linear transformation on R2 : (a) A ¼ 5 6 3 2 , (b) B ¼ 1 1 2 1 , (c) C ¼ 5 1 1 3 Find, for each matrix, all eigenvalues and a maximum set S of linearly independent eigenvectors. Which of these linear operators are diagonalizable—that is, which can be represented by a diagonal matrix? (a) First find DðtÞ ¼ t2 3t 28 ¼ ðt 7Þðt þ 4Þ. The roots l ¼ 7 and l ¼ 4 are the eigenvalues of A. We find corresponding eigenvectors. (i) Subtract l ¼ 7 down the diagonal of A to obtain M ¼ 2 6 3 9 ; corresponding to 2x þ 6y ¼ 0 3x 9y ¼ 0 or x 3y ¼ 0 Here v1 ¼ ð3; 1Þ is a nonzero solution. (ii) Subtract l ¼ 4 (or add 4) down the diagonal of A to obtain M ¼ 9 6 3 2 ; corresponding to 9x þ 6y ¼ 0 3x þ 2y ¼ 0 or 3x þ 2y ¼ 0 Here v2 ¼ ð2; 3Þ is a nonzero solution. Then S ¼ fv1; v2g ¼ fð3; 1Þ; ð2; 3Þg is a maximal set of linearly independent eigenvectors. Because S is a basis of R2 , A is diagonalizable. Using the basis S, A is represented by the diagonal matrix D ¼ diagð7; 4Þ. (b) First find the characteristic polynomial DðtÞ ¼ t2 þ 1. There are no real roots. Thus B, a real matrix representing a linear transformation on R2 , has no eigenvalues and no eigenvectors. Hence, in particular, B is not diagonalizable. 310 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
  • 318. (c) First find DðtÞ ¼ t2 8t þ 16 ¼ ðt 4Þ2 . Thus, l ¼ 4 is the only eigenvalue of C. Subtract l ¼ 4 down the diagonal of C to obtain M ¼ 1 1 1 1 ; corresponding to x y ¼ 0 The homogeneous system has only one independent solution; for example, x ¼ 1, y ¼ 1. Thus, v ¼ ð1; 1Þ is an eigenvector of C. Furthermore, as there are no other eigenvalues, the singleton set S ¼ fvg ¼ fð1; 1Þg is a maximal set of linearly independent eigenvectors of C. Furthermore, because S is not a basis of R2 , C is not diagonalizable. 9.12. Suppose the matrix B in Problem 9.11 represents a linear operator on complex space C2 . Show that, in this case, B is diagonalizable by finding a basis S of C2 consisting of eigenvectors of B. The characteristic polynomial of B is still DðtÞ ¼ t2 þ 1. As a polynomial over C, DðtÞ does factor; specifically, DðtÞ ¼ ðt iÞðt þ iÞ. Thus, l ¼ i and l ¼ i are the eigenvalues of B. (i) Subtract l ¼ i down the diagonal of B to obtain the homogeneous system ð1 iÞx y ¼ 0 2x þ ð1 iÞy ¼ 0 or ð1 iÞx y ¼ 0 The system has only one independent solution; for example, x ¼ 1, y ¼ 1 i. Thus, v1 ¼ ð1; 1 iÞ is an eigenvector that spans the eigenspace of l ¼ i. (ii) Subtract l ¼ i (or add i) down the diagonal of B to obtain the homogeneous system ð1 þ iÞx y ¼ 0 2x þ ð1 þ iÞy ¼ 0 or ð1 þ iÞx y ¼ 0 The system has only one independent solution; for example, x ¼ 1, y ¼ 1 þ i. Thus, v2 ¼ ð1; 1 þ iÞ is an eigenvector that spans the eigenspace of l ¼ i. As a complex matrix, B is diagonalizable. Specifically, S ¼ fv1; v2g ¼ fð1; 1 iÞ; ð1; 1 þ iÞg is a basis of C2 consisting of eigenvectors of B. Using this basis S, B is represented by the diagonal matrix D ¼ diagði; iÞ. 9.13. Let L be the linear transformation on R2 that reflects each point P across the line y ¼ kx, where k 0. (See Fig. 9-1.) (a) Show that v1 ¼ ðk; 1Þ and v2 ¼ ð1; kÞ are eigenvectors of L. (b) Show that L is diagonalizable, and find a diagonal representation D. (a) The vector v1 ¼ ðk; 1Þ lies on the line y ¼ kx, and hence is left fixed by L; that is, Lðv1Þ ¼ v1. Thus, v1 is an eigenvector of L belonging to the eigenvalue l1 ¼ 1. The vector v2 ¼ ð1; kÞ is perpendicular to the line y ¼ kx, and hence, L reflects v2 into its negative; that is, Lðv2Þ ¼ v2. Thus, v2 is an eigenvector of L belonging to the eigenvalue l2 ¼ 1. y x 0 L P ( ) P L( ) v2 v2 y k = x Figure 9-1 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 311
  • 319. (b) Here S ¼ fv1; v2g is a basis of R2 consisting of eigenvectors of L. Thus, L is diagonalizable, with the diagonal representation D ¼ 1 0 0 1 (relative to the basis S). Eigenvalues and Eigenvectors 9.14. Let A ¼ 4 1 1 2 5 2 1 1 2 2 4 3 5: (a) Find all eigenvalues of A. (b) Find a maximum set S of linearly independent eigenvectors of A. (c) Is A diagonalizable? If yes, find P such that D ¼ P1 AP is diagonal. (a) First find the characteristic polynomial DðtÞ of A. We have trðAÞ ¼ 4 þ 5 þ 2 ¼ 11 and jAj ¼ 40 2 2 þ 5 þ 8 4 ¼ 45 Also, find each cofactor Aii of aii in A: A11 ¼ 5 2 1 2 ¼ 12; A22 ¼ 4 1 1 2 ¼ 9; A33 ¼ 4 1 2 5 ¼ 18 Hence; DðtÞ ¼ t3 trðAÞ t2 þ ðA11 þ A22 þ A33Þt jAj ¼ t3 11t2 þ 39t 45 Assuming Dt has a rational root, it must be among 1, 3, 5, 9, 15, 45. Testing, by synthetic division, we get 3 1 11 þ 39 45 3 24 þ 45 1 8 þ 15 þ 0 Thus, t ¼ 3 is a root of DðtÞ. Also, t 3 is a factor and t2 8t þ 15 is a factor. Hence, DðtÞ ¼ ðt 3Þðt2 8t þ 15Þ ¼ ðt 3Þðt 5Þðt 3Þ ¼ ðt 3Þ2 ðt 5Þ Accordingly, l ¼ 3 and l ¼ 5 are eigenvalues of A. (b) Find linearly independent eigenvectors for each eigenvalue of A. (i) Subtract l ¼ 3 down the diagonal of A to obtain the matrix M ¼ 1 1 1 2 2 2 1 1 1 2 4 3 5; corresponding to x þ y z ¼ 0 Here u ¼ ð1; 1; 0Þ and v ¼ ð1; 0; 1Þ are linearly independent solutions. (ii) Subtract l ¼ 5 down the diagonal of A to obtain the matrix M ¼ 1 1 1 2 0 2 1 1 3 2 4 3 5; corresponding to x þ y z ¼ 0 2x 2z ¼ 0 x þ y 3z ¼ 0 or x z ¼ 0 y 2z ¼ 0 Only z is a free variable. Here w ¼ ð1; 2; 1Þ is a solution. Thus, S ¼ fu; v; wg ¼ fð1; 1; 0Þ; ð1; 0; 1Þ; ð1; 2; 1Þg is a maximal set of linearly independent eigenvectors of A. Remark: The vectors u and v were chosen so that they were independent solutions of the system x þ y z ¼ 0. On the other hand, w is automatically independent of u and v because w belongs to a different eigenvalue of A. Thus, the three vectors are linearly independent. 312 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
  • 320. (c) A is diagonalizable, because it has three linearly independent eigenvectors. Let P be the matrix with columns u; v; w. Then P ¼ 1 1 1 1 0 2 0 1 1 2 4 3 5 and D ¼ P1 AP ¼ 3 3 5 2 4 3 5 9.15. Repeat Problem 9.14 for the matrix B ¼ 3 1 1 7 5 1 6 6 2 2 4 3 5. (a) First find the characteristic polynomial DðtÞ of B. We have trðBÞ ¼ 0; jBj ¼ 16; B11 ¼ 4; B22 ¼ 0; B33 ¼ 8; so P i Bii ¼ 12 Therefore, DðtÞ ¼ t3 12t þ 16 ¼ ðt 2Þ2 ðt þ 4Þ. Thus, l1 ¼ 2 and l2 ¼ 4 are the eigen- values of B. (b) Find a basis for the eigenspace of each eigenvalue of B. (i) Subtract l1 ¼ 2 down the diagonal of B to obtain M ¼ 1 1 1 7 7 1 6 6 0 2 4 3 5; corresponding to x y þ z ¼ 0 7x 7y þ z ¼ 0 6x 6y ¼ 0 or x y þ z ¼ 0 z ¼ 0 The system has only one independent solution; for example, x ¼ 1, y ¼ 1, z ¼ 0. Thus, u ¼ ð1; 1; 0Þ forms a basis for the eigenspace of l1 ¼ 2. (ii) Subtract l2 ¼ 4 (or add 4) down the diagonal of B to obtain M ¼ 7 1 1 7 1 1 6 6 6 2 4 3 5; corresponding to 7x y þ z ¼ 0 7x y þ z ¼ 0 6x 6y þ 6z ¼ 0 or x y þ z ¼ 0 6y 6z ¼ 0 The system has only one independent solution; for example, x ¼ 0, y ¼ 1, z ¼ 1. Thus, v ¼ ð0; 1; 1Þ forms a basis for the eigenspace of l2 ¼ 4. Thus S ¼ fu; vg is a maximal set of linearly independent eigenvectors of B. (c) Because B has at most two linearly independent eigenvectors, B is not similar to a diagonal matrix; that is, B is not diagonalizable. 9.16. Find the algebraic and geometric multiplicities of the eigenvalue l1 ¼ 2 of the matrix B in Problem 9.15. The algebraic multiplicity of l1 ¼ 2 is 2, because t 2 appears with exponent 2 in DðtÞ. However, the geometric multiplicity of l1 ¼ 2 is 1, because dim El1 ¼ 1 (where El1 is the eigenspace of l1). 9.17. Let T: R3 ! R3 be defined by Tðx; y; zÞ ¼ ð2x þ y 2z; 2x þ 3y 4z; x þ y zÞ. Find all eigenvalues of T, and find a basis of each eigenspace. Is T diagonalizable? If so, find the basis S of R3 that diagonalizes T; and find its diagonal representation D. First find the matrix A that represents T relative to the usual basis of R3 by writing down the coefficients of x; y; z as rows, and then find the characteristic polynomial of A (and T). We have A ¼ ½T ¼ 2 1 2 2 3 4 1 1 1 2 4 3 5 and trðAÞ ¼ 4; jAj ¼ 2 A11 ¼ 1; A22 ¼ 0; A33 ¼ 4 P i Aii ¼ 5 Therefore, DðtÞ ¼ t3 4t2 þ 5t 2 ¼ ðt 1Þ2 ðt 2Þ, and so l ¼ 1 and l ¼ 2 are the eigenvalues of A (and T). We next find linearly independent eigenvectors for each eigenvalue of A. CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 313
  • 321. (i) Subtract l ¼ 1 down the diagonal of A to obtain the matrix M ¼ 1 1 2 2 2 4 1 1 2 2 4 3 5; corresponding to x þ y 2z ¼ 0 Here y and z are free variables, and so there are two linearly independent eigenvectors belonging to l ¼ 1. For example, u ¼ ð1; 1; 0Þ and v ¼ ð2; 0; 1Þ are two such eigenvectors. (ii) Subtract l ¼ 2 down the diagonal of A to obtain M ¼ 0 1 2 2 1 4 1 1 3 2 4 3 5; corresponding to y 2z ¼ 0 2x þ y 4z ¼ 0 x þ y 3z ¼ 0 or x þ y 3z ¼ 0 y 2z ¼ 0 Only z is a free variable. Here w ¼ ð1; 2; 1Þ is a solution. Thus, T is diagonalizable, because it has three independent eigenvectors. Specifically, choosing S ¼ fu; v; wg ¼ fð1; 1; 0Þ; ð2; 0; 1Þ; ð1; 2; 1Þg as a basis, T is represented by the diagonal matrix D ¼ diagð1; 1; 2Þ. 9.18. Prove the following for a linear operator (matrix) T: (a) The scalar 0 is an eigenvalue of T if and only if T is singular. (b) If l is an eigenvalue of T, where T is invertible, then l1 is an eigenvalue of T1 . (a) We have that 0 is an eigenvalue of T if and only if there is a vector v 6¼ 0 such that TðvÞ ¼ 0v—that is, if and only if T is singular. (b) Because T is invertible, it is nonsingular; hence, by (a), l 6¼ 0. By definition of an eigenvalue, there exists v 6¼ 0 such that TðvÞ ¼ lv. Applying T1 to both sides, we obtain v ¼ T1 ðlvÞ ¼ lT1 ðvÞ; and so T1 ðvÞ ¼ l1 v Therefore, l1 is an eigenvalue of T1 . 9.19. Let l be an eigenvalue of a linear operator T: V ! V, and let El consists of all the eigenvectors belonging to l (called the eigenspace of l). Prove that El is a subspace of V. That is, prove (a) If u 2 El, then ku 2 El for any scalar k. (b) If u; v; 2 El, then u þ v 2 El. (a) Because u 2 El, we have TðuÞ ¼ lu. Then TðkuÞ ¼ kTðuÞ ¼ kðluÞ ¼ lðkuÞ; and so ku 2 El: (We view the zero vector 0 2 V as an ‘‘eigenvector’’ of l in order for El to be a subspace of V.) (b) As u; v 2 El, we have TðuÞ ¼ lu and TðvÞ ¼ lv. Then Tðu þ vÞ ¼ TðuÞ þ TðvÞ ¼ lu þ lv ¼ lðu þ vÞ; and so u þ v 2 El 9.20. Prove Theorem 9.6: The following are equivalent: (i) The scalar l is an eigenvalue of A. (ii) The matrix lI A is singular. (iii) The scalar l is a root of the characteristic polynomial DðtÞ of A. The scalar l is an eigenvalue of A if and only if there exists a nonzero vector v such that Av ¼ lv or ðlIÞv Av ¼ 0 or ðlI AÞv ¼ 0 or lI A is singular. In such a case, l is a root of DðtÞ ¼ jtI Aj. Also, v is in the eigenspace El of l if and only if the above relations hold. Hence, v is a solution of ðlI AÞX ¼ 0. 9.21. Prove Theorem 9.80 : Suppose v1; v2; . . . ; vn are nonzero eigenvectors of T belonging to distinct eigenvalues l1; l2; . . . ; ln. Then v1; v2; . . . ; vn are linearly independent. 314 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
  • 322. Suppose the theorem is not true. Let v1; v2; . . . ; vs be a minimal set of vectors for which the theorem is not true. We have s 1, because v1 6¼ 0. Also, by the minimality condition, v2; . . . ; vs are linearly independent. Thus, v1 is a linear combination of v2; . . . ; vs, say, v1 ¼ a2v2 þ a3v3 þ þ asvs ð1Þ (where some ak 6¼ 0Þ. Applying T to (1) and using the linearity of T yields Tðv1Þ ¼ Tða2v2 þ a3v3 þ þ asvsÞ ¼ a2Tðv2Þ þ a3Tðv3Þ þ þ asTðvsÞ ð2Þ Because vj is an eigenvector of T belonging to lj, we have TðvjÞ ¼ ljvj. Substituting in (2) yields l1v1 ¼ a2l2v2 þ a3l3v3 þ þ aslsvs ð3Þ Multiplying (1) by l1 yields l1v1 ¼ a2l1v2 þ a3l1v3 þ þ asl1vs ð4Þ Setting the right-hand sides of (3) and (4) equal to each other, or subtracting (3) from (4) yields a2ðl1 l2Þv2 þ a3ðl1 l3Þv3 þ þ asðl1 lsÞvs ¼ 0 ð5Þ Because v2; v3; . . . ; vs are linearly independent, the coefficients in (5) must all be zero. That is, a2ðl1 l2Þ ¼ 0; a3ðl1 l3Þ ¼ 0; . . . ; asðl1 lsÞ ¼ 0 However, the li are distinct. Hence l1 lj 6¼ 0 for j 1. Hence, a2 ¼ 0, a3 ¼ 0; . . . ; as ¼ 0. This contradicts the fact that some ak 6¼ 0. The theorem is proved. 9.22. Prove Theorem 9.9. Suppose DðtÞ ¼ ðt a1Þðt a2Þ . . . ðt anÞ is the characteristic polynomial of an n-square matrix A, and suppose the n roots ai are distinct. Then A is similar to the diagonal matrix D ¼ diagða1; a2; . . . ; anÞ. Let v1; v2; . . . ; vn be (nonzero) eigenvectors corresponding to the eigenvalues ai. Then the n eigenvectors vi are linearly independent (Theorem 9.8), and hence form a basis of Kn . Accordingly, A is diagonalizable (i.e., A is similar to a diagonal matrix D), and the diagonal elements of D are the eigenvalues ai. 9.23. Prove Theorem 9.100 : The geometric multiplicity of an eigenvalue l of T does not exceed its algebraic multiplicity. Suppose the geometric multiplicity of l is r. Then its eigenspace El contains r linearly independent eigenvectors v1; . . . ; vr. Extend the set fvig to a basis of V, say, fvi; . . . ; vr; w1; . . . ; wsg. We have Tðv1Þ ¼ lv1; Tðv2Þ ¼ lv2; . . . ; TðvrÞ ¼ lvr; Tðw1Þ ¼ a11v1 þ þ a1rvr þ b11w1 þ þ b1sws Tðw2Þ ¼ a21v1 þ þ a2rvr þ b21w1 þ þ b2sws :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: TðwsÞ ¼ as1v1 þ þ asrvr þ bs1w1 þ þ bssws Then M ¼ lIr A 0 B is the matrix of T in the above basis, where A ¼ ½aijT and B ¼ ½bijT : Because M is block diagonal, the characteristic polynomial ðt lÞr of the block lIr must divide the characteristic polynomial of M and hence of T. Thus, the algebraic multiplicity of l for T is at least r, as required. Diagonalizing Real Symmetric Matrices and Quadratic Forms 9.24. Let A ¼ 7 3 3 1 . Find an orthogonal matrix P such that D ¼ P1 AP is diagonal. CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 315
  • 323. First find the characteristic polynomial DðtÞ of A. We have DðtÞ ¼ t2 trðAÞ t þ jAj ¼ t2 6t 16 ¼ ðt 8Þðt þ 2Þ Thus, the eigenvalues of A are l ¼ 8 and l ¼ 2. We next find corresponding eigenvectors. Subtract l ¼ 8 down the diagonal of A to obtain the matrix M ¼ 1 3 3 9 ; corresponding to x þ 3y ¼ 0 3x 9y ¼ 0 or x 3y ¼ 0 A nonzero solution is u1 ¼ ð3; 1Þ. Subtract l ¼ 2 (or add 2) down the diagonal of A to obtain the matrix M ¼ 9 3 3 1 ; corresponding to 9x þ 3y ¼ 0 3x þ y ¼ 0 or 3x þ y ¼ 0 A nonzero solution is u2 ¼ ð1; 3Þ. As expected, because A is symmetric, the eigenvectors u1 and u2 are orthogonal. Normalize u1 and u2 to obtain, respectively, the unit vectors ^ u1 ¼ ð3= ffiffiffiffiffi 10 p ; 1= ffiffiffiffiffi 10 p Þ and ^ u2 ¼ ð1= ffiffiffiffiffi 10 p ; 3= ffiffiffiffiffi 10 p Þ: Finally, let P be the matrix whose columns are the unit vectors ^ u1 and ^ u2, respectively. Then P ¼ 3= ffiffiffiffiffi 10 p 1= ffiffiffiffiffi 10 p 1= ffiffiffiffiffi 10 p 3= ffiffiffiffiffi 10 p # and D ¼ P1 AP ¼ 8 0 0 2 As expected, the diagonal entries in D are the eigenvalues of A. 9.25. Let B ¼ 11 8 4 8 1 2 4 2 4 2 4 3 5. (a) Find all eigenvalues of B. (b) Find a maximal set S of nonzero orthogonal eigenvectors of B. (c) Find an orthogonal matrix P such that D ¼ P1 BP is diagonal. (a) First find the characteristic polynomial of B. We have trðBÞ ¼ 6; jBj ¼ 400; B11 ¼ 0; B22 ¼ 60; B33 ¼ 75; so P i Bii ¼ 135 Hence, DðtÞ ¼ t3 6t2 135t 400. If DðtÞ has an integer root it must divide 400. Testing t ¼ 5, by synthetic division, yields 5 1 6 135 400 5 þ 55 þ 400 1 11 80 þ 0 Thus, t þ 5 is a factor of DðtÞ, and t2 11t 80 is a factor. Thus, DðtÞ ¼ ðt þ 5Þðt2 11t 80Þ ¼ ðt þ 5Þ2 ðt 16Þ The eigenvalues of B are l ¼ 5 (multiplicity 2), and l ¼ 16 (multiplicity 1). (b) Find an orthogonal basis for each eigenspace. Subtract l ¼ 5 (or, add 5) down the diagonal of B to obtain the homogeneous system 16x 8y þ 4z ¼ 0; 8x þ 4y 2z ¼ 0; 4x 2y þ z ¼ 0 That is, 4x 2y þ z ¼ 0. The system has two independent solutions. One solution is v1 ¼ ð0; 1; 2Þ. We seek a second solution v2 ¼ ða; b; cÞ, which is orthogonal to v1, such that 4a 2b þ c ¼ 0; and also b 2c ¼ 0 316 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
  • 324. One such solution is v2 ¼ ð5; 8; 4Þ. Subtract l ¼ 16 down the diagonal of B to obtain the homogeneous system 5x 8y þ 4z ¼ 0; 8x 17y 2z ¼ 0; 4x 2y 20z ¼ 0 This system yields a nonzero solution v3 ¼ ð4; 2; 1Þ. (As expected from Theorem 9.13, the eigenvector v3 is orthogonal to v1 and v2.) Then v1; v2; v3 form a maximal set of nonzero orthogonal eigenvectors of B. (c) Normalize v1; v2; v3 to obtain the orthonormal basis: ^ v1 ¼ v1= ffiffiffi 5 p ; ^ v2 ¼ v2= ffiffiffiffiffiffiffiffi 105 p ; ^ v3 ¼ v3= ffiffiffiffiffi 21 p Then P is the matrix whose columns are ^ v1; ^ v2; ^ v3. Thus, P ¼ 0 5= ffiffiffiffiffiffiffiffi 105 p 4= ffiffiffiffiffi 21 p 1= ffiffiffi 5 p 8= ffiffiffiffiffiffiffiffi 105 p 2= ffiffiffiffiffi 21 p 2= ffiffiffi 5 p 4= ffiffiffiffiffiffiffiffi 105 p 1= ffiffiffiffiffi 21 p 2 6 4 3 7 5 and D ¼ P1 BP ¼ 5 5 16 2 6 4 3 7 5 9.26. Let qðx; yÞ ¼ x2 þ 6xy 7y2 . Find an orthogonal substitution that diagonalizes q. Find the symmetric matrix A that represents q and its characteristic polynomial DðtÞ. We have A ¼ 1 3 3 7 and DðtÞ ¼ t2 þ 6t 16 ¼ ðt 2Þðt þ 8Þ The eigenvalues of A are l ¼ 2 and l ¼ 8. Thus, using s and t as new variables, a diagonal form of q is qðs; tÞ ¼ 2s2 8t2 The corresponding orthogonal substitution is obtained by finding an orthogonal set of eigenvectors of A. (i) Subtract l ¼ 2 down the diagonal of A to obtain the matrix M ¼ 1 3 3 9 ; corresponding to x þ 3y ¼ 0 3x 9y ¼ 0 or x þ 3y ¼ 0 A nonzero solution is u1 ¼ ð3; 1Þ. (ii) Subtract l ¼ 8 (or add 8) down the diagonal of A to obtain the matrix M ¼ 9 3 3 1 ; corresponding to 9x þ 3y ¼ 0 3x þ y ¼ 0 or 3x þ y ¼ 0 A nonzero solution is u2 ¼ ð1; 3Þ. As expected, because A is symmetric, the eigenvectors u1 and u2 are orthogonal. Now normalize u1 and u2 to obtain, respectively, the unit vectors ^ u1 ¼ ð3= ffiffiffiffiffi 10 p ; 1= ffiffiffiffiffi 10 p Þ and ^ u2 ¼ ð1= ffiffiffiffiffi 10 p ; 3= ffiffiffiffiffi 10 p Þ: Finally, let P be the matrix whose columns are the unit vectors ^ u1 and ^ u2, respectively, and then ½x; yT ¼ P½s; tT is the required orthogonal change of coordinates. That is, P ¼ 3= ffiffiffiffiffi 10 p 1= ffiffiffiffiffi 10 p 1= ffiffiffiffiffi 10 p 3= ffiffiffiffiffi 10 p # and x ¼ 3s t ffiffiffiffiffi 10 p ; y ¼ s þ 3t ffiffiffiffiffi 10 p One can also express s and t in terms of x and y by using P1 ¼ PT . That is, s ¼ 3x þ y ffiffiffiffiffi 10 p ; t ¼ x þ 3t ffiffiffiffiffi 10 p CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 317
  • 325. Minimal Polynomial 9.27. Let A ¼ 4 2 2 6 3 4 3 2 3 2 4 3 5 and B ¼ 3 2 2 4 4 6 2 3 5 2 4 3 5. The characteristic polynomial of both matrices is DðtÞ ¼ ðt 2Þðt 1Þ2 . Find the minimal polynomial mðtÞ of each matrix. The minimal polynomial mðtÞ must divide DðtÞ. Also, each factor of DðtÞ (i.e., t 2 and t 1) must also be a factor of mðtÞ. Thus, mðtÞ must be exactly one of the following: f ðtÞ ¼ ðt 2Þðt 1Þ or gðtÞ ¼ ðt 2Þðt 1Þ2 (a) By the Cayley–Hamilton theorem, gðAÞ ¼ DðAÞ ¼ 0, so we need only test f ðtÞ. We have f ðAÞ ¼ ðA 2IÞðA IÞ ¼ 2 2 2 6 5 4 3 2 1 2 4 3 5 3 2 2 6 4 4 3 2 2 2 4 3 5 ¼ 0 0 0 0 0 0 0 0 0 2 4 3 5 Thus, mðtÞ ¼ f ðtÞ ¼ ðt 2Þðt 1Þ ¼ t2 3t þ 2 is the minimal polynomial of A. (b) Again gðBÞ ¼ DðBÞ ¼ 0, so we need only test f ðtÞ. We get f ðBÞ ¼ ðB 2IÞðB IÞ ¼ 1 2 2 4 6 6 2 3 3 2 4 3 5 2 2 2 4 5 6 2 3 4 2 4 3 5 ¼ 2 2 2 4 4 4 2 2 2 2 4 3 5 6¼ 0 Thus, mðtÞ 6¼ f ðtÞ. Accordingly, mðtÞ ¼ gðtÞ ¼ ðt 2Þðt 1Þ2 is the minimal polynomial of B. [We emphasize that we do not need to compute gðBÞ; we know gðBÞ ¼ 0 from the Cayley–Hamilton theorem.] 9.28. Find the minimal polynomial mðtÞ of each of the following matrices: (a) A ¼ 5 1 3 7 , (b) B ¼ 1 2 3 0 2 3 0 0 3 2 4 3 5, (c) C ¼ 4 1 1 2 (a) The characteristic polynomial of A is DðtÞ ¼ t2 12t þ 32 ¼ ðt 4Þðt 8Þ. Because DðtÞ has distinct factors, the minimal polynomial mðtÞ ¼ DðtÞ ¼ t2 12t þ 32. (b) Because B is triangular, its eigenvalues are the diagonal elements 1; 2; 3; and so its characteristic polynomial is DðtÞ ¼ ðt 1Þðt 2Þðt 3Þ. Because DðtÞ has distinct factors, mðtÞ ¼ DðtÞ. (c) The characteristic polynomial of C is DðtÞ ¼ t2 6t þ 9 ¼ ðt 3Þ2 . Hence the minimal polynomial of C is f ðtÞ ¼ t 3 or gðtÞ ¼ ðt 3Þ2 . However, f ðCÞ 6¼ 0; that is, C 3I 6¼ 0. Hence, mðtÞ ¼ gðtÞ ¼ DðtÞ ¼ ðt 3Þ2 : 9.29. Suppose S ¼ fu1; u2; . . . ; ung is a basis of V, and suppose F and G are linear operators on V such that ½F has 0’s on and below the diagonal, and ½G has a 6¼ 0 on the superdiagonal and 0’s elsewhere. That is, ½F ¼ 0 a21 a31 . . . an1 0 0 a32 . . . an2 :::::::::::::::::::::::::::::::::::::::: 0 0 0 . . . an;n1 0 0 0 . . . 0 2 6 6 6 6 4 3 7 7 7 7 5 ; ½G ¼ 0 a 0 . . . 0 0 0 a . . . 0 ::::::::::::::::::::::::::: 0 0 0 . . . a 0 0 0 . . . 0 2 6 6 6 6 4 3 7 7 7 7 5 Show that (a) Fn ¼ 0, (b) Gn1 6¼ 0, but Gn ¼ 0. (These conditions also hold for ½F and ½G.) (a) We have Fðu1Þ ¼ 0 and, for r 1, FðurÞ is a linear combination of vectors preceding ur in S. That is, FðurÞ ¼ ar1u1 þ ar2u2 þ þ ar;r1ur1 318 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
  • 326. Hence, F2 ðurÞ ¼ FðFðurÞÞ is a linear combination of vectors preceding ur1, and so on. Hence, Fr ðurÞ ¼ 0 for each r. Thus, for each r, Fn ðurÞ ¼ Fnr ð0Þ ¼ 0, and so Fn ¼ 0, as claimed. (b) We have Gðu1Þ ¼ 0 and, for each k 1, GðukÞ ¼ auk1. Hence, Gr ðukÞ ¼ ar ukr for r k. Because a 6¼ 0, an1 6¼ 0. Therefore, Gn1 ðunÞ ¼ an1 u1 6¼ 0, and so Gn1 6¼ 0. On the other hand, by (a), Gn ¼ 0. 9.30. Let B be the matrix in Example 9.12(a) that has 1’s on the diagonal, a’s on the superdiagonal, where a 6¼ 0, and 0’s elsewhere. Show that f ðtÞ ¼ ðt lÞn is both the characteristic polynomial DðtÞ and the minimum polynomial mðtÞ of A. Because A is triangular with l’s on the diagonal, DðtÞ ¼ f ðtÞ ¼ ðt lÞn is its characteristic polynomial. Thus, mðtÞ is a power of t l. By Problem 9.29, ðA lIÞr1 6¼ 0. Hence, mðtÞ ¼ DðtÞ ¼ ðt lÞn . 9.31. Find the characteristic polynomial DðtÞ and minimal polynomial mðtÞ of each matrix: (a) M ¼ 4 1 0 0 0 0 4 1 0 0 0 0 4 0 0 0 0 0 4 1 0 0 0 0 4 2 6 6 6 6 4 3 7 7 7 7 5 , (b) M0 ¼ 2 7 0 0 0 2 0 0 0 0 1 1 0 0 2 4 2 6 6 4 3 7 7 5 (a) M is block diagonal with diagonal blocks A ¼ 4 1 0 0 4 1 0 0 4 2 4 3 5 and B ¼ 4 1 0 4 The characteristic and minimal polynomial of A is f ðtÞ ¼ ðt 4Þ3 and the characteristic and minimal polynomial of B is gðtÞ ¼ ðt 4Þ2 . Then DðtÞ ¼ f ðtÞgðtÞ ¼ ðt 4Þ5 but mðtÞ ¼ LCM½ f ðtÞ; gðtÞ ¼ ðt 4Þ3 (where LCM means least common multiple). We emphasize that the exponent in mðtÞ is the size of the largest block. (b) Here M0 is block diagonal with diagonal blocks A0 ¼ 2 7 0 2 and B0 ¼ 1 1 2 4 The char- acteristic and minimal polynomial of A0 is f ðtÞ ¼ ðt 2Þ2 . The characteristic polynomial of B0 is gðtÞ ¼ t2 5t þ 6 ¼ ðt 2Þðt 3Þ, which has distinct factors. Hence, gðtÞ is also the minimal polynomial of B. Accordingly, DðtÞ ¼ f ðtÞgðtÞ ¼ ðt 2Þ3 ðt 3Þ but mðtÞ ¼ LCM½ f ðtÞ; gðtÞ ¼ ðt 2Þ2 ðt 3Þ 9.32. Find a matrix A whose minimal polynomial is f ðtÞ ¼ t3 8t2 þ 5t þ 7. Simply let A ¼ 0 0 7 1 0 5 0 1 8 2 4 3 5, the companion matrix of f ðtÞ [defined in Example 9.12(b)]. 9.33. Prove Theorem 9.15: The minimal polynomial mðtÞ of a matrix (linear operator) A divides every polynomial that has A as a zero. In particular (by the Cayley–Hamilton theorem), mðtÞ divides the characteristic polynomial DðtÞ of A. Suppose f ðtÞ is a polynomial for which f ðAÞ ¼ 0. By the division algorithm, there exist polynomials qðtÞ and rðtÞ for which f ðtÞ ¼ mðtÞqðtÞ þ rðtÞ and rðtÞ ¼ 0 or deg rðtÞ deg mðtÞ. Substituting t ¼ A in this equation, and using that f ðAÞ ¼ 0 and mðAÞ ¼ 0, we obtain rðAÞ ¼ 0. If rðtÞ 6¼ 0, then rðtÞ is a polynomial of degree less than mðtÞ that has A as a zero. This contradicts the definition of the minimal polynomial. Thus, rðtÞ ¼ 0, and so f ðtÞ ¼ mðtÞqðtÞ; that is, mðtÞ divides f ðtÞ. CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 319
  • 327. 9.34. Let mðtÞ be the minimal polynomial of an n-square matrix A. Prove that the characteristic polynomial DðtÞ of A divides ½mðtÞn . Suppose mðtÞ ¼ tr þ c1tr1 þ þ cr1t þ cr. Define matrices Bj as follows: B0 ¼ I B1 ¼ A þ c1I B2 ¼ A2 þ c1A þ c2I Br1 ¼ Ar1 þ c1Ar2 þ þ cr1I so so so so I ¼ B0 c1I ¼ B1 A ¼ B1 AB0 c2I ¼ B2 AðA þ c1IÞ ¼ B2 AB1 ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: cr1I ¼ Br1 ABr2 Then ABr1 ¼ crI ðAr þ c1Ar1 þ þ cr1A þ crIÞ ¼ crI mðAÞ ¼ crI Set BðtÞ ¼ tr1 B0 þ tr2 B1 þ þ tBr2 þ Br1 Then ðtI AÞBðtÞ ¼ ðtr B0 þ tr1 B1 þ þ tBr1Þ ðtr1 AB0 þ tr2 AB1 þ þ ABr1Þ ¼ tr B0 þ tr1 ðB1 AB0Þ þ tr2 ðB2 AB1Þ þ þ tðBr1 ABr2Þ ABr1 ¼ tr I þ c1tr1 I þ c2tr2 I þ þ cr1tI þ crI ¼ mðtÞI Taking the determinant of both sides gives jtI AjjBðtÞj ¼ jmðtÞIj ¼ ½mðtÞn . Because jBðtÞj is a poly- nomial, jtI Aj divides ½mðtÞn ; that is, the characteristic polynomial of A divides ½mðtÞn . 9.35. Prove Theorem 9.16: The characteristic polynomial DðtÞ and the minimal polynomial mðtÞ of A have the same irreducible factors. Suppose f ðtÞ is an irreducible polynomial. If f ðtÞ divides mðtÞ, then f ðtÞ also divides DðtÞ [because mðtÞ divides DðtÞ. On the other hand, if f ðtÞ divides DðtÞ, then by Problem 9.34, f ðtÞ also divides ½mðtÞn . But f ðtÞ is irreducible; hence, f ðtÞ also divides mðtÞ. Thus, mðtÞ and DðtÞ have the same irreducible factors. 9.36. Prove Theorem 9.19: The minimal polynomial mðtÞ of a block diagonal matrix M with diagonal blocks Ai is equal to the least common multiple (LCM) of the minimal polynomials of the diagonal blocks Ai. We prove the theorem for the case r ¼ 2. The general theorem follows easily by induction. Suppose M ¼ A 0 0 B , where A and B are square matrices. We need to show that the minimal polynomial mðtÞ of M is the LCM of the minimal polynomials gðtÞ and hðtÞ of A and B, respectively. Because mðtÞ is the minimal polynomial of M; mðMÞ ¼ mðAÞ 0 0 mðBÞ ¼ 0, and mðAÞ ¼ 0 and mðBÞ ¼ 0. Because gðtÞ is the minimal polynomial of A, gðtÞ divides mðtÞ. Similarly, hðtÞ divides mðtÞ. Thus mðtÞ is a multiple of gðtÞ and hðtÞ. Now let f ðtÞ be another multiple of gðtÞ and hðtÞ. Then f ðMÞ ¼ f ðAÞ 0 0 f ðBÞ ¼ 0 0 0 0 ¼ 0. But mðtÞ is the minimal polynomial of M; hence, mðtÞ divides f ðtÞ. Thus, mðtÞ is the LCM of gðtÞ and hðtÞ. 9.37. Suppose mðtÞ ¼ tr þ ar1tr1 þ þ a1t þ a0 is the minimal polynomial of an n-square matrix A. Prove the following: (a) A is nonsingular if and only if the constant term a0 6¼ 0. (b) If A is nonsingular, then A1 is a polynomial in A of degree r 1 n. (a) The following are equivalent: (i) A is nonsingular, (ii) 0 is not a root of mðtÞ, (iii) a0 6¼ 0. Thus, the statement is true. 320 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
  • 328. (b) Because A is nonsingular, a0 6¼ 0 by (a). We have mðAÞ ¼ Ar þ ar1Ar1 þ þ a1A þ a0I ¼ 0 Thus; 1 a0 ðAr1 þ ar1Ar2 þ þ a1IÞA ¼ I Accordingly; A1 ¼ 1 a0 ðAr1 þ ar1Ar2 þ þ a1IÞ SUPPLEMENTARY PROBLEMS Polynomials of Matrices 9.38. Let A ¼ 2 3 5 1 and B ¼ 1 2 0 3 . Find f ðAÞ, gðAÞ, f ðBÞ, gðBÞ, where f ðtÞ ¼ 2t2 5t þ 6 and gðtÞ ¼ t3 2t2 þ t þ 3. 9.39. Let A ¼ 1 2 0 1 . Find A2 , A3 , An , where n 3, and A1 . 9.40. Let B ¼ 8 12 0 0 8 12 0 0 8 2 4 3 5. Find a real matrix A such that B ¼ A3 . 9.41. For each matrix, find a polynomial having the following matrix as a root: (a) A ¼ 2 5 1 3 , (b) B ¼ 2 3 7 4 , (c) C ¼ 1 1 2 1 2 3 2 1 4 2 4 3 5 9.42. Let A be any square matrix and let f ðtÞ be any polynomial. Prove (a) ðP1 APÞn ¼ P1 An P. (b) f ðP1 APÞ ¼ P1 f ðAÞP. (c) f ðAT Þ ¼ ½ f ðAÞT . (d) If A is symmetric, then f ðAÞ is symmetric. 9.43. Let M ¼ diag½A1; . . . ; Ar be a block diagonal matrix, and let f ðtÞ be any polynomial. Show that f ðMÞ is block diagonal and f ðMÞ ¼ diag½ f ðA1Þ; . . . ; f ðArÞ: 9.44. Let M be a block triangular matrix with diagonal blocks A1; . . . ; Ar, and let f ðtÞ be any polynomial. Show that f ðMÞ is also a block triangular matrix, with diagonal blocks f ðA1Þ; . . . ; f ðArÞ. Eigenvalues and Eigenvectors 9.45. For each of the following matrices, find all eigenvalues and corresponding linearly independent eigen- vectors: (a) A ¼ 2 3 2 5 , (b) B ¼ 2 4 1 6 , (c) C ¼ 1 4 3 7 When possible, find the nonsingular matrix P that diagonalizes the matrix. 9.46. Let A ¼ 2 1 2 3 . (a) Find eigenvalues and corresponding eigenvectors. (b) Find a nonsingular matrix P such that D ¼ P1 AP is diagonal. (c) Find A8 and f ðAÞ where f ðtÞ ¼ t4 5t3 þ 7t2 2t þ 5. (d) Find a matrix B such that B2 ¼ A. CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 321
  • 329. 9.47. Repeat Problem 9.46 for A ¼ 5 6 2 2 . 9.48. For each of the following matrices, find all eigenvalues and a maximum set S of linearly independent eigenvectors: (a) A ¼ 1 3 3 3 5 3 6 6 4 2 4 3 5, (b) B ¼ 3 1 1 7 5 1 6 6 2 2 4 3 5, (c) C ¼ 1 2 2 1 2 1 1 1 4 2 4 3 5 Which matrices can be diagonalized, and why? 9.49. For each of the following linear operators T: R2 ! R2 , find all eigenvalues and a basis for each eigenspace: (a) Tðx; yÞ ¼ ð3x þ 3y; x þ 5yÞ, (b) Tðx; yÞ ¼ ð3x 13y; x 3yÞ. 9.50. Let A ¼ a b c d be a real matrix. Find necessary and sufficient conditions on a; b; c; d so that A is diagonalizable—that is, so that A has two (real) linearly independent eigenvectors. 9.51. Show that matrices A and AT have the same eigenvalues. Give an example of a 2 2 matrix A where A and AT have different eigenvectors. 9.52. Suppose v is an eigenvector of linear operators F and G. Show that v is also an eigenvector of the linear operator kF þ k0 G, where k and k0 are scalars. 9.53. Suppose v is an eigenvector of a linear operator T belonging to the eigenvalue l. Prove (a) For n 0; v is an eigenvector of Tn belonging to ln . (b) f ðlÞ is an eigenvalue of f ðTÞ for any polynomial f ðtÞ. 9.54. Suppose l 6¼ 0 is an eigenvalue of the composition F G of linear operators F and G. Show that l is also an eigenvalue of the composition G F. [Hint: Show that GðvÞ is an eigenvector of G F.] 9.55. Let E: V ! V be a projection mapping; that is, E2 ¼ E. Show that E is diagonalizable and, in fact, can be represented by the diagonal matrix M ¼ Ir 0 0 0 , where r is the rank of E. Diagonalizing Real Symmetric Matrices and Quadratic Forms 9.56. For each of the following symmetric matrices A, find an orthogonal matrix P and a diagonal matrix D such that D ¼ P1 AP: (a) A ¼ 5 4 4 1 , (b) A ¼ 4 1 1 4 , (c) A ¼ 7 3 3 1 9.57. For each of the following symmetric matrices B, find its eigenvalues, a maximal orthogonal set S of eigenvectors, and an orthogonal matrix P such that D ¼ P1 BP is diagonal: (a) B ¼ 0 1 1 1 0 1 1 1 0 2 4 3 5, (b) B ¼ 2 2 4 2 5 8 4 8 17 2 4 3 5 9.58. Using variables s and t, find an orthogonal substitution that diagonalizes each of the following quadratic forms: (a) qðx; yÞ ¼ 4x2 þ 8xy 11y2 , (b) qðx; yÞ ¼ 2x2 6xy þ 10y2 9.59. For each of the following quadratic forms qðx; y; zÞ, find an orthogonal substitution expressing x; y; z in terms of variables r; s; t, and find qðr; s; tÞ: (a) qðx; y; zÞ ¼ 5x2 þ 3y2 þ 12xz; (b) qðx; y; zÞ ¼ 3x2 4xy þ 6y2 þ 2xz 4yz þ 3z2 322 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
  • 330. 9.60. Find a real 2 2 symmetric matrix A with eigenvalues: (a) l ¼ 1 and l ¼ 4 and eigenvector u ¼ ð1; 1Þ belonging to l ¼ 1; (b) l ¼ 2 and l ¼ 3 and eigenvector u ¼ ð1; 2Þ belonging to l ¼ 2. In each case, find a matrix B for which B2 ¼ A. Characteristic and Minimal Polynomials 9.61. Find the characteristic and minimal polynomials of each of the following matrices: (a) A ¼ 3 1 1 2 4 2 1 1 3 2 4 3 5, (b) B ¼ 3 2 1 3 8 3 3 6 1 2 4 3 5 9.62. Find the characteristic and minimal polynomials of each of the following matrices: (a) A ¼ 2 5 0 0 0 0 2 0 0 0 0 0 4 2 0 0 0 3 5 0 0 0 0 0 7 2 6 6 6 6 4 3 7 7 7 7 5 , (b) B ¼ 4 1 0 0 0 1 2 0 0 0 0 0 3 1 0 0 0 0 3 1 0 0 0 0 3 2 6 6 6 6 4 3 7 7 7 7 5 , (c) C ¼ 3 2 0 0 0 1 4 0 0 0 0 0 3 1 0 0 0 1 3 0 0 0 0 0 4 2 6 6 6 6 4 3 7 7 7 7 5 9.63. Let A ¼ 1 1 0 0 2 0 0 0 1 2 4 3 5 and B ¼ 2 0 0 0 2 2 0 0 1 2 4 3 5. Show that A and B have different characteristic polynomials (and so are not similar) but have the same minimal polynomial. Thus, nonsimilar matrices may have the same minimal polynomial. 9.64. Let A be an n-square matrix for which Ak ¼ 0 for some k n. Show that An ¼ 0. 9.65. Show that a matrix A and its transpose AT have the same minimal polynomial. 9.66. Suppose f ðtÞ is an irreducible monic polynomial for which f ðAÞ ¼ 0 for a matrix A. Show that f ðtÞ is the minimal polynomial of A. 9.67. Show that A is a scalar matrix kI if and only if the minimal polynomial of A is mðtÞ ¼ t k. 9.68. Find a matrix A whose minimal polynomial is (a) t3 5t2 þ 6t þ 8, (b) t4 5t3 2t þ 7t þ 4. 9.69. Let f ðtÞ and gðtÞ be monic polynomials (leading coefficient one) of minimal degree for which A is a root. Show f ðtÞ ¼ gðtÞ: [Thus, the minimal polynomial of A is unique.] ANSWERS TO SUPPLEMENTARY PROBLEMS Notation: M ¼ ½R1; R2; . . . denotes a matrix M with rows R1; R2; . . . : 9.38. f ðAÞ ¼ ½26; 3; 5; 27, gðAÞ ¼ ½40; 39; 65; 27, f ðBÞ ¼ ½3; 6; 0; 9, gðBÞ ¼ ½3; 12; 0; 15 9.39. A2 ¼ ½1; 4; 0; 1, A3 ¼ ½1; 6; 0; 1, An ¼ ½1; 2n; 0; 1, A1 ¼ ½1; 2; 0; 1 9.40. Let A ¼ ½2; a; b; 0; 2; c; 0; 0; 2. Set B ¼ A3 and then a ¼ 1, b ¼ 1 2, c ¼ 1 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors 323
  • 331. 9.41. Find DðtÞ: (a) t2 þ t 11, (b) t2 þ 2t þ 13, (c) t3 7t2 þ 6t 1 9.45. (a) l ¼ 1; u ¼ ð3; 1Þ; l ¼ 4; v ¼ ð1; 2Þ, (b) l ¼ 4; u ¼ ð2; 1Þ, (c) l ¼ 1; u ¼ ð2; 1Þ; l ¼ 5; v ¼ ð2; 3Þ. Only A and C can be diagonalized; use P ¼ ½u; v. 9.46. (a) l ¼ 1; u ¼ ð1; 1Þ; l ¼ 4; v ¼ ð1; 2Þ, (b) P ¼ ½u; v, (c) f ðAÞ ¼ ½3; 1; 2; 1; A8 ¼ ½21 846; 21 845; 43 690; 43 691, (d) B ¼ 4 3 ; 1 3 ; 2 3 ; 5 3 9.47. (a) l ¼ 1; u ¼ ð3; 2Þ; l ¼ 2; v ¼ ð2; 1Þ, (b) P ¼ ½u; v, (c) f ðAÞ ¼ ½2; 6; 2; 9; A8 ¼ ½1021; 1530; 510; 764, (d) B ¼ ½3 þ 4 ffiffiffi 2 p ; 6 þ 6 ffiffiffi 2 p ; 2 2 ffiffiffi 2 p ; 4 3 ffiffiffi 2 p 9.48. (a) l ¼ 2; u ¼ ð1; 1; 0Þ; v ¼ ð1; 0; 1Þ; l ¼ 4; w ¼ ð1; 1; 2Þ, (b) l ¼ 2; u ¼ ð1; 1; 0Þ; l ¼ 4; v ¼ ð0; 1; 1Þ, (c) l ¼ 3; u ¼ ð1; 1; 0Þ; v ¼ ð1; 0; 1Þ; l ¼ 1; w ¼ ð2; 1; 1Þ. Only A and C can be diagonalized; use P ¼ ½u; v; w: 9.49. (a) l ¼ 2; u ¼ ð3; 1Þ; l ¼ 6; v ¼ ð1; 1Þ, (b) No real eigenvalues 9.50. We need ½trðAÞ2 4½detðAÞ 0 or ða dÞ2 þ 4bc 0. 9.51. A ¼ ½1; 1; 0; 1 9.56. (a) P ¼ ½2; 1; 1; 2= ffiffiffi 5 p , D ¼ ½7; 0; 0; 3, (b) P ¼ ½1; 1; 1; 1= ffiffiffi 2 p , D ¼ ½3; 0; 0; 5, (c) P ¼ ½3; 1; 1; 3= ffiffiffiffiffi 10 p , D ¼ ½8; 0; 0; 2 9.57. (a) l ¼ 1; u ¼ ð1; 1; 0Þ; v ¼ ð1; 1; 2Þ; l ¼ 2; w ¼ ð1; 1; 1Þ, (b) l ¼ 1; u ¼ ð2; 1; 1Þ; v ¼ ð2; 3; 1Þ; l ¼ 22; w ¼ ð1; 2; 4Þ; Normalize u; v; w, obtaining ^ u; ^ v; ^ w, and set P ¼ ½^ u; ^ v; ^ w. (Remark: u and v are not unique.) 9.58. (a) x ¼ ð4s þ tÞ= ffiffiffiffiffi 17 p ; y ¼ ðs þ 4tÞ= ffiffiffiffiffi 17 p ; qðs; tÞ ¼ 5s2 12t2 , (b) x ¼ ð3s tÞ= ffiffiffiffiffi 10 p ; y ¼ ðs þ 3tÞ= ffiffiffiffiffi 10 p ; qðs; tÞ ¼ s2 þ 11t2 9.59. (a) x ¼ ð3s þ 2tÞ= ffiffiffiffiffi 13 p ; y ¼ r; z ¼ ð2s 3tÞ= ffiffiffiffiffi 13 p ; qðr; s; tÞ ¼ 3r2 þ 9s2 4t2 , (b) x ¼ 5Ks þ Lt; y ¼ Jr þ 2Ks 2Lt; z ¼ 2Jr Ks Lt, where J ¼ 1= ffiffiffi 5 p , K ¼ 1= ffiffiffiffiffi 30 p , L ¼ 1= ffiffiffi 6 p ; qðr; s; tÞ ¼ 2r2 þ 2s2 þ 8t2 9.60. (a) A ¼ 1 2 ½5; 3; 3; 5; B ¼ 1 2 ½3; 1; 1; 3, (b) A ¼ 1 5 ½14; 2; 2; 11, B ¼ 1 5 ½ ffiffiffi 2 p þ 4 ffiffiffi 3 p ; 2 ffiffiffi 2 p 2 ffiffiffi 3 p ; 2 ffiffiffi 2 p 2 ffiffiffi 3 p ; 4 ffiffiffi 2 p þ ffiffiffi 3 p 9.61. (a) DðtÞ ¼ mðtÞ ¼ ðt 2Þ2 ðt 6Þ, (b) DðtÞ ¼ ðt 2Þ2 ðt 6Þ; mðtÞ ¼ ðt 2Þðt 6Þ 9.62. (a) DðtÞ ¼ ðt 2Þ3 ðt 7Þ2 ; mðtÞ ¼ ðt 2Þ2 ðt 7Þ, (b) DðtÞ ¼ ðt 3Þ5 ; mðtÞ ¼ ðt 3Þ3 , (c) DðtÞ ¼ ðt 2Þ2 ðt 4Þ2 ðt 5Þ; mðtÞ ¼ ðt 2Þðt 4Þðt 5Þ 9.68. Let A be the companion matrix [Example 9.12(b)] with last column: (a) ½8; 6; 5T , (b) ½4; 7; 2; 5T 9.69. Hint: A is a root of hðtÞ ¼ f ðtÞ gðtÞ, where hðtÞ 0 or the degree of hðtÞ is less than the degree of f ðtÞ: 324 CHAPTER 9 Diagonalization: Eigenvalues and Eigenvectors
  • 332. Canonical Forms 10.1 Introduction Let T be a linear operator on a vector space of finite dimension. As seen in Chapter 6, T may not have a diagonal matrix representation. However, it is still possible to ‘‘simplify’’ the matrix representation of T in a number of ways. This is the main topic of this chapter. In particular, we obtain the primary decomposition theorem, and the triangular, Jordan, and rational canonical forms. We comment that the triangular and Jordan canonical forms exist for T if and only if the characteristic polynomial DðtÞ of T has all its roots in the base field K. This is always true if K is the complex field C but may not be true if K is the real field R. We also introduce the idea of a quotient space. This is a very powerful tool, and it will be used in the proof of the existence of the triangular and rational canonical forms. 10.2 Triangular Form Let T be a linear operator on an n-dimensional vector space V. Suppose T can be represented by the triangular matrix A ¼ a11 a12 . . . a1n a22 . . . a2n . . . . . . ann 2 6 6 4 3 7 7 5 Then the characteristic polynomial DðtÞ of T is a product of linear factors; that is, DðtÞ ¼ detðtI AÞ ¼ ðt a11Þðt a22Þ ðt annÞ The converse is also true and is an important theorem (proved in Problem 10.28). THEOREM 10.1: Let T:V ! V be a linear operator whose characteristic polynomial factors into linear polynomials. Then there exists a basis of V in which T is represented by a triangular matrix. THEOREM 10.1: (Alternative Form) Let A be a square matrix whose characteristic polynomial factors into linear polynomials. Then A is similar to a triangular matrix—that is, there exists an invertible matrix P such that P1 AP is triangular. We say that an operator T can be brought into triangular form if it can be represented by a triangular matrix. Note that in this case, the eigenvalues of T are precisely those entries appearing on the main diagonal. We give an application of this remark. CHAPTER 10 325
  • 333. EXAMPLE 10.1 Let A be a square matrix over the complex field C. Suppose l is an eigenvalue of A2 . Show that ffiffiffi l p or ffiffiffi l p is an eigenvalue of A. By Theorem 10.1, A and A2 are similar, respectively, to triangular matrices of the form B ¼ m1 * . . . * m2 . . . * . . . . . . mn 2 6 6 4 3 7 7 5 and B2 ¼ m2 1 * . . . * m2 2 . . . * . . . . . . m2 n 2 6 6 4 3 7 7 5 Because similar matrices have the same eigenvalues, l ¼ m2 i for some i. Hence, mi ¼ ffiffiffi l p or mi ¼ ffiffiffi l p is an eigenvalue of A. 10.3 Invariance Let T:V ! V be linear. A subspace W of V is said to be invariant under T or T-invariant if T maps W into itself—that is, if v 2 W implies TðvÞ 2 W. In this case, T restricted to W defines a linear operator on W; that is, T induces a linear operator ^ T:W ! W defined by ^ TðwÞ ¼ TðwÞ for every w 2 W. EXAMPLE 10.2 (a) Let T: R3 ! R3 be the following linear operator, which rotates each vector v about the z-axis by an angle y (shown in Fig. 10-1): Tðx; y; zÞ ¼ ðx cos y y sin y; x sin y þ y cos y; zÞ Observe that each vector w ¼ ða; b; 0Þ in the xy-plane W remains in W under the mapping T; hence, W is T-invariant. Observe also that the z-axis U is invariant under T. Furthermore, the restriction of T to W rotates each vector about the origin O, and the restriction of T to U is the identity mapping of U. (b) Nonzero eigenvectors of a linear operator T:V ! V may be characterized as generators of T-invariant one-dimensional subspaces. Suppose TðvÞ ¼ lv, v 6¼ 0. Then W ¼ fkv; k 2 Kg, the one-dimensional subspace generated by v, is invariant under T because TðkvÞ ¼ kTðvÞ ¼ kðlvÞ ¼ klv 2 W Conversely, suppose dim U ¼ 1 and u 6¼ 0 spans U, and U is invariant under T. Then TðuÞ 2 U and so TðuÞ is a multiple of u—that is, TðuÞ ¼ mu. Hence, u is an eigenvector of T. The next theorem (proved in Problem 10.3) gives us an important class of invariant subspaces. THEOREM 10.2: Let T:V ! V be any linear operator, and let f ðtÞ be any polynomial. Then the kernel of f ðTÞ is invariant under T. The notion of invariance is related to matrix representations (Problem 10.5) as follows. THEOREM 10.3: Suppose W is an invariant subspace of T:V ! V. Then T has a block matrix repre- sentation A B 0 C , where A is a matrix representation of the restriction ^ T of T to W. 0 W y z x U T( ) v T w ( ) θ θ v w Figure 10-1 326 CHAPTER 10 Canonical Forms
  • 334. 10.4 Invariant Direct-Sum Decompositions A vector space V is termed the direct sum of subspaces W1; . . . ; Wr, written V ¼ W1 W2 . . . Wr if every vector v 2 V can be written uniquely in the form v ¼ w1 þ w2 þ . . . þ wr; with wi 2 Wi The following theorem (proved in Problem 10.7) holds. THEOREM 10.4: Suppose W1; W2; . . . ; Wr are subspaces of V, and suppose B1 ¼ fw11; w12; . . . ; w1n1 g; . . . ; Br ¼ fwr1; wr2; . . . ; wrnr g are bases of W1; W2; . . . ; Wr, respectively. Then V is the direct sum of the Wi if and only if the union B ¼ B1 [ . . . [ Br is a basis of V. Now suppose T:V ! V is linear and V is the direct sum of (nonzero) T-invariant subspaces W1; W2; . . . ; Wr; that is, V ¼ W1 . . . Wr and TðWiÞ Wi; i ¼ 1; . . . ; r Let Ti denote the restriction of T to Wi. Then T is said to be decomposable into the operators Ti or T is said to be the direct sum of the Ti; written T ¼ T1 . . . Tr: Also, the subspaces W1; . . . ; Wr are said to reduce T or to form a T-invariant direct-sum decomposition of V. Consider the special case where two subspaces U and W reduce an operator T:V ! V; say dim U ¼ 2 and dim W ¼ 3, and suppose fu1; u2g and fw1; w2; w3g are bases of U and W, respectively. If T1 and T2 denote the restrictions of T to U and W, respectively, then T1ðu1Þ ¼ a11u1 þ a12u2 T1ðu2Þ ¼ a21u1 þ a22u2 T2ðw1Þ ¼ b11w1 þ b12w2 þ b13w3 T2ðw2Þ ¼ b21w1 þ b22w2 þ b23w3 T2ðw3Þ ¼ b31w1 þ b32w2 þ b33w3 Accordingly, the following matrices A; B; M are the matrix representations of T1, T2, T, respectively, A ¼ a11 a21 a12 a22 ; B ¼ b11 b21 b31 b12 b22 b32 b13 b23 b33 2 4 3 5; M ¼ A 0 0 B The block diagonal matrix M results from the fact that fu1; u2; w1; w2; w3g is a basis of V (Theorem 10.4), and that TðuiÞ ¼ T1ðuiÞ and TðwjÞ ¼ T2ðwjÞ. A generalization of the above argument gives us the following theorem. THEOREM 10.5: Suppose T:V ! V is linear and suppose V is the direct sum of T-invariant subspaces, say, W1; . . . ; Wr. If Ai is a matrix representation of the restriction of T to Wi, then T can be represented by the block diagonal matrix: M ¼ diagðA1; A2; . . . ; ArÞ 10.5 Primary Decomposition The following theorem shows that any operator T:V ! V is decomposable into operators whose minimum polynomials are powers of irreducible polynomials. This is the first step in obtaining a canonical form for T. CHAPTER 10 Canonical Forms 327
  • 335. THEOREM 10.6: (Primary Decomposition Theorem) Let T:V ! V be a linear operator with minimal polynomial mðtÞ ¼ f1ðtÞn1 f2ðtÞn2 frðtÞnr where the fiðtÞ are distinct monic irreducible polynomials. Then V is the direct sum of T-invariant subspaces W1; . . . ; Wr, where Wi is the kernel of fiðTÞni . Moreover, fiðtÞni is the minimal polynomial of the restriction of T to Wi. The above polynomials fiðtÞni are relatively prime. Therefore, the above fundamental theorem follows (Problem 10.11) from the next two theorems (proved in Problems 10.9 and 10.10, respectively). THEOREM 10.7: Suppose T:V ! V is linear, and suppose f ðtÞ ¼ gðtÞhðtÞ are polynomials such that f ðTÞ ¼ 0 and gðtÞ and hðtÞ are relatively prime. Then V is the direct sum of the T-invariant subspace U and W, where U ¼ Ker gðTÞ and W ¼ Ker hðTÞ. THEOREM 10.8: In Theorem 10.7, if f ðtÞ is the minimal polynomial of T [and gðtÞ and hðtÞ are monic], then gðtÞ and hðtÞ are the minimal polynomials of the restrictions of T to U and W, respectively. We will also use the primary decomposition theorem to prove the following useful characterization of diagonalizable operators (see Problem 10.12 for the proof). THEOREM 10.9: A linear operator T:V ! V is diagonalizable if and only if its minimal polynomial mðtÞ is a product of distinct linear polynomials. THEOREM 10.9: (Alternative Form) A matrix A is similar to a diagonal matrix if and only if its minimal polynomial is a product of distinct linear polynomials. EXAMPLE 10.3 Suppose A 6¼ I is a square matrix for which A3 ¼ I. Determine whether or not A is similar to a diagonal matrix if A is a matrix over: (i) the real field R, (ii) the complex field C. Because A3 ¼ I, A is a zero of the polynomial f ðtÞ ¼ t3 1 ¼ ðt 1Þðt2 þ t þ 1Þ: The minimal polynomial mðtÞ of A cannot be t 1, because A 6¼ I. Hence, mðtÞ ¼ t2 þ t þ 1 or mðtÞ ¼ t3 1 Because neither polynomial is a product of linear polynomials over R, A is not diagonalizable over R. On the other hand, each of the polynomials is a product of distinct linear polynomials over C. Hence, A is diagonalizable over C. 10.6 Nilpotent Operators A linear operator T:V ! V is termed nilpotent if Tn ¼ 0 for some positive integer n; we call k the index of nilpotency of T if Tk ¼ 0 but Tk1 6¼ 0: Analogously, a square matrix A is termed nilpotent if An ¼ 0 for some positive integer n, and of index k if Ak ¼ 0 but Ak1 6¼ 0. Clearly the minimum polynomial of a nilpotent operator (matrix) of index k is mðtÞ ¼ tk ; hence, 0 is its only eigenvalue. EXAMPLE 10.4 The following two r-square matrices will be used throughout the chapter: N ¼ NðrÞ ¼ 0 1 0 . . . 0 0 0 0 1 . . . 0 0 :::::::::::::::::::::::::::::::: 0 0 0 . . . 0 1 0 0 0 . . . 0 0 2 6 6 6 6 4 3 7 7 7 7 5 and JðlÞ ¼ l 1 0 . . . 0 0 0 l 1 . . . 0 0 :::::::::::::::::::::::::::::::: 0 0 0 . . . l 1 0 0 0 . . . 0 l 2 6 6 6 6 4 3 7 7 7 7 5 328 CHAPTER 10 Canonical Forms
  • 336. The first matrix N, called a Jordan nilpotent block, consists of 1’s above the diagonal (called the super- diagonal), and 0’s elsewhere. It is a nilpotent matrix of index r. (The matrix N of order 1 is just the 1 1 zero matrix [0].) The second matrix JðlÞ, called a Jordan block belonging to the eigenvalue l, consists of l’s on the diagonal, 1’s on the superdiagonal, and 0’s elsewhere. Observe that JðlÞ ¼ lI þ N In fact, we will prove that any linear operator T can be decomposed into operators, each of which is the sum of a scalar operator and a nilpotent operator. The following (proved in Problem 10.16) is a fundamental result on nilpotent operators. THEOREM 10.10: Let T:V ! V be a nilpotent operator of index k. Then T has a block diagonal matrix representation in which each diagonal entry is a Jordan nilpotent block N. There is at least one N of order k, and all other N are of orders k. The number of N of each possible order is uniquely determined by T. The total number of N of all orders is equal to the nullity of T. The proof of Theorem 10.10 shows that the number of N of order i is equal to 2mi miþ1 mi1, where mi is the nullity of Ti . 10.7 Jordan Canonical Form An operator T can be put into Jordan canonical form if its characteristic and minimal polynomials factor into linear polynomials. This is always true if K is the complex field C. In any case, we can always extend the base field K to a field in which the characteristic and minimal polynomials do factor into linear factors; thus, in a broad sense, every operator has a Jordan canonical form. Analogously, every matrix is similar to a matrix in Jordan canonical form. The following theorem (proved in Problem 10.18) describes the Jordan canonical form J of a linear operator T. THEOREM 10.11: Let T:V ! V be a linear operator whose characteristic and minimal polynomials are, respectively, DðtÞ ¼ ðt l1Þn1 ðt lrÞnr and mðtÞ ¼ ðt l1Þm1 ðt lrÞmr where the li are distinct scalars. Then T has a block diagonal matrix representa- tion J in which each diagonal entry is a Jordan block Jij ¼ JðliÞ. For each lij, the corresponding Jij have the following properties: (i) There is at least one Jij of order mi; all other Jij are of order mi. (ii) The sum of the orders of the Jij is ni. (iii) The number of Jij equals the geometric multiplicity of li. (iv) The number of Jij of each possible order is uniquely determined by T. EXAMPLE 10.5 Suppose the characteristic and minimal polynomials of an operator T are, respec- tively, DðtÞ ¼ ðt 2Þ4 ðt 5Þ3 and mðtÞ ¼ ðt 2Þ2 ðt 5Þ3 CHAPTER 10 Canonical Forms 329
  • 337. Then the Jordan canonical form of T is one of the following block diagonal matrices: diag 2 1 0 2 ; 2 1 0 2 ; 5 1 0 0 5 1 0 0 5 2 4 3 5 0 @ 1 A or diag 2 1 0 2 ; ½2; ½2; 5 1 0 0 5 1 0 0 5 2 4 3 5 0 @ 1 A The first matrix occurs if T has two independent eigenvectors belonging to the eigenvalue 2; and the second matrix occurs if T has three independent eigenvectors belonging to the eigenvalue 2. 10.8 Cyclic Subspaces Let T be a linear operator on a vector space V of finite dimension over K. Suppose v 2 V and v 6¼ 0. The set of all vectors of the form f ðTÞðvÞ, where f ðtÞ ranges over all polynomials over K, is a T-invariant subspace of V called the T-cyclic subspace of V generated by v; we denote it by Zðv; TÞ and denote the restriction of T to Zðv; TÞ by Tv: By Problem 10.56, we could equivalently define Zðv; TÞ as the intersection of all T-invariant subspaces of V containing v. Now consider the sequence v; TðvÞ; T2 ðvÞ; T3 ðvÞ; . . . of powers of T acting on v. Let k be the least integer such that Tk ðvÞ is a linear combination of those vectors that precede it in the sequence, say, Tk ðvÞ ¼ ak1Tk1 ðvÞ a1TðvÞ a0v mvðtÞ ¼ tk þ ak1tk1 þ þ a1t þ a0 Then is the unique monic polynomial of lowest degree for which mvðTÞðvÞ ¼ 0. We call mvðtÞ the T-annihilator of v and Zðv; TÞ. The following theorem (proved in Problem 10.29) holds. THEOREM 10.12: Let Zðv; TÞ, Tv, mvðtÞ be defined as above. Then (i) The set fv; TðvÞ; . . . ; Tk1 ðvÞg is a basis of Zðv; TÞ; hence, dim Zðv; TÞ ¼ k. (ii) The minimal polynomial of Tv is mvðtÞ. (iii) The matrix representation of Tv in the above basis is just the companion matrix CðmvÞ of mvðtÞ; that is, CðmvÞ ¼ 0 0 0 . . . 0 a0 1 0 0 . . . 0 a1 0 1 0 . . . 0 a2 :::::::::::::::::::::::::::::::::::::::: 0 0 0 . . . 0 ak2 0 0 0 . . . 1 ak1 2 6 6 6 6 6 6 4 3 7 7 7 7 7 7 5 10.9 Rational Canonical Form In this section, we present the rational canonical form for a linear operator T:V ! V. We emphasize that this form exists even when the minimal polynomial cannot be factored into linear polynomials. (Recall that this is not the case for the Jordan canonical form.) 330 CHAPTER 10 Canonical Forms
  • 338. LEMMA 10.13: Let T:V ! V be a linear operator whose minimal polynomial is f ðtÞn , where f ðtÞ is a monic irreducible polynomial. Then V is the direct sum V ¼ Zðv1; TÞ Zðvr; TÞ of T-cyclic subspaces Zðvi; TÞ with corresponding T-annihilators f ðtÞn1 ; f ðtÞn2 ; . . . ; f ðtÞnr ; n ¼ n1 n2 . . . nr Any other decomposition of V into T-cyclic subspaces has the same number of components and the same set of T-annihilators. We emphasize that the above lemma (proved in Problem 10.31) does not say that the vectors vi or other T-cyclic subspaces Zðvi; TÞ are uniquely determined by T, but it does say that the set of T-annihilators is uniquely determined by T. Thus, T has a unique block diagonal matrix representation: M ¼ diagðC1; C2; . . . ; CrÞ where the Ci are companion matrices. In fact, the Ci are the companion matrices of the polynomials f ðtÞni . Using the Primary Decomposition Theorem and Lemma 10.13, we obtain the following result. THEOREM 10.14: Let T:V ! V be a linear operator with minimal polynomial mðtÞ ¼ f1ðtÞm1 f2ðtÞm2 fsðtÞms where the fiðtÞ are distinct monic irreducible polynomials. Then T has a unique block diagonal matrix representation: M ¼ diagðC11; C12; . . . ; C1r1 ; . . . ; Cs1; Cs2; . . . ; Csrs Þ where the Cij are companion matrices. In particular, the Cij are the companion matrices of the polynomials fiðtÞnij , where m1 ¼ n11 n12 n1r1 ; . . . ; ms ¼ ns1 ns2 nsrs The above matrix representation of T is called its rational canonical form. The polynomials fiðtÞnij are called the elementary divisors of T. EXAMPLE 10.6 Let V be a vector space of dimension 8 over the rational field Q, and let T be a linear operator on V whose minimal polynomial is mðtÞ ¼ f1ðtÞf2ðtÞ2 ¼ ðt4 4t3 þ 6t2 4t 7Þðt 3Þ2 Thus, because dim V ¼ 8; the characteristic polynomial DðtÞ ¼ f1ðtÞ f2ðtÞ4 : Also, the rational canonical form M of T must have one block the companion matrix of f1ðtÞ and one block the companion matrix of f2ðtÞ2 . There are two possibilities: (a) diag½Cðt4 4t3 þ 6t2 4t 7Þ, Cððt 3Þ2 Þ, Cððt 3Þ2 Þ (b) diag½Cðt4 4t3 þ 6t2 4t 7Þ, Cððt 3Þ2 Þ, Cðt 3Þ; Cðt 3Þ That is, (a) diag 0 0 0 7 1 0 0 4 0 1 0 6 0 0 1 4 2 6 6 4 3 7 7 5; 0 9 1 6 ; 0 9 1 6 0 B B @ 1 C C A; (b) diag 0 0 0 7 1 0 0 4 0 1 0 6 0 0 1 4 2 6 6 4 3 7 7 5; 0 9 1 6 ; ½3; ½3 0 B B @ 1 C C A 10.10 Quotient Spaces Let V be a vector space over a field K and let W be a subspace of V. If v is any vector in V, we write v þ W for the set of sums v þ w with w 2 W; that is, v þ W ¼ fv þ w : w 2 Wg CHAPTER 10 Canonical Forms 331
  • 339. These sets are called the cosets of W in V. We show (Problem 10.22) that these cosets partition V into mutually disjoint subsets. EXAMPLE 10.7 Let W be the subspace of R2 defined by W ¼ fða; bÞ : a ¼ bg; that is, W is the line given by the equation x y ¼ 0. We can view v þ W as a translation of the line obtained by adding the vector v to each point in W. As shown in Fig. 10-2, the coset v þ W is also a line, and it is parallel to W. Thus, the cosets of W in R2 are precisely all the lines parallel to W. In the following theorem, we use the cosets of a subspace W of a vector space V to define a new vector space; it is called the quotient space of V by W and is denoted by V=W. THEOREM 10.15: Let W be a subspace of a vector space over a field K. Then the cosets of W in V form a vector space over K with the following operations of addition and scalar multiplication: ðiÞ ðu þ wÞ þ ðv þ WÞ ¼ ðu þ vÞ þ W; ðiiÞ kðu þ WÞ ¼ ku þ W; where k 2 K We note that, in the proof of Theorem 10.15 (Problem 10.24), it is first necessary to show that the operations are well defined; that is, whenever u þ W ¼ u0 þ W and v þ W ¼ v0 þ W, then ðiÞ ðu þ vÞ þ W ¼ ðu0 þ v0 Þ þ W and ðiiÞ ku þ W ¼ ku0 þ W for any k 2 K In the case of an invariant subspace, we have the following useful result (proved in Problem 10.27). THEOREM 10.16: Suppose W is a subspace invariant under a linear operator T:V ! V. Then T induces a linear operator T on V=W defined by Tðv þ WÞ ¼ TðvÞ þ W. Moreover, if T is a zero of any polynomial, then so is T. Thus, the minimal polynomial of T divides the minimal polynomial of T. SOLVED PROBLEMS Invariant Subspaces 10.1. Suppose T:V ! V is linear. Show that each of the following is invariant under T: (a) f0g, (b) V, (c) kernel of T, (d) image of T. (a) We have Tð0Þ ¼ 0 2 f0g; hence, f0g is invariant under T. (b) For every v 2 V , TðvÞ 2 V; hence, V is invariant under T. (c) Let u 2 Ker T. Then TðuÞ ¼ 0 2 Ker T because the kernel of T is a subspace of V. Thus, Ker T is invariant under T. (d) Because TðvÞ 2 Im T for every v 2 V, it is certainly true when v 2 Im T. Hence, the image of T is invariant under T. 10.2. Suppose fWig is a collection of T-invariant subspaces of a vector space V. Show that the intersection W ¼ T i Wi is also T-invariant. Suppose v 2 W; then v 2 Wi for every i. Because Wi is T-invariant, TðvÞ 2 Wi for every i. Thus, TðvÞ 2 W and so W is T-invariant. Figure 10-2 332 CHAPTER 10 Canonical Forms
  • 340. 10.3. Prove Theorem 10.2: Let T:V ! V be linear. For any polynomial f ðtÞ, the kernel of f ðTÞ is invariant under T. Suppose v 2 Ker f ðTÞ—that is, f ðTÞðvÞ ¼ 0. We need to show that TðvÞ also belongs to the kernel of f ðTÞ—that is, f ðTÞðTðvÞÞ ¼ ð f ðTÞ TÞðvÞ ¼ 0. Because f ðtÞt ¼ tf ðtÞ, we have f ðTÞ T ¼ T f ðTÞ. Thus, as required, ð f ðTÞ TÞðvÞ ¼ ðT f ðTÞÞðvÞ ¼ Tð f ðTÞðvÞÞ ¼ Tð0Þ ¼ 0 10.4. Find all invariant subspaces of A ¼ 2 5 1 2 viewed as an operator on R2 . By Problem 10.1, R2 and f0g are invariant under A. Now if A has any other invariant subspace, it must be one-dimensional. However, the characteristic polynomial of A is DðtÞ ¼ t2 trðAÞ t þ jAj ¼ t2 þ 1 Hence, A has no eigenvalues (in R) and so A has no eigenvectors. But the one-dimensional invariant subspaces correspond to the eigenvectors; thus, R2 and f0g are the only subspaces invariant under A. 10.5. Prove Theorem 10.3: Suppose W is T-invariant. Then T has a triangular block representation A B 0 C , where A is the matrix representation of the restriction ^ T of T to W. We choose a basis fw1; . . . ; wrg of W and extend it to a basis fw1; . . . ; wr; v1; . . . ; vsg of V. We have ^ Tðw1Þ ¼ Tðw1Þ ¼ a11w1 þ þ a1rwr ^ Tðw2Þ ¼ Tðw2Þ ¼ a21w1 þ þ a2rwr :::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ^ TðwrÞ ¼ TðwrÞ ¼ ar1w1 þ þ arrwr Tðv1Þ ¼ b11w1 þ þ b1rwr þ c11v1 þ þ c1svs Tðv2Þ ¼ b21w1 þ þ b2rwr þ c21v1 þ þ c2svs :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: TðvsÞ ¼ bs1w1 þ þ bsrwr þ cs1v1 þ þ cssvs But the matrix of T in this basis is the transpose of the matrix of coefficients in the above system of equations (Section 6.2). Therefore, it has the form A B 0 C , where A is the transpose of the matrix of coefficients for the obvious subsystem. By the same argument, A is the matrix of ^ T relative to the basis fwig of W. 10.6. Let ^ T denote the restriction of an operator T to an invariant subspace W. Prove (a) For any polynomial f ðtÞ, f ð ^ TÞðwÞ ¼ f ðTÞðwÞ. (b) The minimal polynomial of ^ T divides the minimal polynomial of T. (a) If f ðtÞ ¼ 0 or if f ðtÞ is a constant (i.e., of degree 1), then the result clearly holds. Assume deg f ¼ n 1 and that the result holds for polynomials of degree less than n. Suppose that f ðtÞ ¼ antn þ an1tn1 þ þ a1t þ a0 f ð ^ TÞðwÞ ¼ ðan ^ Tn þ an1 ^ Tn1 þ þ a0IÞðwÞ ¼ ðan ^ Tn1 Þð ^ TðwÞÞ þ ðan1 ^ Tn1 þ þ a0IÞðwÞ ¼ ðanTn1 ÞðTðwÞÞ þ ðan1Tn1 þ þ a0IÞðwÞ ¼ f ðTÞðwÞ Then (b) Let mðtÞ denote the minimal polynomial of T. Then by (a), mð ^ TÞðwÞ ¼ mðTÞðwÞ ¼ 0ðwÞ ¼ 0 for every w 2 W; that is, ^ T is a zero of the polynomial mðtÞ. Hence, the minimal polynomial of ^ T divides mðtÞ. CHAPTER 10 Canonical Forms 333
  • 341. Invariant Direct-Sum Decompositions 10.7. Prove Theorem 10.4: Suppose W1; W2; . . . ; Wr are subspaces of V with respective bases B1 ¼ fw11; w12; . . . ; w1n1 g; . . . ; Br ¼ fwr1; wr2; . . . ; wrnr g Then V is the direct sum of the Wi if and only if the union B ¼ S i Bi is a basis of V. Suppose B is a basis of V. Then, for any v 2 V, v ¼ a11w11 þ þ a1n1 w1n1 þ þ ar1wr1 þ þ arnr wrnr ¼ w1 þ w2 þ þ wr where wi ¼ ai1wi1 þ þ aini wini 2 Wi. We next show that such a sum is unique. Suppose v ¼ w0 1 þ w0 2 þ þ w0 r; where w0 i 2 Wi Because fwi1; . . . ; wini g is a basis of Wi, w0 i ¼ bi1wi1 þ þ bini wini , and so v ¼ b11w11 þ þ b1n1 w1n1 þ þ br1wr1 þ þ brnr wrnr Because B is a basis of V; aij ¼ bij, for each i and each j. Hence, wi ¼ w0 i, and so the sum for v is unique. Accordingly, V is the direct sum of the Wi. Conversely, suppose V is the direct sum of the Wi. Then for any v 2 V, v ¼ w1 þ þ wr, where wi 2 Wi. Because fwiji g is a basis of Wi, each wi is a linear combination of the wiji , and so v is a linear combination of the elements of B. Thus, B spans V. We now show that B is linearly independent. Suppose a11w11 þ þ a1n1 w1n1 þ þ ar1wr1 þ þ arnr wrnr ¼ 0 Note that ai1wi1 þ þ aini wini 2 Wi. We also have that 0 ¼ 0 þ 0 0 2 Wi. Because such a sum for 0 is unique, ai1wi1 þ þ aini wini ¼ 0 for i ¼ 1; . . . ; r The independence of the bases fwiji g implies that all the a’s are 0. Thus, B is linearly independent and is a basis of V. 10.8. Suppose T:V ! V is linear and suppose T ¼ T1 T2 with respect to a T-invariant direct-sum decomposition V ¼ U W. Show that (a) mðtÞ is the least common multiple of m1ðtÞ and m2ðtÞ, where mðtÞ, m1ðtÞ, m2ðtÞ are the minimum polynomials of T; T1; T2, respectively. (b) DðtÞ ¼ D1ðtÞD2ðtÞ, where DðtÞ; D1ðtÞ, D2ðtÞ are the characteristic polynomials of T; T1; T2, respectively. (a) By Problem 10.6, each of m1ðtÞ and m2ðtÞ divides mðtÞ. Now suppose f ðtÞ is a multiple of both m1ðtÞ and m2ðtÞ, then f ðT1ÞðUÞ ¼ 0 and f ðT2ÞðWÞ ¼ 0. Let v 2 V, then v ¼ u þ w with u 2 U and w 2 W. Now f ðTÞv ¼ f ðTÞu þ f ðTÞw ¼ f ðT1Þu þ f ðT2Þw ¼ 0 þ 0 ¼ 0 That is, T is a zero of f ðtÞ. Hence, mðtÞ divides f ðtÞ, and so mðtÞ is the least common multiple of m1ðtÞ and m2ðtÞ. (b) By Theorem 10.5, T has a matrix representation M ¼ A 0 0 B , where A and B are matrix representations of T1 and T2, respectively. Then, as required, DðtÞ ¼ jtI Mj ¼ tI A 0 0 tI B ¼ jtI AjjtI Bj ¼ D1ðtÞD2ðtÞ 10.9. Prove Theorem 10.7: Suppose T:V ! V is linear, and suppose f ðtÞ ¼ gðtÞhðtÞ are polynomials such that f ðTÞ ¼ 0 and gðtÞ and hðtÞ are relatively prime. Then V is the direct sum of the T-invariant subspaces U and W where U ¼ Ker gðTÞ and W ¼ Ker hðTÞ. 334 CHAPTER 10 Canonical Forms
  • 342. Note first that U and W are T-invariant by Theorem 10.2. Now, because gðtÞ and hðtÞ are relatively prime, there exist polynomials rðtÞ and sðtÞ such that rðtÞgðtÞ þ sðtÞhðtÞ ¼ 1 Hence; for the operator T; rðTÞgðTÞ þ sðTÞhðTÞ ¼ I ð*Þ Let v 2 V; then; by ð*Þ; v ¼ rðTÞgðTÞv þ sðTÞhðTÞv But the first term in this sum belongs to W ¼ Ker hðTÞ, because hðTÞrðTÞgðTÞv ¼ rðTÞgðTÞhðTÞv ¼ rðTÞf ðTÞv ¼ rðTÞ0v ¼ 0 Similarly, the second term belongs to U. Hence, V is the sum of U and W. To prove that V ¼ U W, we must show that a sum v ¼ u þ w with u 2 U, w 2 W, is uniquely determined by v. Applying the operator rðTÞgðTÞ to v ¼ u þ w and using gðTÞu ¼ 0, we obtain rðTÞgðTÞv ¼ rðTÞgðTÞu þ rðTÞgðTÞw ¼ rðTÞgðTÞw Also, applying ð*Þ to w alone and using hðTÞw ¼ 0, we obtain w ¼ rðTÞgðTÞw þ sðTÞhðTÞw ¼ rðTÞgðTÞw Both of the above formulas give us w ¼ rðTÞgðTÞv, and so w is uniquely determined by v. Similarly u is uniquely determined by v. Hence, V ¼ U W, as required. 10.10. Prove Theorem 10.8: In Theorem 10.7 (Problem 10.9), if f ðtÞ is the minimal polynomial of T (and gðtÞ and hðtÞ are monic), then gðtÞ is the minimal polynomial of the restriction T1 of T to U and hðtÞ is the minimal polynomial of the restriction T2 of T to W. Let m1ðtÞ and m2ðtÞ be the minimal polynomials of T1 and T2, respectively. Note that gðT1Þ ¼ 0 and hðT2Þ ¼ 0 because U ¼ Ker gðTÞ and W ¼ Ker hðTÞ. Thus, m1ðtÞ divides gðtÞ and m2ðtÞ divides hðtÞ ð1Þ By Problem 10.9, f ðtÞ is the least common multiple of m1ðtÞ and m2ðtÞ. But m1ðtÞ and m2ðtÞ are relatively prime because gðtÞ and hðtÞ are relatively prime. Accordingly, f ðtÞ ¼ m1ðtÞm2ðtÞ. We also have that f ðtÞ ¼ gðtÞhðtÞ. These two equations together with (1) and the fact that all the polynomials are monic imply that gðtÞ ¼ m1ðtÞ and hðtÞ ¼ m2ðtÞ, as required. 10.11. Prove the Primary Decomposition Theorem 10.6: Let T:V ! V be a linear operator with minimal polynomial mðtÞ ¼ f1ðtÞn1 f2ðtÞn2 . . . frðtÞnr where the fiðtÞ are distinct monic irreducible polynomials. Then V is the direct sum of T- invariant subspaces W1; . . . ; Wr where Wi is the kernel of fiðTÞni . Moreover, fiðtÞni is the minimal polynomial of the restriction of T to Wi. The proof is by induction on r. The case r ¼ 1 is trivial. Suppose that the theorem has been proved for r 1. By Theorem 10.7, we can write V as the direct sum of T-invariant subspaces W1 and V1, where W1 is the kernel of f1ðTÞn1 and where V1 is the kernel of f2ðTÞn2 frðTÞnr . By Theorem 10.8, the minimal polynomials of the restrictions of T to W1 and V1 are f1ðtÞn1 and f2ðtÞn2 frðtÞnr , respectively. Denote the restriction of T to V1 by ^ T1. By the inductive hypothesis, V1 is the direct sum of subspaces W2; . . . ; Wr such that Wi is the kernel of fiðT1Þni and such that fiðtÞni is the minimal polynomial for the restriction of ^ T1 to Wi. But the kernel of fiðTÞni , for i ¼ 2; . . . ; r is necessarily contained in V1, because fiðtÞni divides f2ðtÞn2 frðtÞnr . Thus, the kernel of fiðTÞni is the same as the kernel of fiðT1Þni , which is Wi. Also, the restriction of T to Wi is the same as the restriction of ^ T1 to Wi (for i ¼ 2; . . . ; r); hence, fiðtÞni is also the minimal polynomial for the restriction of T to Wi. Thus, V ¼ W1 W2 Wr is the desired decomposition of T. 10.12. Prove Theorem 10.9: A linear operator T:V ! V has a diagonal matrix representation if and only if its minimal polynomal mðtÞ is a product of distinct linear polynomials. CHAPTER 10 Canonical Forms 335
  • 343. Suppose mðtÞ is a product of distinct linear polynomials, say, mðtÞ ¼ ðt l1Þðt l2Þ ðt lrÞ where the li are distinct scalars. By the Primary Decomposition Theorem, V is the direct sum of subspaces W1; . . . ; Wr, where Wi ¼ KerðT liIÞ. Thus, if v 2 Wi, then ðT liIÞðvÞ ¼ 0 or TðvÞ ¼ liv. In other words, every vector in Wi is an eigenvector belonging to the eigenvalue li. By Theorem 10.4, the union of bases for W1; . . . ; Wr is a basis of V. This basis consists of eigenvectors, and so T is diagonalizable. Conversely, suppose T is diagonalizable (i.e., V has a basis consisting of eigenvectors of T ). Let l1; . . . ; ls be the distinct eigenvalues of T. Then the operator f ðTÞ ¼ ðT l1IÞðT l2IÞ ðT lsIÞ maps each basis vector into 0. Thus, f ðTÞ ¼ 0, and hence, the minimal polynomial mðtÞ of T divides the polynomial f ðtÞ ¼ ðt l1Þðt l2Þ ðt lsIÞ Accordingly, mðtÞ is a product of distinct linear polynomials. Nilpotent Operators, Jordan Canonical Form 10.13. Let T:V be linear. Suppose, for v 2 V, Tk ðvÞ ¼ 0 but Tk1 ðvÞ 6¼ 0. Prove (a) The set S ¼ fv; TðvÞ; . . . ; Tk1 ðvÞg is linearly independent. (b) The subspace W generated by S is T-invariant. (c) The restriction ^ T of T to W is nilpotent of index k. (d) Relative to the basis fTk1 ðvÞ; . . . ; TðvÞ; vg of W, the matrix of T is the k-square Jordan nilpotent block Nk of index k (see Example 10.5). (a) Suppose av þ a1TðvÞ þ a2T2 ðvÞ þ þ ak1Tk1 ðvÞ ¼ 0 ð*Þ Applying Tk1 to ð*Þ and using Tk ðvÞ ¼ 0, we obtain aTk1 ðvÞ ¼ 0; because Tk1 ðvÞ 6¼ 0, a ¼ 0. Now applying Tk2 to ð*Þ and using Tk ðvÞ ¼ 0 and a ¼ 0, we fiind a1Tk1 ðvÞ ¼ 0; hence, a1 ¼ 0. Next applying Tk3 to ð*Þ and using Tk ðvÞ ¼ 0 and a ¼ a1 ¼ 0, we obtain a2Tk1 ðvÞ ¼ 0; hence, a2 ¼ 0. Continuing this process, we find that all the a’s are 0; hence, S is independent. (b) Let v 2 W. Then v ¼ bv þ b1TðvÞ þ b2T2 ðvÞ þ þ bk1Tk1 ðvÞ Using Tk ðvÞ ¼ 0, we have TðvÞ ¼ bTðvÞ þ b1T2 ðvÞ þ þ bk2Tk1 ðvÞ 2 W Thus, W is T-invariant. (c) By hypothesis, Tk ðvÞ ¼ 0. Hence, for i ¼ 0; . . . ; k 1, ^ Tk ðTi ðvÞÞ ¼ Tkþi ðvÞ ¼ 0 That is, applying ^ Tk to each generator of W, we obtain 0; hence, ^ Tk ¼ 0 and so ^ T is nilpotent of index at most k. On the other hand, ^ Tk1 ðvÞ ¼ Tk1 ðvÞ 6¼ 0; hence, T is nilpotent of index exactly k. (d) For the basis fTk1 ðvÞ, Tk2 ðvÞ; . . . ; TðvÞ; vg of W, ^ TðTk1 ðvÞÞ ¼ Tk ðvÞ ¼ 0 ^ TðTk2 ðvÞÞ ¼ Tk1 ðvÞ ^ TðTk3 ðvÞÞ ¼ Tk2 ðvÞ :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: ^ TðTðvÞÞ ¼ T2 ðvÞ ^ TðvÞ ¼ TðvÞ Hence, as required, the matrix of T in this basis is the k-square Jordan nilpotent block Nk. 336 CHAPTER 10 Canonical Forms
  • 344. 10.14. Let T:V ! V be linear. Let U ¼ Ker Ti and W ¼ Ker Tiþ1 . Show that (a) U W, (b) TðWÞ U. (a) Suppose u 2 U ¼ Ker Ti . Then Ti ðuÞ ¼ 0 and so Tiþ1 ðuÞ ¼ TðTi ðuÞÞ ¼ Tð0Þ ¼ 0. Thus, u 2 Ker Tiþ1 ¼ W. But this is true for every u 2 U; hence, U W. (b) Similarly, if w 2 W ¼ Ker Tiþ1 , then Tiþ1 ðwÞ ¼ 0: Thus, Tiþ1 ðwÞ ¼ Ti ðTðwÞÞ ¼ Ti ð0Þ ¼ 0 and so TðWÞ U. 10.15. Let T:V be linear. Let X ¼ Ker Ti2 , Y ¼ Ker Ti1 , Z ¼ Ker Ti . Therefore (Problem 10.14), X Y Z. Suppose fu1; . . . ; urg; fu1; . . . ; ur; v1; . . . ; vsg; fu1; . . . ; ur; v1; . . . ; vs; w1; . . . ; wtg are bases of X; Y; Z, respectively. Show that S ¼ fu1; . . . ; ur; Tðw1Þ; . . . ; TðwtÞg is contained in Y and is linearly independent. By Problem 10.14, TðZÞ Y, and hence S Y. Now suppose S is linearly dependent. Then there exists a relation a1u1 þ þ arur þ b1Tðw1Þ þ þ btTðwtÞ ¼ 0 where at least one coefficient is not zero. Furthermore, because fuig is independent, at least one of the bk must be nonzero. Transposing, we find b1Tðw1Þ þ þ btTðwtÞ ¼ a1u1 arur 2 X ¼ Ker Ti2 Hence; Ti2 ðb1Tðw1Þ þ þ btTðwtÞÞ ¼ 0 Thus; Ti1 ðb1w1 þ þ btwtÞ ¼ 0; and so b1w1 þ þ btwt 2 Y ¼ Ker Ti1 Because fui; vjg generates Y, we obtain a relation among the ui, vj, wk where one of the coefficients (i.e., one of the bk) is not zero. This contradicts the fact that fui; vj; wkg is independent. Hence, S must also be independent. 10.16. Prove Theorem 10.10: Let T:V ! V be a nilpotent operator of index k. Then T has a unique block diagonal matrix representation consisting of Jordan nilpotent blocks N. There is at least one N of order k, and all other N are of orders k. The total number of N of all orders is equal to the nullity of T. Suppose dim V ¼ n. Let W1 ¼ Ker T, W2 ¼ Ker T2 ; . . . ; Wk ¼ Ker Tk . Let us set mi ¼ dim Wi, for i ¼ 1; . . . ; k. Because T is of index k, Wk ¼ V and Wk1 6¼ V and so mk1 mk ¼ n. By Problem 10.14, W1 W2 Wk ¼ V Thus, by induction, we can choose a basis fu1; . . . ; ung of V such that fu1; . . . ; umi g is a basis of Wi. We now choose a new basis for V with respect to which T has the desired form. It will be convenient to label the members of this new basis by pairs of indices. We begin by setting vð1; kÞ ¼ umk1þ1; vð2; kÞ ¼ umk1þ2; . . . ; vðmk mk1; kÞ ¼ umk and setting vð1; k 1Þ ¼ Tvð1; kÞ; vð2; k 1Þ ¼ Tvð2; kÞ; . . . ; vðmk mk1; k 1Þ ¼ Tvðmk mk1; kÞ By the preceding problem, S1 ¼ fu1 . . . ; umk2 ; vð1; k 1Þ; . . . ; vðmk mk1; k 1Þg is a linearly independent subset of Wk1. We extend S1 to a basis of Wk1 by adjoining new elements (if necessary), which we denote by vðmk mk1 þ 1; k 1Þ; vðmk mk1 þ 2; k 1Þ; . . . ; vðmk1 mk2; k 1Þ Next we set vð1; k 2Þ ¼ Tvð1; k 1Þ; vð2; k 2Þ ¼ Tvð2; k 1Þ; . . . ; vðmk1 mk2; k 2Þ ¼ Tvðmk1 mk2; k 1Þ CHAPTER 10 Canonical Forms 337
  • 345. Again by the preceding problem, S2 ¼ fu1; . . . ; umks ; vð1; k 2Þ; . . . ; vðmk1 mk2; k 2Þg is a linearly independent subset of Wk2, which we can extend to a basis of Wk2 by adjoining elements vðmk1 mk2 þ 1; k 2Þ; vðmk1 mk2 þ 2; k 2Þ; . . . ; vðmk2 mk3; k 2Þ Continuing in this manner, we get a new basis for V, which for convenient reference we arrange as follows: vð1; kÞ . . . ; vðmk mk1; kÞ vð1; k 1Þ; . . . ; vðmk mk1; k 1Þ . . . ; vðmk1 mk2; k 1Þ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: vð1; 2Þ; . . . ; vðmk mk1; 2Þ; . . . ; vðmk1 mk2; 2Þ; . . . ; vðm2 m1; 2Þ vð1; 1Þ; . . . ; vðmk mk1; 1Þ; . . . ; vðmk1 mk2; 1Þ; . . . ; vðm2 m1; 1Þ; . . . ; vðm1; 1Þ The bottom row forms a basis of W1, the bottom two rows form a basis of W2, and so forth. But what is important for us is that T maps each vector into the vector immediately below it in the table or into 0 if the vector is in the bottom row. That is, Tvði; jÞ ¼ vði; j 1Þ for j 1 0 for j ¼ 1 Now it is clear [see Problem 10.13(d)] that T will have the desired form if the vði; jÞ are ordered lexicographically: beginning with vð1; 1Þ and moving up the first column to vð1; kÞ, then jumping to vð2; 1Þ and moving up the second column as far as possible. Moreover, there will be exactly mk mk1 diagonal entries of order k: Also, there will be ðmk1 mk2Þ ðmk mk1Þ ¼ 2mk1 mk mk2 diagonal entries of order k 1 ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: 2m2 m1 m3 diagonal entries of order 2 2m1 m2 diagonal entries of order 1 as can be read off directly from the table. In particular, because the numbers m1; . . . ; mk are uniquely determined by T, the number of diagonal entries of each order is uniquely determined by T. Finally, the identity m1 ¼ ðmk mk1Þ þ ð2mk1 mk mk2Þ þ þ ð2m2 m1 m3Þ þ ð2m1 m2Þ shows that the nullity m1 of T is the total number of diagonal entries of T. 10.17. Let A ¼ 0 1 1 0 1 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 6 6 6 6 4 3 7 7 7 7 5 and B ¼ 0 1 1 0 0 0 0 1 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0 0 0 2 6 6 6 6 4 3 7 7 7 7 5 . The reader can verify that A and B are both nilpotent of index 3; that is, A3 ¼ 0 but A2 6¼ 0, and B3 ¼ 0 but B2 6¼ 0. Find the nilpotent matrices MA and MB in canonical form that are similar to A and B, respectively. Because A and B are nilpotent of index 3, MA and MB must each contain a Jordan nilpotent block of order 3, and none greater then 3. Note that rankðAÞ ¼ 2 and rankðBÞ ¼ 3, so nullityðAÞ ¼ 5 2 ¼ 3 and nullityðBÞ ¼ 5 3 ¼ 2. Thus, MA must contain three diagonal blocks, which must be one of order 3 and two of order 1; and MB must contain two diagonal blocks, which must be one of order 3 and one of order 2. Namely, MA ¼ 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 6 6 6 6 4 3 7 7 7 7 5 and MB ¼ 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 2 6 6 6 6 4 3 7 7 7 7 5 338 CHAPTER 10 Canonical Forms
  • 346. 10.18. Prove Theorem 10.11 on the Jordan canonical form for an operator T. By the primary decomposition theorem, T is decomposable into operators T1; . . . ; Tr; that is, T ¼ T1 Tr, where ðt liÞmi is the minimal polynomial of Ti. Thus, in particular, ðT1 l1IÞm1 ¼ 0; . . . ; ðTr lrIÞmr ¼ 0 Set Ni ¼ Ti liI. Then, for i ¼ 1; . . . ; r, Ti ¼ Ni þ liI; where Nmi i ¼ 0 That is, Ti is the sum of the scalar operator liI and a nilpotent operator Ni, which is of index mi because ðt liÞm i is the minimal polynomial of Ti. Now, by Theorem 10.10 on nilpotent operators, we can choose a basis so that Ni is in canonical form. In this basis, Ti ¼ Ni þ liI is represented by a block diagonal matrix Mi whose diagonal entries are the matrices Jij. The direct sum J of the matrices Mi is in Jordan canonical form and, by Theorem 10.5, is a matrix representation of T. Last, we must show that the blocks Jij satisfy the required properties. Property (i) follows from the fact that Ni is of index mi. Property (ii) is true because T and J have the same characteristic polynomial. Property (iii) is true because the nullity of Ni ¼ Ti liI is equal to the geometric multiplicity of the eigenvalue li. Property (iv) follows from the fact that the Ti and hence the Ni are uniquely determined by T. 10.19. Determine all possible Jordan canonical forms J for a linear operator T:V ! V whose characteristic polynomial DðtÞ ¼ ðt 2Þ5 and whose minimal polynomial mðtÞ ¼ ðt 2Þ2 . J must be a 5 5 matrix, because DðtÞ has degree 5, and all diagonal elements must be 2, because 2 is the only eigenvalue. Moreover, because the exponent of t 2 in mðtÞ is 2, J must have one Jordan block of order 2, and the others must be of order 2 or 1. Thus, there are only two possibilities: J ¼ diag 2 1 2 ; 2 1 2 ; ½2 or J ¼ diag 2 1 2 ; ½2; ½2; ½2 10.20. Determine all possible Jordan canonical forms for a linear operator T:V ! V whose character- istic polynomial DðtÞ ¼ ðt 2Þ3 ðt 5Þ2 . In each case, find the minimal polynomial mðtÞ. Because t 2 has exponent 3 in DðtÞ, 2 must appear three times on the diagonal. Similarly, 5 must appear twice. Thus, there are six possibilities: (a) diag 2 1 2 1 2 2 4 3 5; 5 1 5 0 @ 1 A, (b) diag 2 1 2 1 2 2 4 3 5; ½5; ½5 0 @ 1 A, (c) diag 2 1 2 ; ½2; 5 1 5 , (d) diag 2 1 2 ; ½2; ½5; ½5 , (e) diag ½2; ½2; ½2; 5 1 5 , (f ) diagð½2; ½2; ½2; ½5; ½5Þ The exponent in the minimal polynomial mðtÞ is equal to the size of the largest block. Thus, (a) mðtÞ ¼ ðt 2Þ3 ðt 5Þ2 , (b) mðtÞ ¼ ðt 2Þ3 ðt 5Þ, (c) mðtÞ ¼ ðt 2Þ2 ðt 5Þ2 , (d) mðtÞ ¼ ðt 2Þ2 ðt 5Þ, (e) mðtÞ ¼ ðt 2Þðt 5Þ2 , (f ) mðtÞ ¼ ðt 2Þðt 5Þ Quotient Space and Triangular Form 10.21. Let W be a subspace of a vector space V. Show that the following are equivalent: (i) u 2 v þ W, (ii) u v 2 W, (iii) v 2 u þ W. Suppose u 2 v þ W. Then there exists w0 2 W such that u ¼ v þ w0. Hence, u v ¼ w0 2 W. Conversely, suppose u v 2 W. Then u v ¼ w0 where w0 2 W. Hence, u ¼ v þ w0 2 v þ W. Thus, (i) and (ii) are equivalent. We also have u v 2 W iff ðu vÞ ¼ v u 2 W iffv 2 u þ W. Thus, (ii) and (iii) are also equivalent. CHAPTER 10 Canonical Forms 339
  • 347. 10.22. Prove the following: The cosets of W in V partition V into mutually disjoint sets. That is, (a) Any two cosets u þ W and v þ W are either identical or disjoint. (b) Each v 2 V belongs to a coset; in fact, v 2 v þ W. Furthermore, u þ W ¼ v þ W if and only if u v 2 W, and so ðv þ wÞ þ W ¼ v þ W for any w 2 W. Let v 2 V. Because 0 2 W, we have v ¼ v þ 0 2 v þ W, which proves (b). Now suppose the cosets u þ W and v þ W are not disjoint; say, the vector x belongs to both u þ W and v þ W. Then u x 2 W and x v 2 W. The proof of (a) is complete if we show that u þ W ¼ v þ W. Let u þ w0 be any element in the coset u þ W. Because u x, x v, w0 belongs to W, ðu þ w0Þ v ¼ ðu xÞ þ ðx vÞ þ w0 2 W Thus, u þ w0 2 v þ W, and hence the cost u þ W is contained in the coset v þ W. Similarly, v þ W is contained in u þ W, and so u þ W ¼ v þ W. The last statement follows from the fact that u þ W ¼ v þ W if and only if u 2 v þ W, and, by Problem 10.21, this is equivalent to u v 2 W. 10.23. Let W be the solution space of the homogeneous equation 2x þ 3y þ 4z ¼ 0. Describe the cosets of W in R3 . W is a plane through the origin O ¼ ð0; 0; 0Þ, and the cosets of W are the planes parallel to W. Equivalently, the cosets of W are the solution sets of the family of equations 2x þ 3y þ 4z ¼ k; k 2 R In fact, the coset v þ W, where v ¼ ða; b; cÞ, is the solution set of the linear equation 2x þ 3y þ 4z ¼ 2a þ 3b þ 4c or 2ðx aÞ þ 3ðy bÞ þ 4ðz cÞ ¼ 0 10.24. Suppose W is a subspace of a vector space V. Show that the operations in Theorem 10.15 are well defined; namely, show that if u þ W ¼ u0 þ W and v þ W ¼ v0 þ W, then ðaÞ ðu þ vÞ þ W ¼ ðu0 þ v0 Þ þ W and ðbÞ ku þ W ¼ ku0 þ W for any k 2 K (a) Because u þ W ¼ u0 þ W and v þ W ¼ v0 þ W, both u u0 and v v0 belong to W. But then ðu þ vÞ ðu0 þ v0 Þ ¼ ðu u0 Þ þ ðv v0 Þ 2 W. Hence, ðu þ vÞ þ W ¼ ðu0 þ v0 Þ þ W. (b) Also, because u u0 2 W implies kðu u0 Þ 2 W, then ku ku0 ¼ kðu u0 Þ 2 W; accordingly, ku þ W ¼ ku0 þ W. 10.25. Let V be a vector space and W a subspace of V. Show that the natural map Z: V ! V=W, defined by ZðvÞ ¼ v þ W, is linear. For any u; v 2 V and any k 2 K, we have nðu þ vÞ ¼ u þ v þ W ¼ u þ W þ v þ W ¼ ZðuÞ þ ZðvÞ and ZðkvÞ ¼ kv þ W ¼ kðv þ WÞ ¼ kZðvÞ Accordingly, Z is linear. 10.26. Let W be a subspace of a vector space V. Suppose fw1; . . . ; wrg is a basis of W and the set of cosets f v1; . . . ; vsg, where vj ¼ vj þ W, is a basis of the quotient space. Show that the set of vectors B ¼ fv1; . . . ; vs, w1; . . . ; wrg is a basis of V. Thus, dim V ¼ dim W þ dimðV=WÞ. Suppose u 2 V. Because f vjg is a basis of V=W, u ¼ u þ W ¼ a1 v1 þ a2 v2 þ þ as vs Hence, u ¼ a1v1 þ þ asvs þ w, where w 2 W. Since fwig is a basis of W, u ¼ a1v1 þ þ asvs þ b1w1 þ þ brwr 340 CHAPTER 10 Canonical Forms
  • 348. Accordingly, B spans V. We now show that B is linearly independent. Suppose c1v1 þ þ csvs þ d1w1 þ þ drwr ¼ 0 ð1Þ Then c1 v1 þ þ cs vs ¼ 0 ¼ W Because f vjg is independent, the c’s are all 0. Substituting into (1), we find d1w1 þ þ drwr ¼ 0. Because fwig is independent, the d’s are all 0. Thus, B is linearly independent and therefore a basis of V. 10.27. Prove Theorem 10.16: Suppose W is a subspace invariant under a linear operator T:V ! V. Then T induces a linear operator T on V=W defined by Tðv þ WÞ ¼ TðvÞ þ W. Moreover, if T is a zero of any polynomial, then so is T. Thus, the minimal polynomial of T divides the minimal polynomial of T. We first show that T is well defined; that is, if u þ W ¼ v þ W, then Tðu þ WÞ ¼ Tðv þ WÞ. If u þ W ¼ v þ W, then u v 2 W, and, as W is T-invariant, Tðu vÞ ¼ TðuÞ TðvÞ 2 W. Accordingly, Tðu þ WÞ ¼ TðuÞ þ W ¼ TðvÞ þ W ¼ Tðv þ WÞ as required. We next show that T is linear. We have Tððu þ WÞ þ ðv þ WÞÞ ¼ Tðu þ v þ WÞ ¼ Tðu þ vÞ þ W ¼ TðuÞ þ TðvÞ þ W ¼ TðuÞ þ W þ TðvÞ þ W ¼ Tðu þ WÞ þ Tðv þ WÞ Furthermore, Tðkðu þ WÞÞ ¼ Tðku þ WÞ ¼ TðkuÞ þ W ¼ kTðuÞ þ W ¼ kðTðuÞ þ WÞ ¼ k ^ Tðu þ WÞ Thus, T is linear. Now, for any coset u þ W in V=W, T2ðu þ WÞ ¼ T2 ðuÞ þ W ¼ TðTðuÞÞ þ W ¼ TðTðuÞ þ WÞ ¼ Tð Tðu þ WÞÞ ¼ T2 ðu þ WÞ Hence, T2 ¼ T2 . Similarly, Tn ¼ Tn for any n. Thus, for any polynomial f ðtÞ ¼ antn þ þ a0 ¼ P aiti f ðTÞðu þ WÞ ¼ f ðTÞðuÞ þ W ¼ P aiTi ðuÞ þ W ¼ P aiðTi ðuÞ þ WÞ ¼ P aiTiðu þ WÞ ¼ P ai Ti ðu þ WÞ ¼ ð P ai Ti Þðu þ WÞ ¼ f ð TÞðu þ WÞ and so f ðTÞ ¼ f ð TÞ. Accordingly, if T is a root of f ðtÞ then f ðTÞ ¼ 0 ¼ W ¼ f ð TÞ; that is, T is also a root of f ðtÞ. The theorem is proved. 10.28. Prove Theorem 10.1: Let T:V ! V be a linear operator whose characteristic polynomial factors into linear polynomials. Then V has a basis in which T is represented by a triangular matrix. The proof is by induction on the dimension of V. If dim V ¼ 1, then every matrix representation of T is a 1 1 matrix, which is triangular. Now suppose dim V ¼ n 1 and that the theorem holds for spaces of dimension less than n. Because the characteristic polynomial of T factors into linear polynomials, T has at least one eigenvalue and so at least one nonzero eigenvector v, say TðvÞ ¼ a11v. Let W be the one-dimensional subspace spanned by v. Set V ¼ V=W. Then (Problem 10.26) dim V ¼ dim V dim W ¼ n 1. Note also that W is invariant under T. By Theorem 10.16, T induces a linear operator T on V whose minimal polynomial divides the minimal polynomial of T. Because the characteristic polynomial of T is a product of linear polynomials, so is its minimal polynomial, and hence, so are the minimal and characteristic polynomials of T. Thus, V and T satisfy the hypothesis of the theorem. Hence, by induction, there exists a basis f v2; . . . ; vng of V such that Tð v2Þ ¼ a22 v2 Tð v3Þ ¼ a32 v2 þ a33 v3 ::::::::::::::::::::::::::::::::::::::::: Tð vnÞ ¼ an2 vn þ an3 v3 þ þ ann vn CHAPTER 10 Canonical Forms 341
  • 349. Now let v2; . . . ; vn be elements of V that belong to the cosets v2; . . . ; vn, respectively. Then fv; v2; . . . ; vng is a basis of V (Problem 10.26). Because Tðv2Þ ¼ a22 v2, we have Tð v2Þ a22 v22 ¼ 0; and so Tðv2Þ a22v2 2 W But W is spanned by v; hence, Tðv2Þ a22v2 is a multiple of v, say, Tðv2Þ a22v2 ¼ a21v; and so Tðv2Þ ¼ a21v þ a22v2 Similarly, for i ¼ 3; . . . ; n TðviÞ ai2v2 ai3v3 aiivi 2 W; and so TðviÞ ¼ ai1v þ ai2v2 þ þ aiivi Thus, TðvÞ ¼ a11v Tðv2Þ ¼ a21v þ a22v2 :::::::::::::::::::::::::::::::::::::::: TðvnÞ ¼ an1v þ an2v2 þ þ annvn and hence the matrix of T in this basis is triangular. Cyclic Subspaces, Rational Canonical Form 10.29. Prove Theorem 10.12: Let Zðv; TÞ be a T-cyclic subspace, Tv the restriction of T to Zðv; TÞ, and mvðtÞ ¼ tk þ ak1tk1 þ þ a0 the T-annihilator of v. Then, (i) The set fv; TðvÞ; . . . ; Tk1 ðvÞg is a basis of Zðv; TÞ; hence, dim Zðv; TÞ ¼ k. (ii) The minimal polynomial of Tv is mvðtÞ. (iii) The matrix of Tv in the above basis is the companion matrix C ¼ CðmvÞ of mvðtÞ [which has 1’s below the diagonal, the negative of the coefficients a0; a1; . . . ; ak1 of mvðtÞ in the last column, and 0’s elsewhere]. (i) By definition of mvðtÞ, Tk ðvÞ is the first vector in the sequence v, TðvÞ, T2 ðvÞ; ... that, is a linear combination of those vectors that precede it in the sequence; hence, the set B ¼ fv; TðvÞ; ... ; Tk1 ðvÞg is linearly independent. We now only have to show that Zðv; TÞ ¼ LðBÞ, the linear span of B. By the above, Tk ðvÞ 2 LðBÞ. We prove by induction that Tn ðvÞ 2 LðBÞ for every n. Suppose n k and Tn1 ðvÞ 2 LðBÞ—that is, Tn1 ðvÞ is a linear combination of v; ... ; Tk1 ðvÞ. Then Tn ðvÞ ¼ TðTn1 ðvÞÞ is a linear combination of TðvÞ; ... ; Tk ðvÞ. But Tk ðvÞ 2 LðBÞ; hence, Tn ðvÞ 2 LðBÞ for every n. Consequently, f ðTÞðvÞ 2 LðBÞ for any polynomial f ðtÞ. Thus, Zðv; TÞ ¼ LðBÞ, and so B is a basis, as claimed. (ii) Suppose mðtÞ ¼ ts þ bs1ts1 þ þ b0 is the minimal polynomial of Tv. Then, because v 2 Zðv; TÞ, 0 ¼ mðTvÞðvÞ ¼ mðTÞðvÞ ¼ Ts ðvÞ þ bs1Ts1 ðvÞ þ þ b0v Thus, Ts ðvÞ is a linear combination of v, TðvÞ; . . . ; Ts1 ðvÞ, and therefore k s. However, mvðTÞ ¼ 0 and so mvðTvÞ ¼ 0: Then mðtÞ divides mvðtÞ; and so s k: Accordingly, k ¼ s and hence mvðtÞ ¼ mðtÞ. (iii) TvðvÞ ¼ TðvÞ TvðTðvÞÞ ¼ T2 ðvÞ ::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: TvðTk2 ðvÞÞ ¼ Tk1 ðvÞ TvðTk1 ðvÞÞ ¼ Tk ðvÞ ¼ a0v a1 TðvÞ a2T2 ðvÞ ak1 Tk1 ðvÞ By definition, the matrix of Tv in this basis is the tranpose of the matrix of coefficients of the above system of equations; hence, it is C, as required. 10.30. Let T:V ! V be linear. Let W be a T-invariant subspace of V and T the induced operator on V=W. Prove (a) The T-annihilator of v 2 V divides the minimal polynomial of T. (b) The T-annihilator of v 2 V=W divides the minimal polynomial of T. 342 CHAPTER 10 Canonical Forms
  • 350. (a) The T-annihilator of v 2 V is the minimal polynomial of the restriction of T to Zðv; TÞ; therefore, by Problem 10.6, it divides the minimal polynomial of T. (b) The T-annihilator of v 2 V=W divides the minimal polynomial of T, which divides the minimal polynomial of T by Theorem 10.16. Remark: In the case where the minimum polynomial of T is f ðtÞn , where f ðtÞ is a monic irreducible polynomial, then the T-annihilator of v 2 V and the T-annihilator of v 2 V=W are of the form f ðtÞm , where m n. 10.31. Prove Lemma 10.13: Let T:V ! V be a linear operator whose minimal polynomial is f ðtÞn , where f ðtÞ is a monic irreducible polynomial. Then V is the direct sum of T-cyclic subspaces Zi ¼ Zðvi; TÞ, i ¼ 1; . . . ; r, with corresponding T-annihilators f ðtÞn1 ; f ðtÞn2 ; . . . ; f ðtÞnr ; n ¼ n1 n2 nr Any other decomposition of V into the direct sum of T-cyclic subspaces has the same number of components and the same set of T-annihilators. The proof is by induction on the dimension of V. If dim V ¼ 1, then V is T-cyclic and the lemma holds. Now suppose dim V 1 and that the lemma holds for those vector spaces of dimension less than that of V. Because the minimal polynomial of T is f ðtÞn , there exists v1 2 V such that f ðTÞn1 ðv1Þ 6¼ 0; hence, the T-annihilator of v1 is f ðtÞn . Let Z1 ¼ Zðv1; TÞ and recall that Z1 is T-invariant. Let V ¼ V=Z1 and let T be the linear operator on V induced by T. By Theorem 10.16, the minimal polynomial of T divides f ðtÞn ; hence, the hypothesis holds for V and T. Consequently, by induction, V is the direct sum of T-cyclic subspaces; say, V ¼ Zð v2; TÞ Zð vr; T Þ where the corresponding T-annihilators are f ðtÞn2 ; . . . ; f ðtÞnr , n n2 nr. We claim that there is a vector v2 in the coset v2 whose T-annihilator is f ðtÞn2 , the T-annihilator of v2. Let w be any vector in v2. Then f ðTÞn2 ðwÞ 2 Z1. Hence, there exists a polynomial gðtÞ for which f ðTÞn2 ðwÞ ¼ gðTÞðv1Þ ð1Þ Because f ðtÞn is the minimal polynomial of T, we have, by (1), 0 ¼ f ðTÞn ðwÞ ¼ f ðTÞnn2 gðTÞðv1Þ But f ðtÞn is the T-annihilator of v1; hence, f ðtÞn divides f ðtÞnn2 gðtÞ, and so gðtÞ ¼ f ðtÞn2 hðtÞ for some polynomial hðtÞ. We set v2 ¼ w hðTÞðv1Þ Because w v2 ¼ hðTÞðv1Þ 2 Z1, v2 also belongs to the coset v2. Thus, the T-annihilator of v2 is a multiple of the T-annihilator of v2. On the other hand, by (1), f ðTÞn2 ðv2Þ ¼ f ðTÞns ðw hðTÞðv1ÞÞ ¼ f ðTÞn2 ðwÞ gðTÞðv1Þ ¼ 0 Consequently, the T-annihilator of v2 is f ðtÞn2 , as claimed. Similarly, there exist vectors v3; . . . ; vr 2 V such that vi 2 vi and that the T-annihilator of vi is f ðtÞni , the T-annihilator of vi. We set Z2 ¼ Zðv2; TÞ; . . . ; Zr ¼ Zðvr; TÞ Let d denote the degree of f ðtÞ, so that f ðtÞni has degree dni. Then, because f ðtÞni is both the T-annihilator of vi and the T-annihilator of vi, we know that fvi; TðviÞ; . . . ; Tdni1 ðviÞg and f vi: TðviÞ; . . . ; Tdni1 ðviÞg are bases for Zðvi; TÞ and Zðvi; TÞ, respectively, for i ¼ 2; . . . ; r. But V ¼ Zðv2; TÞ Zðvr; TÞ; hence, f v2; . . . ; Tdn21 ð v2Þ; . . . ; vr; . . . ; Tdnr1 ð vrÞg CHAPTER 10 Canonical Forms 343
  • 351. is a basis for V. Therefore, by Problem 10.26 and the relation Ti ð vÞ ¼ TiðvÞ (see Problem 10.27), fv1; . . . ; Tdn11 ðv1Þ; v2; . . . ; Ten21 ðv2Þ; . . . ; vr; . . . ; Tdnr1 ðvrÞg is a basis for V. Thus, by Theorem 10.4, V ¼ Zðv1; TÞ Zðvr; TÞ, as required. It remains to show that the exponents n1; . . . ; nr are uniquely determined by T. Because d ¼ degree of f ðtÞ; dim V ¼ dðn1 þ þ nrÞ and dim Zi ¼ dni; i ¼ 1; . . . ; r Also, if s is any positive integer, then (Problem 10.59) f ðTÞs ðZiÞ is a cyclic subspace generated by f ðTÞs ðviÞ, and it has dimension dðni sÞ if ni s and dimension 0 if ni s. Now any vector v 2 V can be written uniquely in the form v ¼ w1 þ þ wr, where wi 2 Zi. Hence, any vector in f ðTÞs ðVÞ can be written uniquely in the form f ðTÞs ðvÞ ¼ f ðTÞs ðw1Þ þ þ f ðTÞs ðwrÞ where f ðTÞs ðwiÞ 2 f ðTÞs ðZiÞ. Let t be the integer, dependent on s, for which n1 s; . . . ; nt s; ntþ1 s Then f ðTÞs ðVÞ ¼ f ðTÞs ðZ1Þ f ðTÞs ðZtÞ and so dim½ f ðTÞs ðVÞ ¼ d½ðn1 sÞ þ þ ðnt sÞ ð2Þ The numbers on the left of (2) are uniquely determined by T. Set s ¼ n 1, and (2) determines the number of ni equal to n. Next set s ¼ n 2, and (2) determines the number of ni (if any) equal to n 1. We repeat the process until we set s ¼ 0 and determine the number of ni equal to 1. Thus, the ni are uniquely determined by T and V, and the lemma is proved. 10.32. Let V be a seven-dimensional vector space over R, and let T:V ! V be a linear operator with minimal polynomial mðtÞ ¼ ðt2 2t þ 5Þðt 3Þ3 . Find all possible rational canonical forms M of T. Because dim V ¼ 7; there are only two possible characteristic polynomials, D1ðtÞ ¼ ðt2 2t þ 5Þ2 ðt 3Þ3 or D1ðtÞ ¼ ðt2 2t þ 5Þðt 3Þ5 : Moreover, the sum of the orders of the companion matrices must add up to 7. Also, one companion matrix must be Cðt2 2t þ 5Þ and one must be Cððt 3Þ3 Þ ¼ Cðt3 9t2 þ 27t 27Þ. Thus, M must be one of the following block diagonal matrices: (a) diag 0 5 1 2 ; 0 5 1 2 ; 0 0 27 1 0 27 0 1 9 2 4 3 5 0 @ 1 A; (b) diag 0 5 1 2 ; 0 0 27 1 0 27 0 1 9 2 4 3 5; 0 9 1 6 0 @ 1 A; (c) diag 0 5 1 2 ; 0 0 27 1 0 27 0 1 9 2 4 3 5; ½3; ½3 0 @ 1 A Projections 10.33. Suppose V ¼ W1 Wr. The projection of V into its subspace Wk is the mapping E: V ! V defined by EðvÞ ¼ wk, where v ¼ w1 þ þ wr; wi 2 Wi. Show that (a) E is linear, (b) E2 ¼ E. (a) Because the sum v ¼ w1 þ þ wr, wi 2 W is uniquely determined by v, the mapping E is well defined. Suppose, for u 2 V, u ¼ w0 1 þ þ w0 r, w0 i 2 Wi. Then v þ u ¼ ðw1 þ w0 1Þ þ þ ðwr þ w0 rÞ and kv ¼ kw1 þ þ kwr; kwi; wi þ w0 i 2 Wi are the unique sums corresponding to v þ u and kv. Hence, Eðv þ uÞ ¼ wk þ w0 k ¼ EðvÞ þ EðuÞ and EðkvÞ ¼ kwk þ kEðvÞ and therefore E is linear. 344 CHAPTER 10 Canonical Forms
  • 352. (b) We have that wk ¼ 0 þ þ 0 þ wk þ 0 þ þ 0 is the unique sum corresponding to wk 2 Wk; hence, EðwkÞ ¼ wk. Then, for any v 2 V, E2 ðvÞ ¼ EðEðvÞÞ ¼ EðwkÞ ¼ wk ¼ EðvÞ Thus, E2 ¼ E, as required. 10.34. Suppose E:V ! V is linear and E2 ¼ E. Show that (a) EðuÞ ¼ u for any u 2 Im E (i.e., the restriction of E to its image is the identity mapping); (b) V is the direct sum of the image and kernel of E:V ¼ Im E Ker E; (c) E is the projection of V into Im E, its image. Thus, by the preceding problem, a linear mapping T:V ! V is a projection if and only if T2 ¼ T; this characterization of a projection is frequently used as its definition. (a) If u 2 Im E, then there exists v 2 V for which EðvÞ ¼ u; hence, as required, EðuÞ ¼ EðEðvÞÞ ¼ E2 ðvÞ ¼ EðvÞ ¼ u (b) Let v 2 V. We can write v in the form v ¼ EðvÞ þ v EðvÞ. Now EðvÞ 2 Im E and, because Eðv EðvÞÞ ¼ EðvÞ E2 ðvÞ ¼ EðvÞ EðvÞ ¼ 0 v EðvÞ 2 Ker E. Accordingly, V ¼ Im E þ Ker E. Now suppose w 2 Im E Ker E. By (i), EðwÞ ¼ w because w 2 Im E. On the other hand, EðwÞ ¼ 0 because w 2 Ker E. Thus, w ¼ 0, and so Im E Ker E ¼ f0g. These two conditions imply that V is the direct sum of the image and kernel of E. (c) Let v 2 V and suppose v ¼ u þ w, where u 2 Im E and w 2 Ker E. Note that EðuÞ ¼ u by (i), and EðwÞ ¼ 0 because w 2 Ker E. Hence, EðvÞ ¼ Eðu þ wÞ ¼ EðuÞ þ EðwÞ ¼ u þ 0 ¼ u That is, E is the projection of V into its image. 10.35. Suppose V ¼ U W and suppose T:V ! V is linear. Show that U and W are both T-invariant if and only if TE ¼ ET, where E is the projection of V into U. Observe that EðvÞ 2 U for every v 2 V, and that (i) EðvÞ ¼ v iff v 2 U, (ii) EðvÞ ¼ 0 iff v 2 W. Suppose ET ¼ TE. Let u 2 U. Because EðuÞ ¼ u, TðuÞ ¼ TðEðuÞÞ ¼ ðTEÞðuÞ ¼ ðETÞðuÞ ¼ EðTðuÞÞ 2 U Hence, U is T-invariant. Now let w 2 W. Because EðwÞ ¼ 0, EðTðwÞÞ ¼ ðETÞðwÞ ¼ ðTEÞðwÞ ¼ TðEðwÞÞ ¼ Tð0Þ ¼ 0; and so TðwÞ 2 W Hence, W is also T-invariant. Conversely, suppose U and W are both T-invariant. Let v 2 V and suppose v ¼ u þ w, where u 2 T and w 2 W. Then TðuÞ 2 U and TðwÞ 2 W; hence, EðTðuÞÞ ¼ TðuÞ and EðTðwÞÞ ¼ 0. Thus, ðETÞðvÞ ¼ ðETÞðu þ wÞ ¼ ðETÞðuÞ þ ðETÞðwÞ ¼ EðTðuÞÞ þ EðTðwÞÞ ¼ TðuÞ and ðTEÞðvÞ ¼ ðTEÞðu þ wÞ ¼ TðEðu þ wÞÞ ¼ TðuÞ That is, ðETÞðvÞ ¼ ðTEÞðvÞ for every v 2 V; therefore, ET ¼ TE, as required. SUPPLEMENTARY PROBLEMS Invariant Subspaces 10.36. Suppose W is invariant under T:V ! V. Show that W is invariant under f ðTÞ for any polynomial f ðtÞ. 10.37. Show that every subspace of V is invariant under I and 0, the identity and zero operators. CHAPTER 10 Canonical Forms 345
  • 353. 10.38. Let W be invariant under T1: V ! V and T2: V ! V. Prove W is also invariant under T1 þ T2 and T1T2. 10.39. Let T:V ! V be linear. Prove that any eigenspace, El is T-invariant. 10.40. Let V be a vector space of odd dimension (greater than 1) over the real field R. Show that any linear operator on V has an invariant subspace other than V or f0g. 10.41. Determine the invariant subspace of A ¼ 2 4 5 2 viewed as a linear operator on (a) R2 , (b) C2 . 10.42. Suppose dim V ¼ n. Show that T:V ! V has a triangular matrix representation if and only if there exist T-invariant subspaces W1 W2 Wn ¼ V for which dim Wk ¼ k, k ¼ 1; . . . ; n. Invariant Direct Sums 10.43. The subspaces W1; . . . ; Wr are said to be independent if w1 þ þ wr ¼ 0, wi 2 Wi, implies that each wi ¼ 0. Show that spanðWiÞ ¼ W1 Wr if and only if the Wi are independent. [Here spanðWiÞ denotes the linear span of the Wi.] 10.44. Show that V ¼ W1 Wr if and only if (i) V ¼ spanðWiÞ and (ii) for k ¼ 1; 2; . . . ; r, Wk spanðW1; . . . ; Wk1; Wkþ1; . . . ; WrÞ ¼ f0g. 10.45. Show that spanðWiÞ ¼ W1 Wr if and only if dim ½spanðWiÞ ¼ dim W1 þ þ dim Wr. 10.46. Suppose the characteristic polynomial of T:V ! V is DðtÞ ¼ f1ðtÞn1 f2ðtÞn2 frðtÞnr , where the fiðtÞ are distinct monic irreducible polynomials. Let V ¼ W1 Wr be the primary decomposition of V into T- invariant subspaces. Show that fiðtÞni is the characteristic polynomial of the restriction of T to Wi. Nilpotent Operators 10.47. Suppose T1 and T2 are nilpotent operators that commute (i.e., T1T2 ¼ T2T1). Show that T1 þ T2 and T1T2 are also nilpotent. 10.48. Suppose A is a supertriangular matrix (i.e., all entries on and below the main diagonal are 0). Show that A is nilpotent. 10.49. Let V be the vector space of polynomials of degree n. Show that the derivative operator on V is nilpotent of index n þ 1. 10.50. Show that any Jordan nilpotent block matrix N is similar to its transpose NT (the matrix with 1’s below the diagonal and 0’s elsewhere). 10.51. Show that two nilpotent matrices of order 3 are similar if and only if they have the same index of nilpotency. Show by example that the statement is not true for nilpotent matrices of order 4. Jordan Canonical Form 10.52. Find all possible Jordan canonical forms for those matrices whose characteristic polynomial DðtÞ and minimal polynomial mðtÞ are as follows: (a) DðtÞ ¼ ðt 2Þ4 ðt 3Þ2 ; mðtÞ ¼ ðt 2Þ2 ðt 3Þ2 , (b) DðtÞ ¼ ðt 7Þ5 ; mðtÞ ¼ ðt 7Þ2 , (c) DðtÞ ¼ ðt 2Þ7 ; mðtÞ ¼ ðt 2Þ3 10.53. Show that every complex matrix is similar to its transpose. (Hint: Use its Jordan canonical form.) 10.54. Show that all n n complex matrices A for which An ¼ I but Ak 6¼ I for k n are similar. 10.55. Suppose A is a complex matrix with only real eigenvalues. Show that A is similar to a matrix with only real entries. 346 CHAPTER 10 Canonical Forms
  • 354. Cyclic Subspaces 10.56. Suppose T:V ! V is linear. Prove that Zðv; TÞ is the intersection of all T-invariant subspaces containing v. 10.57. Let f ðtÞ and gðtÞ be the T-annihilators of u and v, respectively. Show that if f ðtÞ and gðtÞ are relatively prime, then f ðtÞgðtÞ is the T-annihilator of u þ v. 10.58. Prove that Zðu; TÞ ¼ Zðv; TÞ if and only if gðTÞðuÞ ¼ v where gðtÞ is relatively prime to the T-annihilator of u. 10.59. Let W ¼ Zðv; TÞ, and suppose the T-annihilator of v is f ðtÞn , where f ðtÞ is a monic irreducible polynomial of degree d. Show that f ðTÞs ðWÞ is a cyclic subspace generated by f ðTÞs ðvÞ and that it has dimension dðn sÞ if n s and dimension 0 if n s. Rational Canonical Form 10.60. Find all possible rational forms for a 6 6 matrix over R with minimal polynomial: (a) mðtÞ ¼ ðt2 2t þ 3Þðt þ 1Þ2 , (b) mðtÞ ¼ ðt 2Þ3 . 10.61. Let A be a 4 4 matrix with minimal polynomial mðtÞ ¼ ðt2 þ 1Þðt2 3Þ. Find the rational canonical form for A if A is a matrix over (a) the rational field Q, (b) the real field R, (c) the complex field C. 10.62. Find the rational canonical form for the four-square Jordan block with l’s on the diagonal. 10.63. Prove that the characteristic polynomial of an operator T:V ! V is a product of its elementary divisors. 10.64. Prove that two 3 3 matrices with the same minimal and characteristic polynomials are similar. 10.65. Let Cð f ðtÞÞ denote the companion matrix to an arbitrary polynomial f ðtÞ. Show that f ðtÞ is the characteristic polynomial of Cð f ðtÞÞ. Projections 10.66. Suppose V ¼ W1 Wr. Let Ei denote the projection of V into Wi. Prove (i) EiEj ¼ 0, i 6¼ j; (ii) I ¼ E1 þ þ Er. 10.67. Let E1; . . . ; Er be linear operators on V such that (i) E2 i ¼ Ei (i.e., the Ei are projections); (ii) EiEj ¼ 0, i 6¼ j; (iii) I ¼ E1 þ þ Er Prove that V ¼ Im E1 Im Er. 10.68. Suppose E: V ! V is a projection (i.e., E2 ¼ E). Prove that E has a matrix representation of the form Ir 0 0 0 , where r is the rank of E and Ir is the r-square identity matrix. 10.69. Prove that any two projections of the same rank are similar. (Hint: Use the result of Problem 10.68.) 10.70. Suppose E: V ! V is a projection. Prove (i) I E is a projection and V ¼ Im E Im ðI EÞ, (ii) I þ E is invertible (if 1 þ 1 6¼ 0). Quotient Spaces 10.71. Let W be a subspace of V. Suppose the set of cosets fv1 þ W; v2 þ W; . . . ; vn þ Wg in V=W is linearly independent. Show that the set of vectors fv1; v2; . . . ; vng in V is also linearly independent. 10.72. Let W be a substance of V. Suppose the set of vectors fu1; u2; . . . ; ung in V is linearly independent, and that LðuiÞ W ¼ f0g. Show that the set of cosets fu1 þ W; . . . ; un þ Wg in V=W is also linearly independent. CHAPTER 10 Canonical Forms 347
  • 355. 10.73. Suppose V ¼ U W and that fu1; . . . ; ung is a basis of U. Show that fu1 þ W; . . . ; un þ Wg is a basis of the quotient spaces V=W. (Observe that no condition is placed on the dimensionality of V or W.) 10.74. Let W be the solution space of the linear equation a1x1 þ a2x2 þ þ anxn ¼ 0; ai 2 K and let v ¼ ðb1; b2; . . . ; bnÞ 2 Kn . Prove that the coset v þ W of W in Kn is the solution set of the linear equation a1x1 þ a2x2 þ þ anxn ¼ b; where b ¼ a1b1 þ þ anbn 10.75. Let V be the vector space of polynomials over R and let W be the subspace of polynomials divisible by t4 (i.e., of the form a0t4 þ a1t5 þ þ an4tn ). Show that the quotient space V=W has dimension 4. 10.76. Let U and W be subspaces of V such that W U V. Note that any coset u þ W of W in U may also be viewed as a coset of W in V, because u 2 U implies u 2 V; hence, U=W is a subset of V=W. Prove that (i) U=W is a subspace of V=W, (ii) dimðV=WÞ dimðU=WÞ ¼ dimðV=UÞ. 10.77. Let U and W be subspaces of V. Show that the cosets of U W in V can be obtained by intersecting each of the cosets of U in V by each of the cosets of W in V: V=ðU WÞ ¼ fðv þ UÞ ðv0 þ WÞ : v; v0 2 Vg 10.78. Let T:V ! V0 be linear with kernel W and image U. Show that the quotient space V=W is isomorphic to U under the mapping y:V=W ! U defined by yðv þ WÞ ¼ TðvÞ. Furthermore, show that T ¼ i y Z, where Z:V ! V=W is the natural mapping of V into V=W (i.e., ZðvÞ ¼ v þ W), and i:U ,! V0 is the inclusion mapping (i.e., iðuÞ ¼ u). (See diagram.) ANSWERS TO SUPPLEMENTARY PROBLEMS 10.41. (a) R2 and f0g, (b) C2 ; f0g; W1 ¼ spanð2; 1 2iÞ; W2 ¼ spanð2; 1 þ 2iÞ 10.52. (a) diag 2 1 2 ; 2 1 2 ; 3 1 3 ; diag 2 1 2 ; ½2: ½2; 3 1 3 ; (b) diag 7 1 7 ; 7 1 7 ; ½7 ; diag 7 1 7 ; ½7; ½7; ½7 ; (c) Let Mk denote a Jordan block with l ¼ 2 and order k. Then diagðM3; M3; M1Þ, diagðM3; M2; M2Þ, diagðM3; M2; M1; M1Þ, diagðM3; M1; M1; M1; M1Þ 10.60. Let A ¼ 0 3 1 2 ; B ¼ 0 1 1 2 ; C ¼ 0 0 8 1 0 12 0 1 6 2 4 3 5; D ¼ 0 4 1 4 . (a) diagðA; A; BÞ; diagðA; B; BÞ; diagðA; B; 1; 1Þ; (b) diagðC; CÞ; diagðC; D; 2Þ; diagðC; 2; 2; 2Þ 10.61. Let A ¼ 0 1 1 0 ; B ¼ 0 3 1 0 . (a) diagðA; BÞ, (b) diagðA; ffiffiffi 3 p ; ffiffiffi 3 p Þ, (c) diagði; i; ffiffiffi 3 p ; ffiffiffi 3 p Þ 10.62. Companion matrix with the last column ½l4 ; 4l3 ; 6l2 ; 4lT 348 CHAPTER 10 Canonical Forms
  • 356. CHAPTER 11 Linear Functionals and the Dual Space 11.1 Introduction In this chapter, we study linear mappings from a vector space V into its field K of scalars. (Unless otherwise stated or implied, we view K as a vector space over itself.) Naturally all the theorems and results for arbitrary mappings on V hold for this special case. However, we treat these mappings separately because of their fundamental importance and because the special relationship of V to K gives rise to new notions and results that do not apply in the general case. 11.2 Linear Functionals and the Dual Space Let V be a vector space over a field K. A mapping f:V ! K is termed a linear functional (or linear form) if, for every u; v 2 V and every a; b; 2 K, fðau þ bvÞ ¼ afðuÞ þ bfðvÞ In other words, a linear functional on V is a linear mapping from V into K. EXAMPLE 11.1 (a) Let pi:Kn ! K be the ith projection mapping; that is, piða1; a2; . . . anÞ ¼ ai. Then pi is linear and so it is a linear functional on Kn . (b) Let V be the vector space of polynomials in t over R. Let J:V ! R be the integral operator defined by JðpðtÞÞ ¼ Ð1 0 pðtÞ dt. Recall that J is linear; and hence, it is a linear functional on V. (c) Let V be the vector space of n-square matrices over K. Let T :V ! K be the trace mapping TðAÞ ¼ a11 þ a22 þ þ ann; where A ¼ ½aij That is, T assigns to a matrix A the sum of its diagonal elements. This map is linear (Problem 11.24), and so it is a linear functional on V. By Theorem 5.10, the set of linear functionals on a vector space V over a field K is also a vector space over K, with addition and scalar multiplication defined by ðf þ sÞðvÞ ¼ fðvÞ þ sðvÞ and ðkfÞðvÞ ¼ kfðvÞ where f and s are linear functionals on V and k 2 K. This space is called the dual space of V and is denoted by V*. EXAMPLE 11.2 Let V ¼ Kn , the vector space of n-tuples, which we write as column vectors. Then the dual space V* can be identified with the space of row vectors. In particular, any linear functional f ¼ ða1; ... ; anÞ in V* has the representation fðx1; x2; . . . ; xnÞ ¼ ½a1; a2; . . . ; an½x2; x2; . . . ; xnT ¼ a1x1 þ a2x2 þ þ anxn Historically, the formal expression on the right was termed a linear form. CHAPTER 11 349
  • 357. 11.3 Dual Basis Suppose V is a vector space of dimension n over K. By Theorem 5.11, the dimension of the dual space V* is also n (because K is of dimension 1 over itself). In fact, each basis of V determines a basis of V* as follows (see Problem 11.3 for the proof). THEOREM 11.1: Suppose fv1; . . . ; vng is a basis of V over K. Let f1; . . . ; fn 2 V* be the linear functionals as defined by fiðvjÞ ¼ dij ¼ 1 if i ¼ j 0 if i 6¼ j Then ff1; . . . ; fng is a basis of V*: The above basis ffig is termed the basis dual to fvig or the dual basis. The above formula, which uses the Kronecker delta dij, is a short way of writing f1ðv1Þ ¼ 1; f1ðv2Þ ¼ 0; f1ðv3Þ ¼ 0; . . . ; f1ðvnÞ ¼ 0 f2ðv1Þ ¼ 0; f2ðv2Þ ¼ 1; f2ðv3Þ ¼ 0; . . . ; f2ðvnÞ ¼ 0 :::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::: fnðv1Þ ¼ 0; fnðv2Þ ¼ 0; . . . ; fnðvn1Þ ¼ 0; fnðvnÞ ¼ 1 By Theorem 5.2, these linear mappings fi are unique and well defined. EXAMPLE 11.3 Consider the basis fv1 ¼ ð2; 1Þ; v2 ¼ ð3; 1Þg of R2 . Find the dual basis ff1; f2g. We seek linear functionals f1ðx; yÞ ¼ ax þ by and f2ðx; yÞ ¼ cx þ dy such that f1ðv1Þ ¼ 1; f1ðv2Þ ¼ 0; f2ðv2Þ ¼ 0; f2ðv2Þ ¼ 1 These four conditions lead to the following two systems of linear equations: f1ðv1Þ ¼ f1ð2; 1Þ ¼ 2a þ b ¼ 1 f1ðv2Þ ¼ f1ð3; 1Þ ¼ 3a þ b ¼ 0 and f2ðv1Þ ¼ f2ð2; 1Þ ¼ 2c þ d ¼ 0 f2ðv2Þ ¼ f2ð3; 1Þ ¼ 3c þ d ¼ 1 The solutions yield a ¼ 1, b ¼ 3 and c ¼ 1, d ¼ 2. Hence, f1ðx; yÞ ¼ x þ 3y and f2ðx; yÞ ¼ x 2y form the dual basis. The next two theorems (proved in Problems 11.4 and 11.5, respectively) give relationships between bases and their duals. THEOREM 11.2: Let fv1; . . . ; vng be a basis of V and let ff1; . . . ; fng be the dual basis in V*. Then (i) For any vector u 2 V, u ¼ f1ðuÞv1 þ f2ðuÞv2 þ þ fnðuÞvn. (ii) For any linear functional s 2 V*, s ¼ sðv1Þf1 þ sðv2Þf2 þ þ sðvnÞfn. THEOREM 11.3: Let fv1; . . . ; vng and fw1; . . . ; wng be bases of V and let ff1; . . . ; fng and fs1; . . . ; sng be the bases of V* dual to fvig and fwig, respectively. Suppose P is the change-of-basis matrix from fvig to fwig. Then ðP1 ÞT is the change-of-basis matrix from ffig to fsig. 11.4 Second Dual Space We repeat: Every vector space V has a dual space V*, which consists of all the linear functionals on V. Thus, V* has a dual space V**, called the second dual of V, which consists of all the linear functionals on V*. We now show that each v 2 V determines a specific element ^ v 2 V**. First, for any f 2 V*, we define ^ vðfÞ ¼ fðvÞ 350 CHAPTER 11 Linear Functionals and the Dual Space
  • 358. It remains to be shown that this map ^ v:V* ! K is linear. For any scalars a; b 2 K and any linear functionals f; s 2 V*, we have ^ vðaf þ bsÞ ¼ ðaf þ bsÞðvÞ ¼ afðvÞ þ bsðvÞ ¼ a^ vðfÞ þ b^ vðsÞ That is, ^ v is linear and so ^ v 2 V**. The following theorem (proved in Problem 12.7) holds. THEOREM 11.4: If V has finite dimensions, then the mapping v 7! ^ v is an isomorphism of V onto V**. The above mapping v 7! ^ v is called the natural mapping of V into V**. We emphasize that this mapping is never onto V** if V is not finite-dimensional. However, it is always linear, and moreover, it is always one-to-one. Now suppose V does have finite dimension. By Theorem 11.4, the natural mapping determines an isomorphism between V and V**. Unless otherwise stated, we will identify V with V** by this mapping. Accordingly, we will view V as the space of linear functionals on V* and write V ¼ V**. We remark that if ffig is the basis of V* dual to a basis fvig of V, then fvig is the basis of V** ¼ V that is dual to ffig. 11.5 Annihilators Let W be a subset (not necessarily a subspace) of a vector space V. A linear functional f 2 V* is called an annihilator of W if fðwÞ ¼ 0 for every w 2 W—that is, if fðWÞ ¼ f0g. We show that the set of all such mappings, denoted by W0 and called the annihilator of W, is a subspace of V*. Clearly, 0 2 W0 : Now suppose f; s 2 W0 . Then, for any scalars a; b; 2 K and for any w 2 W, ðaf þ bsÞðwÞ ¼ afðwÞ þ bsðwÞ ¼ a0 þ b0 ¼ 0 Thus, af þ bs 2 W0 , and so W0 is a subspace of V*. In the case that W is a subspace of V, we have the following relationship between W and its annihilator W0 (see Problem 11.11 for the proof). THEOREM 11.5: Suppose V has finite dimension and W is a subspace of V. Then ðiÞ dim W þ dim W0 ¼ dim V and ðiiÞ W00 ¼ W Here W00 ¼ fv 2 V :fðvÞ ¼ 0 for every f 2 W0 g or, equivalently, W00 ¼ ðW0 Þ0 , where W00 is viewed as a subspace of V under the identification of V and V**. 11.6 Transpose of a Linear Mapping Let T :V ! U be an arbitrary linear mapping from a vector space V into a vector space U. Now for any linear functional f 2 U*, the composition f T is a linear mapping from V into K: That is, f T 2 V*. Thus, the correspondence f 7! f T is a mapping from U* into V*; we denote it by Tt and call it the transpose of T. In other words, Tt :U* ! V* is defined by Tt ðfÞ ¼ f T Thus, ðTt ðfÞÞðvÞ ¼ fðTðvÞÞ for every v 2 V. CHAPTER 11 Linear Functionals and the Dual Space 351
  • 359. THEOREM 11.6: The transpose mapping Tt defined above is linear. Proof. For any scalars a; b 2 K and any linear functionals f; s 2 U*, Tt ðaf þ bsÞ ¼ ðaf þ bsÞ T ¼ aðf TÞ þ bðs TÞ ¼ aTt ðfÞ þ bTt ðsÞ That is, Tt is linear, as claimed. We emphasize that if T is a linear mapping from V into U, then Tt is a linear mapping from U* into V*. The same ‘‘transpose’’ for the mapping Tt no doubt derives from the following theorem (proved in Problem 11.16). THEOREM 11.7: Let T :V ! U be linear, and let A be the matrix representation of T relative to bases fvig of V and fuig of U. Then the transpose matrix AT is the matrix representation of Tt :U* ! V* relative to the bases dual to fuig and fvig. SOLVED PROBLEMS Dual Spaces and Dual Bases 11.1. Find the basis ff1; f2; f3g that is dual to the following basis of R3 : fv1 ¼ ð1; 1; 3Þ; v2 ¼ ð0; 1; 1Þ; v3 ¼ ð0; 3; 2Þg The linear functionals may be expressed in the form f1ðx; y; zÞ ¼ a1x þ a2y þ a3z; f2ðx; y; zÞ ¼ b1x þ b2y þ b3z; f3ðx; y; zÞ ¼ c1x þ c2y þ c3z By definition of the dual basis, fiðvjÞ ¼ 0 for i 6¼ j, but fiðvjÞ ¼ 1 for i ¼ j. We find f1 by setting f1ðv1Þ ¼ 1; f1ðv2Þ ¼ 0; f1ðv3Þ ¼ 0: This yields f1ð1; 1; 3Þ ¼ a1 a2 þ 3a3 ¼ 1; f1ð0; 1; 1Þ ¼ a2 a3 ¼ 0; f1ð0; 3; 2Þ ¼ 3a2 2a3 ¼ 0 Solving the system of equations yields a1 ¼ 1, a2 ¼ 0, a3 ¼ 0. Thus, f1ðx; y; zÞ ¼ x. We find f2 by setting f2ðv1Þ ¼ 0, f2ðv2Þ ¼ 1, f2ðv3Þ ¼ 0. This yields f2ð1; 1; 3Þ ¼ b1 b2 þ 3b3 ¼ 0; f2ð0; 1; 1Þ ¼ b2 b3 ¼ 1; f2ð0; 3; 2Þ ¼ 3b2 2b3 ¼ 0 Solving the system of equations yields b1 ¼ 7, b2 ¼ 2, a3 ¼ 3. Thus, f2ðx; y; zÞ ¼ 7x 2y 3z. We find f3 by setting f3ðv1Þ ¼ 0, f3ðv2Þ ¼ 0, f3ðv3Þ ¼ 1. This yields f3ð1; 1; 3Þ ¼ c1 c2 þ 3c3 ¼ 0; f3ð0; 1; 1Þ ¼ c2 c3 ¼ 0; f3ð0; 3; 2Þ ¼ 3c2 2c3 ¼ 1 Solving the system of equations yields c1 ¼ 2, c2 ¼ 1, c3 ¼ 1. Thus, f3ðx; y; zÞ ¼ 2x þ y þ z. 11.2. Let V ¼ fa þ bt : a; b 2 Rg, the vector space of real polynomials of degree 1. Find the basis fv1; v2g of V that is dual to the basis ff1; f2g of V* defined by f1ð f ðtÞÞ ¼ ð1 0 f ðtÞ dt and f2ð f ðtÞÞ ¼ ð2 0 f ðtÞ dt Let v1 ¼ a þ bt and v2 ¼ c þ dt. By definition of the dual basis, f1ðv1Þ ¼ 1; f1ðv2Þ ¼ 0 and f2ðv1Þ ¼ 0; fiðvjÞ ¼ 1 Thus, f1ðv1Þ ¼ Ð1 0 ða þ btÞ dt ¼ a þ 1 2 b ¼ 1 f2ðv1Þ ¼ Ð2 0 ða þ btÞ dt ¼ 2a þ 2b ¼ 0 ) and f1ðv2Þ ¼ Ð1 0 ðc þ dtÞ dt ¼ c þ 1 2 d ¼ 0 f2ðv2Þ ¼ Ð2 0 ðc þ dtÞ dt ¼ 2c þ 2d ¼ 1 ) Solving each system yields a ¼ 2, b ¼ 2 and c ¼ 1 2, d ¼ 1. Thus, fv1 ¼ 2 2t; v2 ¼ 1 2 þ tg is the basis of V that is dual to ff1; f2g. 352 CHAPTER 11 Linear Functionals and the Dual Space
  • 360. 11.3. Prove Theorem 11.1: Suppose fv1; . . . ; vng is a basis of V over K. Let f1; . . . ; fn 2 V* be defined by fiðvjÞ ¼ 0 for i 6¼ j, but fiðvjÞ ¼ 1 for i ¼ j. Then ff1; . . . ; fng is a basis of V*. We first show that ff1; . . . ; fng spans V*. Let f be an arbitrary element of V*, and suppose fðv1Þ ¼ k1; fðv2Þ ¼ k2; . . . ; fðvnÞ ¼ kn Set s ¼ k1f1 þ þ knfn. Then sðv1Þ ¼ ðk1f1 þ þ knfnÞðv1Þ ¼ k1f1ðv1Þ þ k2f2ðv1Þ þ þ knfnðv1Þ ¼ k1 1 þ k2 0 þ þ kn 0 ¼ k1 Similarly, for i ¼ 2; . . . ; n, sðviÞ ¼ ðk1f1 þ þ knfnÞðviÞ ¼ k1f1ðviÞ þ þ kifiðviÞ þ þ knfnðviÞ ¼ ki Thus, fðviÞ ¼ sðviÞ for i ¼ 1; ... ; n. Because f and s agree on the basis vectors, f ¼ s ¼ k1f1 þ þ knfn. Accordingly, ff1; ... ; fng spans V*. It remains to be shown that ff1; . . . ; fng is linearly independent. Suppose a1f1 þ a2f2 þ þ anfn ¼ 0 Applying both sides to v1, we obtain 0 ¼ 0ðv1Þ ¼ ða1f1 þ þ anfnÞðv1Þ ¼ a1f1ðv1Þ þ a2f2ðv1Þ þ þ anfnðv1Þ ¼ a1 1 þ a2 0 þ þ an 0 ¼ a1 Similarly, for i ¼ 2; . . . ; n, 0 ¼ 0ðviÞ ¼ ða1f1 þ þ anfnÞðviÞ ¼ a1f1ðviÞ þ þ aifiðviÞ þ þ anfnðviÞ ¼ ai That is, a1 ¼ 0; . . . ; an ¼ 0. Hence, ff1; . . . ; fng is linearly independent, and so it is a basis of V*. 11.4. Prove Theorem 11.2: Let fv1; . . . ; vng be a basis of V and let ff1; . . . ; fng be the dual basis in V*. For any u 2 V and any s 2 V*, (i) u ¼ P i fiðuÞvi. (ii) s ¼ P i fðviÞfi. Suppose u ¼ a1v1 þ a2v2 þ þ anvn ð1Þ Then f1ðuÞ ¼ a1f1ðv1Þ þ a2f1ðv2Þ þ þ anf1ðvnÞ ¼ a1 1 þ a2 0 þ þ an 0 ¼ a1 Similarly, for i ¼ 2; . . . ; n, fiðuÞ ¼ a1fiðv1Þ þ þ aifiðviÞ þ þ anfiðvnÞ ¼ ai That is, f1ðuÞ ¼ a1, f2ðuÞ ¼ a2; . . . ; fnðuÞ ¼ an. Substituting these results into (1), we obtain (i). Next we prove ðiiÞ. Applying the linear functional s to both sides of (i), sðuÞ ¼ f1ðuÞsðv1Þ þ f2ðuÞsðv2Þ þ þ fnðuÞsðvnÞ ¼ sðv1Þf1ðuÞ þ sðv2Þf2ðuÞ þ þ sðvnÞfnðuÞ ¼ ðsðv1Þf1 þ sðv2Þf2 þ þ sðvnÞfnÞðuÞ Because the above holds for every u 2 V, s ¼ sðv1Þf2 þ sðv2Þf2 þ þ sðvnÞfn, as claimed. 11.5. Prove Theorem 11.3. Let fvig and fwig be bases of V and let ffig and fsig be the respective dual bases in V*. Let P be the change-of-basis matrix from fvig to fwig: Then ðP1 ÞT is the change-of-basis matrix from ffig to fsig. Suppose, for i ¼ 1; . . . ; n, wi ¼ ai1v1 þ ai2v2 þ þ ainvn and si ¼ bi1f1 þ bi2f2 þ þ ainvn Then P ¼ ½aij and Q ¼ ½bij. We seek to prove that Q ¼ ðP1 ÞT . Let Ri denote the ith row of Q and let Cj denote the jth column of PT . Then Ri ¼ ðbi1; bi2; . . . ; binÞ and Cj ¼ ðaj1; aj2; . . . ; ajnÞT CHAPTER 11 Linear Functionals and the Dual Space 353
  • 361. By definition of the dual basis, siðwjÞ ¼ ðbi1f1 þ bi2f2 þ þ binfnÞðaj1v1 þ aj2v2 þ þ ajnvnÞ ¼ bi1aj1 þ bi2aj2 þ þ binajn ¼ RiCj ¼ dij where dij is the Kronecker delta. Thus, QPT ¼ ½RiCj ¼ ½dij ¼ I Therefore, Q ¼ ðPT Þ1 ¼ ðP1 ÞT , as claimed. 11.6. Suppose v 2 V, v 6¼ 0, and dim V ¼ n. Show that there exists f 2 V* such that fðvÞ 6¼ 0. We extend fvg to a basis fv; v2; . . . ; vng of V. By Theorem 5.2, there exists a unique linear mapping f:V ! K such that fðvÞ ¼ 1 and fðviÞ ¼ 0, i ¼ 2; . . . ; n. Hence, f has the desired property. 11.7. Prove Theorem 11.4: Suppose dim V ¼ n. Then the natural mapping v 7! ^ v is an isomorphism of V onto V**. We first prove that the map v 7! ^ v is linear—that is, for any vectors v; w 2 V and any scalars a; b 2 K, av þ bw ¼ a^ v þ b^ w. For any linear functional f 2 V*, av þ bwðfÞ ¼ fðav þ bwÞ ¼ afðvÞ þ bfðwÞ ¼ a^ vðfÞ þ b ^ wðfÞ ¼ ða^ v þ b^ wÞðfÞ Because av þ bwðfÞ ¼ ða^ v þ b^ wÞðfÞ for every f 2 V*, we have av þ bw ¼ a^ v þ b^ w. Thus, the map v 7! ^ v is linear. Now suppose v 2 V, v 6¼ 0. Then, by Problem 11.6, there exists f 2 V* for which fðvÞ 6¼ 0. Hence, ^ vðfÞ ¼ fðvÞ 6¼ 0, and thus ^ v 6¼ 0. Because v 6¼ 0 implies ^ v 6¼ 0, the map v 7! ^ v is nonsingular and hence an isomorphism (Theorem 5.64). Now dim V ¼ dim V* ¼ dim V**, because V has finite dimension. Accordingly, the mapping v 7! ^ v is an isomorphism of V onto V**. Annihilators 11.8. Show that if f 2 V* annihilates a subset S of V, then f annihilates the linear span LðSÞ of S. Hence, S0 ¼ ½spanðSÞ0 . Suppose v 2 spanðSÞ. Then there exists w1; . . . ; wr 2 S for which v ¼ a1w1 þ a2w2 þ þ arwr. fðvÞ ¼ a1fðw1Þ þ a2fðw2Þ þ þ arfðwrÞ ¼ a10 þ a20 þ þ ar0 ¼ 0 Because v was an arbitrary element of spanðSÞ; f annihilates spanðSÞ, as claimed. 11.9. Find a basis of the annihilator W0 of the subspace W of R4 spanned by v1 ¼ ð1; 2; 3; 4Þ and v2 ¼ ð0; 1; 4; 1Þ By Problem 11.8, it suffices to find a basis of the set of linear functionals f such that fðv1Þ ¼ 0 and fðv2Þ ¼ 0, where fðx1; x2; x3; x4Þ ¼ ax1 þ bx2 þ cx3 þ dx4. Thus, fð1; 2; 3; 4Þ ¼ a þ 2b 3c þ 4d ¼ 0 and fð0; 1; 4; 1Þ ¼ b þ 4c d ¼ 0 The system of two equations in the unknowns a; b; c; d is in echelon form with free variables c and d. (1) Set c ¼ 1, d ¼ 0 to obtain the solution a ¼ 11, b ¼ 4, c ¼ 1, d ¼ 0. (2) Set c ¼ 0, d ¼ 1 to obtain the solution a ¼ 6, b ¼ 1, c ¼ 0, d ¼ 1. The linear functions f1ðxiÞ ¼ 11x1 4x2 þ x3 and f2ðxiÞ ¼ 6x1 x2 þ x4 form a basis of W0 . 11.10. Show that (a) For any subset S of V; S S00 . (b) If S1 S2, then S0 2 S0 1. (a) Let v 2 S. Then for every linear functional f 2 S0 , ^ vðfÞ ¼ fðvÞ ¼ 0. Hence, ^ v 2 ðS0 Þ0 . Therefore, under the identification of V and V**, v 2 S00 . Accordingly, S S00 . (b) Let f 2 S0 2 . Then fðvÞ ¼ 0 for every v 2 S2. But S1 S2; hence, f annihilates every element of S1 (i.e., f 2 S0 1 ). Therefore, S0 2 S0 1. d d d d 354 CHAPTER 11 Linear Functionals and the Dual Space
  • 362. 11.11. Prove Theorem 11.5: Suppose V has finite dimension and W is a subspace of V. Then (i) dim W þ dim W0 ¼ dim V, (ii) W00 ¼ W. (i) Suppose dim V ¼ n and dim W ¼ r n. We want to show that dim W0 ¼ n r. We choose a basis fw1; . . . ; wrg of W and extend it to a basis of V, say fw1; . . . ; wr; v1; . . . ; vnrg. Consider the dual basis ff1; . . . ; fr; s1; . . . ; snrg By definition of the dual basis, each of the above s’s annihilates each wi; hence, s1; . . . ; snr 2 W0 . We claim that fsig is a basis of W0 . Now fsjg is part of a basis of V*, and so it is linearly independent. We next show that ffjg spans W0 . Let s 2 W0 . By Theorem 11.2, s ¼ sðw1Þf1 þ þ sðwrÞfr þ sðv1Þs1 þ þ sðvnrÞsnr ¼ 0f1 þ þ 0fr þ sðv1Þs1 þ þ sðvnrÞsnr ¼ sðv1Þs1 þ þ sðvnrÞsnr Consequently, fs1; . . . ; snrg spans W0 and so it is a basis of W0 . Accordingly, as required dim W0 ¼ n r ¼ dim V dim W: (ii) Suppose dim V ¼ n and dim W ¼ r. Then dim V* ¼ n and, by (i), dim W0 ¼ n r. Thus, by (i), dim W00 ¼ n ðn rÞ ¼ r; therefore, dim W ¼ dim W00 . By Problem 11.10, W W00 . Accord- ingly, W ¼ W00 . 11.12. Let U and W be subspaces of V. Prove that ðU þ WÞ0 ¼ U0 W0 . Let f 2 ðU þ WÞ0 . Then f annihilates U þ W; and so, in particular, f annihilates U and W: That is, f 2 U0 and f 2 W0 ; hence, f 2 U0 W0 : Thus, ðU þ WÞ0 U0 W0 : On the other hand, suppose s 2 U0 W0 : Then s annihilates U and also W. If v 2 U þ W, then v ¼ u þ w, where u 2 U and w 2 W. Hence, sðvÞ ¼ sðuÞ þ sðwÞ ¼ 0 þ 0 ¼ 0. Thus, s annihilates U þ W; that is, s 2 ðU þ WÞ0 . Accordingly, U0 þ W0 ðU þ WÞ0 . The two inclusion relations together give us the desired equality. Remark: Observe that no dimension argument is employed in the proof; hence, the result holds for spaces of finite or infinite dimension. Transpose of a Linear Mapping 11.13. Let f be the linear functional on R2 defined by fðx; yÞ ¼ x 2y. For each of the following linear operators T on R2 , find ðTt ðfÞÞðx; yÞ: (a) Tðx; yÞ ¼ ðx; 0Þ, (b) Tðx; yÞ ¼ ðy; x þ yÞ, (c) Tðx; yÞ ¼ ð2x 3y; 5x þ 2yÞ By definition, Tt ðfÞ ¼ f T; that is, ðTt ðfÞÞðvÞ ¼ fðTðvÞÞ for every v. Hence, (a) ðTt ðfÞÞðx; yÞ ¼ fðTðx; yÞÞ ¼ fðx; 0Þ ¼ x (b) ðTt ðfÞÞðx; yÞ ¼ fðTðx; yÞÞ ¼ fðy; x þ yÞ ¼ y 2ðx þ yÞ ¼ 2x y (c) ðTt ðfÞÞðx; yÞ ¼ fðTðx; yÞÞ ¼ fð2x 3y; 5x þ 2yÞ ¼ ð2x 3yÞ 2ð5x þ 2yÞ ¼ 8x 7y 11.14. Let T :V ! U be linear and let Tt :U* ! V* be its transpose. Show that the kernel of Tt is the annihilator of the image of T—that is, Ker Tt ¼ ðIm TÞ0 . Suppose f 2 Ker Tt ; that is, Tt ðfÞ ¼ f T ¼ 0. If u 2 Im T, then u ¼ TðvÞ for some v 2 V; hence, fðuÞ ¼ fðTðvÞÞ ¼ ðf TÞðvÞ ¼ 0ðvÞ ¼ 0 We have that fðuÞ ¼ 0 for every u 2 Im T; hence, f 2 ðIm TÞ0 . Thus, Ker Tt ðIm TÞ0 . On the other hand, suppose s 2 ðIm TÞ0 ; that is, sðIm TÞ ¼ f0g . Then, for every v 2 V, ðTt ðsÞÞðvÞ ¼ ðs TÞðvÞ ¼ sðTðvÞÞ ¼ 0 ¼ 0ðvÞ CHAPTER 11 Linear Functionals and the Dual Space 355
  • 363. We have ðTt ðsÞÞðvÞ ¼ 0ðvÞ for every v 2 V; hence, Tt ðsÞ ¼ 0. Thus, s 2 Ker Tt , and so ðIm TÞ0 Ker Tt . The two inclusion relations together give us the required equality. 11.15. Suppose V and U have finite dimension and T:V ! U is linear. Prove rankðTÞ ¼ rankðTt Þ. Suppose dim V ¼ n and dim U ¼ m, and suppose rankðTÞ ¼ r. By Theorem 11.5, dimðIm TÞ0 ¼ dim u dimðIm TÞ ¼ m rankðTÞ ¼ m r By Problem 11.14, Ker Tt ¼ ðIm TÞ0 . Hence, nullity ðTt Þ ¼ m r. It then follows that, as claimed, rankðTt Þ ¼ dim U* nullityðTt Þ ¼ m ðm rÞ ¼ r ¼ rankðTÞ 11.16. Prove Theorem 11.7: Let T :V ! U be linear and let A be the matrix representation of T in the bases fvjg of V and fuig of U. Then the transpose matrix AT is the matrix representation of Tt :U* ! V* in the bases dual to fuig and fvjg. Suppose, for j ¼ 1; . . . ; m, TðvjÞ ¼ aj1u1 þ aj2u2 þ þ ajnun ð1Þ We want to prove that, for i ¼ 1; . . . ; n, Tt ðsiÞ ¼ a1if1 þ a2if2 þ þ amifm ð2Þ where fsig and ffjg are the bases dual to fuig and fvjg, respectively. Let v 2 V and suppose v ¼ k1v1 þ k2v2 þ þ kmvm. Then, by (1), TðvÞ ¼ k1Tðv1Þ þ k2Tðv2Þ þ þ kmTðvmÞ ¼ k1ða11u1 þ þ a1nunÞ þ k2ða21u1 þ þ a2nunÞ þ þ kmðam1u1 þ þ amnunÞ ¼ ðk1a11 þ k2a21 þ þ kmam1Þu1 þ þ ðk1a1n þ k2a2n þ þ kmamnÞun ¼ P n i¼1 ðk1a1i þ k2a2i þ þ kmamiÞui Hence, for j ¼ 1; . . . ; n. ðTt ðsjÞðvÞÞ ¼ sjðTðvÞÞ ¼ sj P n i¼1 ðk1a1i þ k2a2i þ þ kmamiÞui ¼ k1a1j þ k2a2j þ þ kmamj ð3Þ On the other hand, for j ¼ 1; . . . ; n, ða1jf1 þ a2jf2 þ þ amjfmÞðvÞ ¼ ða1jf1 þ a2jf2 þ þ amjfmÞðk1v1 þ k2v2 þ þ kmvmÞ ¼ k1a1j þ k2a2j þ þ kmamj ð4Þ Because v 2 V was arbitrary, (3) and (4) imply that Tt ðsjÞ ¼ a1jf1 þ a2jf2 þ þ amjfm; j ¼ 1; . . . ; n which is (2). Thus, the theorem is proved. SUPPLEMENTARY PROBLEMS Dual Spaces and Dual Bases 11.17. Find (a) f þ s, (b) 3f, (c) 2f 5s, where f:R3 ! R and s:R3 ! R are defined by fðx; y; zÞ ¼ 2x 3y þ z and sðx; y; zÞ ¼ 4x 2y þ 3z 11.18. Find the dual basis of each of the following bases of R3 : (a) fð1; 0; 0Þ; ð0; 1; 0Þ; ð0; 0; 1Þg, (b) fð1; 2; 3Þ; ð1; 1; 1Þ; ð2; 4; 7Þg. 356 CHAPTER 11 Linear Functionals and the Dual Space
  • 364. 11.19. Let V be the vector space of polynomials over R of degree 2. Let f1; f2; f3 be the linear functionals on V defined by f1ð f ðtÞÞ ¼ ð1 0 f ðtÞ dt; f2ð f ðtÞÞ ¼ f 0 ð1Þ; f3ð f ðtÞÞ ¼ f ð0Þ Here f ðtÞ ¼ a þ bt þ ct2 2 V and f 0 ðtÞ denotes the derivative of f ðtÞ. Find the basis f f1ðtÞ; f2ðtÞ; f3ðtÞg of V that is dual to ff1; f2; f3g. 11.20. Suppose u; v 2 V and that fðuÞ ¼ 0 implies fðvÞ ¼ 0 for all f 2 V*. Show that v ¼ ku for some scalar k. 11.21. Suppose f; s 2 V* and that fðvÞ ¼ 0 implies sðvÞ ¼ 0 for all v 2 V. Show that s ¼ kf for some scalar k. 11.22. Let V be the vector space of polynomials over K. For a 2 K, define fa :V ! K by fað f ðtÞÞ ¼ f ðaÞ. Show that (a) fa is linear; (b) if a 6¼ b, then fa 6¼ fb. 11.23. Let V be the vector space of polynomials of degree 2. Let a; b; c 2 K be distinct scalars. Let fa; fb; fc be the linear functionals defined by fað f ðtÞÞ ¼ f ðaÞ, fbð f ðtÞÞ ¼ f ðbÞ, fcð f ðtÞÞ ¼ f ðcÞ. Show that ffa; fb; fcg is linearly independent, and find the basis f f1ðtÞ; f2ðtÞ; f3ðtÞg of V that is its dual. 11.24. Let V be the vector space of square matrices of order n. Let T :V ! K be the trace mapping; that is, TðAÞ ¼ a11 þ a22 þ þ ann, where A ¼ ðaijÞ. Show that T is linear. 11.25. Let W be a subspace of V. For any linear functional f on W, show that there is a linear functional s on V such that sðwÞ ¼ fðwÞ for any w 2 W; that is, f is the restriction of s to W. 11.26. Let fe1; . . . ; eng be the usual basis of Kn . Show that the dual basis is fp1; . . . ; png where pi is the ith projection mapping; that is, piða1; . . . ; anÞ ¼ ai. 11.27. Let V be a vector space over R. Let f1; f2 2 V* and suppose s:V ! R; defined by sðvÞ ¼ f1ðvÞf2ðvÞ; also belongs to V*. Show that either f1 ¼ 0 or f2 ¼ 0. Annihilators 11.28. Let W be the subspace of R4 spanned by ð1; 2; 3; 4Þ, ð1; 3; 2; 6Þ, ð1; 4; 1; 8Þ. Find a basis of the annihilator of W. 11.29. Let W be the subspace of R3 spanned by ð1; 1; 0Þ and ð0; 1; 1Þ. Find a basis of the annihilator of W. 11.30. Show that, for any subset S of V; spanðSÞ ¼ S00 , where spanðSÞ is the linear span of S. 11.31. Let U and W be subspaces of a vector space V of finite dimension. Prove that ðU WÞ0 ¼ U0 þ W0 . 11.32. Suppose V ¼ U W. Prove that V0 ¼ U0 W0 . Transpose of a Linear Mapping 11.33. Let f be the linear functional on R2 defined by fðx; yÞ ¼ 3x 2y. For each of the following linear mappings T :R3 ! R2 , find ðTt ðfÞÞðx; y; zÞ: (a) Tðx; y; zÞ ¼ ðx þ y; y þ zÞ, (b) Tðx; y; zÞ ¼ ðx þ y þ z; 2x yÞ 11.34. Suppose T1 :U ! V and T2 :V ! W are linear. Prove that ðT2 T1Þt ¼ Tt 1 Tt 2. 11.35. Suppose T :V ! U is linear and V has finite dimension. Prove that Im Tt ¼ ðKer TÞ0 . CHAPTER 11 Linear Functionals and the Dual Space 357
  • 365. 11.36. Suppose T :V ! U is linear and u 2 U. Prove that u 2 Im T or there exists f 2 V* such that Tt ðfÞ ¼ 0 and fðuÞ ¼ 1. 11.37. Let V be of finite dimension. Show that the mapping T 7! Tt is an isomorphism from HomðV; VÞ onto HomðV*; V*Þ. (Here T is any linear operator on V.) Miscellaneous Problems 11.38. Let V be a vector space over R. The line segment uv joining points u; v 2 V is defined by uv ¼ ftu þ ð1 tÞv:0 t 1g. A subset S of V is convex if u; v 2 S implies uv S. Let f 2 V*. Define Wþ ¼ fv 2 V : fðvÞ 0g; W ¼ fv 2 V : fðvÞ ¼ 0g; W ¼ fv 2 V : fðvÞ 0g Prove that Wþ ; W, and W are convex. 11.39. Let V be a vector space of finite dimension. A hyperplane H of V may be defined as the kernel of a nonzero linear functional f on V. Show that every subspace of V is the intersection of a finite number of hyperplanes. ANSWERS TO SUPPLEMENTARY PROBLEMS 11.17. (a) 6x 5y þ 4z, (b) 6x 9y þ 3z, (c) 16x þ 4y 13z 11.18. (a) f1 ¼ x; f2 ¼ y; f3 ¼ z; (b) f1 ¼ 3x 5y 2z; f2 ¼ 2x þ y; f3 ¼ x þ 2y þ z 11.19. f1ðtÞ ¼ 3t 3 2 t2 ; f2ðtÞ ¼ 1 2 t þ 3 4 t2 ; f3ðtÞ ¼ 1 3t þ 3 2 t2 11.22. (b) Let f ðtÞ ¼ t. Then fað f ðtÞÞ ¼ a 6¼ b ¼ fbð f ðtÞÞ; and therefore, fa 6¼ fb 11.23. f1ðtÞ ¼ t2 ðb þ cÞt þ bc ða bÞða cÞ ; f2ðtÞ ¼ t2 ða þ cÞt þ ac ðb aÞðb cÞ ; f3ðtÞ ¼ t2 ða þ bÞt þ ab ðc aÞðc bÞ 11.28. ff1ðx; y; z; tÞ ¼ 5x y þ z; f2ðx; y; z; tÞ ¼ 2y tg 11.29. ffðx; y; zÞ ¼ x y þ zg 11.33. (a) ðTt ðfÞÞðx; y; zÞ ¼ 3x þ y 2z, (b) ðTt ðfÞÞðx; y; zÞ ¼ x þ 5y þ 3z 358 CHAPTER 11 Linear Functionals and the Dual Space
  • 366. Bilinear, Quadratic, and Hermitian Forms 12.1 Introduction This chapter generalizes the notions of linear mappings and linear functionals. Specifically, we introduce the notion of a bilinear form. These bilinear maps also give rise to quadratic and Hermitian forms. Although quadratic forms were discussed previously, this chapter is treated independently of the previous results. Although the field K is arbitrary, we will later specialize to the cases K ¼ R and K ¼ C. Furthermore, we may sometimes need to divide by 2. In such cases, we must assume that 1 þ 1 6¼ 0, which is true when K ¼ R or K ¼ C. 12.2 Bilinear Forms Let V be a vector space of finite dimension over a field K. A bilinear form on V is a mapping f :V V ! K such that, for all a; b 2 K and all ui; vi 2 V: (i) f ðau1 þ bu2; vÞ ¼ af ðu1; vÞ þ bf ðu2; vÞ, (ii) f ðu; av1 þ bv2Þ ¼ af ðu; v1Þ þ bf ðu; v2Þ We express condition (i) by saying f is linear in the first variable, and condition (ii) by saying f is linear in the second variable. EXAMPLE 12.1 (a) Let f be the dot product on Rn ; that is, for u ¼ ðaiÞ and v ¼ ðbiÞ, f ðu; vÞ ¼ u v ¼ a1b1 þ a2b2 þ þ anbn Then f is a bilinear form on Rn . (In fact, any inner product on a real vector space V is a bilinear form on V.) (b) Let f and s be arbitrarily linear functionals on V. Let f :V V ! K be defined by f ðu; vÞ ¼ fðuÞsðvÞ. Then f is a bilinear form, because f and s are each linear. (c) Let A ¼ ½aij be any n n matrix over a field K. Then A may be identified with the following bilinear form F on Kn , where X ¼ ½xi and Y ¼ ½yi are column vectors of variables: f ðX; YÞ ¼ XT AY ¼ P i;j aijxiyi ¼ a11x1y1 þ a12x1y2 þ þ annxnyn The above formal expression in the variables xi; yi is termed the bilinear polynomial corresponding to the matrix A. Equation (12.1) shows that, in a certain sense, every bilinear form is of this type. CHAPTER 12 359
  • 367. Space of Bilinear Forms Let BðVÞ denote the set of all bilinear forms on V. A vector space structure is placed on BðVÞ, where for any f ; g 2 BðVÞ and any k 2 K, we define f þ g and kf as follows: ð f þ gÞðu; vÞ ¼ f ðu; vÞ þ gðu; vÞ and ðkf Þðu; vÞ ¼ kf ðu; vÞ The following theorem (proved in Problem 12.4) applies. THEOREM 12.1: Let V be a vector space of dimension n over K. Let ff1; . . . ; fng be any basis of the dual space V*. Then f fij : i; j ¼ 1; . . . ; ng is a basis of BðVÞ, where fij is defined by fijðu; vÞ ¼ fiðuÞfjðvÞ. Thus, in particular, dim BðVÞ ¼ n2 . 12.3 Bilinear Forms and Matrices Let f be a bilinear form on V and let S ¼ fu1; . . . ; ung be a basis of V. Suppose u; v 2 V and u ¼ a1u1 þ þ anun and v ¼ b1u1 þ þ bnun Then f ðu; vÞ ¼ f ða1u1 þ þ anun; b1u1 þ þ bnunÞ ¼ P i;j aibjf ðui; ujÞ Thus, f is completely determined by the n2 values f ðui; ujÞ. The matrix A ¼ ½aij where aij ¼ f ðui; ujÞ is called the matrix representation of f relative to the basis S or, simply, the ‘‘matrix of f in S.’’ It ‘‘represents’’ f in the sense that, for all u; v 2 V, f ðu; vÞ ¼ P i;j aibj f ðui; ujÞ ¼ ½uT S A½vS ð12:1Þ [As usual, ½uS denotes the coordinate (column) vector of u in the basis S.] Change of Basis, Congruent Matrices We now ask, how does a matrix representing a bilinear form transform when a new basis is selected? The answer is given in the following theorem (proved in Problem 12.5). THEOREM 12.2: Let P be a change-of-basis matrix from one basis S to another basis S0 . If A is the matrix representing a bilinear form f in the original basis S, then B ¼ PT AP is the matrix representing f in the new basis S0 . The above theorem motivates the following definition. DEFINITION: A matrix B is congruent to a matrix A, written B ’ A, if there exists a nonsingular matrix P such that B ¼ PT AP. Thus, by Theorem 12.2, matrices representing the same bilinear form are congruent. We remark that congruent matrices have the same rank, because P and PT are nonsingular; hence, the following definition is well defined. DEFINITION: The rank of a bilinear form f on V, written rankð f Þ, is the rank of any matrix representation of f . We say f is degenerate or nondegenerate according to whether rankð f Þ dim V or rankð f Þ ¼ dim V. 12.4 Alternating Bilinear Forms Let f be a bilinear form on V. Then f is called (i) alternating if f ðv; vÞ ¼ 0 for every v 2 V; (ii) skew-symmetric if f ðu; vÞ ¼ f ðv; uÞ for every u; v 2 V. 360 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
  • 368. Now suppose (i) is true. Then (ii) is true, because, for any u; v; 2 V, 0 ¼ f ðu þ v; u þ vÞ ¼ f ðu; uÞ þ f ðu; vÞ þ f ðv; uÞ þ f ðv; vÞ ¼ f ðu; vÞ þ f ðv; uÞ On the other hand, suppose (ii) is true and also 1 þ 1 6¼ 0. Then (i) is true, because, for every v 2 V, we have f ðv; vÞ ¼ f ðv; vÞ. In other words, alternating and skew-symmetric are equivalent when 1 þ 1 6¼ 0. The main structure theorem of alternating bilinear forms (proved in Problem 12.23) is as follows. THEOREM 12.3: Let f be an alternating bilinear form on V. Then there exists a basis of V in which f is represented by a block diagonal matrix M of the form M ¼ diag 0 1 1 0 ; 0 1 1 0 ; . . . ; 0 1 1 0 ; ½0; ½0; . . . ½0 Moreover, the number of nonzero blocks is uniquely determined by f [because it is equal to 1 2 rankð f Þ. In particular, the above theorem shows that any alternating bilinear form must have even rank. 12.5 Symmetric Bilinear Forms, Quadratic Forms This section investigates the important notions of symmetric bilinear forms and quadratic forms and their representation by means of symmetric matrices. The only restriction on the field K is that 1 þ 1 6¼ 0. In Section 12.6, we will restrict K to be the real field R, which yields important special results. Symmetric Bilinear Forms Let f be a bilinear form on V. Then f is said to be symmetric if, for every u; v 2 V, f ðu; vÞ ¼ f ðv; uÞ One can easily show that f is symmetric if and only if any matrix representation A of f is a symmetric matrix. The main result for symmetric bilinear forms (proved in Problem 12.10) is as follows. (We emphasize that we are assuming that 1 þ 1 6¼ 0.) THEOREM 12.4: Let f be a symmetric bilinear form on V. Then V has a basis fv1; . . . ; vng in which f is represented by a diagonal matrix—that is, where f ðvi; vjÞ ¼ 0 for i 6¼ j. THEOREM 12.4: (Alternative Form) Let A be a symmetric matrix over K. Then A is congruent to a diagonal matrix; that is, there exists a nonsingular matrix P such that PT AP is diagonal. Diagonalization Algorithm Recall that a nonsingular matrix P is a product of elementary matrices. Accordingly, one way of obtaining the diagonal form D ¼ PT AP is by a sequence of elementary row operations and the same sequence of elementary column operations. This same sequence of elementary row operations on the identity matrix I will yield PT . This algorithm is formalized below. ALGORITHM 12.1: (Congruence Diagonalization of a Symmetric Matrix) The input is a symmetric matrix A ¼ ½aij of order n. Step 1. Form the n 2n (block) matrix M ¼ ½A1; I, where A1 ¼ A is the left half of M and the identity matrix I is the right half of M. Step 2. Examine the entry a11. There are three cases. CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 361
  • 369. Case I: a11 6¼ 0. (Use a11 as a pivot to put 0’s below a11 in M and to the right of a11 in A1:Þ For i ¼ 2; . . . ; n: (a) Apply the row operation ‘‘Replace Ri by ai1R1 þ a11Ri.’’ (b) Apply the corresponding column operation ‘‘Replace Ci by ai1C1 þ a11Ci.’’ These operations reduce the matrix M to the form M a11 0 * * 0 A1 * * ð*Þ Case II: a11 ¼ 0 but akk 6¼ 0, for some k 1. (a) Apply the row operation ‘‘Interchange R1 and Rk.’’ (b) Apply the corresponding column operation ‘‘Interchange C1 and Ck.’’ (These operations bring akk into the first diagonal position, which reduces the matrix to Case I.) Case III: All diagonal entries aii ¼ 0 but some aij 6¼ 0. (a) Apply the row operation ‘‘Replace Ri by Rj þ Ri.’’ (b) Apply the corresponding column operation ‘‘Replace Ci by Cj þ Ci.’’ (These operations bring 2aij into the ith diagonal position, which reduces the matrix to Case II.) Thus, M is finally reduced to the form ð*Þ, where A2 is a symmetric matrix of order less than A. Step 3. Repeat Step 2 with each new matrix Ak (by neglecting the first row and column of the preceding matrix) until A is diagonalized. Then M is transformed into the form M0 ¼ ½D; Q, where D is diagonal. Step 4. Set P ¼ QT . Then D ¼ PT AP. Remark 1: We emphasize that in Step 2, the row operations will change both sides of M, but the column operations will only change the left half of M. Remark 2: The condition 1 þ 1 6¼ 0 is used in Case III, where we assume that 2aij 6¼ 0 when aij 6¼ 0. The justification for the above algorithm appears in Problem 12.9. EXAMPLE 12.2 Let A ¼ 1 2 3 2 5 4 3 4 8 2 4 3 5. Apply Algorithm 9.1 to find a nonsingular matrix P such that D ¼ PT AP is diagonal. First form the block matrix M ¼ ½A; I; that is, let M ¼ ½A; I ¼ 1 2 3 1 0 0 2 5 4 0 1 0 3 4 8 0 0 1 2 4 3 5 Apply the row operations ‘‘Replace R2 by 2R1 þ R2’’ and ‘‘Replace R3 by 3R1 þ R3’’ to M, and then apply the corresponding column operations ‘‘Replace C2 by 2C1 þ C2’’ and ‘‘Replace C3 by 3C1 þ C3’’ to obtain 1 2 3 1 0 0 0 1 2 2 1 0 0 2 1 3 0 1 2 4 3 5 and then 1 0 0 1 0 0 0 1 2 2 1 0 0 2 1 3 0 1 2 4 3 5 362 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
  • 370. Next apply the row operation ‘‘Replace R3 by 2R2 þ R3’’ and then the corresponding column operation ‘‘Replace C3 by 2C2 þ C3’’ to obtain 1 0 0 1 0 0 0 1 2 2 1 0 0 0 5 7 2 1 2 4 3 5 and then 1 0 0 1 0 0 0 1 0 2 1 0 0 0 5 7 2 1 2 4 3 5 Now A has been diagonalized. Set P ¼ 1 2 7 0 1 2 0 0 1 2 4 3 5 and then D ¼ P1 AP ¼ 1 0 0 0 1 0 0 0 5 2 4 3 5 We emphasize that P is the transpose of the right half of the final matrix. Quadratic Forms We begin with a definition. DEFINITION A: A mapping q:V ! K is a quadratic form if qðvÞ ¼ f ðv; vÞ for some symmetric bilinear form f on V. If 1 þ 1 6¼ 0 in K, then the bilinear form f can be obtained from the quadratic form q by the following polar form of f : f ðu; vÞ ¼ 1 2 ½qðu þ vÞ qðuÞ qðvÞ Now suppose f is represented by a symmetric matrix A ¼ ½aij, and 1 þ 1 6¼ 0. Letting X ¼ ½xi denote a column vector of variables, q can be represented in the form qðXÞ ¼ f ðX; XÞ ¼ XT AX ¼ P i;j aijxixj ¼ P i aiix2 i þ 2 P ij aijxixj The above formal expression in the variables xi is also called a quadratic form. Namely, we have the following second definition. DEFINITION B: A quadratic form q in variables x1; x2; . . . ; xn is a polynomial such that every term has degree two; that is, qðx1; x2; . . . ; xnÞ ¼ P i cix2 i þ P ij dijxixj Using 1 þ 1 6¼ 0, the quadratic form q in Definition B determines a symmetric matrix A ¼ ½aij where aii ¼ ci and aij ¼ aji ¼ 1 2 dij. Thus, Definitions A and B are essentially the same. If the matrix representation A of q is diagonal, then q has the diagonal representation qðXÞ ¼ XT AX ¼ a11x2 1 þ a22x2 2 þ þ annx2 n That is, the quadratic polynomial representing q will contain no ‘‘cross product’’ terms. Moreover, by Theorem 12.4, every quadratic form has such a representation (when 1 þ 1 6¼ 0Þ. 12.6 Real Symmetric Bilinear Forms, Law of Inertia This section treats symmetric bilinear forms and quadratic forms on vector spaces V over the real field R. The special nature of R permits an independent theory. The main result (proved in Problem 12.14) is as follows. THEOREM 12.5: Let f be a symmetric form on V over R. Then there exists a basis of V in which f is represented by a diagonal matrix. Every other diagonal matrix representation of f has the same number p of positive entries and the same number n of negative entries. CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 363
  • 371. The above result is sometimes called the Law of Inertia or Sylvester’s Theorem. The rank and signature of the symmetric bilinear form f are denoted and defined by rankð f Þ ¼ p þ n and sigð f Þ ¼ p n These are uniquely defined by Theorem 12.5. A real symmetric bilinear form f is said to be (i) positive definite if qðvÞ ¼ f ðv; vÞ 0 for every v 6¼ 0, (ii) nonnegative semidefinite if qðvÞ ¼ f ðv; vÞ 0 for every v. EXAMPLE 12.3 Let f be the dot product on Rn . Recall that f is a symmetric bilinear form on Rn . We note that f is also positive definite. That is, for any u ¼ ðaiÞ 6¼ 0 in Rn , f ðu; uÞ ¼ a2 1 þ a2 2 þ þ a2 n 0 Section 12.5 and Chapter 13 tell us how to diagonalize a real quadratic form q or, equivalently, a real symmetric matrix A by means of an orthogonal transition matrix P. If P is merely nonsingular, then q can be represented in diagonal form with only 1’s and 1’s as nonzero coefficients. Namely, we have the following corollary. COROLLARY 12.6: Any real quadratic form q has a unique representation in the form qðx1; x2; . . . ; xnÞ ¼ x2 1 þ þ x2 p x2 pþ1 x2 r where r ¼ p þ n is the rank of the form. COROLLARY 12.6: (Alternative Form) Any real symmetric matrix A is congruent to the unique diagonal matrix D ¼ diagðIp; In; 0Þ where r ¼ p þ n is the rank of A. 12.7 Hermitian Forms Let V be a vector space of finite dimension over the complex field C. A Hermitian form on V is a mapping f :V V ! C such that, for all a; b 2 C and all ui; v 2 V, (i) f ðau1 þ bu2; vÞ ¼ af ðu1; vÞ þ bf ðu2; vÞ, (ii) f ðu; vÞ ¼ f ðv; uÞ. (As usual, k denotes the complex conjugate of k 2 C.) Using (i) and (ii), we get f ðu; av1 þ bv2Þ ¼ f ðav1 þ bv2; uÞ ¼ af ðv1; uÞ þ bf ðv2; uÞ ¼ ^ af ðv1; uÞ þ bf ðv2; uÞ ¼ af ðu; v1Þ þ bf ðu; v2Þ That is, ðiiiÞ f ðu; av1 þ bv2Þ ¼ af ðu; v1Þ þ bf ðu; v2Þ: As before, we express condition (i) by saying f is linear in the first variable. On the other hand, we express condition (iii) by saying f is ‘‘conjugate linear’’ in the second variable. Moreover, condition (ii) tells us that f ðv; vÞ ¼ f ðv; vÞ, and hence, f ðv; vÞ is real for every v 2 V. The results of Sections 12.5 and 12.6 for symmetric forms have their analogues for Hermitian forms. Thus, the mapping q:V ! R, defined by qðvÞ ¼ f ðv; vÞ, is called the Hermitian quadratic form or complex quadratic form associated with the Hermitian form f . We can obtain f from q by the polar form f ðu; vÞ ¼ 1 4 ½qðu þ vÞ qðu vÞ þ 1 4 ½qðu þ ivÞ qðu ivÞ 364 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
  • 372. Now suppose S ¼ fu1; . . . ; ung is a basis of V. The matrix H ¼ ½hij where hij ¼ f ðui; ujÞ is called the matrix representation of f in the basis S. By (ii), f ðui; ujÞ ¼ f ðuj; uiÞ; hence, H is Hermitian and, in particular, the diagonal entries of H are real. Thus, any diagonal representation of f contains only real entries. The next theorem (to be proved in Problem 12.47) is the complex analog of Theorem 12.5 on real symmetric bilinear forms. THEOREM 12.7: Let f be a Hermitian form on V over C. Then there exists a basis of V in which f is represented by a diagonal matrix. Every other diagonal matrix representation of f has the same number p of positive entries and the same number n of negative entries. Again the rank and signature of the Hermitian form f are denoted and defined by rankð f Þ ¼ p þ n and sigð f Þ ¼ p n These are uniquely defined by Theorem 12.7. Analogously, a Hermitian form f is said to be (i) positive definite if qðvÞ ¼ f ðv; vÞ 0 for every v 6¼ 0, (ii) nonnegative semidefinite if qðvÞ ¼ f ðv; vÞ 0 for every v. EXAMPLE 12.4 Let f be the dot product on Cn ; that is, for any u ¼ ðziÞ and v ¼ ðwiÞ in Cn , f ðu; vÞ ¼ u v ¼ z1 w1 þ z2 w2 þ þ zn wn Then f is a Hermitian form on Cn . Moreover, f is also positive definite, because, for any u ¼ ðziÞ 6¼ 0 in Cn , f ðu; uÞ ¼ z1 z1 þ z2 z2 þ þ zn zn ¼ jz1j2 þ jz2j2 þ þ jznj2 0 SOLVED PROBLEMS Bilinear Forms 12.1. Let u ¼ ðx1; x2; x3Þ and v ¼ ðy1; y2; y3Þ. Express f in matrix notation, where f ðu; vÞ ¼ 3x1y1 2x1y3 þ 5x2y1 þ 7x2y2 8x2y3 þ 4x3y2 6x3y3 Let A ¼ ½aij, where aij is the coefficient of xiyj. Then f ðu; vÞ ¼ XT AY ¼ ½x1; x2; x3 3 0 2 5 7 8 0 4 6 2 4 3 5 y1 y2 y3 2 4 3 5 12.2. Let A be an n n matrix over K. Show that the mapping f defined by f ðX; YÞ ¼ XT AY is a bilinear form on Kn . For any a; b 2 K and any Xi; Yi 2 Kn , f ðaX1 þ bX2; YÞ ¼ ðaX1 þ bX2ÞT AY ¼ ðaXT 1 þ bXT 2 ÞAY ¼ aXT 1 AY þ bXT 2 AY ¼ af ðX1; YÞ þ bf ðX2; YÞ Hence, f is linear in the first variable. Also, f ðX; aY1 þ bY2Þ ¼ XT AðaY1 þ bY2Þ ¼ aXT AY1 þ bXT AY2 ¼ af ðX; Y1Þ þ bf ðX; Y2Þ Hence, f is linear in the second variable, and so f is a bilinear form on Kn . CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 365
  • 373. 12.3. Let f be the bilinear form on R2 defined by f ½ðx1; x2Þ; ðy1; y2Þ ¼ 2x1y1 3x1y2 þ 4x2y2 (a) Find the matrix A of f in the basis fu1 ¼ ð1; 0Þ; u2 ¼ ð1; 1Þg. (b) Find the matrix B of f in the basis fv1 ¼ ð2; 1Þ; v2 ¼ ð1; 1Þg. (c) Find the change-of-basis matrix P from the basis fuig to the basis fvig, and verify that B ¼ PT AP. (a) Set A ¼ ½aij, where aij ¼ f ðui; ujÞ. This yields a11 ¼ f ½ð1; 0Þ; ð1; 0Þ ¼ 2 0 0 ¼ 2; a21 ¼ f ½ð1; 1Þ; ð1; 0Þ ¼ 2 0 þ 0 ¼ 2 a12 ¼ f ½ð1; 0Þ; ð1; 1Þ ¼ 2 3 0 ¼ 1; a22 ¼ f ½ð1; 1Þ; ð1; 1Þ ¼ 2 3 þ 4 ¼ 3 Thus, A ¼ 2 1 2 3 is the matrix of f in the basis fu1; u2g. (b) Set B ¼ ½bij, where bij ¼ f ðvi; vjÞ. This yields b11 ¼ f ½ð2; 1Þ; ð2; 1Þ ¼ 8 6 þ 4 ¼ 6; b21 ¼ f ½ð1; 1Þ; ð2; 1Þ ¼ 4 3 4 ¼ 3 b12 ¼ f ½ð2; 1Þ; ð1; 1Þ ¼ 4 þ 6 4 ¼ 6; b22 ¼ f ½ð1; 1Þ; ð1; 1Þ ¼ 2 þ 3 þ 4 ¼ 9 Thus, B ¼ 6 6 3 9 is the matrix of f in the basis fv1; v2g. (c) Writing v1 and v2 in terms of the ui yields v1 ¼ u1 þ u2 and v2 ¼ 2u1 u2. Then P ¼ 1 2 1 1 ; PT ¼ 1 1 2 1 PT AP ¼ 1 1 2 1 2 1 2 3 1 2 1 1 ¼ 6 6 3 9 ¼ B and 12.4. Prove Theorem 12.1: Let V be an n-dimensional vector space over K. Let ff1; . . . ; fng be any basis of the dual space V*. Then f fij : i; j ¼ 1; . . . ; ng is a basis of BðVÞ, where fij is defined by fijðu; vÞ ¼ fiðuÞfjðvÞ. Thus, dim BðVÞ ¼ n2 . Let fu1; . . . ; ung be the basis of V dual to ffig. We first show that f fijg spans BðVÞ. Let f 2 BðVÞ and suppose f ðui; ujÞ ¼ aij: We claim that f ¼ P i;j aij fij. It suffices to show that f ðus; utÞ ¼ P aij fij ðus; utÞ for s; t ¼ 1; . . . ; n We have P aij fij ðus; utÞ ¼ P aij fijðus; utÞ ¼ P aijfiðusÞfjðutÞ ¼ P aijdisdjt ¼ ast ¼ f ðus; utÞ as required. Hence, ffijg spans BðVÞ. Next, suppose P aijfij ¼ 0. Then for s; t ¼ 1; . . . ; n, 0 ¼ 0ðus; utÞ ¼ ð P aij fijÞðus; utÞ ¼ ars The last step follows as above. Thus, f fijg is independent, and hence is a basis of BðVÞ. 12.5. Prove Theorem 12.2. Let P be the change-of-basis matrix from a basis S to a basis S0 . Let A be the matrix representing a bilinear form in the basis S. Then B ¼ PT AP is the matrix representing f in the basis S0 . Let u; v 2 V. Because P is the change-of-basis matrix from S to S0 , we have P½uS0 ¼ ½uS and also P½vS0 ¼ ½vS; hence, ½uT S ¼ ½uT S0 PT . Thus, f ðu; vÞ ¼ ½uT S A½vS ¼ ½uT S0 PT AP½vS0 Because u and v are arbitrary elements of V, PT AP is the matrix of f in the basis S0 . 366 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
  • 374. Symmetric Bilinear Forms, Quadratic Forms 12.6. Find the symmetric matrix that corresponds to each of the following quadratic forms: (a) qðx; y; zÞ ¼ 3x2 þ 4xy y2 þ 8xz 6yz þ z2 , (b) q0 ðx; y; zÞ ¼ 3x2 þ xz 2yz, (c) q00 ðx; y; zÞ ¼ 2x2 5y2 7z2 The symmetric matrix A ¼ ½aij that represents qðx1; . . . ; xnÞ has the diagonal entry aii equal to the coefficient of the square term x2 i and the nondiagonal entries aij and aji each equal to half of the coefficient of the cross-product term xixj. Thus, (a) A ¼ 3 2 4 2 1 3 4 3 1 2 4 3 5, (b) A0 ¼ 3 0 1 2 0 0 1 1 2 1 0 2 4 3 5, (c) A00 ¼ 2 0 0 0 5 0 0 0 7 2 4 3 5 The third matrix A00 is diagonal, because the quadratic form q00 is diagonal; that is, q00 has no cross-product terms. 12.7. Find the quadratic form qðXÞ that corresponds to each of the following symmetric matrices: (a) A ¼ 5 3 3 8 ; (b) B ¼ 4 5 7 5 6 8 7 8 9 2 4 3 5, (c) C ¼ 2 4 1 5 4 7 6 8 1 6 3 9 5 8 9 1 2 6 6 4 3 7 7 5 The quadratic form qðXÞ that corresponds to a symmetric matrix M is defined by qðXÞ ¼ XT MX, where X ¼ ½xi is the column vector of unknowns. (a) Compute as follows: qðx; yÞ ¼ XT AX ¼ ½x; y 5 3 3 8 x y ¼ ½5x 3y; 3x þ 8y x y ¼ 5x2 3xy 3xy þ 8y2 ¼ 5x2 6xy þ 8y2 As expected, the coefficient 5 of the square term x2 and the coefficient 8 of the square term y2 are the diagonal elements of A, and the coefficient 6 of the cross-product term xy is the sum of the nondiagonal elements 3 and 3 of A (or twice the nondiagonal element 3, because A is symmetric). (b) Because B is a three-square matrix, there are three unknowns, say x; y; z or x1; x2; x3. Then qðx; y; zÞ ¼ 4x2 10xy 6y2 þ 14xz þ 16yz 9z2 qðx1; x2; x3Þ ¼ 4x2 1 10x1x2 6x2 2 þ 14x1x3 þ 16x2x3 9x2 3 or Here we use the fact that the coefficients of the square terms x2 1; x2 2; x2 3 (or x2 ; y2 ; z2 ) are the respective diagonal elements 4; 6; 9 of B, and the coefficient of the cross-product term xixj is the sum of the nondiagonal elements bij and bji (or twice bij, because bij ¼ bji). (c) Because C is a four-square matrix, there are four unknowns. Hence, qðx1; x2; x3; x4Þ ¼ 2x2 1 7x2 2 þ 3x2 3 þ x2 4 þ 8x1x2 2x1x3 þ 10x1x4 12x2x3 þ 16x2x4 þ 18x3x4 12.8. Let A ¼ 1 3 2 3 7 5 2 5 8 2 4 3 5. Apply Algorithm 12.1 to find a nonsingular matrix P such that D ¼ PT AP is diagonal, and find sigðAÞ, the signature of A. CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 367
  • 375. First form the block matrix M ¼ ½A; I: M ¼ ½A; I ¼ 1 3 2 1 0 0 3 7 5 0 1 0 2 5 8 0 0 1 2 4 3 5 Using a11 ¼ 1 as a pivot, apply the row operations ‘‘Replace R2 by 3R1 þ R2’’ and ‘‘Replace R3 by 2R1 þ R3’’ to M and then apply the corresponding column operations ‘‘Replace C2 by 3C1 þ C2’’ and ‘‘Replace C3 by 2C1 þ C3’’ to A to obtain 1 3 2 1 0 0 0 2 1 3 1 0 0 1 4 2 0 1 2 4 3 5 and then 1 0 0 1 0 0 0 2 1 3 1 0 0 1 4 2 0 1 2 4 3 5: Next apply the row operation ‘‘Replace R3 by R2 þ 2R3’’ and then the corresponding column operation ‘‘Replace C3 by C2 þ 2C3’’ to obtain 1 0 0 1 0 0 0 2 1 3 1 0 0 0 9 1 1 2 2 4 3 5 and then 1 0 0 1 0 0 0 2 0 3 1 0 0 0 18 1 1 2 2 4 3 5 Now A has been diagonalized and the transpose of P is in the right half of M. Thus, set P ¼ 1 3 1 0 1 1 0 0 2 2 4 3 5 and then D ¼ PT AP ¼ 1 0 0 0 2 0 0 0 18 2 4 3 5 Note D has p ¼ 2 positive and n ¼ 1 negative diagonal elements. Thus, the signature of A is sigðAÞ ¼ p n ¼ 2 1 ¼ 1. 12.9. Justify Algorithm 12.1, which diagonalizes (under congruence) a symmetric matrix A. Consider the block matrix M ¼ ½A; I. The algorithm applies a sequence of elementary row operations and the corresponding column operations to the left side of M, which is the matrix A. This is equivalent to premultiplying A by a sequence of elementary matrices, say, E1; E2; . . . ; Er, and postmultiplying A by the transposes of the Ei. Thus, when the algorithm ends, the diagonal matrix D on the left side of M is equal to D ¼ Er E2E1AET 1 ET 2 ET r ¼ QAQT ; where Q ¼ Er E2E1 On the other hand, the algorithm only applies the elementary row operations to the identity matrix I on the right side of M. Thus, when the algorithm ends, the matrix on the right side of M is equal to Er E2E1I ¼ Er E2E1 ¼ Q Setting P ¼ QT , we get D ¼ PT AP, which is a diagonalization of A under congruence. 12.10. Prove Theorem 12.4: Let f be a symmetric bilinear form on V over K (where 1 þ 1 6¼ 0). Then V has a basis in which f is represented by a diagonal matrix. Algorithm 12.1 shows that every symmetric matrix over K is congruent to a diagonal matrix. This is equivalent to the statement that f has a diagonal representation. 12.11. Let q be the quadratic form associated with the symmetric bilinear form f . Verify the polar identity f ðu; vÞ ¼ 1 2 ½qðu þ vÞ qðuÞ qðvÞ. (Assume that 1 þ 1 6¼ 0.) We have qðu þ vÞ qðuÞ qðvÞ ¼ f ðu þ v; u þ vÞ f ðu; uÞ f ðv; vÞ ¼ f ðu; uÞ þ f ðu; vÞ þ f ðv; uÞ þ f ðv; vÞ f ðu; uÞ f ðv; vÞ ¼ 2f ðu; vÞ If 1 þ 1 6¼ 0, we can divide by 2 to obtain the required identity. 368 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
  • 376. 12.12. Consider the quadratic form qðx; yÞ ¼ 3x2 þ 2xy y2 and the linear substitution x ¼ s 3t; y ¼ 2s þ t (a) Rewrite qðx; yÞ in matrix notation, and find the matrix A representing qðx; yÞ. (b) Rewrite the linear substitution using matrix notation, and find the matrix P corresponding to the substitution. (c) Find qðs; tÞ using direct substitution. (d) Find qðs; tÞ using matrix notation. (a) Here qðx; yÞ ¼ ½x; y 3 1 1 1 x y . Thus, A ¼ 3 1 1 1 ; and qðXÞ ¼ XT AX, where X ¼ ½x; yT . (b) Here x y ¼ 1 3 2 1 s t . Thus, P ¼ 1 3 2 1 ; and X ¼ x y ; Y ¼ s t and X ¼ PY. (c) Substitute for x and y in q to obtain qðs; tÞ ¼ 3ðs 3tÞ2 þ 2ðs 3tÞð2s þ tÞ ð2s þ tÞ2 ¼ 3ðs2 6st þ 9t2 Þ þ 2ð2s2 5st 3t2 Þ ð4s2 þ 4st þ t2 Þ ¼ 3s2 32st þ 20t2 (d) Here qðXÞ ¼ XT AX and X ¼ PY. Thus, XT ¼ YT PT . Therefore, qðs; tÞ ¼ qðYÞ ¼ YT PT APY ¼ ½s; t 1 2 3 1 3 1 1 1 1 3 2 1 s t ¼ ½s; t 3 16 16 20 s t ¼ 3s2 32st þ 20t2 [As expected, the results in parts (c) and (d) are equal.] 12.13. Consider any diagonal matrix A ¼ diagða1; . . . ; anÞ over K. Show that for any nonzero scalars k1; . . . ; kn 2 K; A is congruent to a diagonal matrix D with diagonal entries a1k2 1; . . . ; ank2 n. Furthermore, show that (a) If K ¼ C, then we can choose D so that its diagonal entries are only 1’s and 0’s. (b) If K ¼ R, then we can choose D so that its diagonal entries are only 1’s, 1’s, and 0’s. Let P ¼ diagðk1; . . . ; knÞ. Then, as required, D ¼ PT AP ¼ diagðkiÞ diagðaiÞ diagðkiÞ ¼ diagða1k2 1 ; . . . ; ank2 n Þ (a) Let P ¼ diagðbiÞ, where bi ¼ 1= ffiffiffiffi ai p if ai 6¼ 0 1 if ai ¼ 0 Then PT AP has the required form. (b) Let P ¼ diagðbiÞ, where bi ¼ 1= ffiffiffiffiffiffiffi jaij p if ai 6¼ 0 1 if ai ¼ 0 Then PT AP has the required form. Remark: We emphasize that (b) is no longer true if ‘‘congruence’’ is replaced by ‘‘Hermitian congruence.’’ 12.14. Prove Theorem 12.5: Let f be a symmetric bilinear form on V over R. Then there exists a basis of V in which f is represented by a diagonal matrix. Every other diagonal matrix representation of f has the same number p of positive entries and the same number n of negative entries. By Theorem 12.4, there is a basis fu1; . . . ; ung of V in which f is represented by a diagonal matrix with, say, p positive and n negative entries. Now suppose fw1; . . . ; wng is another basis of V, in which f is represented by a diagonal matrix with p0 positive and n0 negative entries. We can assume without loss of generality that the positive entries in each matrix appear first. Because rankð f Þ ¼ p þ n ¼ p0 þ n0 , it suffices to prove that p ¼ p0 . CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 369
  • 377. Let U be the linear span of u1; . . . ; up and let W be the linear span of wp0þ1; . . . ; wn. Then f ðv; vÞ 0 for every nonzero v 2 U, and f ðv; vÞ 0 for every nonzero v 2 W. Hence, U W ¼ f0g. Note that dim U ¼ p and dim W ¼ n p0 . Thus, dimðU þ WÞ ¼ dim U þ dimW dimðU WÞ ¼ p þ ðn p0 Þ 0 ¼ p p0 þ n But dimðU þ WÞ dim V ¼ n; hence, p p0 þ n n or p p0 . Similarly, p0 p and therefore p ¼ p0 , as required. Remark: The above theorem and proof depend only on the concept of positivity. Thus, the theorem is true for any subfield K of the real field R such as the rational field Q. Positive Definite Real Quadratic Forms 12.15. Prove that the following definitions of a positive definite quadratic form q are equivalent: (a) The diagonal entries are all positive in any diagonal representation of q. (b) qðYÞ 0, for any nonzero vector Y in Rn . Suppose qðYÞ ¼ a1y2 1 þ a2y2 2 þ þ any2 n. If all the coefficients are positive, then clearly qðYÞ 0 whenever Y 6¼ 0. Thus, (a) implies (b). Conversely, suppose (a) is not true; that is, suppose some diagonal entry ak 0. Let ek ¼ ð0; . . . ; 1; . . . 0Þ be the vector whose entries are all 0 except 1 in the kth position. Then qðekÞ ¼ ak is not positive, and so (b) is not true. That is, (b) implies (a). Accordingly, (a) and (b) are equivalent. 12.16. Determine whether each of the following quadratic forms q is positive definite: (a) qðx; y; zÞ ¼ x2 þ 2y2 4xz 4yz þ 7z2 (b) qðx; y; zÞ ¼ x2 þ y2 þ 2xz þ 4yz þ 3z2 Diagonalize (under congruence) the symmetric matrix A corresponding to q. (a) Apply the operations ‘‘Replace R3 by 2R1 þ R3’’ and ‘‘Replace C3 by 2C1 þ C3,’’ and then ‘‘Replace R3 by R2 þ R3’’ and ‘‘Replace C3 by C2 þ C3.’’ These yield A ¼ 1 0 2 0 2 2 2 2 7 2 4 3 5 ’ 1 0 0 0 2 2 0 2 3 2 4 3 5 ’ 1 0 0 0 2 0 0 0 1 2 4 3 5 The diagonal representation of q only contains positive entries, 1; 2; 1, on the diagonal. Thus, q is positive definite. (b) We have A ¼ 1 0 1 0 1 2 1 2 3 2 4 3 5 ’ 1 0 0 0 1 2 0 2 2 2 4 3 5 ’ 1 0 0 0 1 0 0 0 2 2 4 3 5 There is a negative entry 2 on the diagonal representation of q. Thus, q is not positive definite. 12.17. Show that qðx; yÞ ¼ ax2 þ bxy þ cy2 is positive definite if and only if a 0 and the discriminant D ¼ b2 4ac 0. Suppose v ¼ ðx; yÞ 6¼ 0. Then either x 6¼ 0 or y 6¼ 0; say, y 6¼ 0. Let t ¼ x=y. Then qðvÞ ¼ y2 ½aðx=yÞ2 þ bðx=yÞ þ c ¼ y2 ðat2 þ bt þ cÞ However, the following are equivalent: (i) s ¼ at2 þ bt þ c is positive for every value of t. (ii) s ¼ at2 þ bt þ c lies above the t-axis. (iii) a 0 and D ¼ b2 4ac 0. 370 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
  • 378. Thus, q is positive definite if and only if a 0 and D 0. [Remark: D 0 is the same as detðAÞ 0, where A is the symmetric matrix corresponding to q.] 12.18. Determine whether or not each of the following quadratic forms q is positive definite: (a) qðx; yÞ ¼ x2 4xy þ 7y2 , (b) qðx; yÞ ¼ x2 þ 8xy þ 5y2 , (c) qðx; yÞ ¼ 3x2 þ 2xy þ y2 Compute the discriminant D ¼ b2 4ac, and then use Problem 12.17. (a) D ¼ 16 28 ¼ 12. Because a ¼ 1 0 and D 0; q is positive definite. (b) D ¼ 64 20 ¼ 44. Because D 0; q is not positive definite. (c) D ¼ 4 12 ¼ 8. Because a ¼ 3 0 and D 0; q is positive definite. Hermitian Forms 12.19. Determine whether the following matrices are Hermitian: (a) 2 2 þ 3i 4 5i 2 3i 5 6 þ 2i 4 þ 5i 6 2i 7 2 4 3 5, (b) 3 2 i 4 þ i 2 i 6 i 4 þ i i 7 2 4 3 5, (c) 4 3 5 3 2 1 5 1 6 2 4 3 5 A complex matrix A ¼ ½aij is Hermitian if A* ¼ A—that is, if aij ¼ aji: (a) Yes, because it is equal to its conjugate transpose. (b) No, even though it is symmetric. (c) Yes. In fact, a real matrix is Hermitian if and only if it is symmetric. 12.20. Let A be a Hermitian matrix. Show that f is a Hermitian form on Cn where f is defined by f ðX; YÞ ¼ XT A Y. For all a; b 2 C and all X1; X2; Y 2 Cn , f ðaX1 þ bX2; YÞ ¼ ðaX1 þ bX2ÞT A Y ¼ ðaXT 1 þ bXT 2 ÞA Y ¼ aXT 1 A Y þ bXT 2 A Y ¼ af ðX1; YÞ þ bf ðX2; YÞ Hence, f is linear in the first variable. Also, f ðX; YÞ ¼ XTA Y ¼ ðXTA YÞT ¼ YT AT X ¼ YT A* X ¼ YT A X ¼ f ðY; XÞ Hence, f is a Hermitian form on Cn . Remark: We use the fact that XT A Y is a scalar and so it is equal to its transpose. 12.21. Let f be a Hermitian form on V. Let H be the matrix of f in a basis S ¼ fuig of V. Prove the following: (a) f ðu; vÞ ¼ ½uT S H½vS for all u; v 2 V. (b) If P is the change-of-basis matrix from S to a new basis S0 of V, then B ¼ PT H P (or B ¼ Q*HQ, where Q ¼ PÞ is the matrix of f in the new basis S0 . Note that (b) is the complex analog of Theorem 12.2. (a) Let u; v 2 V and suppose u ¼ a1u1 þ þ anun and v ¼ b1u1 þ þ bnun. Then, as required, f ðu; vÞ ¼ f ða1u1 þ þ anun; b1u1 þ þ bnunÞ ¼ P i;j ai bjf ðui; vjÞ ¼ ½a1; . . . ; anH½ b1; . . . ; bnT ¼ ½uT S H½vS CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 371
  • 379. (b) Because P is the change-of-basis matrix from S to S0 , we have P½uS0 ¼ ½uS and P½vS0 ¼ ½vS; hence, ½uT S ¼ ½uT S0 PT and ½vS ¼ P½vS0 : Thus, by (a), f ðu; vÞ ¼ ½uT S H½vS ¼ ½uT S0 PT H P½vS0 But u and v are arbitrary elements of V; hence, PT H P is the matrix of f in the basis S0 : 12.22. Let H ¼ 1 1 þ i 2i 1 i 4 2 3i 2i 2 þ 3i 7 2 4 3 5, a Hermitian matrix. Find a nonsingular matrix P such that D ¼ PT H P is diagonal. Also, find the signature of H. Use the modified Algorithm 12.1 that applies the same row operations but the corresponding conjugate column operations. Thus, first form the block matrix M ¼ ½H; I: M ¼ 1 1 þ i 2i 1 0 0 1 i 4 2 3i 0 1 0 2i 2 þ 3i 7 0 0 1 2 4 3 5 Apply the row operations ‘‘Replace R2 by ð1 þ iÞR1 þ R2’’ and ‘‘Replace R3 by 2iR1 þ R3’’ and then the corresponding conjugate column operations ‘‘Replace C2 by ð1 iÞC1 þ C2’’ and ‘‘Replace C3 by 2iC1 þ C3’’ to obtain 1 1 þ i 2i 1 0 0 0 2 5i 1 þ i 1 0 0 5i 3 2i 0 1 2 4 3 5 and then 1 0 0 1 0 0 0 2 5i 1 þ i 1 0 0 5i 3 2i 0 1 2 4 3 5 Next apply the row operation ‘‘Replace R3 by 5iR2 þ 2R3’’ and the corresponding conjugate column operation ‘‘Replace C3 by 5iC2 þ 2C3’’ to obtain 1 0 0 1 0 0 0 2 5i 1 þ i 1 0 0 0 19 5 þ 9i 5i 2 2 4 3 5 and then 1 0 0 1 0 0 0 2 0 1 þ i 1 0 0 0 38 5 þ 9i 5i 2 2 4 3 5 Now H has been diagonalized, and the transpose of the right half of M is P. Thus, set P ¼ 1 1 þ i 5 þ 9i 0 1 5i 0 0 2 2 4 3 5; and then D ¼ PT H P ¼ 1 0 0 0 2 0 0 0 38 2 4 3 5: Note D has p ¼ 2 positive elements and n ¼ 1 negative elements. Thus, the signature of H is sigðHÞ ¼ 2 1 ¼ 1. Miscellaneous Problems 12.23. Prove Theorem 12.3: Let f be an alternating form on V. Then there exists a basis of V in which f is represented by a block diagonal matrix M with blocks of the form 0 1 1 0 or 0. The number of nonzero blocks is uniquely determined by f [because it is equal to 1 2 rankð f Þ. If f ¼ 0, then the theorem is obviously true. Also, if dim V ¼ 1, then f ðk1u; k2uÞ ¼ k1k2f ðu; uÞ ¼ 0 and so f ¼ 0. Accordingly, we can assume that dim V 1 and f 6¼ 0. Because f 6¼ 0, there exist (nonzero) u1; u2 2 V such that f ðu1; u2Þ 6¼ 0. In fact, multiplying u1 by an appropriate factor, we can assume that f ðu1; u2Þ ¼ 1 and so f ðu2; u1Þ ¼ 1. Now u1 and u2 are linearly independent; because if, say, u2 ¼ ku1, then f ðu1; u2Þ ¼ f ðu1; ku1Þ ¼ kf ðu1; u1Þ ¼ 0. Let U ¼ spanðu1; u2Þ; then, 372 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
  • 380. (i) The matrix representation of the restriction of f to U in the basis fu1; u2g is 0 1 1 0 , (ii) If u 2 U, say u ¼ au1 þ bu2, then f ðu; u1Þ ¼ f ðau1 þ bu2; u1Þ ¼ b and f ðu; u2Þ ¼ f ðau1 þ bu2; u2Þ ¼ a Let W consists of those vectors w 2 V such that f ðw; u1Þ ¼ 0 and f ðw; u2Þ ¼ 0: Equivalently, W ¼ fw 2 V : f ðw; uÞ ¼ 0 for every u 2 Ug We claim that V ¼ U W. It is clear that U W ¼ f0g, and so it remains to show that V ¼ U þ W. Let v 2 V. Set u ¼ f ðv; u2Þu1 f ðv; u1Þu2 and w ¼ v u ð1Þ Because u is a linear combination of u1 and u2; u 2 U. We show next that w 2 W. By (1) and (ii), f ðu; u1Þ ¼ f ðv; u1Þ; hence, f ðw; u1Þ ¼ f ðv u; u1Þ ¼ f ðv; u1Þ f ðu; u1Þ ¼ 0 Similarly, f ðu; u2Þ ¼ f ðv; u2Þ and so f ðw; u2Þ þ f ðv u; u2Þ ¼ f ðv; u2Þ f ðu; u2Þ ¼ 0 Then w 2 W and so, by (1), v ¼ u þ w, where u 2 W. This shows that V ¼ U þ W; therefore, V ¼ U W. Now the restriction of f to W is an alternating bilinear form on W. By induction, there exists a basis u3; . . . ; un of W in which the matrix representing f restricted to W has the desired form. Accordingly, u1; u2; u3; . . . ; un is a basis of V in which the matrix representing f has the desired form. SUPPLEMENTARY PROBLEMS Bilinear Forms 12.24. Let u ¼ ðx1; x2Þ and v ¼ ðy1; y2Þ. Determine which of the following are bilinear forms on R2 : (a) f ðu; vÞ ¼ 2x1y2 3x2y1, (c) f ðu; vÞ ¼ 3x2y2, (e) f ðu; vÞ ¼ 1, (b) f ðu; vÞ ¼ x1 þ y2, (d) f ðu; vÞ ¼ x1x2 þ y1y2, (f ) f ðu; vÞ ¼ 0 12.25. Let f be the bilinear form on R2 defined by f ½ðx1; x2Þ; ðy1; y2Þ ¼ 3x1y1 2x1y2 þ 4x2y1 x2y2 (a) Find the matrix A of f in the basis fu1 ¼ ð1; 1Þ; u2 ¼ ð1; 2Þg. (b) Find the matrix B of f in the basis fv1 ¼ ð1; 1Þ; v2 ¼ ð3; 1Þg. (c) Find the change-of-basis matrix P from fuig to fvig, and verify that B ¼ PT AP. 12.26. Let V be the vector space of two-square matrices over R. Let M ¼ 1 2 3 5 , and let f ðA; BÞ ¼ trðAT MBÞ, where A; B 2 V and ‘‘tr’’ denotes trace. (a) Show that f is a bilinear form on V. (b) Find the matrix of f in the basis 1 0 0 0 ; 0 1 0 0 ; 0 0 1 0 ; 0 0 0 1 12.27. Let BðVÞ be the set of bilinear forms on V over K. Prove the following: (a) If f ; g 2 BðVÞ, then f þ g, kg 2 BðVÞ for any k 2 K. (b) If f and s are linear functions on V, then f ðu; vÞ ¼ fðuÞsðvÞ belongs to BðVÞ. 12.28. Let ½ f denote the matrix representation of a bilinear form f on V relative to a basis fuig. Show that the mapping f 7! ½ f is an isomorphism of BðVÞ onto the vector space V of n-square matrices. CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 373
  • 381. 12.29. Let f be a bilinear form on V. For any subset S of V, let S? ¼ fv 2 V : f ðu; vÞ ¼ 0 for every u 2 Sg and S ¼ fv 2 V : f ðv; uÞ ¼ 0 for every u 2 Sg Show that: (a) S and S are subspaces of V; (b) S1 S2 implies S? 2 S? 1 and S 2 S 1 ; (c) f0g? ¼ f0g ¼ V. 12.30. Suppose f is a bilinear form on V. Prove that: rankð f Þ ¼ dim V dim V? ¼ dim V dim V , and hence, dim V? ¼ dim V . 12.31. Let f be a bilinear form on V. For each u 2 V, let ^ u:V ! K and ~ u:V ! K be defined by ^ uðxÞ ¼ f ðx; uÞ and ~ uðxÞ ¼ f ðu; xÞ. Prove the following: (a) ^ u and ~ u are each linear; i.e., ^ u; ~ u 2 V*, (b) u 7! ^ u and u 7! ~ u are each linear mappings from V into V*, (c) rankð f Þ ¼ rankðu 7! ^ uÞ ¼ rankðu 7! ~ uÞ. 12.32. Show that congruence of matrices (denoted by ’) is an equivalence relation; that is, (i) A ’ A; (ii) If A ’ B, then B ’ A; (iii) If A ’ B and B ’ C, then A ’ C. Symmetric Bilinear Forms, Quadratic Forms 12.33. Find the symmetric matrix A belonging to each of the following quadratic forms: (a) qðx; y; zÞ 2x2 8xy þ y2 16xz þ 14yz þ 5z2 , (c) qðx; y; zÞ ¼ xy þ y2 þ 4xz þ z2 (b) qðx; y; zÞ ¼ x2 xz þ y2 , (d) qðx; y; zÞ ¼ xy þ yz 12.34. For each of the following symmetric matrices A, find a nonsingular matrix P such that D ¼ PT AP is diagonal: (a) A ¼ 1 0 2 0 3 6 2 6 7 2 4 3 5, (b) A ¼ 1 2 1 2 5 3 1 3 2 2 4 3 5, (c) A ¼ 1 1 0 2 1 2 1 0 0 1 1 2 2 0 2 1 2 6 6 4 3 7 7 5 12.35. Let qðx; yÞ ¼ 2x2 6xy 3y2 and x ¼ s þ 2t, y ¼ 3s t. (a) Rewrite qðx; yÞ in matrix notation, and find the matrix A representing the quadratic form. (b) Rewrite the linear substitution using matrix notation, and find the matrix P corresponding to the substitution. (c) Find qðs; tÞ using (i) direct substitution, (ii) matrix notation. 12.36. For each of the following quadratic forms qðx; y; zÞ, find a nonsingular linear substitution expressing the variables x; y; z in terms of variables r; s; t such that qðr; s; tÞ is diagonal: (a) qðx; y; zÞ ¼ x2 þ 6xy þ 8y2 4xz þ 2yz 9z2 , (b) qðx; y; zÞ ¼ 2x2 3y2 þ 8xz þ 12yz þ 25z2 , (c) qðx; y; zÞ ¼ x2 þ 2xy þ 3y2 þ 4xz þ 8yz þ 6z2 . In each case, find the rank and signature. 12.37. Give an example of a quadratic form qðx; yÞ such that qðuÞ ¼ 0 and qðvÞ ¼ 0 but qðu þ vÞ 6¼ 0. 12.38. Let SðVÞ denote all symmetric bilinear forms on V. Show that (a) SðVÞ is a subspace of BðVÞ; (b) If dim V ¼ n, then dim SðVÞ ¼ 1 2 nðn þ 1Þ. 12.39. Consider a real quadratic polynomial qðx1; . . . ; xnÞ ¼ Pn i;j¼1 aijxixj; where aij ¼ aji. 374 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
  • 382. (a) If a11 6¼ 0, show that the substitution x1 ¼ y1 1 a11 ða12y2 þ þ a1nynÞ; x2 ¼ y2; . . . ; xn ¼ yn yields the equation qðx1; . . . ; xnÞ ¼ a11 y2 1 þ q0 ðy2; . . . ; ynÞ, where q0 is also a quadratic polynomial. (b) If a11 ¼ 0 but, say, a12 6¼ 0, show that the substitution x1 ¼ y1 þ y2; x2 ¼ y1 y2; x3 ¼ y3; . . . ; xn ¼ yn yields the equation qðx1; . . . ; xnÞ ¼ P bij yi yj, where b11 6¼ 0, which reduces this case to case (a). Remark: This method of diagonalizing q is known as completing the square. Positive Definite Quadratic Forms 12.40. Determine whether or not each of the following quadratic forms is positive definite: (a) qðx; yÞ ¼ 4x2 þ 5xy þ 7y2 , (c) qðx; y; zÞ ¼ x2 þ 4xy þ 5y2 þ 6xz þ 2yz þ 4z2 (b) qðx; yÞ ¼ 2x2 3xy y2 ; (d) qðx; y; zÞ ¼ x2 þ 2xy þ 2y2 þ 4xz þ 6yz þ 7z2 12.41. Find those values of k such that the given quadratic form is positive definite: (a) qðx; yÞ ¼ 2x2 5xy þ ky2 , (b) qðx; yÞ ¼ 3x2 kxy þ 12y2 (c) qðx; y; zÞ ¼ x2 þ 2xy þ 2y2 þ 2xz þ 6yz þ kz2 12.42. Suppose A is a real symmetric positive definite matrix. Show that A ¼ PT P for some nonsingular matrix P. Hermitian Forms 12.43. Modify Algorithm 12.1 so that, for a given Hermitian matrix H, it finds a nonsingular matrix P for which D ¼ PT A P is diagonal. 12.44. For each Hermitian matrix H, find a nonsingular matrix P such that D ¼ PT H P is diagonal: (a) H ¼ 1 i i 2 , (b) H ¼ 1 2 þ 3i 2 3i 1 , (c) H ¼ 1 i 2 þ i i 2 1 i 2 i 1 þ i 2 2 4 3 5 Find the rank and signature in each case. 12.45. Let A be a complex nonsingular matrix. Show that H ¼ A*A is Hermitian and positive definite. 12.46. We say that B is Hermitian congruent to A if there exists a nonsingular matrix P such that B ¼ PT A P or, equivalently, if there exists a nonsingular matrix Q such that B ¼ Q*AQ. Show that Hermitian congruence is an equivalence relation. (Note: If P ¼ Q, then PT A P ¼ Q*AQ.) 12.47. Prove Theorem 12.7: Let f be a Hermitian form on V. Then there is a basis S of V in which f is represented by a diagonal matrix, and every such diagonal representation has the same number p of positive entries and the same number n of negative entries. Miscellaneous Problems 12.48. Let e denote an elementary row operation, and let f * denote the corresponding conjugate column operation (where each scalar k in e is replaced by k in f *). Show that the elementary matrix corresponding to f * is the conjugate transpose of the elementary matrix corresponding to e. 12.49. Let V and W be vector spaces over K. A mapping f :V W ! K is called a bilinear form on V and W if (i) f ðav1 þ bv2; wÞ ¼ af ðv1; wÞ þ bf ðv2; wÞ, (ii) f ðv; aw1 þ bw2Þ ¼ af ðv; w1Þ þ bf ðv; w2Þ for every a; b 2 K; vi 2 V; wj 2 W. Prove the following: CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms 375
  • 383. (a) The set BðV; WÞ of bilinear forms on V and W is a subspace of the vector space of functions from V W into K. (b) If ff1; ... ; fmg is a basis of V* and fs1; ... ; sng is a basis of W*, then f fij : i ¼ 1; ... ; m; j ¼ 1; ... ; ng is a basis of BðV; WÞ, where fij is defined by fijðv; wÞ ¼ fiðvÞsjðwÞ. Thus, dim BðV; WÞ ¼ dim V dim W. [Note that if V ¼ W, then we obtain the space BðVÞ investigated in this chapter.] 12.50. Let V be a vector space over K. A mapping f :V V . . . V zfflfflfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflfflfflffl{ m times ! K is called a multilinear (or m-linear) form on V if f is linear in each variable; that is, for i ¼ 1; . . . ; m, f ð. . . ; au þ bv; . . .Þ ¼ af ð. . . ; ^ u; . . .Þ þ bf ð. . . ; ^ v; . . .Þ where c . . . denotes the ith element, and other elements are held fixed. An m-linear form f is said to be alternating if f ðv1; . . . vmÞ ¼ 0 whenever vi ¼ vj for i 6¼ j. Prove the following: (a) The set BmðVÞ of m-linear forms on V is a subspace of the vector space of functions from V V V into K. (b) The set AmðVÞ of alternating m-linear forms on V is a subspace of BmðVÞ. Remark 1: If m ¼ 2, then we obtain the space BðVÞ investigated in this chapter. Remark 2: If V ¼ Km , then the determinant function is an alternating m-linear form on V. ANSWERS TO SUPPLEMENTARY PROBLEMS Notation: M ¼ ½R1; R2; . . . denotes a matrix M with rows R1; R2; . . .. 12.24. (a) yes, (b) no, (c) yes, (d) no, (e) no, (f ) yes 12.25. (a) A ¼ ½4; 1; 7; 3, (b) B ¼ ½0; 4; 20; 32, (c) P ¼ ½3; 5; 2; 2 12.26. (b) ½1; 0; 2; 0; 0; 1; 0; 2; 3; 0; 5; 0; 0; 3; 0; 5 12.33. (a) ½2; 4; 8; 4; 1; 7; 8; 7; 5, (b) ½1; 0; 1 2 ; 0; 1; 0; 1 2 ; 0; 0, (c) ½0; 1 2 ; 2; 1 2 ; 1; 0; 2; 0; 1, (d) ½0; 1 2 ; 0; 1 2 ; 0; 1; 1 2 ; 0; 1 2 ; 0; 1 2 ; 0 12.34. (a) P ¼ ½1; 0; 2; 0; 1; 2; 0; 0; 1; D ¼ diagð1; 3; 9Þ; (b) P ¼ ½1; 2; 11; 0; 1; 5; 0; 0; 1; D ¼ diagð1; 1; 28Þ; (c) P ¼ ½1; 1; 1; 4; 0; 1; 1; 2; 0; 0; 1; 0; 0; 0; 0; 1; D ¼ diagð1; 1; 0; 9Þ 12.35. A ¼ ½2; 3; 3; 3, P ¼ ½1; 2; 3; 1, qðs; tÞ ¼ 43s2 4st þ 17t2 12.36. (a) x ¼ r 3s 19t, y ¼ s þ 7t, z ¼ t; qðr; s; tÞ ¼ r2 s2 þ 36t2 ; (b) x ¼ r 2t; y ¼ s þ 2t; z ¼ t; qðr; s; tÞ ¼ 2r2 3s2 þ 29t2 ; (c) x ¼ r s t; y ¼ s t; z ¼ t; qðr; s; tÞ ¼ r2 2s2 12.37. qðx; yÞ ¼ x2 y2 , u ¼ ð1; 1Þ, v ¼ ð1; 1Þ 12.40. (a) yes, (b) no, (c) no, (d) yes 12.41. (a) k 25 8 , (b) 12 k 12, (c) k 5 12.44. (a) P ¼ ½1; i; 0; 1, D ¼ I; s ¼ 2; (b) P ¼ ½1; 2 þ 3i; 0; 1, D ¼ diagð1; 14Þ, s ¼ 0; (c) P ¼ ½1; i; 3 þ i; 0; 1; i; 0; 0; 1, D ¼ diagð1; 1; 4Þ; s ¼ 1 d 376 CHAPTER 12 Bilinear, Quadratic, and Hermitian Forms
  • 384. Linear Operators on Inner Product Spaces 13.1 Introduction This chapter investigates the space AðVÞ of linear operators T on an inner product space V. (See Chapter 7.) Thus, the base field K is either the real numbers R or the complex numbers C. In fact, different terminologies will be used for the real case and the complex case. We also use the fact that the inner products on real Euclidean space Rn and complex Euclidean space Cn may be defined, respectively, by hu; vi ¼ uT v and hu; vi ¼ uT v where u and v are column vectors. The reader should review the material in Chapter 7 and be very familiar with the notions of norm (length), orthogonality, and orthonormal bases. We also note that Chapter 7 mainly dealt with real inner product spaces, whereas here we assume that V is a complex inner product space unless otherwise stated or implied. Lastly, we note that in Chapter 2, we used AH to denote the conjugate transpose of a complex matrix A; that is, AH ¼ AT . This notation is not standard. Many texts, expecially advanced texts, use A* to denote such a matrix; we will use that notation in this chapter. That is, now A* ¼ AT . 13.2 Adjoint Operators We begin with the following basic definition. DEFINITION: A linear operator T on an inner product space V is said to have an adjoint operator T* on V if hTðuÞ; vi ¼ hu; T*ðvÞi for every u; v 2 V. The following example shows that the adjoint operator has a simple description within the context of matrix mappings. EXAMPLE 13.1 (a) Let A be a real n-square matrix viewed as a linear operator on Rn . Then, for every u; v 2 Rn; hAu; vi ¼ ðAuÞT v ¼ uT AT v ¼ hu; AT vi Thus, the transpose AT of A is the adjoint of A. (b) Let B be a complex n-square matrix viewed as a linear operator on Cn . Then for every u; v; 2 Cn , hBu; vi ¼ ðBuÞT v ¼ uT BT v ¼ uT B* v ¼ hu; B*vi Thus, the conjugate transpose B* of B is the adjoint of B. CHAPTER 13 377
  • 385. Remark: B* may mean either the adjoint of B as a linear operator or the conjugate transpose of B as a matrix. By Example 13.1(b), the ambiguity makes no difference, because they denote the same object. The following theorem (proved in Problem 13.4) is the main result in this section. THEOREM 13.1: Let T be a linear operator on a finite-dimensional inner product space V over K. Then (i) There exists a unique linear operator T* on V such that hTðuÞ; vi¼ hu; T*ðvÞi for every u; v 2 V. (That is, T has an adjoint T*.) (ii) If A is the matrix representation T with respect to any orthonormal basis S ¼ fuig of V, then the matrix representation of T* in the basis S is the conjugate transpose A* of A (or the transpose AT of A when K is real). We emphasize that no such simple relationship exists between the matrices representing T and T* if the basis is not orthonormal. Thus, we see one useful property of orthonormal bases. We also emphasize that this theorem is not valid if V has infinite dimension (Problem 13.31). The following theorem (proved in Problem 13.5) summarizes some of the properties of the adjoint. THEOREM 13.2: Let T; T1; T2 be linear operators on V and let k 2 K. Then (i) ðT1 þ T2Þ* ¼ T1 * þ T2 *, (iii) ðT1T2Þ* ¼ T2 *T1 *, (ii) ðkTÞ* ¼ kT*, (iv) ðT*Þ* ¼ T. Observe the similarity between the above theorem and Theorem 2.3 on properties of the transpose operation on matrices. Linear Functionals and Inner Product Spaces Recall (Chapter 11) that a linear functional f on a vector space V is a linear mapping f:V ! K. This subsection contains an important result (Theorem 13.3) that is used in the proof of the above basic Theorem 13.1. Let V be an inner product space. Each u 2 V determines a mapping ^ u:V ! K defined by ^ uðvÞ ¼ hv; ui Now, for any a; b 2 K and any v1; v2 2 V, ^ uðav1 þ bv2Þ ¼ hav1 þ bv2; ui ¼ ahv1; ui þ bhv2; ui ¼ a^ uðv1Þ þ b^ uðv2Þ That is, ^ u is a linear functional on V. The converse is also true for spaces of finite dimension and it is contained in the following important theorem (proved in Problem 13.3). THEOREM 13.3: Let f be a linear functional on a finite-dimensional inner product space V. Then there exists a unique vector u 2 V such that fðvÞ ¼ hv; ui for every v 2 V. We remark that the above theorem is not valid for spaces of infinite dimension (Problem 13.24). 13.3 Analogy Between AðVÞ and C, Special Linear Operators Let AðVÞ denote the algebra of all linear operators on a finite-dimensional inner product space V. The adjoint mapping T 7! T* on AðVÞ is quite analogous to the conjugation mapping z 7! z on the complex field C. To illustrate this analogy we identify in Table 13-1 certain classes of operators T 2 AðVÞ whose behavior under the adjoint map imitates the behavior under conjugation of familiar classes of complex numbers. The analogy between these operators T and complex numbers z is reflected in the next theorem. 378 CHAPTER 13 Linear Operators on Inner Product Spaces
  • 386. THEOREM 13.4: Let l be an eigenvalue of a linear operator T on V. (i) If T* ¼ T1 (i.e., T is orthogonal or unitary), then jlj ¼ 1. (ii) If T* ¼ T (i.e., T is self-adjoint), then l is real. (iii) If T* ¼ T (i.e., T is skew-adjoint), then l is pure imaginary. (iv) If T ¼ S*S with S nonsingular (i.e., T is positive definite), then l is real and positive. Proof. In each case let v be a nonzero eigenvector of T belonging to l; that is, TðvÞ ¼ lv with v 6¼ 0. Hence, hv; vi is positive. Proof of (i). We show that l lhv; vi ¼ hv; vi: l lhv; vi ¼ hlv; lvi ¼ hTðvÞ; TðvÞi ¼ hv; T*TðvÞi ¼ hv; IðvÞi ¼ hv; vi But hv; vi 6¼ 0; hence, l l ¼ 1 and so jlj ¼ 1. Proof of (ii). We show that lhv; vi ¼ lhv; vi: lhv; vi ¼ hlv; vi ¼ hTðvÞ; vi ¼ hv; T*ðvÞi ¼ hv; TðvÞi ¼ hv; lvi ¼ lhv; vi But hv; vi 6¼ 0; hence, l ¼ l and so l is real. Proof of (iii). We show that lhv; vi ¼ lhv; vi: lhv; vi ¼ hlv; vi ¼ hTðvÞ; vi ¼ hv; T*ðvÞi ¼ hv; TðvÞi ¼ hv; lvi ¼ lhv; vi But hv; vi 6¼ 0; hence, l ¼ l or l ¼ l, and so l is pure imaginary. Proof of (iv). Note first that SðvÞ 6¼ 0 because S is nonsingular; hence, hSðvÞ, SðvÞi is positive. We show that lhv; vi ¼ hSðvÞ; SðvÞi: lhv; vi ¼ hlv; vi ¼ hTðvÞ; vi ¼ hS*SðvÞ; vi ¼ hSðvÞ; SðvÞi But hv; vi and hSðvÞ; SðvÞi are positive; hence, l is positive. Table 13-1 Class of complex numbers Behavior under conjugation Class of operators in AðVÞ Behavior under the adjoint map Unit circle ðjzj ¼ 1Þ z ¼ 1=z Orthogonal operators (real case) T* ¼ T1 Unitary operators (complex case) Self-adjoint operators Also called: Real axis z ¼ z symmetric (real case) T* ¼ T Hermitian (complex case) Skew-adjoint operators Also called: Imaginary axis z ¼ z skew-symmetric (real case) T* ¼ T skew-Hermitian (complex case) Positive real axis z ¼ ww; w 6¼ 0 Positive definite operators T ¼ S*S ð0; 1Þ with S nonsingular CHAPTER 13 Linear Operators on Inner Product Spaces 379
  • 387. Remark: Each of the above operators T commutes with its adjoint; that is, TT* ¼ T*T. Such operators are called normal operators. 13.4 Self-Adjoint Operators Let T be a self-adjoint operator on an inner product space V; that is, suppose T* ¼ T (If T is defined by a matrix A, then A is symmetric or Hermitian according as A is real or complex.) By Theorem 13.4, the eigenvalues of T are real. The following is another important property of T. THEOREM 13.5: Let T be a self-adjoint operator on V. Suppose u and v are eigenvectors of T belonging to distinct eigenvalues. Then u and v are orthogonal; that is, hu; vi ¼ 0. Proof. Suppose TðuÞ ¼ l1u and TðvÞ ¼ l2v, where l1 6¼ l2. We show that l1hu; vi ¼ l2hu; vi: l1hu; vi ¼ hl1u; vi ¼ hTðuÞ; vi ¼ hu; T*ðvÞi ¼ hu; TðvÞi ¼ hu; l2vi ¼ l2hu; vi ¼ l2hu; vi (The fourth equality uses the fact that T* ¼ T, and the last equality uses the fact that the eigenvalue l2 is real.) Because l1 6¼ l2, we get hu; vi ¼ 0. Thus, the theorem is proved. 13.5 Orthogonal and Unitary Operators Let U be a linear operator on a finite-dimensional inner product space V. Suppose U* ¼ U1 or equivalently UU* ¼ U*U ¼ I Recall that U is said to be orthogonal or unitary according as the underlying field is real or complex. The next theorem (proved in Problem 13.10) gives alternative characterizations of these operators. THEOREM 13.6: The following conditions on an operator U are equivalent: (i) U* ¼ U1 ; that is, UU* ¼ U*U ¼ I. [U is unitary (orthogonal).] (ii) U preserves inner products; that is, for every v; w 2 V, hUðvÞ, UðwÞi ¼ hv; wi. (iii) U preserves lengths; that is, for every v 2 V, kUðvÞk ¼ kvk. EXAMPLE 13.2 (a) Let T :R3 ! R3 be the linear operator that rotates each vector v about the z-axis by a fixed angle y as shown in Fig. 10-1 (Section 10.3). That is, T is defined by Tðx; y; zÞ ¼ ðx cos y y sin y; x sin y þ y cos y; zÞ We note that lengths (distances from the origin) are preserved under T. Thus, T is an orthogonal operator. (b) Let V be l2-space (Hilbert space), defined in Section 7.3. Let T :V ! V be the linear operator defined by Tða1; a2; a3; . . .Þ ¼ ð0; a1; a2; a3; . . .Þ Clearly, T preserves inner products and lengths. However, T is not surjective, because, for example, ð1; 0; 0; . . .Þ does not belong to the image of T; hence, T is not invertible. Thus, we see that Theorem 13.6 is not valid for spaces of infinite dimension. An isomorphism from one inner product space into another is a bijective mapping that preserves the three basic operations of an inner product space: vector addition, scalar multiplication, and inner 380 CHAPTER 13 Linear Operators on Inner Product Spaces
  • 388. products. Thus, the above mappings (orthogonal and unitary) may also be characterized as the isomorphisms of V into itself. Note that such a mapping U also preserves distances, because kUðvÞ UðwÞk ¼ kUðv wÞk ¼ kv wk Hence, U is called an isometry. 13.6 Orthogonal and Unitary Matrices Let U be a linear operator on an inner product space V. By Theorem 13.1, we obtain the following results. THEOREM 13.7A: A complex matrix A represents a unitary operator U (relative to an orthonormal basis) if and only if A* ¼ A1 . THEOREM 13.7B: A real matrix A represents an orthogonal operator U (relative to an orthonormal basis) if and only if AT ¼ A1 . The above theorems motivate the following definitions (which appeared in Sections 2.10 and 2.11). DEFINITION: A complex matrix A for which A* ¼ A1 is called a unitary matrix. DEFINITION: A real matrix A for which AT ¼ A1 is called an orthogonal matrix. We repeat Theorem 2.6, which characterizes the above matrices. THEOREM 13.8: The following conditions on a matrix A are equivalent: (i) A is unitary (orthogonal). (ii) The rows of A form an orthonormal set. (iii) The columns of A form an orthonormal set. 13.7 Change of Orthonormal Basis Orthonormal bases play a special role in the theory of inner product spaces V. Thus, we are naturally interested in the properties of the change-of-basis matrix from one such basis to another. The following theorem (proved in Problem 13.12) holds. THEOREM 13.9: Let fu1; . . . ; ung be an orthonormal basis of an inner product space V. Then the change-of-basis matrix from fuig into another orthonormal basis is unitary (orthogonal). Conversely, if P ¼ ½aij is a unitary (orthogonal) matrix, then the following is an orthonormal basis: fu0 i ¼ a1iu1 þ a2iu2 þ þ aniun : i ¼ 1; . . . ; ng Recall that matrices A and B representing the same linear operator T are similar; that is, B ¼ P1 AP, where P is the (nonsingular) change-of-basis matrix. On the other hand, if V is an inner product space, we are usually interested in the case when P is unitary (or orthogonal) as suggested by Theorem 13.9. (Recall that P is unitary if the conjugate tranpose P* ¼ P1 , and P is orthogonal if the transpose PT ¼ P1 .) This leads to the following definition. DEFINITION: Complex matrices A and B are unitarily equivalent if there exists a unitary matrix P for which B ¼ P*AP. Analogously, real matrices A and B are orthogonally equivalent if there exists an orthogonal matrix P for which B ¼ PT AP. Note that orthogonally equivalent matrices are necessarily congruent. CHAPTER 13 Linear Operators on Inner Product Spaces 381
  • 389. 13.8 Positive Definite and Positive Operators Let P be a linear operator on an inner product space V. Then (i) P is said to be positive definite if P ¼ S*S for some nonsingular operators S: (ii) P is said to be positive (or nonnegative or semidefinite) if P ¼ S*S for some operator S: The following theorems give alternative characterizations of these operators. THEOREM 13.10A: The following conditions on an operator P are equivalent: (i) P ¼ T2 for some nonsingular self-adjoint operator T. (ii) P is positive definite. (iii) P is self-adjoint and hPðuÞ; ui 0 for every u 6¼ 0 in V. The corresponding theorem for positive operators (proved in Problem 13.21) follows. THEOREM 13.10B: The following conditions on an operator P are equivalent: (i) P ¼ T2 for some self-adjoint operator T. (ii) P is positive; that is, P ¼ S S: (iii) P is self-adjoint and hPðuÞ; ui 0 for every u 2 V. 13.9 Diagonalization and Canonical Forms in Inner Product Spaces Let T be a linear operator on a finite-dimensional inner product space V over K. Representing T by a diagonal matrix depends upon the eigenvectors and eigenvalues of T, and hence, upon the roots of the characteristic polynomial DðtÞ of T. Now DðtÞ always factors into linear polynomials over the complex field C but may not have any linear polynomials over the real field R. Thus, the situation for real inner product spaces (sometimes called Euclidean spaces) is inherently different than the situation for complex inner product spaces (sometimes called unitary spaces). Thus, we treat them separately. Real Inner Product Spaces, Symmetric and Orthogonal Operators The following theorem (proved in Problem 13.14) holds. THEOREM 13.11: Let T be a symmetric (self-adjoint) operator on a real finite-dimensional product space V. Then there exists an orthonormal basis of V consisting of eigenvectors of T; that is, T can be represented by a diagonal matrix relative to an orthonormal basis. We give the corresponding statement for matrices. THEOREM 13.11: (Alternative Form) Let A be a real symmetric matrix. Then there exists an orthogonal matrix P such that B ¼ P1 AP ¼ PT AP is diagonal. We can choose the columns of the above matrix P to be normalized orthogonal eigenvectors of A; then the diagonal entries of B are the corresponding eigenvalues. On the other hand, an orthogonal operator T need not be symmetric, and so it may not be represented by a diagonal matrix relative to an orthonormal matrix. However, such a matrix T does have a simple canonical representation, as described in the following theorem (proved in Problem 13.16). 382 CHAPTER 13 Linear Operators on Inner Product Spaces
  • 390. THEOREM 13.12: Let T be an orthogonal operator on a real inner product space V. Then there exists an orthonormal basis of V in which T is represented by a block diagonal matrix M of the form M ¼ diag Is; It; cos y1 sin y1 sin y1 cos y1 ; . . . ; cos yr sin yr sin yr cos yr The reader may recognize that each of the 2 2 diagonal blocks represents a rotation in the corresponding two-dimensional subspace, and each diagonal entry 1 represents a reflection in the corresponding one-dimensional subspace. Complex Inner Product Spaces, Normal and Triangular Operators A linear operator T is said to be normal if it commutes with its adjoint—that is, if TT* ¼ T*T. We note that normal operators include both self-adjoint and unitary operators. Analogously, a complex matrix A is said to be normal if it commutes with its conjugate transpose— that is, if AA* ¼ A*A. EXAMPLE 13.3 Let A ¼ 1 1 i 3 þ 2i . Then A* ¼ 1 i 1 3 2i . Also AA* ¼ 2 3 3i 3 þ 3i 14 ¼ A*A. Thus, A is normal. The following theorem (proved in Problem 13.19) holds. THEOREM 13.13: Let T be a normal operator on a complex finite-dimensional inner product space V. Then there exists an orthonormal basis of V consisting of eigenvectors of T; that is, T can be represented by a diagonal matrix relative to an orthonormal basis. We give the corresponding statement for matrices. THEOREM 13.13: (Alternative Form) Let A be a normal matrix. Then there exists a unitary matrix P such that B ¼ P1 AP ¼ P*AP is diagonal. The following theorem (proved in Problem 13.20) shows that even nonnormal operators on unitary spaces have a relatively simple form. THEOREM 13.14: Let T be an arbitrary operator on a complex finite-dimensional inner product space V. Then T can be represented by a triangular matrix relative to an orthonormal basis of V. THEOREM 13.14: (Alternative Form) Let A be an arbitrary complex matrix. Then there exists a unitary matrix P such that B ¼ P1 AP ¼ P*AP is triangular. 13.10 Spectral Theorem The Spectral Theorem is a reformulation of the diagonalization Theorems 13.11 and 13.13. THEOREM 13.15: (Spectral Theorem) Let T be a normal (symmetric) operator on a complex (real) finite-dimensional inner product space V. Then there exists linear operators E1; . . . ; Er on V and scalars l1; . . . ; lr such that (i) T ¼ l1E1 þ l2E2 þ þ lrEr, (iii) E2 1 ¼ E1; E2 2 ¼ E2; . . . ; E2 r ¼ Er, (ii) E1 þ E2 þ þ Er ¼ I, (iv) EiEj ¼ 0 for i 6¼ j. CHAPTER 13 Linear Operators on Inner Product Spaces 383
  • 391. The above linear operators E1; . . . ; Er are projections in the sense that E2 i ¼ Ei. Moreover, they are said to be orthogonal projections because they have the additional property that EiEj ¼ 0 for i 6¼ j. The following example shows the relationship between a diagonal matrix representation and the corresponding orthogonal projections. EXAMPLE 13.4 Consider the following diagonal matrices A; E1; E2; E3: A ¼ 2 3 3 5 2 6 6 4 3 7 7 5; E1 ¼ 1 0 0 0 2 6 6 4 3 7 7 5; E2 ¼ 0 1 1 0 2 6 6 4 3 7 7 5; E3 ¼ 0 0 0 1 2 6 6 4 3 7 7 5 The reader can verify that (i) A ¼ 2E1 þ 3E2 þ 5E3, (ii) E1 þ E2 þ E3 ¼ I, (iii) E2 i ¼ Ei, (iv) EiEj ¼ 0 for i 6¼ j. SOLVED PROBLEMS Adjoints 13.1. Find the adjoint of F:R3 ! R3 defined by Fðx; y; zÞ ¼ ð3x þ 4y 5z; 2x 6y þ 7z; 5x 9y þ zÞ First find the matrix A that represents F in the usual basis of R3 —that is, the matrix A whose rows are the coefficients of x; y; z—and then form the transpose AT of A. This yields A ¼ 3 4 5 2 6 7 5 9 1 2 4 3 5 and then AT ¼ 3 2 5 4 6 9 5 7 1 2 4 3 5 The adjoint F* is represented by the transpose of A; hence, F*ðx; y; zÞ ¼ ð3x þ 2y þ 5z; 4x 6y 9z; 5x þ 7y þ zÞ 13.2. Find the adjoint of G:C3 ! C3 defined by Gðx; y; zÞ ¼ ½2x þ ð1 iÞy; ð3 þ 2iÞx 4iz; 2ix þ ð4 3iÞy 3z First find the matrix B that represents G in the usual basis of C3 , and then form the conjugate transpose B* of B. This yields B ¼ 2 1 i 0 3 þ 2i 0 4i 2i 4 3i 3 2 4 3 5 and then B* ¼ 2 3 2i 2i 1 þ i 0 4 þ 3i 0 4i 3 2 4 3 5 Then G*ðx; y; zÞ ¼ ½2x þ ð3 2iÞy 2iz; ð1 þ iÞx þ ð4 þ 3iÞz; 4iy 3z: 13.3. Prove Theorem 13.3: Let f be a linear functional on an n-dimensional inner product space V. Then there exists a unique vector u 2 V such that fðvÞ ¼ hv; ui for every v 2 V. Let fw1; . . . ; wng be an orthonormal basis of V. Set u ¼ fðw1Þw1 þ fðw2Þw2 þ þ fðwnÞwn Let ^ u be the linear functional on V defined by ^ uðvÞ ¼ hv; ui for every v 2 V. Then, for i ¼ 1; . . . ; n, ^ uðwiÞ ¼ hwi; ui ¼ hwi; fðw1Þw1 þ þ fðwnÞwni ¼ fðwiÞ 384 CHAPTER 13 Linear Operators on Inner Product Spaces
  • 392. Because ^ u and f agree on each basis vector, ^ u ¼ f. Now suppose u0 is another vector in V for which fðvÞ ¼ hv; u0 i for every v 2 V. Then hv; ui ¼ hv; u0 i or hv; u u0 i ¼ 0. In particular, this is true for v ¼ u u0 , and so hu u0 ; u u0 i ¼ 0. This yields u u0 ¼ 0 and u ¼ u0 . Thus, such a vector u is unique, as claimed. 13.4. Prove Theorem 13.1: Let T be a linear operator on an n-dimensional inner product space V. Then (a) There exists a unique linear operator T* on V such that hTðuÞ; vi ¼ hu; T*ðvÞi for all u; v 2 V: (b) Let A be the matrix that represents T relative to an orthonormal basis S ¼ fuig. Then the conjugate transpose A* of A represents T* in the basis S. (a) We first define the mapping T*. Let v be an arbitrary but fixed element of V. The map u 7! hTðuÞ; vi is a linear functional on V. Hence, by Theorem 13.3, there exists a unique element v0 2 V such that hTðuÞ; vi ¼ hu; v0 i for every u 2 V. We define T* : V ! V by T*ðvÞ ¼ v0 . Then hTðuÞ; vi ¼ hu; T*ðvÞi for every u; v 2 V. We next show that T* is linear. For any u; vi 2 V, and any a; b 2 K, hu; T*ðav1 þ bv2Þi ¼ hTðuÞ; av1 þ bv2i ¼ ahTðuÞ; v1i þ bhTðuÞ; v2i ¼ ahu; T*ðv1Þi þ bhu; T*ðv2Þi ¼ hu; aT*ðv1Þ þ bT*ðv2Þi But this is true for every u 2 V; hence, T*ðav1 þ bv2Þ ¼ aT*ðv1Þ þ bT*ðv2Þ. Thus, T* is linear. (b) The matrices A ¼ ½aij and B ¼ ½bij that represent T and T*, respectively, relative to the orthonormal basis S are given by aij ¼ hTðujÞ; uii and bij ¼ hT*ðujÞ; uii (Problem 13.67). Hence, bij ¼ hT*ðujÞ; uii ¼ hui; T*ðujÞi ¼ hTðuiÞ; uji ¼ aji Thus, B ¼ A*, as claimed. 13.5. Prove Theorem 13.2: (i) ðT1 þ T2Þ* ¼ T1 * þ T2 *, (iii) ðT1T2Þ* ¼ T2 *T1 *, (ii) ðkTÞ* ¼ kT*, (iv) ðT*Þ* ¼ T. (i) For any u; v 2 V, hðT1 þ T2ÞðuÞ; vi ¼ hT1ðuÞ þ T2ðuÞ; vi ¼ hT1ðuÞ; vi þ hT2ðuÞ; vi ¼ hu; T1 *ðvÞi þ hu; T2 *ðvÞi ¼ hu; T1 *ðvÞ þ T2 *ðvÞi ¼ hu; ðT1 * þ T2 *ÞðvÞi The uniqueness of the adjoint implies ðT1 þ T2Þ* ¼ T1 * þ T2 *. (ii) For any u; v 2 V, hðkTÞðuÞ; vi ¼ hkTðuÞ; vi ¼ khTðuÞ; vi ¼ khu; T*ðvÞi ¼ hu; kT*ðvÞi ¼ hu; ð kT*ÞðvÞi The uniqueness of the adjoint implies ðkTÞ* ¼ kT*. (iii) For any u; v 2 V, hðT1T2ÞðuÞ; vi ¼ hT1ðT2ðuÞÞ; vi ¼ hT2ðuÞ; T1 *ðvÞi ¼ hu; T2 *ðT1 *ðvÞÞi ¼ hu; ðT2 *T1 *ÞðvÞi The uniqueness of the adjoint implies ðT1T2Þ* ¼ T2 *T1 *. (iv) For any u; v 2 V, hT*ðuÞ; vi ¼ hv; T*ðuÞi ¼ hTðvÞ; ui ¼ hu; TðvÞi The uniqueness of the adjoint implies ðT*Þ* ¼ T. CHAPTER 13 Linear Operators on Inner Product Spaces 385
  • 393. 13.6. Show that ðaÞ I* ¼ I, and ðbÞ 0* ¼ 0. (a) For every u; v 2 V, hIðuÞ; vi ¼ hu; vi ¼ hu; IðvÞi; hence, I* ¼ I. (b) For every u; v 2 V, h0ðuÞ; vi ¼ h0; vi ¼ 0 ¼ hu; 0i ¼ hu; 0ðvÞi; hence, 0* ¼ 0. 13.7. Suppose T is invertible. Show that ðT1 Þ* ¼ ðT*Þ1 . I ¼ I* ¼ ðTT1 Þ* ¼ ðT1 Þ*T*; hence; ðT1 Þ* ¼ ðT*Þ1 : 13.8. Let T be a linear operator on V, and let W be a T-invariant subspace of V. Show that W? is invariant under T*. Let u 2 W? . If w 2 W, then TðwÞ 2 W and so hw; T*ðuÞi ¼ hTðwÞ; ui ¼ 0. Thus, T*ðuÞ 2 W? because it is orthogonal to every w 2 W. Hence, W? is invariant under T*. 13.9. Let T be a linear operator on V. Show that each of the following conditions implies T ¼ 0: (i) hTðuÞ; vi ¼ 0 for every u; v 2 V. (ii) V is a complex space, and hTðuÞ; ui ¼ 0 for every u 2 V. (iii) T is self-adjoint and hTðuÞ; ui ¼ 0 for every u 2 V. Give an example of an operator T on a real space V for which hTðuÞ; ui ¼ 0 for every u 2 V but T 6¼ 0. [Thus, (ii) need not hold for a real space V.] (i) Set v ¼ TðuÞ. Then hTðuÞ; TðuÞi ¼ 0, and hence, TðuÞ ¼ 0, for every u 2 V. Accordingly, T ¼ 0. (ii) By hypothesis, hTðv þ wÞ; v þ wi ¼ 0 for any v; w 2 V. Expanding and setting hTðvÞ; vi ¼ 0 and hTðwÞ; wi ¼ 0, we find hTðvÞ; wi þ hTðwÞ; vi ¼ 0 ð1Þ Note w is arbitrary in (1). Substituting iw for w, and using hTðvÞ; iwi ¼ ihTðvÞ; wi ¼ ihTðvÞ; wi and hTðiwÞ; vi ¼ hiTðwÞ; vi ¼ ihTðwÞ; vi, we find ihTðvÞ; wi þ ihTðwÞ; vi ¼ 0 Dividing through by i and adding to (1), we obtain hTðwÞ; vi ¼ 0 for any v; w; 2 V. By (i), T ¼ 0. (iii) By (ii), the result holds for the complex case; hence we need only consider the real case. Expanding hTðv þ wÞ; v þ wi ¼ 0, we again obtain (1). Because T is self-adjoint and as it is a real space, we have hTðwÞ; vi ¼ hw; TðvÞi ¼ hTðvÞ; wi. Substituting this into (1), we obtain hTðvÞ; wi ¼ 0 for any v; w 2 V. By (i), T ¼ 0. For an example, consider the linear operator T on R2 defined by Tðx; yÞ ¼ ðy; xÞ. Then hTðuÞ; ui ¼ 0 for every u 2 V, but T 6¼ 0. Orthogonal and Unitary Operators and Matrices 13.10. Prove Theorem 13.6: The following conditions on an operator U are equivalent: (i) U* ¼ U1 ; that is, U is unitary. (ii) hUðvÞ; UðwÞi ¼ hu; wi. (iii) kUðvÞk ¼ kvk. Suppose (i) holds. Then, for every v; w; 2 V, hUðvÞ; UðwÞi ¼ hv; U*UðwÞi ¼ hv; IðwÞi ¼ hv; wi Thus, (i) implies (ii). Now if (ii) holds, then kUðvÞk ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi hUðvÞ; UðvÞi p ¼ ffiffiffiffiffiffiffiffiffiffiffi hv; vi p ¼ kvk Hence, (ii) implies (iii). It remains to show that (iii) implies (i). Suppose (iii) holds. Then for every v 2 V, hU*UðvÞi ¼ hUðvÞ; UðvÞi ¼ hv; vi ¼ hIðvÞ; vi Hence, hðU*U IÞðvÞ; vi ¼ 0 for every v 2 V. But U*U I is self-adjoint (Prove!); then, by Problem 13.9, we have U*U I ¼ 0 and so U*U ¼ I. Thus, U* ¼ U1 , as claimed. 386 CHAPTER 13 Linear Operators on Inner Product Spaces
  • 394. 13.11. Let U be a unitary (orthogonal) operator on V, and let W be a subspace invariant under U. Show that W? is also invariant under U. Because U is nonsingular, UðWÞ ¼ W; that is, for any w 2 W, there exists w0 2 W such that Uðw0 Þ ¼ w. Now let v 2 W? . Then, for any w 2 W, hUðvÞ; wi ¼ hUðvÞ; Uðw0 Þi ¼ hv; w0 i ¼ 0 Thus, UðvÞ belongs to W? . Therefore, W? is invariant under U. 13.12. Prove Theorem 13.9: The change-of-basis matrix from an orthonormal basis fu1; . . . ; ung into another orthonormal basis is unitary (orthogonal). Conversely, if P ¼ ½aij is a unitary (ortho- gonal) matrix, then the vectors ui0 ¼ P j ajiuj form an orthonormal basis. Suppose fvig is another orthonormal basis and suppose vi ¼ bi1u1 þ bi2u2 þ þ binun; i ¼ 1; . . . ; n ð1Þ Because fvig is orthonormal, dij ¼ hvi; vji ¼ bi1bj1 þ bi2bj2 þ þ binbjn ð2Þ Let B ¼ ½bij be the matrix of coefficients in (1). (Then BT is the change-of-basis matrix from fuig to fvig.) Then BB* ¼ ½cij, where cij ¼ bi1bj1 þ bi2bj2 þ þ binbjn. By (2), cij ¼ dij, and therefore BB* ¼ I. Accordingly, B, and hence, BT , is unitary. It remains to prove that fu0 ig is orthonormal. By Problem 13.67, hu0 i; u0 ji ¼ a1ia1j þ a2ia2j þ þ anianj ¼ hCi; Cji where Ci denotes the ith column of the unitary (orthogonal) matrix P ¼ ½aij: Because P is unitary (orthogonal), its columns are orthonormal; hence, hu0 i; u0 ji ¼ hCi; Cji ¼ dij. Thus, fu0 ig is an orthonormal basis. Symmetric Operators and Canonical Forms in Euclidean Spaces 13.13. Let T be a symmetric operator. Show that (a) The characteristic polynomial DðtÞ of T is a product of linear polynomials (over R); (b) T has a nonzero eigenvector. (a) Let A be a matrix representing T relative to an orthonormal basis of V; then A ¼ AT . Let DðtÞ be the characteristic polynomial of A. Viewing A as a complex self-adjoint operator, A has only real eigenvalues by Theorem 13.4. Thus, DðtÞ ¼ ðt l1Þðt l2Þ ðt lnÞ where the li are all real. In other words, DðtÞ is a product of linear polynomials over R. (b) By (a), T has at least one (real) eigenvalue. Hence, T has a nonzero eigenvector. 13.14. Prove Theorem 13.11: Let T be a symmetric operator on a real n-dimensional inner product space V. Then there exists an orthonormal basis of V consisting of eigenvectors of T. (Hence, T can be represented by a diagonal matrix relative to an orthonormal basis.) The proof is by induction on the dimension of V. If dim V ¼ 1, the theorem trivially holds. Now suppose dim V ¼ n 1. By Problem 13.13, there exists a nonzero eigenvector v1 of T. Let W be the space spanned by v1, and let u1 be a unit vector in W, e.g., let u1 ¼ v1=kv1k. Because v1 is an eigenvector of T, the subspace W of V is invariant under T. By Problem 13.8, W? is invariant under T* ¼ T. Thus, the restriction ^ T of T to W? is a symmetric operator. By Theorem 7.4, V ¼ W W? . Hence, dim W? ¼ n 1, because dim W ¼ 1. By induction, there exists an orthonormal basis fu2; . . . ; ung of W? consisting of eigenvectors of ^ T and hence of T. But hu1; uii ¼ 0 for i ¼ 2; . . . ; n because ui 2 W? . Accordingly fu1; u2; . . . ; ung is an orthonormal set and consists of eigenvectors of T. Thus, the theorem is proved. CHAPTER 13 Linear Operators on Inner Product Spaces 387
  • 395. 13.15. Let qðx; yÞ ¼ 3x2 6xy þ 11y2 . Find an orthonormal change of coordinates (linear substitution) that diagonalizes the quadratic form q. Find the symmetric matrix A representing q and its characteristic polynomial DðtÞ. We have A ¼ 3 3 3 11 and DðtÞ ¼ t2 trðAÞ t þ jAj ¼ t2 14t þ 24 ¼ ðt 2Þðt 12Þ The eigenvalues are l ¼ 2 and l ¼ 12. Hence, a diagonal form of q is qðs; tÞ ¼ 2s2 þ 12t2 (where we use s and t as new variables). The corresponding orthogonal change of coordinates is obtained by finding an orthogonal set of eigenvectors of A. Subtract l ¼ 2 down the diagonal of A to obtain the matrix M ¼ 1 3 3 9 corresponding to x 3y ¼ 0 3x þ 9y ¼ 0 or x 3y ¼ 0 A nonzero solution is u1 ¼ ð3; 1Þ. Next subtract l ¼ 12 down the diagonal of A to obtain the matrix M ¼ 9 3 3 1 corresponding to 9x 3y ¼ 0 3x y ¼ 0 or 3x y ¼ 0 A nonzero solution is u2 ¼ ð1; 3Þ. Normalize u1 and u2 to obtain the orthonormal basis ^ u1 ¼ ð3= ffiffiffiffiffi 10 p ; 1= ffiffiffiffiffi 10 p Þ; ^ u2 ¼ ð1= ffiffiffiffiffi 10 p ; 3= ffiffiffiffiffi 10 p Þ Now let P be the matrix whose columns are ^ u1 and ^ u2. Then P ¼ 3= ffiffiffiffiffi 10 p 1= ffiffiffiffiffi 10 p 1= ffiffiffiffiffi 10 p 3= ffiffiffiffiffi 10 p # and D ¼ P1 AP ¼ PT AP ¼ 2 0 0 12 Thus, the required orthogonal change of coordinates is x y ¼ P s t or x ¼ 3s t ffiffiffiffiffi 10 p ; y ¼ s þ 3t ffiffiffiffiffi 10 p One can also express s and t in terms of x and y by using P1 ¼ PT ; that is, s ¼ 3x þ y ffiffiffiffiffi 10 p ; t ¼ x þ 3y ffiffiffiffiffi 10 p 13.16. Prove Theorem 13.12: Let T be an orthogonal operator on a real inner product space V. Then there exists an orthonormal basis of V in which T is represented by a block diagonal matrix M of the form M ¼ diag 1; . . . ; 1; 1; . . . ; 1; cos y1 sin y1 sin y1 cos y1 ; . . . ; cos yr sin yr sin yr cos yr Let S ¼ T þ T1 ¼ T þ T*. Then S* ¼ ðT þ T*Þ* ¼ T* þ T ¼ S. Thus, S is a symmetric operator on V. By Theorem 13.11, there exists an orthonormal basis of V consisting of eigenvectors of S. If l1; . . . ; lm denote the distinct eigenvalues of S, then V can be decomposed into the direct sum V ¼ V1 V2 Vm where the Vi consists of the eigenvectors of S belonging to li. We claim that each Vi is invariant under T. For suppose v 2 V; then SðvÞ ¼ liv and SðTðvÞÞ ¼ ðT þ T1 ÞTðvÞ ¼ TðT þ T1 ÞðvÞ ¼ TSðvÞ ¼ TðlivÞ ¼ liTðvÞ That is, TðvÞ 2 Vi. Hence, Vi is invariant under T. Because the Vi are orthogonal to each other, we can restrict our investigation to the way that T acts on each individual Vi. On a given Vi; we have ðT þ T1 Þv ¼ SðvÞ ¼ liv. Multiplying by T, we get ðT2 liT þ IÞðvÞ ¼ 0 ð1Þ 388 CHAPTER 13 Linear Operators on Inner Product Spaces
  • 396. We consider the cases li ¼ 2 and li 6¼ 2 separately. If li ¼ 2, then ðT IÞ2 ðvÞ ¼ 0, which leads to ðT IÞðvÞ ¼ 0 or TðvÞ ¼ v. Thus, T restricted to this Vi is either I or I. If li 6¼ 2, then T has no eigenvectors in Vi, because, by Theorem 13.4, the only eigenvalues of T are 1 or 1. Accordingly, for v 6¼ 0, the vectors v and TðvÞ are linearly independent. Let W be the subspace spanned by v and TðvÞ. Then W is invariant under T, because using (1) we get TðTðvÞÞ ¼ T2 ðvÞ ¼ liTðvÞ v 2 W By Theorem 7.4, Vi ¼ W W? . Furthermore, by Problem 13.8, W? is also invariant under T. Thus, we can decompose Vi into the direct sum of two-dimensional subspaces Wj where the Wj are orthogonal to each other and each Wj is invariant under T. Thus, we can restrict our investigation to the way in which T acts on each individual Wj. Because T2 liT þ I ¼ 0, the characteristic polynomial DðtÞ of T acting on Wj is DðtÞ ¼ t2 lit þ 1. Thus, the determinant of T is 1, the constant term in DðtÞ. By Theorem 2.7, the matrix A representing T acting on Wj relative to any orthogonal basis of Wj must be of the form cos y sin y sin y cos y The union of the bases of the Wj gives an orthonormal basis of Vi, and the union of the bases of the Vi gives an orthonormal basis of V in which the matrix representing T is of the desired form. Normal Operators and Canonical Forms in Unitary Spaces 13.17. Determine which of the following matrices is normal: (a) A ¼ 1 i 0 1 , (b) B ¼ 1 i 1 2 þ i (a) AA* ¼ 1 i 0 1 1 0 i 1 ¼ 2 i i 1 , A*A ¼ 1 0 i 1 1 i 0 1 ¼ 1 i i 2 Because AA* 6¼ A*A, the matrix A is not normal. (b) BB* 1 i 1 2 þ i 1 1 i 2 i ¼ 2 2 þ 2i 2 2i 6 ¼ 1 1 i 2 i 1 i 1 2 þ i ¼ B*B Because BB* ¼ B*B, the matrix B is normal. 13.18. Let T be a normal operator. Prove the following: (a) TðvÞ ¼ 0 if and only if T*ðvÞ ¼ 0. (b) T lI is normal. (c) If TðvÞ ¼ lv, then T*ðvÞ ¼ lv; hence, any eigenvector of T is also an eigenvector of T*. (d) If TðvÞ ¼ l1v and TðwÞ ¼ l2w where l1 6¼ l2, then hv; wi ¼ 0; that is, eigenvectors of T belonging to distinct eigenvalues are orthogonal. (a) We show that hTðvÞ; TðvÞi ¼ hT*ðvÞ; T*ðvÞi: hTðvÞ; TðvÞi ¼ hv; T*TðvÞi ¼ hv; TT*ðvÞi ¼ hT*ðvÞ; T*ðvÞi Hence, by ½I3 in the definition of the inner product in Section 7.2, TðvÞ ¼ 0 if and only if T*ðvÞ ¼ 0. (b) We show that T lI commutes with its adjoint: ðT lIÞðT lIÞ* ¼ ðT lIÞðT* lIÞ ¼ TT* lT* lT þ l lI ¼ T*T lT lT* þ llI ¼ ðT* lIÞðT lIÞ ¼ ðT lIÞ*ðT lIÞ Thus, T lI is normal. CHAPTER 13 Linear Operators on Inner Product Spaces 389
  • 397. (c) If TðvÞ ¼ lv, then ðT lIÞðvÞ ¼ 0. Now T lI is normal by (b); therefore, by (a), ðT lIÞ*ðvÞ ¼ 0. That is, ðT* lIÞðvÞ ¼ 0; hence, T*ðvÞ ¼ lv. (d) We show that l1hv; wi ¼ l2hv; wi: l1hv; wi ¼ hl1v; wi ¼ hTðvÞ; wi ¼ hv; T*ðwÞi ¼ hv; l2wi ¼ l2hv; wi But l1 6¼ l2; hence, hv; wi ¼ 0. 13.19. Prove Theorem 13.13: Let T be a normal operator on a complex finite-dimensional inner product space V. Then there exists an orthonormal basis of V consisting of eigenvectors of T. (Thus, T can be represented by a diagonal matrix relative to an orthonormal basis.) The proof is by induction on the dimension of V. If dim V ¼ 1, then the theorem trivially holds. Now suppose dim V ¼ n 1. Because V is a complex vector space, T has at least one eigenvalue and hence a nonzero eigenvector v. Let W be the subspace of V spanned by v, and let u1 be a unit vector in W. Because v is an eigenvector of T, the subspace W is invariant under T. However, v is also an eigenvector of T* by Problem 13.18; hence, W is also invariant under T*. By Problem 13.8, W? is invariant under T** ¼ T. The remainder of the proof is identical with the latter part of the proof of Theorem 13.11 (Problem 13.14). 13.20. Prove Theorem 13.14: Let T be any operator on a complex finite-dimensional inner product space V. Then T can be represented by a triangular matrix relative to an orthonormal basis of V. The proof is by induction on the dimension of V. If dim V ¼ 1, then the theorem trivially holds. Now suppose dim V ¼ n 1. Because V is a complex vector space, T has at least one eigenvalue and hence at least one nonzero eigenvector v. Let W be the subspace of V spanned by v, and let u1 be a unit vector in W. Then u1 is an eigenvector of T and, say, Tðu1Þ ¼ a11u1. By Theorem 7.4, V ¼ W W? . Let E denote the orthogonal projection V into W? . Clearly W? is invariant under the operator ET. By induction, there exists an orthonormal basis fu2; . . . ; ung of W? such that, for i ¼ 2; . . . ; n, ETðuiÞ ¼ ai2u2 þi3 u3 þ þ aiiui (Note that fu1; u2; . . . ; ung is an orthonormal basis of V.) But E is the orthogonal projection of V onto W? ; hence, we must have TðuiÞ ¼ ai1u1 þ ai2u2 þ þ aiiui for i ¼ 2; . . . ; n. This with Tðu1Þ ¼ a11u1 gives us the desired result. Miscellaneous Problems 13.21. Prove Theorem 13.10B: The following are equivalent: (i) P ¼ T2 for some self-adjoint operator T. (ii) P ¼ S*S for some operator S; that is, P is positive. (iii) P is self-adjoint and hPðuÞ; ui 0 for every u 2 V. Suppose (i) holds; that is, P ¼ T2 where T ¼ T*. Then P ¼ TT ¼ T*T, and so (i) implies (ii). Now suppose (ii) holds. Then P* ¼ ðS*SÞ* ¼ S*S** ¼ S*S ¼ P, and so P is self-adjoint. Furthermore, hPðuÞ; ui ¼ hS*SðuÞ; ui ¼ hSðuÞ; SðuÞi 0 Thus, (ii) implies (iii), and so it remains to prove that (iii) implies (i). Now suppose (iii) holds. Because P is self-adjoint, there exists an orthonormal basis fu1; . . . ; ung of V consisting of eigenvectors of P; say, PðuiÞ ¼ liui. By Theorem 13.4, the li are real. Using (iii), we show that the li are nonnegative. We have, for each i, 0 hPðuiÞ; uii ¼ hliui; uii ¼ lihui; uii Thus, hui; uii 0 forces li 0; as claimed. Accordingly, ffiffiffiffi li p is a real number. Let T be the linear operator defined by TðuiÞ ¼ ffiffiffiffi li p ui for i ¼ 1; . . . ; n 390 CHAPTER 13 Linear Operators on Inner Product Spaces
  • 398. Because T is represented by a real diagonal matrix relative to the orthonormal basis fuig, T is self-adjoint. Moreover, for each i, T2 ðuiÞ ¼ Tð ffiffiffiffi li p uiÞ ¼ ffiffiffiffi li p TðiiÞ ¼ ffiffiffiffi li p ffiffiffiffi li p ui ¼ liui ¼ PðuiÞ Because T2 and P agree on a basis of V; P ¼ T2 . Thus, the theorem is proved. Remark: The above operator T is the unique positive operator such that P ¼ T2 ; it is called the positive square root of P. 13.22. Show that any operator T is the sum of a self-adjoint operator and a skew-adjoint operator. Set S ¼ 1 2 ðT þ T*Þ and U ¼ 1 2 ðT T*Þ: Then T ¼ S þ U; where S* ¼ ½1 2 ðT þ T*Þ* ¼ 1 2 ðT* þ T**Þ ¼ 1 2 ðT* þ TÞ ¼ S U* ¼ ½1 2 ðT T*Þ* ¼ 1 2 ðT* TÞ ¼ 1 2 ðT T*Þ ¼ U and that is, S is self-adjoint and U is skew-adjoint. 13.23. Prove: Let T be an arbitrary linear operator on a finite-dimensional inner product space V. Then T is a product of a unitary (orthogonal) operator U and a unique positive operator P; that is, T ¼ UP. Furthermore, if T is invertible, then U is also uniquely determined. By Theorem 13.10, T*T is a positive operator; hence, there exists a (unique) positive operator P such that P2 ¼ T*T (Problem 13.43). Observe that kPðvÞk2 ¼ hPðvÞ; PðvÞi ¼ hP2 ðvÞ; vi ¼ hT*TðvÞ; vi ¼ hTðvÞ; TðvÞi ¼ kTðvÞk2 ð1Þ We now consider separately the cases when T is invertible and noninvertible. If T is invertible, then we set ^ U ¼ PT1 . We show that ^ U is unitary: ^ U* ¼ ðPT1 Þ* ¼ T1 P* ¼ ðT*Þ1 P and ^ U ^ U ¼ ðT*Þ1 PPT1 ¼ ðT*Þ1 T*TT1 ¼ I Thus, ^ U is unitary. We next set U ¼ ^ U1 . Then U is also unitary, and T ¼ UP as required. To prove uniqueness, we assume T ¼ U0P0, where U0 is unitary and P0 is positive. Then T*T ¼ P0 *U0 *U0P0 ¼ P0IP0 ¼ P2 0 But the positive square root of T*T is unique (Problem 13.43); hence, P0 ¼ P. (Note that the invertibility of T is not used to prove the uniqueness of P.) Now if T is invertible, then P is also invertible by (1). Multiplying U0P ¼ UP on the right by P1 yields U0 ¼ U. Thus, U is also unique when T is invertible. Now suppose T is not invertible. Let W be the image of P; that is, W ¼ Im P. We define U1 :W ! V by U1ðwÞ ¼ TðvÞ; where PðvÞ ¼ w ð2Þ We must show that U1 is well defined; that is, that PðvÞ ¼ Pðv0 Þ implies TðvÞ ¼ Tðv0 Þ. This follows from the fact that Pðv v0 Þ ¼ 0 is equivalent to kPðv v0 Þk ¼ 0, which forces kTðv v0 Þk ¼ 0 by (1). Thus, U1 is well defined. We next define U2 :W ! V. Note that, by (1), P and T have the same kernels. Hence, the images of P and T have the same dimension; that is, dimðIm PÞ ¼ dim W ¼ dimðIm TÞ. Consequently, W? and ðIm TÞ? also have the same dimension. We let U2 be any isomorphism between W? and ðIm TÞ? . We next set U ¼ U1 U2. [Here U is defined as follows: If v 2 V and v ¼ w þ w0 , where w 2 W, w0 2 W? , then UðvÞ ¼ U1ðwÞ þ U2ðw0 Þ.] Now U is linear (Problem 13.69), and, if v 2 V and PðvÞ ¼ w, then, by (2), TðvÞ ¼ U1ðwÞ ¼ UðwÞ ¼ UPðvÞ Thus, T ¼ UP, as required. It remains to show that U is unitary. Now every vector x 2 V can be written in the form x ¼ PðvÞ þ w0 , where w0 2 W? . Then UðxÞ ¼ UPðvÞ þ U2ðw0 Þ ¼ TðvÞ þ U2ðw0 Þ, where hTðvÞ; U2ðw0 Þi ¼ 0 by definition CHAPTER 13 Linear Operators on Inner Product Spaces 391
  • 399. of U2. Also, hTðvÞ; TðvÞi ¼ hPðvÞ; PðvÞi by (1). Thus, hUðxÞ; UðxÞi ¼ hTðvÞ þ U2ðw0 Þ; TðvÞ þ U2ðw0 Þi ¼ hTðvÞ; TðvÞi þ hU2ðw0 Þ; U2ðw0 Þi ¼ hPðvÞ; PðvÞi þ hw0 ; w0 i ¼ hPðvÞ þ w0 ; PðvÞ þ w0 Þ ¼ hx; xi [We also used the fact that hPðvÞ; w0 i ¼ 0: Thus, U is unitary, and the theorem is proved. 13.24. Let V be the vector space of polynomials over R with inner product defined by h f ; gi ¼ ð1 0 f ðtÞgðtÞ dt Give an example of a linear functional f on V for which Theorem 13.3 does not hold—that is, for which there is no polynomial hðtÞ such that fð f Þ ¼ h f ; hi for every f 2 V. Let f:V ! R be defined by fð f Þ ¼ f ð0Þ; that is, f evaluates f ðtÞ at 0, and hence maps f ðtÞ into its constant term. Suppose a polynomial hðtÞ exists for which fðf Þ ¼ f ð0Þ ¼ ð1 0 f ðtÞhðtÞ dt ð1Þ for every polynomial f ðtÞ. Observe that f maps the polynomial tf ðtÞ into 0; hence, by (1), ð1 0 tf ðtÞhðtÞ dt ¼ 0 ð2Þ for every polynomial f ðtÞ. In particular (2) must hold for f ðtÞ ¼ thðtÞ; that is, ð1 0 t2 h2 ðtÞ dt ¼ 0 This integral forces hðtÞ to be the zero polynomial; hence, fð f Þ ¼ h f ; hi ¼ h f ; 0i ¼ 0 for every polynomial f ðtÞ. This contradicts the fact that f is not the zero functional; hence, the polynomial hðtÞ does not exist. SUPPLEMENTARY PROBLEMS Adjoint Operators 13.25. Find the adjoint of: (a) A ¼ 5 2i 3 þ 7i 4 6i 8 þ 3i ; (b) B ¼ 3 5i i 2i ; (c) C ¼ 1 1 2 3 13.26. Let T :R3 ! R3 be defined by Tðx; y; zÞ ¼ ðx þ 2y; 3x 4z; yÞ: Find T*ðx; y; zÞ: 13.27. Let T :C3 ! C3 be defined by Tðx; y; zÞ ¼ ½ix þ ð2 þ 3iÞy; 3x þ ð3 iÞz; ð2 5iÞy þ iz: Find T*ðx; y; zÞ: 13.28. For each linear function f on V; find u 2 V such that fðvÞ ¼ hv; ui for every v 2 V: (a) f:R3 ! R defined by fðx; y; zÞ ¼ x þ 2y 3z: (b) f:C3 ! C defined by fðx; y; zÞ ¼ ix þ ð2 þ 3iÞy þ ð1 2iÞz: 13.29. Suppose V has finite dimension. Prove that the image of T* is the orthogonal complement of the kernel of T; that is, Im T* ¼ ðKer TÞ? : Hence, rankðTÞ ¼ rankðT*Þ: 13.30. Show that T*T ¼ 0 implies T ¼ 0: 392 CHAPTER 13 Linear Operators on Inner Product Spaces
  • 400. 13.31. Let V be the vector space of polynomials over R with inner product defined by h f ; gi ¼ Ð1 0 f ðtÞgðtÞ dt: Let D be the derivative operator on V; that is, Dð f Þ ¼ df =dt: Show that there is no operator D* on V such that hDð f Þ; gi ¼ h f ; D*ðgÞi for every f ; g 2 V: That is, D has no adjoint. Unitary and Orthogonal Operators and Matrices 13.32. Find a unitary (orthogonal) matrix whose first row is (a) ð2= ffiffiffiffiffi 13 p ; 3= ffiffiffiffiffi 13 p Þ, (b) a multiple of ð1; 1 iÞ, (c) a multiple of ð1; i; 1 iÞ: 13.33. Prove that the products and inverses of orthogonal matrices are orthogonal. (Thus, the orthogonal matrices form a group under multiplication, called the orthogonal group.) 13.34. Prove that the products and inverses of unitary matrices are unitary. (Thus, the unitary matrices form a group under multiplication, called the unitary group.) 13.35. Show that if an orthogonal (unitary) matrix is triangular, then it is diagonal. 13.36. Recall that the complex matrices A and B are unitarily equivalent if there exists a unitary matrix P such that B ¼ P*AP. Show that this relation is an equivalence relation. 13.37. Recall that the real matrices A and B are orthogonally equivalent if there exists an orthogonal matrix P such that B ¼ PT AP. Show that this relation is an equivalence relation. 13.38. Let W be a subspace of V. For any v 2 V, let v ¼ w þ w0 , where w 2 W, w0 2 W? . (Such a sum is unique because V ¼ W W? .) Let T :V ! V be defined by TðvÞ ¼ w w0 . Show that T is self-adjoint unitary operator on V. 13.39. Let V be an inner product space, and suppose U :V ! V (not assumed linear) is surjective (onto) and preserves inner products; that is, hUðvÞ; UðwÞi ¼ hu; wi for every v; w 2 V. Prove that U is linear and hence unitary. Positive and Positive Definite Operators 13.40. Show that the sum of two positive (positive definite) operators is positive (positive definite). 13.41. Let T be a linear operator on V and let f :V V ! K be defined by f ðu; vÞ ¼ hTðuÞ; vi. Show that f is an inner product on V if and only if T is positive definite. 13.42. Suppose E is an orthogonal projection onto some subspace W of V. Prove that kI þ E is positive (positive definite) if k 0 ðk 0Þ. 13.43. Consider the operator T defined by TðuiÞ ¼ ffiffiffiffi li p ui; i ¼ 1; . . . ; n, in the proof of Theorem 13.10A. Show that T is positive and that it is the only positive operator for which T2 ¼ P. 13.44. Suppose P is both positive and unitary. Prove that P ¼ I. 13.45. Determine which of the following matrices are positive (positive definite): ðiÞ 1 1 1 1 ; ðiiÞ 0 i i 0 ; ðiiiÞ 0 1 1 0 ; ðivÞ 1 1 0 1 ; ðvÞ 2 1 1 2 ; ðviÞ 1 2 2 1 13.46. Prove that a 2 2 complex matrix A ¼ a b c d is positive if and only if (i) A ¼ A*, and (ii) a; d and jAj ¼ ad bc are nonnegative real numbers. CHAPTER 13 Linear Operators on Inner Product Spaces 393
  • 401. 13.47. Prove that a diagonal matrix A is positive (positive definite) if and only if every diagonal entry is a nonnegative (positive) real number. Self-adjoint and Symmetric Matrices 13.48. For any operator T, show that T þ T* is self-adjoint and T T* is skew-adjoint. 13.49. Suppose T is self-adjoint. Show that T2 ðvÞ ¼ 0 implies TðvÞ ¼ 0. Using this to prove that Tn ðvÞ ¼ 0 also implies that TðvÞ ¼ 0 for n 0. 13.50. Let V be a complex inner product space. Suppose hTðvÞ; vi is real for every v 2 V. Show that T is self- adjoint. 13.51. Suppose T1 and T2 are self-adjoint. Show that T1T2 is self-adjoint if and only if T1 and T1 commute; that is, T1T2 ¼ T2T1. 13.52. For each of the following symmetric matrices A, find an orthogonal matrix P and a diagonal matrix D such that PT AP is diagonal: (a) A ¼ 1 2 2 2 ; (b) A ¼ 5 4 4 1 , (c) A ¼ 7 3 3 1 13.53. Find an orthogonal change of coordinates X ¼ PX0 that diagonalizes each of the following quadratic forms and find the corresponding diagonal quadratic form qðx0 Þ: (a) qðx; yÞ ¼ 2x2 6xy þ 10y2 , (b) qðx; yÞ ¼ x2 þ 8xy 5y2 (c) qðx; y; zÞ ¼ 2x2 4xy þ 5y2 þ 2xz 4yz þ 2z2 Normal Operators and Matrices 13.54. Let A ¼ 2 i i 2 . Verify that A is normal. Find a unitary matrix P such that P*AP is diagonal. Find P*AP. 13.55. Show that a triangular matrix is normal if and only if it is diagonal. 13.56. Prove that if T is normal on V, then kTðvÞk ¼ kT*ðvÞk for every v 2 V. Prove that the converse holds in complex inner product spaces. 13.57. Show that self-adjoint, skew-adjoint, and unitary (orthogonal) operators are normal. 13.58. Suppose T is normal. Prove that (a) T is self-adjoint if and only if its eigenvalues are real. (b) T is unitary if and only if its eigenvalues have absolute value 1. (c) T is positive if and only if its eigenvalues are nonnegative real numbers. 13.59. Show that if T is normal, then T and T* have the same kernel and the same image. 13.60. Suppose T1 and T2 are normal and commute. Show that T1 þ T2 and T1T2 are also normal. 13.61. Suppose T1 is normal and commutes with T2. Show that T1 also commutes with T2 *. 13.62. Prove the following: Let T1 and T2 be normal operators on a complex finite-dimensional vector space V. Then there exists an orthonormal basis of V consisting of eigenvectors of both T1 and T2. (That is, T1 and T2 can be simultaneously diagonalized.) Isomorphism Problems for Inner Product Spaces 13.63. Let S ¼ fu1; . . . ; ung be an orthonormal basis of an inner product space V over K. Show that the mapping v 7! ½vs is an (inner product space) isomorphism between V and Kn . (Here ½vS denotes the coordinate vector of v in the basis S.) 394 CHAPTER 13 Linear Operators on Inner Product Spaces
  • 402. 13.64. Show that inner product spaces V and W over K are isomorphic if and only if V and W have the same dimension. 13.65. Suppose fu1; . . . ; ung and fu0 1; . . . ; u0 ng are orthonormal bases of V and W, respectively. Let T :V ! W be the linear map defined by TðuiÞ ¼ u0 i for each i. Show that T is an isomorphism. 13.66. Let V be an inner product space. Recall that each u 2 V determines a linear functional ^ u in the dual space V* by the definition ^ uðvÞ ¼ hv; ui for every v 2 V. (See the text immediately preceding Theorem 13.3.) Show that the map u 7! ^ u is linear and nonsingular, and hence an isomorphism from V onto V*. Miscellaneous Problems 13.67. Suppose fu1; . . . ; ung is an orthonormal basis of V: Prove (a) ha1u1 þ a2u2 þ þ anun; b1u1 þ b2u2 þ þ bnuni ¼ a1 b1 þ a2 b2 þ . . . an bn (b) Let A ¼ ½aij be the matrix representing T: V ! V in the basis fuig: Then aij ¼ hTðuiÞ; uji: 13.68. Show that there exists an orthonormal basis fu1; . . . ; ung of V consisting of eigenvectors of T if and only if there exist orthogonal projections E1; . . . ; Er and scalars l1; . . . ; lr such that (i) T ¼ l1E1 þ þ lrEr, (ii) E1 þ þ Er ¼ I, (iii) EiEj ¼ 0 for i 6¼ j 13.69. Suppose V ¼ U W and suppose T1:U ! V and T2 :W ! V are linear. Show that T ¼ T1 T2 is also linear. Here T is defined as follows: If v 2 V and v ¼ u þ w where u 2 U, w 2 W, then TðvÞ ¼ T1ðuÞ þ T2ðwÞ ANSWERS TO SUPPLEMENTARY PROBLEMS Notation: ½R1; R2; . . . ; Rn denotes a matrix with rows R1; R2; . . . ; Rn. 13.25. (a) ½5 þ 2i; 4 þ 6i; 3 7i; 8 3i, (b) ½3; i; 5i; 2i, (c) ½1; 2; 1; 3 13.26. T*ðx; y; zÞ ¼ ðx þ 3y; 2x þ z; 4yÞ 13.27. T*ðx; y; zÞ ¼ ½ix þ 3y; ð2 3iÞx þ ð2 þ 5iÞz; ð3 þ iÞy iz 13.28. (a) u ¼ ð1; 2; 3Þ, (b) u ¼ ði; 2 3i; 1 þ 2iÞ 13.32. (a) ð1= ffiffiffiffiffi 13 p Þ½2; 3; 3; 2, (b) ð1= ffiffiffi 3 p Þ½1; 1 i; 1 þ i; 1, (c) 1 2 ½1; i; 1 i; ffiffiffi 2 p i; ffiffiffi 2 p ; 0; 1; i; 1 þ i 13.45. Only (i) and (v) are positive. Only (v) is positive definite. 13.52. (a and b) P ¼ ð1= ffiffiffi 5 p Þ½2; 1; 1; 2, (c) P ¼ ð1= ffiffiffiffiffi 10 p Þ½3; 1; 1; 3 (a) D ¼ ½2; 0; 0; 3; (b) D ¼ ½7; 0; 0; 3; (c) D ¼ ½8; 0; 0; 2 13.53. (a) x ¼ ð3x0 y0 Þ= ffiffiffiffiffi 10 p ; y ¼ ðx0 þ 3y0 Þ= ffiffiffiffiffi 10 p ; (b) x ¼ ð2x0 y0 Þ= ffiffiffi 5 p ; y ¼ ðx0 þ 2y0 Þ= ffiffiffi 5 p ; (c) x ¼ x0 = ffiffiffi 3 p þ y0 = ffiffiffi 2 p þ z0 = ffiffiffi 6 p ; y ¼ x0 = ffiffiffi 3 p 2z0 = ffiffiffi 6 p ; z ¼ x0 = ffiffiffi 3 p y0 = ffiffiffi 2 p þ z0 = ffiffiffi 6 p ; (a) qðx0 Þ ¼ diagð1; 11Þ; (b) qðx0 Þ ¼ diagð3; 7Þ; (c) qðx0 Þ ¼ diagð1; 17Þ 13.54. (a) P ¼ ð1= ffiffiffi 2 p Þ½1; 1; 1; 1; P*AP ¼ diagð2 þ i; 2 iÞ CHAPTER 13 Linear Operators on Inner Product Spaces 395
  • 403. Multilinear Products A.1 Introduction The material in this appendix is much more abstract than that which has previously appeared. Accordingly, many of the proofs will be omitted. Also, we motivate the material with the following observation. Let S be a basis of a vector space V. Theorem 5.2 may be restated as follows. THEOREM 5.2: Let g:S ! V be the inclusion map of the basis S into V. Then, for any vector space U and any mapping f :S ! U; there exists a unique linear mapping f :V ! U such that f ¼ f g: Another way to state the fact that f ¼ f g is that the diagram in Fig. A-1(a) commutes. A.2 Bilinear Mapping and Tensor Products Let U, V, W be vector spaces over a field K. Consider a map f : V W ! U Then f is said to be bilinear if, for each v 2 V; the map fv :W ! U defined by fv w ð Þ ¼ f v; w ð Þ is linear; and, for each w 2 W; the map fw :V ! U defined by fw v ð Þ ¼ f v; w ð Þ is linear. That is, f is linear in each of its two variables. Note that f is similar to a bilinear form except that the values of the map f are in a vector space U rather than the field K. DEFINITION A.1: Let V and W be vector spaces over the same field K. The tensor product of V and W is a vector space T over K together with a bilinear map g : V W ! T; denoted by g v; w ð Þ ¼ v w; with the following property: (*) For any vector space U over K and any bilinear map f :V W ! U there exists a unique linear map f :T ! U such that f g ¼ f : The tensor product (T, g) [or simply T when g is understood] of V and W is denoted by V W; and the element v w is called the tensor of v and w. Another way to state condition (*) is that the diagram in Fig. A-1(b) commutes. The fact that such a unique linear map f* exists is called the ‘‘Universal Mapping Principle’’ (UMP). As illustrated in Fig. A-1(b), condition (*) also says that any bilinear map f :V W ! U ‘‘factors through’’ the tensor product T ¼ V W: The uniqueness in (*) implies that the image of g spans T; that is, span v w f g ð Þ ¼ T: APPENDIX A Figure A-1 396
  • 404. THEOREM A.1: (Uniqueness of Tensor Products) Let (T, g) and T0 ; g0 ð Þ be tensor products of V and W. Then there exists a unique isomorphism h:T ! T0 such that hg ¼ g0 : Proof. Because T is a tensor product, and g0 :V W ! T0 is bilinear, there exists a unique linear map h:T ! T0 such that hg ¼ g0 : Similarly, because T0 is a tensor product, and g:V W ! T0 is bilinear, there exists a unique linear map h0 :T0 ! T such that h0 g0 ¼ g: Using hg ¼ g0 , we get h0 hg ¼ g: Also, because T is a tensor product, and g:V W ! T is bilinear, there exists a unique linear map h :T ! T such that h g ¼ g: But 1Tg ¼ g: Thus, h0 h ¼ h ¼ 1T. Similarly, hh0 ¼ 1T0 : Therefore, h is an isomorphism from T to T0 : THEOREM A.2: (Existence of Tensor Product) The tensor product T ¼ V W of vector spaces V and W over K exists. Let v1; . . . ; vm f g be a basis of V and let w1; . . . ; wn f g be a basis of W. Then the mn vectors vi wi i ¼ 1; . . . ; m; j ¼ 1; . . . ; n ð Þ form a basis of T. Thus, dim T ¼ mn ¼ dim V ð Þ dim W ð Þ: Outline of Proof. Suppose v1; . . . ; vm is a basis of V, and suppose w1; . . . ; wn f g is a basis of W. Consider the mn symbols tijji ¼ i; . . . ; m; j ¼ 1; . . . ; n . Let T be the vector space generated by the tij. That is, T consists of all linear combinations of the tij with coefficients in K. [See Problem 4.137.] Let v 2 V and w 2 W. Say v ¼ a1v1 þ a2v2 þ þ amvm and w ¼ b1w1 þ b2w2 þ þ bmwm Let g:V W ! T be defined by g v; w ð Þ ¼ X i X j aibjtij Then g is bilinear. [Proof left to reader.] Now let f :V W ! U be bilinear. Because the tij form a basis of T, Theorem 5.2 (stated above) tells us that there exists a unique linear map f :T ! U such that f tij ¼ f vi; wj . Then, for v ¼ P i aivi and w ¼ P j bjwj, we have f ðv; wÞ ¼ f X i aivi; X j bjwj ! ¼ X i X j aibj f vi; wj ¼ X i X j aibj tij ¼ f g v; w ð Þ ð Þ: Therefore, f ¼ f g where f * is the required map in Definition A.1. Thus, T is a tensor product. Let fv0 1; . . . ; v0 mg be any basis of V and fw0 1; . . . ; w0 mg be any basis of W. Let v 2 V and w 2 W and say v ¼ a0 1v0 1 þ þ a0 mv0 m and w ¼ b0 1w0 1 þ þ b0 mw0 m Then v w ¼ g v; w ð Þ ¼ X i X j a0 ib0 i g v0 i; w0 i ð Þ ¼ X i X j a0 ib0 j v0 i w0 j Thus, the elements v0 i w0 j span T. There are mn such elements. They cannot be linearly dependent because tij is a basis of T, and hence, dim T ¼ mn. Thus, the v0 i w0 j form a basis of T. Next we give two concrete examples of tensor products. EXAMPLE A.1 Let V be the vector space of polynomials Pr1 x ð Þ and let W be the vector space of polynomials Ps1 y ð Þ. Thus, the following from bases of V and W, respectively, 1; x; x2 ; . . . ; xr1 and 1; y; y2 ; . . . ; ys1 In particular, dim V ¼ r and dim W ¼ s: Let T be the vector space of polynomials in variables x and y with basis xi y j where i ¼ 0; 1; . . . ; r 1; j ¼ 0; 1; . . . ; s 1 Appendix A Multilinear Products 397
  • 405. Then T is the tensor product V W under the mapping xi y j ¼ xi yi For example, suppose v ¼ 2 5x þ 3x3 and w ¼ 7y þ 4y2 . Then v w ¼ 14y þ 8y2 35xy 20xy2 þ 21x3 y þ 12x3 y2 Note, dim T ¼ rs ¼ dim V ð Þ dim W ð Þ: EXAMPLE A.2 Let V be the vector space of m n matrices over a field K and let W be the vector space of p q matrices over K. Suppose A ¼ ½a11 belongs to V, and B belongs to W. Let T be the vector space of mp nq matrices over K. Then T is the tensor product of V and W where A B is the block matrix A B ¼ aijB ¼ a11B a12B a1nB a21B a22B a2nB am1B am2B amnB 2 6 6 4 3 7 7 5 For example, suppose A ¼ 1 2 3 4 and B ¼ 1 2 3 4 5 6 : Then A B ¼ 1 2 3 2 4 6 4 5 6 8 10 12 3 6 9 4 8 12 12 15 18 16 20 24 2 6 6 6 4 3 7 7 7 5 Isomorphisms of Tensor Products First we note that tensoring is associative in a cannonical way. Namely, THEOREM A.3: Let U, V, W be vector spaces over a field K. Then there exists a unique isomorphism U V ð Þ W ! U V W ð Þ such that, for every u 2 U; v 2 V; w 2 W; u v ð Þ w 7! u v w ð Þ Accordingly, we may omit parenthesis when tensoring any number of factors. Specifically, given vectors spaces V1; V2; . . . ; Vm over a field K, we may unambiguously form their tensor product V1 V2 . . . Vm and, for vectors vj in Vj, we may unambiguously form the tensor product v1 v2 . . . vm Moreover, given a vector space V over K, we may unambiguously define the following tensor product: r V ¼ V V . . . V r factors ð Þ Also, there is a canonical isomorphism r V ð Þ s V ð Þ ! rþs V Furthermore, viewing K as a vector space over itself, we have the canonical isomorphism K V ! V where we define a v ¼ av: 398 Appendix A Multilinear Products
  • 406. A.3 Alternating Multilinear Maps Let f :Vr ! U where V and U are vector spaces over K. [Recall Vr ¼ V V . . . V, r factors.] (1) The mapping f is said to be multilinear or r-linear if f v1; . . . ; vr ð Þ is linear as a function of each vj when the other vi’s are held fixed. That is, f ð. . . ; vj þ v0 j; . . .Þ ¼ f ð. . . ; vj; . . .Þ þ f ð. . . ; v0 j; . . .Þ f ð. . . ; kvj; . . .Þ ¼ kf ð. . . ; vj; . . .Þ where only the jth position changes. (2) The mapping f is said to be alternating if f v1; . . . ; vr ð Þ ¼ 0 whenever vi ¼ vj with i 6¼ j One can easily show (Prove!) that if f is an alternating multilinear mapping on Vr , then f . . . ; vi; . . . ; vj; . . . ¼ f . . . ; vj; . . . ; vi; . . . That is, if two of the vectors are interchanged, then the associated value changes sign. EXAMPLE A.3 (Determinants) The determinant function D:M ! K on the space M of n n matrices may be viewed as an n-variable function D A ð Þ ¼ D R1; R2; . . . ; Rn ð Þ defined on the rows R1; R2; . . . ; Rn of A. Recall (Chapter 8) that, in this context, D is both n-linear and alternating. We now need some additional notation. Let K ¼ k1; k2; . . . ; kr ½ denote an r-list (r-tuple) of elements from In ¼ 1; 2; . . . ; n ð Þ. We will then use the following notation where the vk’s denote vectors and the aik’s denote scalars: vK ¼ ðvk1 ; vk2 ; . . . ; vkr Þ and aK ¼ a1k1 a2k2 . . . arkr Note vK is a list of r vectors, and aK is a product of r scalars. Now suppose the elements in K ¼ k1; k2; . . . ; kr ½ are distinct. Then K is a permutation sK of an r-list J ¼ i1; i2; . . . ; ir ½ in standard form, that is, where i1 i2 . . . ir. The number of such standard-form r-lists J from In is the binomial coefficient: n r ¼ n! r! n r ð Þ! [Recall sign sK ð Þ ¼ 1 ð ÞmK where mK is the number of interchanges that transforms K into J.] Now suppose A ¼ aij is an r n matrix. For a given ordered r-list J, we define DJ A ð Þ ¼ a1i1 a1i2 . . . a1ir a2i1 a2i2 . . . a2ir ari1 ari2 . . . arir That is, DJ (A) is the determinant of the r r submatrix of A whose column subscripts belong to J. Our main theorem below uses the following ‘‘shuffling’’ lemma. LEMMA A.4 Let V and U be vector spaces over K, and let f :Vr ! U be an alternating r-linear mapping. Let v1; v2; . . . ; vn be vectors in V and let A ¼ aij be an r n matrix over K where r n. For i ¼ 1; 2; . . . ; r, let ui ¼ ai1vi þ ai2v2 þ þ ainvn ...................... Appendix A Multilinear Products 399
  • 407. Then f u1; . . . ; ur ð Þ ¼ X f DJ A ð Þf ðvi1 ; vi2 ; . . . ; vir Þ where the sum is over all standard-form r-lists J ¼ i1; i2; . . . ; ir f g. The proof is technical but straightforward. The linearity of f gives us the sum f u1; . . . ; ur ð Þ ¼ X K aKf vK ð Þ where the sum is over all r-lists K from 1; . . . ; n f g. The alternating property of f tells us that f vK ð Þ ¼ 0 when K does not contain distinct integers. The proof now mainly uses the fact that as we interchange the vj’s to transform f vK ð Þ ¼ f ðvk1 ; vk2 ; . . . ; vkr Þ to f vj ¼ f ðvi1 ; vi2 ; . . . ; vir Þ so that i1 ir, the associated sign of aK, will change in the same way as the sign of the corresponding permutation sK changes when it is transformed to the identity permutation using transpositions. We illustrate the lemma below for r ¼ 2 and n ¼ 3. EXAMPLE A.4 Suppose f :V2 ! U is an alternating multilinear function. Let v1; v2; v3 2 V and let u; w 2 V. Suppose u ¼ a1v1 þ a2v2 þ a3v3 and w ¼ b1v1 þ b2v2 þ b3v3 Consider f u; w ð Þ ¼ f a1v1 þ a2v2 þ a3v3; b1v1 þ b2v2 þ b3v3 ð Þ Using multilinearity, we get nine terms: f u; w ð Þ ¼ a1b1 f v1; vr ð Þ þ a1b2 f v1; v2 ð Þ þ a1b3 f v1; v3 ð Þ þ a2b1 f v2; v1 ð Þ þ a2b2 f v2; v2 ð Þ þ a2b3 f v2; v3 ð Þ þ a3b1 f v3; v1 ð Þ þ a3b2 f v3; v2 ð Þ þ a3b3 f v3; v3 ð Þ (Note that J ¼ 1; 2 ½ ; J0 ¼ 1; 3 ½ and J00 ¼ 2; 3 ½ are the three standard-form 2-lists of I ¼ 1; 2; 3 ½ .) The alternating property of f tells us that each f vi; vi ð Þ ¼ 0; hence, three of the above nine terms are equal to 0. The alternating property also tells us that f vi; vf ¼ f vf ; vr . Thus, three of the terms can be transformed so their subscripts form a standard-form 2-list by a single interchange. Finally we obtain f u; w ð Þ ¼ a1b2 a2b1 ð Þ f v1; v2 ð Þ þ a1b3 a3b1 ð Þ f v1; v3 ð Þ þ a2b3 a3b2 ð Þ f v2; v3 ð Þ ¼ a1 a2 b1 b2 f v1; v2 ð Þ þ a1 a3 b1 b3 f v1; v3 ð Þ þ a2 a3 b2 b3 f v2; v3 ð Þ which is the content of Lemma A.4. A.4 Exterior Products The following definition applies. DEFINITION A.2: Let V be an n-dimensionmal vector space over a field K, and let r be an integer such that 1 r n. The r-fold exterior product (or simply exterior product when r is understood) is a vector space E over K together with an alternating r-linear mapping g:Vr ! E, denoted by g v1; . . . ; vr ð Þ ¼ v1 ^ .. . ^ vr, with the following property: (*) For any vector space U over K and any alternating r-linear map f :Vr ! U there exists a unique linear map f :E ! U such that f g ¼ f . 400 Appendix A Multilinear Products
  • 408. The r-fold tensor product (E, g) (or simply E when g is understood) of V is denoted by ^r V, and the element v1 ^ ^ vr is called the exterior product or wedge product of the vi’s. Another way to state condition (*) is that the diagram in Fig. A-1(c) commutes. Again, the fact that such a unique linear map f * exists is called the ‘‘Universal Mapping Principle (UMP)’’. As illustrated in Fig. A-1(c), condition (*) also says that any alternating r-linear map f :Vr ! U ‘‘factors through’’ the exterior product E ¼ ^r V. Again, the uniqueness in (*) implies that the image of g spans E; that is, span v1 ^ ^ vr ð Þ ¼ E. THEOREM A.5: (Uniqueness of Exterior Products) Let (E, g) and E0 ; g0 ð Þ be r-fold exterior products of V. Then there exists a unique isomorphism h:E ! E0 such that hg ¼ g0 . The proof is the same as the proof of Theorem A.1, which uses the UMP. THEOREM A.6: (Existence of Exterior Products) Let V be an n-dimensional vector space over K. Then the exterior product E ¼ ^r V exists. If r n, then E ¼ 0 f g. If r n, then dim E ¼ n r . Moreover, if v1; . . . ; vn ½ is a basis of V, then the vectors vi1 ^ vi2 ^ ^ vir ; where 1 i1 i2 ir n, form a basis of E. We give a concrete example of an exterior product. EXAMPLE A.5 (Cross Product) Consider V ¼ R3 with the usual basis (i, j, k). Let E ¼ ^2 V. Note dim V ¼ 3: Thus, dim E ¼ 3 with basis i ^ j; i ^ k; j ^ k: We identify E with R3 under the correspondence i ¼ j ^ k; j ¼ k ^ i ¼ i ^ k; k ¼ i ^ j Let u and w be arbitrary vectors in V ¼ R3 , say u ¼ a1; a2; a3 ð Þ ¼ a1i þ a2j þ a3k and w ¼ b1; b2; b3 ð Þ ¼ b1i þ b2j þ b3k Then, as in Example A.3, u ^ w ¼ a1b2 a2b1 ð Þði ^ jÞ þ a1b3 a3b1 ð Þði ^ kÞ þ a2b3 a3b2 ð Þðj ^ kÞ Using the above identification, we get u ^ w ¼ a2b3 a3b2 ð Þi a1b3 a3b1 ð Þj þ a1b2 a2b1 ð Þk ¼ a2 a3 b2 b3 i a1 a3 b1 b3 j þ a1 a2 b1 b2 k The reader may recognize that the above exterior product is precisely the well-known cross product in R3 . Our last theorem tells us that we are actually able to ‘‘multiply’’ exterior products, which allows us to form an ‘‘exterior algebra’’ that is illustrated below. THEOREM A.7: Let V be a vector space over K. Let r and s be positive integers. Then there is a unique bilinear mapping ^r V ^s V ! ^rþs V such that, for any vectors ui; wj in V, u1 ^ ^ ur ð Þ w1 ^ ^ ws ð Þ 7! u1 ^ ^ ur ^ w1 ^ ^ ws Appendix A Multilinear Products 401
  • 409. EXAMPLE A.6 We form an exterior algebra A over a field K using noncommuting variables x, y, z. Because it is an exterior algebra, our variables satisfy: x ^ x ¼ 0; y ^ y ¼ 0; z ^ z ¼ 0; and y ^ x ¼ x ^ y; z ^ x ¼ x ^ z; z ^ y ¼ y ^ z Every element of A is a linear combination of the eight elements 1; x; y; z; x ^ y; x ^ z; y ^ z; x ^ y ^ z We multiply two ‘‘polynomials’’ in A using the usual distributive law, but now we also use the above conditions. For example, 3 þ 4y 5x ^ y þ 6x ^ z ½ ^ 5x 2y ½ ¼ 15x 6y 20x ^ y þ 12x ^ y ^ z Observe we use the fact that 4y ½ ^ 5x ½ ¼ 20y ^ x ¼ 20x ^ y and 6x ^ z ½ ^ 2y ½ ¼ 12x ^ z ^ y ¼ 12x ^ y ^ z 402 Appendix A Multilinear Products
  • 410. Algebraic Structures B.1 Introduction We define here algebraic structures that occur in almost all branches of mathematics. In particular, we will define a field that appears in the definition of a vector space. We begin with the definition of a group, which is a relatively simple algebraic structure with only one operation and is used as a building block for many other algebraic systems. B.2 Groups Let G be a nonempty set with a binary operation; that is, to each pair of elements a; b 2 G there is assigned an element ab 2 G. Then G is called a group if the following axioms hold: G1 ½ For any a; b; c 2 G, we have ab ð Þc ¼ a bc ð Þ (the associative law). G2 ½ There exists an element e 2 G, called the identity element, such that ae ¼ ea ¼ a for every a 2 G. G3 ½ For each a 2 G there exists an element a1 2 G, called the inverse of a, such that aa1 ¼ a1 a ¼ e. A group G is said to be abelian (or: commutative) if the commutative law holds—that is, if ab ¼ ba for every a; b 2 G. When the binary operation is denoted by juxtaposition as above, the group G is said to be written multiplicatively. Sometimes, when G is abelian, the binary operation is denoted by + and G is said to be written additively. In such a case, the identity element is denoted by 0 and is called the zero element; the inverse is denoted by a and it is called the negative of a. If A and B are subsets of a group G, then we write AB ¼ abja 2 A; b 2 B f g or A þ B ¼ a þ bja 2 A; b 2 B f g We also write a for {a}. A subset H of a group G is called a subgroup of G if H forms a group under the operation of G. If H is a subgroup of G and a 2 G, then the set Ha is called a right coset of H and the set aH is called a left coset of H. DEFINITION: A subgroup H of G is called a normal subgroup if a1 Ha H for every a 2 G. Equivalently, H is normal if aH ¼ Ha for every a 2 G—that is, if the right and left cosets of H coincide. Note that every subgroup of an abelian group is normal. THEOREM B.1: Let H be a normal subgroup of G. Then the cosets of H in G form a group under coset multiplication. This group is called the quotient group and is denoted by G/H. APPENDIX B 403
  • 411. EXAMPLE B.1 The set Z of integers forms an abelian group under addition. (We remark that the even integers form a subgroup of Z but the odd integers do not.) Let H denote the set of multiples of 5; that is, H ¼ f. . . ; 10; 5; 0; 5; 10; . . .g. Then H is a subgroup (necessarily normal) of Z. The cosets of H in Z follow: 0 ¼ 0 þ H ¼ H ¼ . . . ; 10; 5; 0; 5; 10; . . . f g 1 ¼ 1 þ H ¼ f. . . ; 9; 4; 1; 6; 11; . . .g 2 ¼ 2 þ H ¼ . . . ; 8; 3; 2; 7; 12; . . . f g 3 ¼ 3 þ H ¼ . . . ; 7; 2; 3; 8; 13; . . . f g 4 ¼ 4 þ H ¼ . . . ; 6; 1; 4; 9; 14; . . . f g For any other integer n 2 Z, n ¼ n þ H coincides with one of the above cosets. Thus, by the above theorem, Z=H ¼ 0; 1; 2; 3; 4 f g forms a group under coset addition; its addition table follows: þ 0 1 2 3 4 0 0 1 2 3 4 1 1 2 3 4 0 2 2 3 4 0 1 3 3 4 0 1 2 4 4 0 1 2 3 This quotient group Z/H is referred to as the integers modulo 5 and is frequently denoted by Z5. Analogeusly, for any positive integer n, there exists the quotient group Zn called the integers modulo n. EXAMPLE B.2 The permutations of n symbols (see page 267) form a group under composition of mappings; it is called the symmetric group of degree n and is denoted by Sn. We investigate S3 here; its elements are E ¼ 1 2 3 1 2 3 s2 ¼ 1 2 3 3 2 1 f1 ¼ 1 2 3 2 3 1 s1 ¼ 1 2 3 1 3 2 s3 ¼ 1 2 3 2 1 3 f2 ¼ 1 2 3 3 1 2 Here 1 2 3 i j k is the permutation that maps 1 7! i; 2 7! j; 3 7! k. The multiplication table of S3 is E s1 s2 s3 f1 f2 E E s1 s2 s3 f1 f2 s1 s1 E f1 f2 s2 s3 s2 s2 f2 E f1 f3 s1 s3 s3 f1 f2 E s1 s2 f1 f1 s3 s1 s2 f2 E f2 f2 s2 s3 s1 E f1 (The element in the ath row and bth column is ab.) The set H ¼ E; s1 f g is a subgroup of S3; its right and left cosets are Right Cosets Left Cosets H ¼ E; s1 f g H ¼ E; s1 f g Hf1 ¼ f1; s2 f g f2H ¼ f1; s3 f g Hf2 ¼ f2; s3 f g f2H ¼ f2; s2 f g Observe that the right cosets and the left cosets are distinct; hence, H is not a normal subgroup of S3. A mapping f from a group G into a group G0 is called a homomorphism if f ab ð Þ ¼ f a ð Þf b ð Þ. For every a; b 2 G. (If f is also bijective, i.e., one-to-one and onto, then f is called an isomorphism and G and G0 are 404 Appendix B Algebraic Structures
  • 412. said to be isomorphic.) If f : G ! G0 is a homomorphism, then the kernel of f is the set of elements of G that map into the identity element e0 2 G0 : kernel of f ¼ a 2 G j f a ð Þ ¼ e0 f g (As usual, f(G) is called the image of the mapping f : G ! G0 .) The following theorem applies. THEOREM B.2: Let f: G ! G be a homomorphism with kernel K. Then K is a normal subgroup of G, and the quotient group G/K is isomorphic to the image of f. EXAMPLE B.3 Let G be the group of real numbers under addition, and let G0 be the group of positive real numbers under multiplication. The mapping f : G ! G0 defined by f a ð Þ ¼ 2a is a homomorphism because f a þ b ð Þ ¼ 2aþb ¼ 2a 2b ¼ f a ð Þf b ð Þ In particular, f is bijective, hence, G and G0 are isomorphic. EXAMPLE B.4 Let G be the group of nonzero complex numbers under multiplication, and let G0 be the group of nonzero real numbers under multiplication. The mapping f : G ! G0 defined by f z ð Þ ¼ jzj is a homomorphism because f z1z2 ð Þ ¼ jz1z2j ¼ jz1jjz2j ¼ f z1 ð Þ f z2 ð Þ The kernel K of f consists of those complex numbers z on the unit circle—that is, for which jzj ¼ 1. Thus, G=K is isomorphic to the image of f—that is, to the group of positive real numbers under multiplication. B.3 Rings, Integral Domains, and Fields Let R be a nonempty set with two binary operations, an operation of addition (denoted by +) and an operation of multiplication (denoted by juxtaposition). Then R is called a ring if the following axioms are satisfied: R1 ½ For any a; b; c 2 R, we have a þ b ð Þ þ c ¼ a þ b þ c ð Þ. R2 ½ There exists an element 0 2 R; called the zero element, such that a þ 0 ¼ 0 þ a ¼ a for every a 2 R: R3 ½ For each a 2 R there exists an element a 2 R, called the negative of a, such that a þ a ð Þ ¼ a ð Þ þ a ¼ 0. R4 ½ For any a; b 2 R; we have a þ b ¼ b þ a: R5 ½ For any a; b; c 2 R; we have ab ð Þc ¼ a bc ð Þ: R6 ½ For any a; b; c 2 R; we have (i) a b þ c ð Þ ¼ ab þ ac; and (ii) b þ c ð Þa ¼ ba þ ca: Observe that the axioms R1 ½ through R4 ½ may be summarized by saying that R is an abelian group under addition. Subtraction is defined in R by a b a þ b ð Þ. It can be shown (see Problem B.25) that a 0 ¼ 0 a ¼ 0 for every a 2 R: R is called a commutative ring if ab ¼ ba for every a; b 2 R: We also say that R is a ring with a unit element if there exists a nonzero element 1 2 R such that a 1 ¼ 1 a ¼ a for every a 2 R: A nonempty subset S of R is called a subring of R if S forms a ring under the operations of R. We note that S is a subring of R if and only if a; b 2 S implies a b 2 S and ab 2 S. A nonempty subset I of R is called a left ideal in R if (i) a b 2 I whenever a; b 2 I; and (ii) ra 2 I whenever r 2 R; a 2 I: Note that a left ideal I in R is also a subring of R. Similarly, we can define a right ideal and a two-sided ideal. Clearly all ideals in commutative rings are two sided. The term ideal shall mean two-sided ideal uniess otherwise specified. Appendix B Algebraic Structures 405
  • 413. THEOREM B.3: Let I be a (two-sided) ideal in a ring R. Then the cosets a þ I j a 2 R f g form a ring under coset addition and coset multiplication. This ring is denoted by R=I and is called the quotient ring. Now let R be a commutative ring with a unit element. For any a 2 R, the set a ð Þ ¼ ra j r 2 R f g is an ideal; it is called the principal ideal generated by a. If every ideal in R is a principal ideal, then R is called a principal ideal ring. DEFINITION: A commutative ring R with a unit element is called an integral domain if R has no zero divisors—that is, if ab ¼ 0 implies a ¼ 0 or b ¼ 0. DEFINITION: A commutative ring R with a unit element is called a field if every nonzero a 2 R has a multiplicative inverse; that is, there exists an element a1 2 R such that aa1 ¼ a1 a ¼ 1: A field is necessarily an integral domain; for if ab ¼ 0 and a 6¼ 0; then b ¼ 1 b ¼ a1 ab ¼ a1 0 ¼ 0 We remark that a field may also be viewed as a commutative ring in which the nonzero elements form a group under multiplication. EXAMPLE B.5 The set Z of integers with the usual operations of addition and multiplication is the classical example of an integral domain with a unit element. Every ideal I in Z is a principal ideal; that is, I ¼ n ð Þ for some integer n. The quotient ring Zn ¼ Z= n ð Þ is called the ring of integers module n. If n is prime, then Zn is a field. On the other hand, if n is not prime then Zn has zero divisors. For example, in the ring Z6; 2 3 ¼ 0 and 2 6¼ 0 and 3 6¼ 0: EXAMPLE B.6 The rational numbers Q and the real numbers R each form a field with respect to the usual operations of addition and multiplication. EXAMPLE B.7 Let C denote the set of ordered pairs of real numbers with addition and multiplication defined by a; b ð Þ þ c; d ð Þ ¼ a þ c; b þ d ð Þ a; b ð Þ c; d ð Þ ¼ ac bd; ad þ bc ð Þ Then C satisfies all the required properties of a field. In fact, C is just the field of complex numbers (see page 4). EXAMPLE B.8 The set M of all 2 6 2 matrices with real entries forms a noncommutative ring with zero divisors under the operations of matrix addition and matrix multiplication. EXAMPLE B.9 Let R be any ring. Then the set R x ½ of all polynomials over R forms a ring with respect to the usual operations of addition and multiplication of polynomials. Moreover, if R is an integral domain then R x ½ is also an integral domain. Now let D be an integral domain. We say that b divides a in D if a ¼ bc for some c 2 D. An element u 2 D is called a unit if u divides 1—that is, if u has a multiplicative inverse. An element b 2 D is called an associate of a 2 D if b ¼ ua for some unit u 2 D. A nonunit p 2 D is said to be irreducible if p ¼ ab implies a or b is a unit. An integral domain D is called a unique factorization domain if every nonunit a 2 D can be written uniquely (up to associates and order) as a product of irreducible elements. EXAMPLE B.10 The ring Z of integers is the classical example of a unique factorization domain. The units of Z are 1 and 1. The only associates of n 2 Z are n and n. The irreducible elements of Z are the prime numbers. EXAMPLE B.11 The set D ¼ a þ b ffiffiffiffiffi 13 p j a; b integers is an integral domain. The units of D are 1; 18 5 ffiffiffiffiffi 13 p and 18 5 ffiffiffiffiffi 13 p . The elements 2; 3 ffiffiffiffiffi 13 p and 3 ffiffiffiffiffi 13 p are irreducible in D. Observe that 4 ¼ 2 2 ¼ 3 ffiffiffiffiffi 13 p 3 ffiffiffiffiffi 13 p : Thus, D is not a unique factorization domain. (See Problem B.40.) 406 Appendix B Algebraic Structures
  • 414. B.4 Modules Let M be an additive abelian group and let R be a ring with a unit element. Then M is said to be a (left) R- module if there exists a mapping R M ! M that satisfies the following axioms: M1 ½ r m1 þ m2 ð Þ ¼ rm1 þ rm2 M2 ½ r þ s ð Þm ¼ rm þ sm M3 ½ rs ð Þm ¼ r sm ð Þ M4 ½ 1 m ¼ m for any r; s 2 R and any mi 2 M. We emphasize that an R-module is a generalization of a vector space where we allow the scalars to come from a ring rather than a field. EXAMPLE B.12 Let G be any additive abelian group. We make G into a module over the ring Z of integers by defining ng ¼ g þ g þ þ g; zfflfflfflfflfflfflfflfflfflfflfflffl}|fflfflfflfflfflfflfflfflfflfflfflffl{ n times 0g ¼ 0; n ð Þg ¼ ng where n is any positive integer. EXAMPLE B.13 Let R be a ring and let I be an ideal in R. Then I may be viewed as a module over R. EXAMPLE B.14 Let V be a vector space over a field K and let T :V ! V be a linear mapping. We make V into a module over the ring K x ½ of polynomials over K by defining f x ð Þv ¼ f T ð Þ v ð Þ: The reader should check that a scalar multiplication has been defined. Let M be a module over R. An additive subgroup N of M is called a submodule of M if u 2 N and k 2 R imply ku 2 N: (Note that N is then a module over R.) Let M and M0 be R-modules. A mapping T :M ! M0 is called a homomorphism (or: R-homomorphism or R-linear) if (i) T u þ v ð Þ ¼ T u ð Þ þ T v ð Þ and (ii) T ku ð Þ ¼ kT u ð Þ for every u; v 2 M and every k 2 R. PROBLEMS Groups B.1. Determine whether each of the following systems forms a group G: (i) G ¼ set of integers; operation subtraction; (ii) G ¼ f1; 1g, operation multiplication; (iii) G ¼ set of nonzero rational numbers, operation division; (iv) G ¼ set of nonsingular n n matrices, operation matrix multiplication; (v) G ¼ fa þ bi : a; b 2 Zg, operation addition. B.2. Show that in a group G: (i) the identity element of G is unique; (ii) each a 2 G has a unique inverse a1 2 G; (iii) a1 ð Þ 1 ¼ a; and ab ð Þ1 ¼ b1 a1 ; (iv) ab ¼ ac implies b ¼ c, and ba ¼ ca implies b ¼ c. Appendix B Algebraic Structures 407
  • 415. B.3. In a group G, the powers of a 2 G are defined by a0 ¼ e; an ¼ aan1 ; an ¼ an ð Þ1 ; where n 2 N Show that the following formulas hold for any integers r; s; t 2 Z: (i) ar as ¼ arþs ; (ii) ar ð Þs ¼ ars ; (iii) arþs ð Þ t ¼ arsþst . B.4. Show that if G is an abelian group, then ab ð Þn ¼ an bn for any a; b 2 G and any integer n 2 Z: B.5. Suppose G is a group such that ab ð Þ2 ¼ a2 b2 for every a; b 2 G. Show that G is abelian. B.6. Suppose H is a subset of a group G. Show that H is a subgroup of G if and only if (i) H is nonempty, and (ii) a; b 2 H implies ab1 2 H: B.7. Prove that the intersection of any number of subgroups of G is also a subgroup of G. B.8. Show that the set of all powers of a 2 G is a subgroup of G; it is called the cyclic group generated by a. B.9. A group G is said to be cyclic if G is generated by some a 2 G; that is, G ¼ an : n 2 Z ð Þ. Show that every subgroup of a cyclic group is cyclic. B.10. Suppose G is a cyclic subgroup. Show that G is isomorphic to the set Z of integers under addition or to the set Zn (of the integers module n) under addition. B.11. Let H be a subgroup of G. Show that the right (left) cosets of H partition G into mutually disjoint subsets. B.12. The order of a group G, denoted by jGj; is the number of elements of G. Prove Lagrange’s theorem: If H is a subgroup of a finite group G, then jHj divides jGj. B.13. Suppose jGj ¼ p where p is prime. Show that G is cyclic. B.14. Suppose H and N are subgroups of G with N normal. Show that (i) HN is a subgroup of G and (ii) H N is a normal subgroup of G. B.15. Let H be a subgroup of G with only two right (left) cosets. Show that H is a normal subgroup of G. B.16. Prove Theorem B.1: Let H be a normal subgroup of G. Then the cosets of H in G form a group G=H under coset multiplication. B.17. Suppose G is an abelian group. Show that any factor group G=H is also abelian. B.18. Let f : G ! G0 be a group homomorphism. Show that (i) f e ð Þ ¼ e0 where e and e0 are the identity elements of G and G0 , respectively; (ii) f a1 ð Þ ¼ f a ð Þ1 for any a 2 G. B.19. Prove Theorem B.2: Let f : G ! G0 be a group homomorphism with kernel K. Then K is a normal subgroup of G, and the quotient group G=K is isomorphic to the image of f. B.20. Let G be the multiplicative group of complex numbers z such that jzj ¼ 1; and let R be the additive group of real numbers. Prove that G is isomorphic to R=Z: 408 Appendix B Algebraic Structures
  • 416. B.21. For a fixed g 2 G, let ^ g : G ! G be defined by ^ g a ð Þ ¼ g1 ag: Show that G is an isomorphism of G onto G. B.22. Let G be the multiplicative group of n n nonsingular matrices over R. Show that the mapping A 7! jAj is a homomorphism of G into the multiplicative group of nonzero real numbers. B.23. Let G be an abelian group. For a fixed n 2 Z; show that the map a 7! an is a homomorphism of G into G. B.24. Suppose H and N are subgroups of G with N normal. Prove that H N is normal in H and H= H N ð Þ is isomorphic to HN=N. Rings B.25. Show that in a ring R: (i) a 0 ¼ 0 a ¼ 0; (ii) a b ð Þ ¼ a ð Þb ¼ ab, (iii) a ð Þ b ð Þ ¼ ab: B.26. Show that in a ring R with a unit element: (i) 1 ð Þa ¼ a; (ii) 1 ð Þ 1 ð Þ ¼ 1. B.27. Let R be a ring. Suppose a2 ¼ a for every a 2 R: Prove that R is a commutative ring. (Such a ring is called a Boolean ring.) B.28. Let R be a ring with a unit element. We make R into another ring ^ R by defining a b ¼ a þ b þ 1 and a b ¼ ab þ a þ b. (i) Verify that ^ R is a ring. (ii) Determine the 0-element and 1-element of ^ R. B.29. Let G be any (additive) abelian group. Define a multiplication in G by a b ¼ 0. Show that this makes G into a ring. B.30. Prove Theorem B.3: Let I be a (two-sided) ideal in a ring R. Then the cosets a þ I j a 2 R ð Þ form a ring under coset addition and coset multiplication. B.31. Let I1 and I2 be ideals in R. Prove that I1 þ I2 and I1 I2 are also ideals in R. B.32. Let R and R0 be rings. A mapping f : R ! R0 is called a homomorphism (or: ring homomorphism) if (i) f a þ b ð Þ ¼ f a ð Þ þ f b ð Þ and (ii) f ab ð Þ ¼ f a ð Þ f b ð Þ, for every a; b 2 R. Prove that if f : R ! R0 is a homomorphism, then the set K ¼ r 2 R j f r ð Þ ¼ 0 f g is an ideal in R. (The set K is called the kernel of f.) Integral Domains and Fields B.33. Prove that in an integral domain D, if ab ¼ ac; a 6¼ 0; then b ¼ c. B.34. Prove that F ¼ a þ b ffiffiffi 2 p j a; b rational is a field. B.35. Prove that D ¼ a þ b ffiffiffi 2 p j a; b integers is an integral domain but not a field. B.36. Prove that a finite integral domain D is a field. B.37. Show that the only ideals in a field K are 0 f g and K. B.38. A complex number a þ bi where a, b are integers is called a Gaussian integer. Show that the set G of Gaussian integers is an integral domain. Also show that the units in G are 1 and i. Appendix B Algebraic Structures 409
  • 417. B.39. Let D be an integral domain and let I be an ideal in D. Prove that the factor ring D=I is an integral domain if and only if I is a prime ideal. (An ideal I is prime if ab 2 I implies a 2 I or b 2 I:) B.40. Consider the integral domain D ¼ a þ b ffiffiffiffiffi 13 p j a; b integers (see Example B.11). If a ¼ a þ b ffiffiffiffiffi 13 p , we define N a ð Þ ¼ a2 13b2 . Prove: (i) N ab ð Þ ¼ N a ð ÞN b ð Þ; (ii) a is a unit if and only if N a ð Þ ¼ 1; (iii) the units of D are 1; 18 5 ffiffiffiffiffi 13 p and 18 5 ffiffiffiffiffi 13 p ; (iv) the numbers 2; 3 ffiffiffiffiffi 13 p and 3 ffiffiffiffiffi 13 p are irreducible. Modules B.41. Let M be an R-module and let A and B be submodules of M. Show that A þ B and A B are also submodules of M. B.42. Let M be an R-module with submodule N. Show that the cosets u þ N : u 2 M f g form an R-module under coset addition and scalar multiplication defined by r u þ N ð Þ ¼ ru þ N. (This module is denoted by M=N and is called the quotient module.) B.43. Let M and M0 be R-modules and let f : M ! M0 be an R-homomorphism. Show that the set K ¼ u 2 M : f u ð Þ ¼ 0 f g is a submodule of f. (The set K is called the kernel of f.) B.44. Let M be an R-module and let E M ð Þ denote the set of all R-homomorphism of M into itself. Define the appropriate operations of addition and multiplication in E M ð Þ so that E M ð Þ becomes a ring. 410 Appendix B Algebraic Structures
  • 418. Polynomials over a Field C.1 Introduction We will investigate polynomials over a field K and show that they have many properties that are analogous to properties of the integers. These results play an important role in obtaining canonical forms for a linear operator T on a vector space V over K. C.2 Ring of Polynomials Let K be a field. Formally, a polynomial of f over K is an infinite sequence of elements from K in which all except a finite number of them are 0: f ¼ . . . ; 0; an; . . . ; a1; a0 ð Þ (We write the sequence so that it extends to the left instead of to the right.) The entry ak is called the kth coefficient of f. If n is the largest integer for which an 6¼ 0, then we say that the degree of f is n, written deg f ¼ n We also call an the leading coefficient of f, and if an ¼ 1 we call f a monic polynomial. On the other hand, if every coefficient of f is 0 then f is called the zero polynomial, written f ¼ 0. The degree of the zero polynomial is not defined. Now if g is another polynomial over K, say g ¼ . . . ; 0; bm; . . . ; b1; b0 ð Þ then the sum f þ g is the polynomial obtained by adding corresponding coefficients. That is, if m n, then f þ g ¼ . . . ; 0; an; . . . ; am þ bm; . . . ; a1 þ b1; a0 þ b0 ð Þ Furthermore, the product fg is the polynomial fg ¼ . . . ; 0; anbm; . . . ; a1b0 þ a0b1; a0b0 ð Þ that is, the kth coefficient ck of fg is ck ¼ X k t¼0 a1bk1 ¼ a0bk þ a1bk1 þ þ akb0 The following theorem applies. THEOREM C.1: The set P of polynomials over a field K under the above operations of addition and multiplication forms a commutative ring with a unit element and with no zero divisors—an integral domain. If f and g are nonzero polynomials in P, then deg fg ð Þ ¼ deg f ð Þ deg g ð Þ. APPENDIX C 411
  • 419. Notation We identify the scalar a0 2 K with the polynomial a0 ¼ . . . ; 0; a0 ð Þ We also choose a symbol, say t, to denote the polynomial t ¼ . . . ; 0; 1; 0 ð Þ We call the symbol t an indeterminant. Multiplying t with itself, we obtain t2 ¼ . . . ; 0; 1; 0; 0 ð Þ; t3 ¼ . . . ; 0; 1; 0; 0; 0 ð Þ; . . . Thus, the above polynomial f can be written uniquely in the usual form f ¼ antn þ þ ast þ a0 When the symbol t is selected as the indeterminant, the ring of polynomials over K is denoted by K t ½ and a polynomial f is frequently denoted by f t ð Þ. We also view the field K as a subset of K t ½ under the above identification. This is possible because the operations of addition and multiplication of elements of K are preserved under this identification: ð. . . ; 0; a0Þ þ ð. . . ; 0; b0Þ ¼ ð. . . ; 0; a0 þ b0Þ ð. . . ; 0; a0Þ ð. . . ; 0; b0Þ ¼ ð. . . ; 0; a0b0Þ We remark that the nonzero elements of K are the units of the ring K t ½ . We also remark that every nonzero polynomial is an associate of a unique monic polynomial. Hence, if d and d0 are monic polynomials for which d divides d0 and d0 divides d, then d ¼ d0 . (A polynomial g divides a polynomial f if there is a polynomial h such that f ¼ hg:) C.3 Divisibility The following theorem formalizes the process known as ‘‘long division.’’ THEOREM C.2 (Division Algorithm): Let f and g be polynomials over a field K with g 6¼ 0. Then there exist polynomials q and r such that f ¼ qg þ r where either r ¼ 0 or deg r deg g. Proof: If f ¼ 0 or if deg f deg g, then we have the required representation f ¼ 0g þ f Now suppose deg f deg g, say f ¼ antn þ þ a1t þ a0 and g ¼ bmtm þ þ b1t þ b0 where an; bm 6¼ 0 and n m. We form the polynomial f1 ¼ f an bm tnm g ð1Þ Then deg f1 deg f . By induction, there exist polynomials q1 and r such that f1 ¼ q1g þ r 412 Appendix C Polynomials over a Field
  • 420. where either r ¼ 0 or deg r deg g. Substituting this into (1) and solving for f, f ¼ q1 þ an bm tnm g þ r which is the desired representation. THEOREM C.3: The ring K t ½ of polynomials over a field K is a principal ideal ring. If I is an ideal in K t ½ , then there exists a unique monic polynomial d that generates I, such that d divides every polynomial f 2 I. Proof. Let d be a polynomial of lowest degree in I. Because we can multiply d by a nonzero scalar and still remain in I, we can assume without loss in generality that d is a monic polynomial. Now suppose f 2 I. By Theorem C.2 there exist polynomials q and r such that f ¼ qd þ r where either r ¼ 0 or deg r deg d Now f ; d 2 I implies qd 2 I; and hence, r ¼ f qd 2 I. But d is a polynomial of lowest degree in I. Accordingly, r ¼ 0 and f ¼ qd; that is, d divides f. It remains to show that d is unique. If d0 is another monic polynomial that generates I, then d divides d0 and d0 divides d. This implies that d ¼ d0 , because d and d0 are monic. Thus, the theorem is proved. THEOREM C.4: Let f and g be nonzero polynomials in K t ½ . Then there exists a unique monic polynomial d such that (i) d divides f and g; and (ii) d0 divides f and g, then d0 divides d. DEFINITION: The above polynomial d is called the greatest common divisor of f and g. If d ¼ 1, then f and g are said to be relatively prime. Proof of Theorem C.4. The set I ¼ mf þ ng j m; n 2 K t ½ f g is an ideal. Let d be the monic polynomial that generates I. Note f ; g 2 I; hence, d divides f and g. Now suppose d0 divides f and g. Let J be the ideal generated by d0 . Then f ; g 2 J, and hence, I J. Accordingly, d 2 J and so d0 divides d as claimed. It remains to show that d is unique. If d1 is another (monic) greatest common divisor of f and g, then d divides d1 and d1 divides d. This implies that d ¼ d1 because d and d1 are monic. Thus, the theorem is proved. COROLLARY C.5: Let d be the greatest common divisor of the polynomials f and g. Then there exist polynomials m and n such that d ¼ mf þ ng. In particular, if f and g are relatively prime, then there exist polynomials m and n such that mf þ ng ¼ 1. The corollary follows directly from the fact that d generates the ideal I ¼ mf þ ng j m; n 2 K t ½ f g C.4 Factorization A polynomial p 2 K t ½ of positive degree is said to be irreducible if p ¼ fg implies f or g is a scalar. LEMMA C.6: Suppose p 2 K t ½ is irreducible. If p divides the product fg of polynomials f ; g 2 K t ½ , then p divides f or p divides g. More generally, if p divides the product of n polynomials f1 f2 . . . fn, then p divides one of them. Proof. Suppose p divides fg but not f. Because p is irreducible, the polynomials f and p must then be relatively prime. Thus, there exist polynomials m; n 2 K t ½ such that mf þ np ¼ 1. Multiplying this Appendix C Polynomials over a Field 413
  • 421. equation by g, we obtain mfg þ npg ¼ g. But p divides fg and so mfg, and p divides npg; hence, p divides the sum g ¼ mfg þ npg. Now suppose p divides f1 f2 fn: If p divides f1, then we are through. If not, then by the above result p divides the product f2 fn: By induction on n, p divides one of the polynomials f2; . . . fn: Thus, the lemma is proved. THEOREM C.7: (Unique Factorization Theorem) Let f be a nonzero polynomial in K t ½ : Then f can be written uniquely (except for order) as a product f ¼ kp1p2 pn where k 2 K and the pi are monic irreducible polynomials in K t ½ : Proof: We prove the existence of such a product first. If f is irreducible or if f 2 K, then such a product clearly exists. On the other hand, suppose f ¼ gh where f and g are nonscalars. Then g and h have degrees less than that of f. By induction, we can assume g ¼ k1g1g2 gr and h ¼ k2h1h2 hs where k1; k2 2 K and the gi and hj are monic irreducible polynomials. Accordingly, f ¼ k1k2 ð Þg1g2 grk1h2 hs is our desired representation. We next prove uniqueness (except for order) of such a product for f. Suppose f ¼ kp1p2 pn ¼ k0 q1q2 qm where k; k0 2 K and the p1; . . . ; pn; q1; . . . ; qm are monic irreducible polynomials. Now p1 divides k0 q1 qm: Because p1 is irreducible, it must divide one of the qi by the above lemma. Say p1 divides q1. Because p1 and q1 are both irreducible and monic, p1 ¼ q1. Accordingly, kp2 pn ¼ k0 q2 qm By induction, we have that n ¼ m and p2 ¼ q2; . . . ; pn ¼ qm for some rearrangement of the qi. We also have that k ¼ k0 . Thus, the theorem is proved. If the field K is the complex field C, then we have the following result that is known as the fundamental theorem of algebra; its proof lies beyond the scope of this text. THEOREM C.8: (Fundamental Theorem of Algebra) Let f t ð Þ be a nonzero polynomial over the complex field C. Then f t ð Þ can be written uniquely (except for order) as a product f t ð Þ ¼ k t r2 ð Þ t r2 ð Þ t rn ð Þ where k; ri 2 C—as a product of linear polynomials. In the case of the real field R we have the following result. THEOREM C.9: Let f t ð Þ be a nonzero polynomial over the real field R. Then f t ð Þ can be written uniquely (except for order) as a product f t ð Þ ¼ kp1 t ð Þp2 t ð Þ pm t ð Þ where k 2 R and the pi t ð Þ are monic irreducible polynomials of degree one or two. 414 Appendix C Polynomials over a Field
  • 422. Odds and Ends D.1 Introduction This appendix discusses various topics, such as equivalence relations, determinants and block matrices, and the generalized MP (Moore–Penrose) inverse. D.2 Relations and Equivalence Relations A binary relation or simply relation R from a set A to a set B assigns to each ordered pair a; b ð Þ 2 A B exactly one of the following statements: (i) ‘‘a is related to b,’’ written a R b, (ii) ‘‘a is not related to b’’ written a R b. A relation from a set A to the same set A is called a relation on A. Observe that any relation R from A to B uniquely defines a subset ^ R of A B as follows: ^ R ¼ a; b ð Þja R b f g Conversely, any subset ^ R of A B defines a relation from A to B as follows: a R b if and only if a; b ð Þ 2 R In view of the above correspondence between relations from A to B and subsets of A B, we redefine a relation from A to B as follows: DEFINITION D.1: A relation R from A to B is a subset of A B. Equivalence Relations Consider a nonempty set S. A relation R on S is called an equivalence relation if R is reflexive, symmetric, and transitive; that is, if R satisfied the following three axioms: [E1] (Reflexivity) Every a 2 A is related to itself. That is, for every a 2 A, a R a. [E2] (Symmetry) If a is related to b, then b is related to a. That is, if a R b, then b R a. [E3] (Transitivity) If a is related to b and b is related to c, then a is related to c. That is, if a R b and b R c, then a R c: The general idea behind an equivalence relation is that it is a classification of objects that are in some way ‘‘alike.’’ Clearly, the relation of equality is an equivalence relation. For this reason, one frequently uses ~ or to denote an equivalence relation. EXAMPLE D.1 (a) In Euclidean geometry, similarity of triangles is an equivalence relation. Specifically, suppose a; b; g are triangles. Then (i) a is similar to itself. (ii). If a is similar to b, then b is similar to a. (iii) If a is similar to b and b is similar to g, then a is similar to g. APPENDIX D 415
  • 423. (b) The relation of set inclusion is not an equivalence relation. It is reflexive and transitive, but it is not symmetric because A B does not imply B A. Equivalence Relations and Partitions Let S be a nonempty set. Recall first that a partition P of S is a subdivision of S into nonempty, nonoverlapping subsets; that is, a collection P ¼ fAjg of nonempty subsets of S such that (i) Each a 2 S belong to one of the Aj, (ii) The sets fAjg are mutually disjoint. The subsets in a partition P are called cells. Thus, each a 2 S belongs to exactly one of the cells. Also, any element b 2 Aj is called a representative of the cell Aj, and a subset B of S is called a system of representatives if B contains exactly one element in each of the cells in fAjg. Now suppose R is an equivalence relation on the nonempty set S. For each a 2 S, the equivalence class of a, denoted by [a], is the set of elements of S to which a is related: a ½ ¼ x j a Rx f g: The collection of equivalence classes, denoted by S=R, is called the quotient of S by R: S=R ¼ a ½ j a 2 S f g The fundamental property of an equivalence relation and its quotient set is contained in the following theorem: THEOREM D.1: Let R be an equivalence relation on a nonempty set S. Then the quotient set S=R is a partition of S. EXAMPLE D.2 Let be the relation on the set Z of integers defined by x y mod 5 ð Þ which reads ‘‘x is congruent to y modulus 5’’ and which means that the difference x y is divisible by 5. Then is an equivalence relation on Z. Then there are exactly five equivalence classes in the quotient set Z= as follows: A0 ¼ . . . ; 10; 5; 0; 5; 10; . . . f g A1 ¼ . . . ; 9; 4; 1; 6; 11; . . . f g A2 ¼ . . . ; 8; 3; 2; 7; 12; . . . f g A3 ¼ . . . ; 7; 2; 3; 8; 13; . . . f g A4 ¼ . . . ; 6; 1; 4; 9; 14; . . . f g Note that any integer x, which can be expressed uniquely in the form x ¼ 5q þ r where 0 r 5, is a member of the equivalence class Ar where r is the remainder. As expected, the equivalence classes are disjoint and their union is Z: Z ¼ A0 [ A1 [ A2 [ A3 [ A4 This quotient set Z= , called the integers modulo 5, is denoted Z=5Z or simply Z5: Usually one chooses 0; 1; 2; 3; 4 f g or 2; 1; 0; 1; 2 f g as a system of representatives of the equiva- lence classes. Analagously, for any positive integer m, there exists the congruence relation defined by x y mod m ð Þ and the quotient set Z= is called the integers modulo m. 416 Appendix D Odds and Ends
  • 424. D.3 Determinants and Block Matrices Recall first: THEOREM 8.12: Suppose M is an upper (lower) triangular block matrix with diagonal blocks Aj; A2; . . . ; An: Then det M ð Þ ¼ det Aj det A2 ð Þ . . . det An ð Þ: Accordingly, if M ¼ A B 0 D where A is r r and D is s s. Then det M ð Þ ¼ det A ð Þ det D ð Þ: THEOREM D.2: Consider the block matrix M ¼ A B C D where A is nonsingular, A is r r and D is s s: Then det M ð Þ ¼ det A ð Þ det D CA1 B ð Þ Proof: Follows from the fact that M ¼ I 0 CA1 I A B 0 D CA1 B and the above result. D.4 Full Rank Factorization A matrix B is said to have full row rank r if B has r rows that are linearly independent, and a matrix C is said to have full column rank r if C has r columns that are linearly independent. DEFINITION D.2: Let A be a m n matrix of rank r. Then A is said to have the full rank factorization A ¼ BC where B has full-column rank r and C has full-row rank r. THEOREM D.3: Every matrix A with rank r 0 has a full rank factorization. There are many full rank factorizations of a matrix A. Fig. D-1 gives an algorithm to find one such factorization. EXAMPLE D.3 Let A ¼ 1 1 1 2 2 2 1 3 1 1 2 3 2 4 3 5 where M ¼ 1 1 0 1 0 0 1 1 0 0 0 0 2 4 3 5 is the row cannonical form of A. We set B ¼ 1 1 2 1 1 2 2 4 3 5 and C ¼ 1 1 0 1 0 0 1 1 Then A ¼ BC is a full rank factorization of A. Algorithm D-1: The input is a matrix A of rank r 0. The output is a full rank factorization of A. Step 1. Find the row cannonical form M of A. Step 2. Let B be the matrix whose columns are the columns of A corresponding to the columns of M with pivots. Step 3. Let C be the matrix whose rows are the nonzero rows of M. Then A ¼ BC is a full rank factorization of A. Figure D-1 Appendix D Odds and Ends 417
  • 425. D.5 Generalized (Moore–Penrose) Inverse Here we assume that the field of scalars is the complex field C where the matrix AH is the conjugate transpose of a matrix A. [If A is a real matrix, then AH ¼ AT .] DEFINITION D.3: Let A be an m n matrix over C. A matrix, denoted by Aþ , is called the pseudoinverse or Morre–Penrose inverse or MP-inverse of A if A satisfies the following four equations: [MP1] AXA ¼ A; [MP3] AX ð ÞH ¼ AX; [MP2] XAX ¼ X; [MP4] XA ð ÞH ¼ XA; Clearly, Aþ is an n m matrix. Also, Aþ ¼ A1 if A is nonsingular. LEMMA D.4: Aþ is unique (when it exists). Proof. Suppose X and Y satisfy the four MP equations. Then AY ¼ AY ð ÞH ¼ AXAY ð ÞH ¼ AY ð ÞH AX ð ÞH ¼ AYAX ¼ AYA ð ÞX ¼ AX The first and fourth equations use [MP3], and the second and last equations use [MP1]. Similarly, YA ¼ XA (which uses [MP4] and [MP1]). Then, Y ¼ YAY ¼ YA ð ÞY ¼ XA ð ÞY ¼ X AY ð Þ ¼ X AX ð Þ ¼ X where the first equation uses [MP2]. LEMMA D.5: Aþ exists for any matrix A. Fig. D-2 gives an algorithm that finds an MP-inverse for any matrix A. Combining the above two lemmas we obtain: THEOREM D.6: Every matrix A over C has a unique Moore–Penrose matrix Aþ . There are special cases when A has full-row rank or full-column rank. THEOREM D.7: Let A be a matrix over C. (a) If A has full column rank (columns are linearly independent), then Aþ ¼ AH A ð Þ 1 AH : (b) If A has full row rank (rows are linearly independent), then Aþ ¼ AH AAH ð Þ 1 : THEOREM D.8: Let A be a matrix over C. Suppose A ¼ BC is a full rank factorization of A. Then Aþ ¼ Cþ Bþ ¼ CH CCH 1 BH B 1 BH Moreover, AAþ ¼ BBþ and Aþ A ¼ Cþ C: Algorithm D-2. Input is an m n matrix A over C or rank r. Output is Aþ . Step 1. Interchange rows and columns of A so that PAQ ¼ A11 A12 A21 A22 where A11 is a nonsingular r r block. [Here P and Q are the products of elementary matrices corresponding to the interchanges of the rows and columns.] Step 2. Set B ¼ A11 A21 and C ¼ Ir; A1 11 A12 where Ir is the r r identity matrix. Step 3. Set Aþ ¼ Q CH CCH ð Þ 1 BH B ð Þ 1 B11 h i P: Figure D-2 418 Appendix D Odds and Ends
  • 426. EXAMPLE D.4 Consider the full rank factorization A ¼ BC in Example D.1; that is, A ¼ 1 1 1 2 2 2 1 3 1 1 2 3 2 4 3 5 ¼ 1 1 2 1 1 2 2 4 3 5 1 1 0 1 0 0 1 1 ¼ BC Then CCH 1 ¼ 1 5 2 1 1 3 ; C CCH 1 ¼ 1 5 2 1 2 1 1 3 1 2 2 6 6 4 3 7 7 5; BH B 1 ¼ 1 11 6 5 5 6 ; B BH B 1 ¼ 1 11 1 7 4 1 4 7 Accordingly, the following is the Moore–Penrose inverse of A: Aþ ¼ 1 55 1 18 15 1 18 15 2 19 25 3 1 10 2 6 6 4 3 7 7 5 D.6 Least-Square Solution Consider a system AX ¼ B