Contents

1 Introduction                                                                                                         5
  1.1 Course Goals . . . . . . . . . .     .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   6
  1.2 About the Subject . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   6
       1.2.1 Connection to Geometry        .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   6
       1.2.2 Linear Systems . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   6
       1.2.3 Eigen-analysis . . . . . .    .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   7
  1.3 About These Notes . . . . . . .      .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   8
  1.4 About The Computer Labs . .          .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   .   9

2 Vectors and Geometry                                                                                                 10
  2.1 Chapter Introduction . . . . . . . . . . . . . . . . . . . . . . . . .                                           10
  2.2 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                          10
      2.2.1 Multiplication by a number and vector addition . . . . . .                                                 11
      2.2.2 Co-ordinates . . . . . . . . . . . . . . . . . . . . . . . . .                                             12
      2.2.3 Properties of vector addition and scalar multiplication . .                                                16
      2.2.4 MATLAB: basic scalar and vector operations . . . . . . .                                                   17
      2.2.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .                                             18
  2.3 Geometrical Aspects of Vectors . . . . . . . . . . . . . . . . . . .                                             19
      2.3.1 Length of a vector . . . . . . . . . . . . . . . . . . . . . .                                             19
      2.3.2 The dot product . . . . . . . . . . . . . . . . . . . . . . .                                              19
      2.3.3 Projections . . . . . . . . . . . . . . . . . . . . . . . . . .                                            22
      2.3.4 MATLAB: norm and dot commands . . . . . . . . . . . .                                                      23
      2.3.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .                                             23
  2.4 Determinants and the Cross Product . . . . . . . . . . . . . . . .                                               25
      2.4.1 The determinant in two and three dimensions . . . . . . .                                                  25
      2.4.2 The cross product . . . . . . . . . . . . . . . . . . . . . .                                              27
      2.4.3 The triple product and the determinant in three dimensions                                                 30
      2.4.4 MATLAB: assigning matrices and det and cross commands                                                      31
      2.4.5 MATLAB: generating scripts with the MATLAB editor .                                                        31
      2.4.6 MATLAB: floating point representation of real numbers .                                                     32
      2.4.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .                                             33
  2.5 Lines and Planes . . . . . . . . . . . . . . . . . . . . . . . . . . .                                           34
      2.5.1 Describing linear sets . . . . . . . . . . . . . . . . . . . .                                             34
      2.5.2 Lines in two dimensions: Parametric form . . . . . . . . .                                                 34


                                       1
CONTENTS                                                                CONTENTS



         2.5.3 Lines in two dimensions: Equation form . . . . . . . . . .          35
         2.5.4 Lines in three dimensions: Parametric form . . . . . . . .          35
         2.5.5 Lines in three dimensions: Equation form . . . . . . . . .          36
         2.5.6 Planes in three dimensions: Parametric form . . . . . . .           37
         2.5.7 Planes in three dimensions: Equation form . . . . . . . .           37
         2.5.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .      37
   2.6   Introduction to Linear Systems . . . . . . . . . . . . . . . . . . .      40
         2.6.1 Description of points and the geometry of solutions to
                systems of equations . . . . . . . . . . . . . . . . . . . . .     40
         2.6.2 Describing the whole plane in two dimensions and all of
                space in three dimensions . . . . . . . . . . . . . . . . . .      42
         2.6.3 Linear dependence and independence . . . . . . . . . . . .          43
         2.6.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .      44
   2.7   Additional Topics . . . . . . . . . . . . . . . . . . . . . . . . . . .   45
         2.7.1 Application: rotational motion . . . . . . . . . . . . . . .        45
         2.7.2 Application: 3-D graphics . . . . . . . . . . . . . . . . . .       46
   2.8   Solutions to Chapter Problems . . . . . . . . . . . . . . . . . . .       49

3 Solving Linear Systems                                                      63
  3.1 Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
       3.1.1 General Form of Linear Systems . . . . . . . . . . . . . . 63
       3.1.2 Solving Linear Systems by Substitution . . . . . . . . . . 63
       3.1.3 Elementary row (equation) operations . . . . . . . . . . . 64
       3.1.4 Augmented Matrices . . . . . . . . . . . . . . . . . . . . . 65
       3.1.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
  3.2 Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . . . . 67
       3.2.1 Using MATLAB for row reductions . . . . . . . . . . . . . 73
       3.2.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
  3.3 Homogeneous Equations . . . . . . . . . . . . . . . . . . . . . . . 76
       3.3.1 Properties of solutions of homogeneous systems. . . . . . 77
       3.3.2 Connection of solutions to homogeneous and inhomoge-
             neous systems. . . . . . . . . . . . . . . . . . . . . . . . . 77
       3.3.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
  3.4 Geometric Applications . . . . . . . . . . . . . . . . . . . . . . . 80
       3.4.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
  3.5 Resistor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 82
       3.5.1 Elements of Basic Circuits . . . . . . . . . . . . . . . . . . 82
       3.5.2 Two Simple Examples Made Complicated . . . . . . . . . 84
       3.5.3 Loop Currents . . . . . . . . . . . . . . . . . . . . . . . . 86
       3.5.4 Alternate Presentation of Resistor Networks . . . . . . . . 90
       3.5.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
  3.6 Additional Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
       3.6.1 Quadratic Functions . . . . . . . . . . . . . . . . . . . . . 96
       3.6.2 Least squares fit . . . . . . . . . . . . . . . . . . . . . . . 98
       3.6.3 Equilibrium configuration of hanging weights and springs 99
       3.6.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 101


                                         2
CONTENTS                                                             CONTENTS



   3.7   Solutions to Chapter Problems . . . . . . . . . . . . . . . . . . . 102

4 Matrices and Determinants                                                   117
  4.1 Matrix operations . . . . . . . . . . . . . . . . . . . . . . . . . . 117
      4.1.1 MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
      4.1.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
  4.2 Linear Transformations and Matrices . . . . . . . . . . . . . . . . 121
      4.2.1 Linear Transformations . . . . . . . . . . . . . . . . . . . 121
      4.2.2 Rotations in two dimensions . . . . . . . . . . . . . . . . . 122
      4.2.3 Projections in two dimensions . . . . . . . . . . . . . . . . 123
      4.2.4 Reflections in two dimensions . . . . . . . . . . . . . . . . 124
      4.2.5 Every linear transformation is multiplication by a matrix 125
      4.2.6 Composition of linear transformations and matrix product 127
      4.2.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
  4.3 Application: random walks . . . . . . . . . . . . . . . . . . . . . 129
      4.3.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
  4.4 The Transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
      4.4.1 MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
      4.4.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
  4.5 Matrix Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
      4.5.1 Computing the inverse . . . . . . . . . . . . . . . . . . . . 141
      4.5.2 Inverses of Products . . . . . . . . . . . . . . . . . . . . . 144
      4.5.3 MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
      4.5.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
  4.6 Return to Resistor Networks . . . . . . . . . . . . . . . . . . . . 146
  4.7 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
      4.7.1 Definition of Determinants . . . . . . . . . . . . . . . . . 150
      4.7.2 Determinants of Triangular matrices . . . . . . . . . . . . 151
      4.7.3 Summary of determinant calculation rules . . . . . . . . . 152
      4.7.4 Calculation of determinant using row operations . . . . . 153
      4.7.5 More expansion formulae . . . . . . . . . . . . . . . . . . 153
      4.7.6 MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
      4.7.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
  4.8 Additional Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
      4.8.1 Application: General Least Squares . . . . . . . . . . . . 157
      4.8.2 Least squares solutions . . . . . . . . . . . . . . . . . . . . 158
      4.8.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
      4.8.4 Elementary matrices . . . . . . . . . . . . . . . . . . . . . 159
      4.8.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
      4.8.6 Exchanging two rows changes the sign of the determinant 163
      4.8.7 The determinant is linear in each row separately . . . . . 164
      4.8.8 Adding a multiple of one row to another doesn’t change
             the determinant . . . . . . . . . . . . . . . . . . . . . . . 165
      4.8.9 The determinant of QA . . . . . . . . . . . . . . . . . . . 166
      4.8.10 The determinant of A is zero exactly when A is not invertible167
      4.8.11 The product formula: det(AB) = det(A) det(B) . . . . . . 167


                                       3
CONTENTS                                                                                    CONTENTS



         4.8.12 The determinant of the transpose . . . . .              . . . . .           .   .   .   .   168
         4.8.13 An impractical formula for the inverse . .              . . . . .           .   .   .   .   169
         4.8.14 Cramer’s rule, an impractical way to solve              systems             .   .   .   .   170
         4.8.15 Problems . . . . . . . . . . . . . . . . . .            . . . . .           .   .   .   .   170
   4.9   Solutions to Chapter Problems . . . . . . . . . .              . . . . .           .   .   .   .   170

5 Complex numbers                                                                                           193
  5.1 Complex arithmetic . . . . . . .      . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   193
  5.2 Complex exponential . . . . . . .     . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   194
  5.3 Polar representation of a complex     number      .   .   .   .   .   .   .   .   .   .   .   .   .   196
  5.4 MATLAB . . . . . . . . . . . . .      . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   196
  5.5 Problems . . . . . . . . . . . . .    . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   197
  5.6 Solutions to Chapter Problems .       . . . . .   .   .   .   .   .   .   .   .   .   .   .   .   .   197

6 Eigen-analysis                                                                                            200
  6.1 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . .                                  200
      6.1.1 Computing the eigenvalues and eigenvectors . . . . . . . .                                      201
      6.1.2 Complex eigenvalues and eigenvectors . . . . . . . . . . .                                      207
      6.1.3 MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . .                                    209
      6.1.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .                                  210
  6.2 Eigenanalysis simplifies matrix powers . . . . . . . . . . . . . . .                                   211
      6.2.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .                                  215
  6.3 Systems of linear differential equations . . . . . . . . . . . . . . .                                 215
      6.3.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .                                  221
  6.4 LCR circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                222
      6.4.1 Capacitors and inductors . . . . . . . . . . . . . . . . . .                                    222
      6.4.2 Differential equations for LCR circuits . . . . . . . . . . .                                    223
      6.4.3 Alternate description of LCR circuits . . . . . . . . . . . .                                   225
      6.4.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .                                  227
  6.5 Additional Topics . . . . . . . . . . . . . . . . . . . . . . . . . . .                               227
      6.5.1 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . .                                 227
      6.5.2 Computing high powers of a matrix . . . . . . . . . . . .                                       229
      6.5.3 Another formula for the determinant . . . . . . . . . . . .                                     230
      6.5.4 The matrix exponential and differential equations . . . . .                                      230
      6.5.5 Converting higher order equations into first order systems                                       231
      6.5.6 Springs and weights . . . . . . . . . . . . . . . . . . . . .                                   233
      6.5.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . .                                  238
  6.6 Solutions to Chapter Problems . . . . . . . . . . . . . . . . . . .                                   239




                                        4
Chapter 1

Introduction

Linear Algebra is a branch of Mathematics. Many applied problems from Sci-
ence, Engineering and Finance can be written in terms of Linear Algebra ques-
tions. This is also true of Calculus, which is why these two fields are stressed
in undergraduate Mathematics education at UBC and other universities. By
putting a class of commonly occurring problems into a unified, abstract frame-
work, the problems can be studied in detail and well understood. This under-
standing can then be taken back to problems in many different fields. It is
especially important to realize that very large problems (with potentially mil-
lions of unknowns) can be understood in the same framework as the model
problems we do by hand in these courses. This course has computer labs that
involve the mathematical software, MATLAB, that will allow you to solve larger
problems using numerical computations.
    Unlike Calculus, Linear Algebra does not require a lot of background knowl-
edge. Basic operations in Linear Algebra are just arithmetic. However, there
is a powerful connection between these simple arithmetic operations and geo-
metric quantities. Simple ideas in this course start to become abstract when
combined together. Linear Algebra is a subject you can study with very limited
matematical background, but you are advised to keep up with course lectures,
readings, assignments and computer labs so you won’t be left behind at the
transition from concrete ideas to abstract ones.
    Details of the course taught in the Spring of 2011 can be found at the common
web page for all sections of the course, which can be found by taking a link at

http://www.math.ubc.ca/~wetton/

On this page, you will find details about the grade breakdown for the course,
test dates and links to the page on computer labs. Weekly assignments for the
course will be posted here. Other useful material such as old tests for practise
can also be found on this page.




                                       5
Chapter 1. Introduction                                         1.1. Course Goals



1.1     Course Goals
The goal of the course is to enable students to

  1. recognize linear algebra questions (for which there are straight-forward
     analytic and numerical solution techniques) as parts of applied problems

  2. make the connection between geometric properties and analytic quantities
     (determinants, dot and cross products, eigenvalues, etc.)

  3. recognize that linear systems of equations can have unique, infinite or no
     solutions and know how to determine all solutions or that none exist

  4. recognize matrix multiplication as a linear transformation and that such
     transformations (to the same dimensional space) can be simplified using
     eigen-analysis

  5. use complex numbers, which arise naturally in the eigen-analysis of ma-
     trices


1.2     About the Subject
The subject of the course is Linear Algebra, focussing on three main topics:
vectors and matrices and connections to geometry, linear systems, and eigen-
analysis of matrices. Several applications are considered including resistor net-
works and random walks.

1.2.1    Connection to Geometry
The first topic considered in the course is vectors, which are quantities with both
magnitude and direction. A typical quantity represented as a vector is a force
F on an object as shown in Figure 1.1. The force F in that figure acts in the
x − y plane as shown. The vector force F can be represented by its components
(Fx , Fy ). Some interesting questions you will be able to answer after completing
this course are: what directions are perpendicular to this force?; what are the
coordinates of the force in the rotated coordinate system x − y ?; what are the
coordinates of the force if its direction is rotated? These last two questions are
related.

1.2.2    Linear Systems
Consider the following simple example. You probably saw something like this
in high school.

Example 1.1 Bob and Sue together have 12 dollars. Sue has 2 dollars more
than Bob. How much money do each have? You can probably guess the solution
by trial and error, but let us proceed a bit more formally. Let x be the amount of


                                        6
Chapter 1. Introduction                                     1.2. About the Subject




               Figure 1.1: Force vector and coordinate systems.


money Sue has and y the amount Bob has. The two statements in the example
can be written mathematically as

                                  x+y           =   12
                                  x−y           =   2.

The equations above are a linear system for the unknowns x and y. A technique
that can be used to solve the system (that is, determine the values of x and
y that simultaneously solve both equations above) is substitution. The second
equation can be written as
                                   x=y+2
This can be substituted into the first equation above, eliminating y from the
problem
            (y + 2) + y = 12 or 2y + 2 = 12 so 2y = 10 so y = 5
The value of y = 5 determines x = 7 from either of the original relationships.
Thus it is determined that Bob has 5 dollars and Sue has 7.

    Often (but not always) a linear system of n equations for n unknowns has
a unique solution. The example above was for the case n = 2. However, the
substitution technique used above becomes impractical when n is larger than 3.
In this course you will learn the Gaussian Elimination technique to solve linear
systems. This systematic method can find all solutions when there are any and
also determine if the system has no solutions. This method can be implemented
in numerical software and used to solve very large systems.

1.2.3    Eigen-analysis
The final subject of the course is eigen-analysis of matrices and its applications.
A simple, motivational example comes from the study of discrete dynamical
systems. Consider a sequence of values

                                 x0 , x1 , · · · xn · · ·


                                            7
Chapter 1. Introduction                                      1.3. About These Notes



where the index n is a time level. Suppose that xn is determined by the previous
value xn−1 in the same way for every n, that is

                          xn = f (xn−1 ) for every n ≥ 1                       (1.1)

for a given function f . This could describe the population number xn of a
species at year n. The simple model assumes that the population the next year
only depends on the population this year through the function f . If the initial
value x0 were given, then the values x1 , x2 · · · xn · · · could be determined using
(1.1) repeatedly. A linear problem arises when we take the specific example
f (x) = ax where a is a given constant. In this case, it is easy to compute the
entries of the sequence:

                          x1   = f (x0 ) = ax0
                          x2 = f (x1 ) = f (ax0 ) = a2 x0
                           .
                           .   .
                               .
                           .   .
                          xn   = f (xn−1 )) = an x0

For this example, we can determine how the sequence behaves very well because
we have an expression for xn above that is easy to understand. There are several
cases:
   1. If x0 = 0 then xn = 0 for all n.
   2. if |a| < 1 then limn→∞ xn = 0.
   3. if a = 1 then xn = x0 for all n.
   4. if a = −1 then the values alternate in sign: xn has the value x0 is n is
      even, −x0 if n is odd.
   5. if |a| > 1 and x0 = 0 then the values of the sequence grow in absolute
      value as n → ∞.
   Linear discrete dynamical systems for vectors are also of interest. In these
cases, multiplication by the number a in the example above is replaced by
multiplication by a matrix A. Eigen-analysis of the matrix A allows one to
understand how the system behaves as n → ∞ in a similar way to the simple
example above.


1.3      About These Notes
The first version of these notes was written by Richard Froese for Math 152
taught in the Spring of 2004. There are many text books on elementary linear
algebra material, but none have the material in the order we want for Math
152 for Applied Science students. These notes stress geometric concepts in
two and three dimensional space. They also treat applications and numerical


                                         8
Chapter 1. Introduction                            1.4. About The Computer Labs



approximation using MATLAB in more detail than most texts. For this reason,
the authors have felt it was worthwhile to maintain and improve these notes for
Math 152. Additionally, we believe it is a social benefit for students to have
access to this material without having to purchase an expensive, commercial
text.
    An update to the notes was made by Richard Froese for the course in 2007,
including solutions to the exercises. The version written for the 2009 course
had some updates by Brian Wetton: this introductory chapter, some additional
comments on MATLAB commands, a reworked section on linear systems arising
from electrical networks, and additional problems and solutions. In addition, the
notes were converted to standard L TEXformat to make them easier to maintain.
                                   A

    Substantial revisions for the notes for 2010 were done by Ignacio Rozada,
who was supported by a UBC Skylight grant over the Summer of 2009 to add the
problems and solutions used in weekly assignments the previous year. He also
added additional MATLAB material. Brian Wetton also added additional notes
to Chapter 4 on the use of matrix multiplication and inverses in the derivation
of solutions to the “fundamental problem” of resistor networks.
    These notes for 2011 had minor revisions revisions done by Brian Wetton.
Further revisions are planned. There will be additional MATLAB material
added to the notes and additional problems and solutions. We have considered
having the notes printed and available at cost at the UBC bookstore but are not
yet sure of the demand from students. If you have any suggestions for additional
material or ways to improve the presentation, please send your ideas to Brian
Wetton (wetton@math.ubc.ca).


1.4     About The Computer Labs
The course includes six one-hour computer labs. These are given to small groups
of students every other week starting in the second week of the term. Locations
and times for the lab sections can be found following a link from the course web
page. The labs are being updated slightly for the 2010 course and will be posted
during the term as they are finalized. The labs are designed to be done during
the lab hour, and must be handed in at the end of the lab period. It is a good
idea to read through the lab notes before going to the lab so you are ready to
begin in the lab. After your first lab, you will be able to go to the lab rooms in
open hours to improve your MATLAB skills and to prepare for upcoming labs
if you find you are not able to complete the labs in the lab hour. Computer
lab material including your knowledge of MATLAB commands will be tested
on midterms and the final exam.
    There are two main goals for the labs. The first is to gain familiarity with
the computational tool, MATLAB, that is commonly used in later courses and
Engineering careers. The second is to be able to solve larger, more interesting
applied problems that would otherwise be inaccessible using analytic methods.
Seeing the algorithms of MATLAB in action may also help you understand the
underlying mathematical concepts you see in the lectures.


                                       9
Chapter 2

Vectors and Geometry

2.1     Chapter Introduction
This chapter contains an introduction to vectors, which correspond to points in
two, three and higher dimensional spaces. In this chapter, you will become famil-
iar with basic vector operations such as addition, scalar multiplication, length,
the dot product, and the cross product (for three dimensional vectors). Vector
representation of lines in 2D and 3D and planes in 3D is presented. Criteria for
when such objects intersect at unique points is given in terms of determinants.
This geometric presentation motivates our study of these kind of problems in
higher dimensional settings in later chapters. Throughout this chapter, MAT-
LAB commands are introduced that perform the operations described in the
text. For 2D and 3D problems, using MATLAB is only a convenience. For
higher dimensions, doing the computations by hand (even with a calculator) is
impractical, and a computational framework like MATLAB is essential to be
able to solve linear problems.


2.2     Vectors
Vectors are used to describe quantities that have both a magnitude and a di-
rection. You are probably familiar with vector quantities in two and three
dimensions, such as forces and velocities.
    Later in this course we will see that vectors can also describe the configura-
tion of a mechanical system of weights and springs, or the collections of voltages
and currents in an electrical circuit. These more abstract vector quantities are
not so easily visualized since they take values in higher dimensional spaces.
    We begin this course by discussing the geometry of vectors in two and three
dimensions. In two and three dimensions, vectors can be visualized as arrows.
Before we can draw a vector, we have to decide where to place the tail of the
vector. If we are drawing forces, we usually put the tail of the vector at the



                                       10
Chapter 2. Vectors and Geometry                                      2.2. Vectors




Figure 2.1: Forces acting on a pendulum (left) and position and velocity of a
particle (right)




                       Figure 2.2: Scalar multiplication.


place where the force is applied. For example, in Figure 2.1 (left) the forces
acting on a pendulum bob are gravity and the restraining force along the shaft.
    If we are drawing the velocity of a particle at a given time, we would place
the tail of the velocity vector v(t) at the position of the particle at that time
as shown in Figure 2.1 (right). Once we have chosen a starting point for the
tails of our vectors (i.e., an origin for space), every point in space corresponds
to exactly one vector, namely the vector whose tail is at the origin and whose
head is at the given point. For example, in Figure 2.1 (right) we have chosen an
arbitrary point as the origin (marked with a circle) and identified the position
of the particle with the vector r(t).

2.2.1    Multiplication by a number and vector addition
There are two basic operations defined for vectors. One is multiplication of a
vector by a number (also called scalar multiplication). The other is addition of
two vectors.
   A vector a can be multiplied by a number (or scalar) s to produce a new
vector sa. If s is positive then sa points in the same direction as a and has
length s times the length of a. This is shown in Figure 2.2. If s is negative then
sa points in the direction opposite to a and has length |s| times the length of a.


                                       11
Chapter 2. Vectors and Geometry                                        2.2. Vectors




                          Figure 2.3: Vector Addition.


   To add two vectors a and b and we draw the parallelogram that has a and
b as two of its sides as shown in Figure 2.3. The vector a + b has its tail at
the origin and its head at the vertex of the parallelogram opposite the origin.
Alternatively we can imagine sliding (or translating) one of the vectors, without
changing its direction, so that its tail sits on the head of the other vector. (In
the diagram we translated the vector a.) The sum a + b is then the vector
whose tail is at the origin and whose head coincides with the vector we moved.

Example 2.1 Describe and sketch the following set of points {sa : s ∈ R}
(that is, the set of all scalar multiples of a) where a is a non-zero vector in R2 .
The set is a straight line going through the origin with direction a as shown in
Figure 2.4.


2.2.2     Co-ordinates
In order to do calculations with vectors, we have to introduce co-ordinate axes.
Once we have chosen in what directions the co-ordinate axes will lie, we can
specify a vector by giving its components in the co-ordinate directions.
    In the Figure 2.5 we see two choices of x and y axes. For the first choice of
axes, the vector a has co-ordinates [5, 3] and for the second choice of axes the
                  √
co-ordinates are [ 34, 0]. In a given problem, it makes sense to choose the axes
so that at least some of the vectors have a simple representation. For example,
in analyzing the forces acting on a pendulum, we would either choose the y axis
either to be vertical, or to lie along the shaft of the pendulum.
    We can choose to write the co-ordinates of a vector in a row, as above, or in
a column, like
                                          5
                                          3
Later on, we will almost always write vectors as columns. But in this chapter we
will write vectors as rows. Writing vectors as rows saves space on the page but
we will learn later how to write vectors in row form even when we want them to
be column vectors for other reasons. Note: When writing vector coordinates


                                        12
Chapter 2. Vectors and Geometry                                      2.2. Vectors




                                             a




                     Figure 2.4: Figure for Example 2.1.


by hand or in this text, either square or round brackets can be used. However,
when using MATLAB, vectors must be created with square brackets (round
brackets are used for other purposes).
   A convenient way to choose the co-ordinate axes is to specify unit vectors
(that is, vectors of length one) that lie along each of the axes. These vectors
are called standard basis vectors and are denoted i and j (in two dimensions)
and i, j and k (in three dimensions). These vectors are shown in Figure 2.6
(Sometimes they are also denoted e1 and e2 (in two dimensions) and e1 , e2 and
e3 (in three dimensions).)
   The unit vectors have co-ordinates

                               i   = e1 = [1, 0]
                               j   = e2 = [0, 1]

in two dimensions, and

                               i   = e1 = [1, 0, 0]
                              j    = e2 = [0, 1, 0]
                              k = e3 = [0, 0, 1]

in three dimensions.
    Often, we make no distinction between a vector and its co-ordinate repre-
sentation. In other words, we regard the co-ordinate axes as being fixed once
and for all. Then a vector in two dimensions is simply a list of two numbers (the


                                        13
Chapter 2. Vectors and Geometry                                     2.2. Vectors




                Figure 2.5: Two choices of co-ordinate axes.




            Figure 2.6: Unit vectors in 2D (left) and 3D (right).




                                     14
Chapter 2. Vectors and Geometry                                                     2.2. Vectors




                    Figure 2.7: Adding vector co-ordinates.


components) [a1 , a2 ], and a vector in three dimensions is a list of three numbers
[a1 , a2 , a3 ]. Vectors in higher dimensions are now easy to define. A vector in n
dimensions is a list of n numbers [a1 , a2 , . . . , an ].
     When a vector is multiplied by a number, each component is scaled by the
same amount. Thus if a = [a1 , a2 ], then

                                 sa       = s[a1 , a2 ]
                                          =      [sa1 , sa2 ]

   Similarly, when two vectors are added, their co-ordinates are added component-
wise. So if a = [a1 , a2 ] and b = [b1 , b2 ], then

                           a+b =              [a1 , a2 ] + [b1 , b2 ]
                                      =       [a1 + b1 , a2 + b2 ]

This is shown in Figure 2.7.
     The analogous formulae hold in three (and higher dimensions). If a =
[a1 , a2 , . . . , an ] and b = [b1 , b2 , . . . , bn ], then

                      sa   = s[a1 , a2 , . . . , an ]
                           =    [sa1 , sa2 , . . . , san ]
                   a+b =        [a1 , a2 , . . . , an ] + [b1 , b2 , . . . , bn ]
                           =    [a1 + b1 , a2 + b2 , . . . , an + bn ]

Example 2.2 Sketch axes x1 -x2 . Add the vectors (1,1) and (2,-1) to your
sketch. Draw these vectors with base point at the origin. Now add the vector
(1,-2) to your sketch, starting at the base point (1,1). That is, draw the vector
with components 1 to the right and 2 down starting at (1,1). Note: your sketch
should show graphically that (1,1)+(1,-2)=(2,-1). See Figure 2.8.



                                               15
Chapter 2. Vectors and Geometry                                           2.2. Vectors




                     Figure 2.8: Figure for example 2.2


2.2.3    Properties of vector addition and scalar multiplica-
         tion
Let 0 denote the zero vector. This is the vector all of whose components are
zero. The following properties are intuitive and easy to verify.

  1. a + b = b + a

  2. a + (b + c) = (a + b) + c

  3. a + 0 = a

  4. a + (−a) = 0

  5. s(a + b) = (sa + sb)

  6. (s + t)a = sa + ta

  7. (st)a = s(ta)

  8. 1a = a

    They follow from similar properties which hold for numbers. For example,
for numbers a1 and b1 we know that a1 + b1 = b1 + a1 . Thus

                 a+b =      [a1 , a2 ] + [b1 , b2 ]
                       =    [a1 + b1 , a2 + b2 ] = [b1 + a1 , b2 + a2 ]
                       =    [b1 , b2 ] + [a1 , a2 ] = b + a,


so property 1 holds. Convince yourself that the rest of these properties are
true. (What is the vector −a?). It might seem like a waste of time fussing over
obvious properties such as these. However, we will see when we come to the
cross product and matrix product, that sometimes such “obvious” properties
turn out to be false!



                                           16
Chapter 2. Vectors and Geometry                                     2.2. Vectors



2.2.4    MATLAB: basic scalar and vector operations
On the course web page there is a link to the Math 152 computer lab page,
where information on the location of computer labs can be found, as well as
how to start up the MATLAB program on the computers in these labs. In
the command window at the prompt >>, you can type MATLAB commands
directly. Some basic commands are given below

assignment: Scalar and vector variables can be assigned using the “=” oper-
     ator. For example

     a = 2

     followed by <enter> assigns the scalar value of 2 to the variable a. The
     result of the command is printed out although this can be suppressed by
     using a colon at the end of the command:

     a = 2;

     Here a is still assigned the value of 2 but no output is generated. Vector
     variables are assigned with the following notation:

     b = [1 2];

     Note that b = [1, 2] has the same meaning in MATLAB, i.e., numbers
     separated by a comma or a space imply row vectors. For column vectors,
     the entries have to be separated by semicolons:

     b1 = [2; 3];

     Note also that there are no special distinctions between the names of scalar
     and vector variables.

addition: Both scalar and vector addition can be done with the “+” operator.
     Keeping the values of scalar a and vector b above, we enter the commands

     a2 = 5;
     b2 = [2 9];
     a+a2
     c = b+b2;

     The first two lines above assign a new scalar and vector. The third line
     prints out the answer 7 (2+5). The last line assigns the resulting vector
     [3 11] ([1 2] + [2 9]) to the new vector c but prints nothing.

scalar multiplication Scalar multiplication (of vectors and other scalars) is
     implemented using the “*” command. Using the variables defined above,


                                      17
Chapter 2. Vectors and Geometry                                            2.2. Vectors



     a*a2
     a*b
     would result in 10 (2 times 5) and [2 4] (2 times [1 2]). The “*” command
     also implements matrix-vector and matrix-matrix multiplication discussed
     later in the course. Vector-vector multiplication (dot products and cross
     products) are implemented using different commands as discussed in the
     next section.
other commands: There are many useful functions built in to MATLAB such
     as sqrt (square root), cos (cosine, taking an argument in radians), acos
     (inverse cosine, giving an result in radians) and many more. They are
     called as follows
     sqrt(2)
                      √
     which will return 2 to 4 decimal places. Type help followed by a com-
     mand name gives you a description of that command. Try typing help
     atan2 since atan2 is a pretty useful function. These MATLAB functions
     can take vector arguments, they act on each entry of the vector. For
     example
     sqrt([1 4])
     will produce the vector [1 2].

2.2.5    Problems
Problem 2.1 Sketch axes x1 -x2 . Add the vectors (2,2) and (1,-1) to your
sketch. Draw these vectors with base point at the origin. Now add the vector
(1,-1) to your sketch, starting at the base point (2,2). That is, draw the vector
with components 1 to the right and 1 down starting at (2,2). Note: your sketch
should show graphically that (2,2)+(1,-1)=(3,1).
Problem 2.2 Let a, b and c be fixed non-zero vectors. Describe and sketch the
following sets of points in two and three dimensions:
  (i) {sa : s ∈ R} (i.e., the set of all scalar multiples of a)
 (ii) {sa : s > 0} (i.e., the set of all positive scalar multiples of a)
(iii) {b + sa : s ∈ R}
(iv) {sa + tb : s, t ∈ R}
 (v) {c + sa + tb : s, t ∈ R}
Problem 2.3 Describe the vectors a − b and b − a.
Problem 2.4 Find an expression for the midpoint between a and b. Find an
expression for a point one third of the way between a and b.
Problem 2.5 Find an expression for the line segment joining a and b.


                                         18
Chapter 2. Vectors and Geometry                   2.3. Geometrical Aspects of Vectors




                         Figure 2.9: Pythagorean Formula.


2.3      Geometrical Aspects of Vectors
2.3.1     Length of a vector
It follows from the Pythagorean formula that the length a of a = [a1 , a2 ]
satisfies a 2 = a2 + a2 . This is shown in Figure 2.9.
                 1   2
    Thus
                                 a = a2 + a2 .
                                         1    2

Similarly, for a vector a = [a1 , a2 , a3 ] in three dimensions,

                                  a =      a2 + a2 + a2 .
                                            1    2    3

The distance between two vectors a and b is the length of the difference b − a.

2.3.2     The dot product
The dot product of two vectors is defined in both two and three dimensions
(actually in any dimension). The result is a number. Two main uses of the dot
product are testing for orthogonality and computing projections.
   The dot product of a = [a1 , a2 ] and b = [b1 , b2 ] is given by

                                  a · b = a1 b1 + a2 b2 .

Similarly, the dot product of a = [a1 , a2 , a3 ] and b = [b1 , b2 , b3 ] is given by

                              a · b = a1 b1 + a2 b2 + a3 b3 .

The properties of the dot product are as follows:
0. If a and b are vectors, then a · b is a number.
1. a · a = a 2 .


                                            19
Chapter 2. Vectors and Geometry                  2.3. Geometrical Aspects of Vectors




 Figure 2.10: The vectors a, b and a − b (left) Lengths of Segments (right).


2. a · b = b · a.

3. a · (b + c) = a · b + a · c.

4. s(a · b) = (sa) · b.

5. 0 · a = 0.

6. a · b = a    b cos(θ), where θ is angle between a and b.

7. a·b = 0 ⇐⇒ a = 0 or b = 0 or a and b are orthogonal (i.e,. perpendicular).

Properties 0 to 5 are easy consequences of the definitions. For example, to verify
property 5 we write

                0 · a = [0, 0, 0] · [a1 , a2 , a3 ] = 0a1 + 0a2 + 0a3 = 0.

    Property 6 is the most important property and is often taken as the definition
of the angle θ between vectors a and b. Notice that our definition is given in
terms of the components of the vectors, which depend on how we chose the
co-ordinate axes. It is not at all clear that if we change the co-ordinate axis,
and hence the co-ordinates of the vectors, that we will get the same answer for
the dot product. However, property 6 says that the dot product only depends
on the lengths of the vectors and the angle between them. These quantities
are independent of how co-ordinate axes are chosen, and hence so is the dot
product.
    To show that property 6 holds we compute a − b 2 in two different ways.
First of all, using properties 1 to 5, we have
                             2
                      a−b         =   (a − b) · (a − b)
                                  = a·a−a·b−b·a+b·b
                                  = a 2 + b 2 − 2a · b                         (2.1)

(Which properties were used in each step?) Next we compute a−b as depicted
in Figure 2.10 (left).


                                           20
Chapter 2. Vectors and Geometry                              2.3. Geometrical Aspects of Vectors



   We mark the lengths of each of the line segments in Figure 2.10 (right).
Using Pythagoras’ theorem for the right angled triangle on the right of this
diagram, we see that

                       a−b     2
                                       = ( a − b cos(θ))2 + b            2
                                                                             sin2 (θ).

Thus, using cos2 (θ) + sin2 (θ) = 1,

       a−b     2
                       =   a       2
                                       + b   2
                                                 cos2 (θ) − 2 a       b cos(θ) + b       2
                                                                                             sin2 (θ)
                                   2         2
                       =   a           + b       −2 a       b cos(θ)                                    (2.2)

Actually, this is just the cosine law applied to the triangle in Figure 2.10 and you
may have been able to write (2.2) directly. Now we equate the two expressions
(2.1, 2.2) for a − b 2 . This gives
                   2           2                        2         2
               a       + b         − 2a · b = a             + b       −2 a     b cos(θ)
                   2           2
Subtracting a          + b         from both sides and dividing by −2 now yields

                                         a·b= a           b cos(θ).

This proves property 6.
    Property 7 now follows directly from 6. If a · b = 0 then a b cos(θ) = 0
so either a = 0, in which case a = 0, or b = 0, in which case b = 0, or
cos(θ) = 0, which implies that θ = π/2 (since θ lies between 0 and π). This
implies a and b are orthogonal.
    Property 6 can be used to compute the angle between two vectors as shown
in the example below.
Example 2.3 What is the angle between the vectors whose tails lie at the centre
of a cube and whose heads lie on adjacent vertices? To compute this take a
cube of side length 2 and centre it at the origin, so that the vertices lie at the
points [±1, ±1, ±1]. Then we must find the angle between a = [1, 1, 1] and
b = [−1, 1, 1]. Since
                                                        √ √
             a · b = −1 + 1 + 1 = 1 = a b cos(θ) = 3 3 cos(θ)

we obtain
                           θ = arccos(1/3) ∼ 1.231                (∼ 70.5◦ )
   Here is an example to review the basic operations on vectors we know so far.
Example 2.4 Consider the vectors a = (2, 3) and b = (1, −3) in R2 . Compute
the following:
 (a) a + b
 (b) 3a
 (c) 2a + 4b


                                                     21
Chapter 2. Vectors and Geometry                  2.3. Geometrical Aspects of Vectors




                             a


                                                     b


                                 projba


                             Figure 2.11: Projection.


 (d) a · b
 (e) b
   Solutions:
 (a) a + b = (2, 3) + (1, −3) = (3, 0)
 (b) 3a = 3(2, 3) = (6, 9)
 (c) 2a + 4b = 2(2, 3) + 4(−1, 3) = (4, 6) + (4, −12) = (8, −6)
 (d) a · b = (2, 3) · (1, −3) = 2 − 9 = −7.
                               √
 (e) b = 12 + (−3)2 = 10.

2.3.3    Projections
Suppose a and b are two vectors. The projection of a in the direction of b,
denoted projb a, is the vector in the direction of b whose length is determined
by drawing a line perpendicular to b that goes through a. In other words, the
length of projb a is the component of a in the direction of b. This is shown in
Figure 2.11
    To compute projb a, we first note that it is a multiple of b. Thus projb a = sb
for some number s. To compute s, we use the fact that the vector projb a − a
(along the dotted line in the diagram) is orthogonal to b. Thus (projb a−a)·b =
0, or (sb − a) · b = 0, or s = a · b/b · b = a · b/ b 2 . Thus
                                               a·b
                                 projb a =          b.
                                                b 2

If b is a unit vector (i.e., b = 1) this expression is even simpler. In this case

                                 projb a = (a · b) b.

    Projections are useful for computing the components of a vector in various
directions. An easy example is given be the co-ordinates of a vector. These are


                                          22
Chapter 2. Vectors and Geometry                2.3. Geometrical Aspects of Vectors



simply the components of a vector in the direction of the standard basis vectors.
So in two dimensions

                          a1   = a · i = [a1 , a2 ] · [1, 0]
                          a2   = a · j = [a1 , a2 ] · [0, 1]

2.3.4     MATLAB: norm and dot commands
MATLAB has built-in functions that implement most of the mathematical op-
erations introduced this course. For example, the commands

norm(a) returns the length (norm) of the vector a.

dot(a,b) returns the dot product of the vectors a and b (if the vectors do not
     have the same length, an error results as you would expect).

Using these commands and scalar multiplication of vectors, a projection of a
onto the direction b can be implemented:

(dot(a,b)/norm(b)^2))*b

where / denotes division (of scalar quantities in this case) and ^ p gives the
p’th power of a quantity.

2.3.5     Problems
Problem 2.6 Consider the vectors a = (1, 2) and b = (1, −2) in R2 (the set
of vectors with 2 components). Compute the following:

  1. a + b

  2. 2a

  3. a − b

  4. a · b

  5. b

Problem 2.7 A circle in the x1 -x2 plane has centre at (2,5). A given point
on its circumference is (3,3). Write an equation that describes all the points
(x1 , x2 ) on the circle.

Problem 2.8 Find the equation of a sphere centred at a = [a1 , a2 , a3 ] with
radius r. (Hint: the sphere is the set of points x = [x1 , x2 , x3 ] whose distance
from a is r

Problem 2.9 Find the equation of a sphere if one of its diameters has end-
points [2, 1, 4] and [4, 3, 10]



                                         23
Chapter 2. Vectors and Geometry                2.3. Geometrical Aspects of Vectors



Problem 2.10 Compute the dot product of the vectors a and b and find the
angle between them.
  (i) a = [1, 2], b = [−2, 3]
 (ii) a = [−1, 2], b = [1, 1]
(iii) a = [1, 1], b = [2, 2]
 (iv) a = [1, 2, 1], b = [−1, 1, 1]
 (v) a = [−1, 2, 3], b = [3, 0, 1]
Problem 2.11 Let a = (1, 1, 1) and b = (3, 1, −2). Compute the following:
  1. The angle between a and b.
  2. proja b (the projection of b in the direction of a).
Problem 2.12 Let a = (1, 4, 0) and b = (2, −1, 5). Compute the following:
 (a) The angle between a and b.
 (b) proja b (the projection of b in the direction of a).
Problem 2.13 For which value of s is the vector [1, 2, s] orthogonal to [−1, 1, 1]?
Problem 2.14 Does the triangle with vertices [1, 2, 3], [4, 0, 5] and [3, 4, 6] have
a right angle?
Problem 2.15 Determine the values of c1 and c2 such that the vector [c1 1 c2 ]
is a scalar multiple of [2 -2 3].
Problem 2.16 An air-plane with an approach speed of 70 knots is on approach
to runway 26 (i.e., pointing in the direction of 260 degrees). This is shown
in Figure 2.12 (left). If the wind is from 330 degrees at 10 knots, what head-
ing should the pilot maintain to stay lined up with the runway? What is the
groundspeed of the air-plane?

Problem 2.17 Suppose the angle of the pendulum shaft makes an angle of θ
with the vertical direction as shown in Figure 2.12 (right). The force of gravity
has magnitude (length) equal to mg and points downwards. The force along the
shaft of the pendulum acts to keep the shaft rigid, i.e., the component of the
total force along the shaft is zero. Write down the co-ordinates of the two forces
and the total force using two different sets of co-ordinate axes — one horizontal
and vertical, and one parallel to and orthogonal to the shaft of the pendulum.
Problem 2.18 (Matlab) In Matlab code, if one defines a vector a = [a1 , a2 , · · · , an ],
with the ai ’s being any numbers, the output of typing a(j) would be aj .
   Suppose that you have a two element vector. How would you write a line of
Matlab code to compute the norm of the vector, without using the norm or dot
commands?


                                         24
Chapter 2. Vectors and Geometry              2.4. Determinants and the Cross Product




Figure 2.12: Runway diagram for problem 2.16 (left) and the pendulum of
problem 2.17 (right)


2.4      Determinants and the Cross Product
2.4.1     The determinant in two and three dimensions
The determinant is a number that is associated with a square matrix, that is, a
square array of numbers.
   In two dimensions it is defined by

                                 a1     a2
                           det               = a1 b2 − a2 b1 .
                                 b1     b2

The definition   in three dimensions is
                   
      a1 a2      a3
                                    b b                   b b                   b b
det  b1 b2      b3  = a1 det 2 3 − a2 det 1 3 + a3 det 1 2
                                    c2 c3                 c1 c3                 c1 c2
      c1 c2      c3
                        = a1 b2 c3 − a1 b3 c2 + a2 b3 c1 − a2 b1 c3 + a3 b1 c2 − a3 b2 c1

   We want to determine the relationship between the determinant and the
vectors a = [a1 , a2 ] and b = [b1 , b2 ] (in two dimensions) and a = [a1 , a2 , a3 ],
b = [b1 , b2 , b3 ] and c = [c1 , c2 , c3 ] (in three dimensions). We will do the two
dimensional case now, but postpone the three dimensional case until after we
have discussed the cross product.
   So let a = [a1 , a2 ] and b = [b1 , b2 ] be two vectors in the plane. Define

                                      ˆ
                                      a = [−a2 , a1 ].

            ˆ
Notice that a has the same length as a, and is perpendicular to a, since

                             a · a = −a2 a1 + a1 a2 = 0.
                             ˆ

                                                                      ˆ
There are exactly two vectors with these properties. The vector a is the one
that is obtained from a by a counterclockwise rotation of π/2 (i.e., 90◦ ). To
see this, notice that if a lies in the first quadrant (that is, a1 > 0 and a2 > 0)


                                             25
Chapter 2. Vectors and Geometry         2.4. Determinants and the Cross Product




                           Figure 2.13: The vector a.
                                                   ˆ


     ˆ
then a lies in the second quadrant, and so on. Later in the course we will study
rotations and this will be a special case. Notice now that the determinant can
be written as a dot product.
                                                    a1   a2
                     a · b = −a2 b1 + a1 b2 = det
                     ˆ
                                                    b1   b2
                                                            ˆ
We want to use the geometric formula for the dot product of a and b. Let θ be
the angle between a and b and π/2 − θ be the angle between a and b, as shown
                                                            ˆ
in Figure 2.13.
    Using the geometric meaning of the dot product, we obtain
                          a1   a2
                   det              = a·b
                                      ˆ
                          b1   b2
                                    =    ˆ
                                         a   b cos(π/2 − θ)
                                    =    a   b sin(θ)

We need to be a bit careful here. When we were discussing the dot product,
we always assumed that the angle between two vectors was in the range 0 to
π. In fact, the geometric formula for the dot product is not sensitive to how we
measure the angle. Suppose that instead of θ in the range 0 to π we use θ1 = −θ
(measuring the angle “backwards”) or θ2 = 2π − θ (measuring the angle going
the long way around the circle). Since cos(θ) = cos(−θ) = cos(2π − θ) we have

           c·d= c        d cos(θ) = c    d cos(θ1 ) = c       d cos(θ2 ).

In other words, the geometric formula for the dot product still is true.
    In the diagram above, we want to let the angle θ between a and b range
between −π and π. In this case the angle π/2 − θ between a and b is sometimes
                                                         ˆ
not in the range between 0 or 2π. But if this happens, then it is still the
               ˆ
angle between a and b, just “backwards” or “the long way around.” Thus the
geometric formula above still is correct.


                                        26
Chapter 2. Vectors and Geometry                 2.4. Determinants and the Cross Product



    Values of θ between 0 and π correspond to the situation where the direction
of b is obtained from the direction of a by a counterclockwise rotation of less
than π. This is the case in the diagram. On the other hand, θ between −π and
0 corresponds to the case where a clockwise rotation of less than π is needed to
get from the direction of a to the direction of b.
    The quantity sin(θ) can be positive or negative, depending on the orienta-
tions of a and b, but in any case the positive quantity b | sin(θ)| is the height
of the parallelogram spanned by a and b if we take a to be the base. In this
case, the length of the base is a . Recall that the area of a parallelogram is
the length of the base times the height. Thus

                 a1   a2
           det                 = Area of parallelogram spanned by a and b
                 b1   b2

The determinant is positive if sin(θ) is positive, that is, if θ is positive. This is
the case if the direction of b is obtained by a counterclockwise rotation of half
a circle or less from the direction of a. Otherwise the determinant is negative.
    Notice that the determinant whose rows are the components of two non-zero
vectors a and b is zero exactly when the vectors a and b are pointing in the
same direction, or in the opposite direction, that is, if one is obtained from the
other by scalar multiplication. The sign of the determinant gives information
about their relative orientation.

2.4.2     The cross product
Unlike the dot product, the cross product is only defined for vectors in three
dimensions. And unlike the dot product, the cross product of two vectors is
another vector, not a number. If a = [a1 , a2 , a3 ] and b = [b1 , b2 , b3 ], then a × b
is a vector given by

                  a × b = [a2 b3 − a3 b2 , a3 b1 − a1 b3 , a1 b2 − a2 b1 ].

An easy way to remember this is to write down a 3 × 3 matrix whose first row
contains the unit basis vectors and whose second and third rows contain the
components of a and b. Then the cross product is obtained by following the
usual rules for computing a 3 × 3 determinant.
                  
         i   j k
                                 a a3            a a3            a a2
  det  a1 a2 a3  = i det 2             − j det 1       + k det 1
                                 b2 b3           b1 b3           b1 b2
        b1 b2 b3
                           =    [a2 b3 − a3 b2 , a3 b1 − a1 b3 , a1 b2 − a2 b1 ]

The geometric meaning of the cross product is given the following three prop-
erties:

   1. a × b is orthogonal to a and to b



                                               27
Chapter 2. Vectors and Geometry                      2.4. Determinants and the Cross Product




              Figure 2.14: The parallelogram spanned by a and b.


  2. a × b = a b sin(θ), where θ is the angle between a and b. In this
    formula, θ lies between 0 and π, so that sin(θ) is positive. This is the same
    as saying that the length of a × b is the area of the parallelogram spanned
    by a and b.
  3. The vectors a, b and a × b obey the right hand rule.
This geometric description of the cross product shows that the definition of the
cross product is independent of how we choose our co-ordinate axes. To verify
1, we compute the dot products a · (a × b) and b · (a × b) and verify that they
are zero. (This is one of the problems below.)
    To verify 2 we must show that the length of a × b is the area of the paral-
lelogram spanned by a and b, since the quantity a b sin(θ) is precisely this
area.
    Since both the length and the area are positive quantities , it is enough to
compare their squares. We have
                   2
          a×b          = (a2 b3 − a3 b2 )2 + (a3 b1 − a1 b3 )2 + (a1 b2 − a2 b1 )2         (2.3)

    On the other hand, the area A of the parallelogram spanned by a and b is
length a times the height. This height is the length of the vector b−proja b =
b − (a · b)a/ a 2 as shown in Figure 2.14. Using these facts, we arrive at the
following formula for the square of the area of the parallelogram.


         A2    =       a   2
                               b − (a · b)a/ a         2 2

                           2            2
               =       a        b           + (a · b)2 a 2 / a   4
                                                                     − 2(a · b)2 / a   2

                           2        2
               =       a       b        − (a · b)2
               =   (a2
                     1     +   a2
                                2   + a2 )(b2 + b2 + b2 ) − (a1 b1 + a2 b2 + a3 b3 )2
                                       3    1    2    3                                    (2.4)

Expanding the expressions in (2.3) and (2.4) reveals that they are equal.


                                                     28
Chapter 2. Vectors and Geometry           2.4. Determinants and the Cross Product



    Notice that there are exactly two vectors satisfying properties 1 and 2, that
is, perpendicular to the plane spanned by a and b and of a given length. The
cross product of a and b is the one that satisfies the right hand rule. We say
that vectors a, b and c (the order is important) satisfy the right hand rule if
you can point the index finger of your right hand in the direction of a and the
middle finger in the direction of b and the thumb in the direction of c. Try to
convince yourself that if a, b and c (in that order) satisfy the right hand rule,
then so do b, c, a and c, a, b.
    Here are some properties of the cross product that are useful in doing com-
putations. The first two are maybe not what you expect.
  1. a × b = −b × a
  2. a × (b × c) = (c · a)b − (b · a)c.
  3. s(a × b) = (sa) × b = a × (sb).
  4. a × (b + c) = a × b + a × c.
  5. a · (b × c) = (a × b) · c.

Example 2.5 Let a = (1, 3, −2) and b = (−1, 2, 3). Compute the following:
 (a) The area of the parallelogram whose sides are a and b.
 (b) The angle between a and b.
   Solution:
 (a) The area of the parallelogram is equal to the length of a × b:
                   ˆ
                   i    j ˆ
                        ˆ k
       a×b=        1    3 −2        i(9      j(2      ˆ
                                  = ˆ + 4) + ˆ − 3) + k(2 + 3) = (13, −1, 5)
                  −1    2 3

     so the area is
                                                       √
                      a×b =       132 + (−1)2 + 52 =       195 ≈ 13.96

 (b) Note that the formula a × b = a b sin θ cannot be used for this
     question since it cannot distinguish between θ and π − θ (think about this
     point). Instead, use the cos, dot product formula which should always be
     used for the calculation of angles between vectors unless you really know
     what you are doing:
               a·b   (1, 3, −2) · (−1, 2, 3)      −1 + 6 − 6   −1
     cos θ =       =                         =√       √      =
               a b   (1, 3, −2) (−1, 2, 3)      1+9+4 1+4+9    14
     so
                                  −1
                      θ = cos−1        ≈ 1.64 radians or ≈ 94.10◦
                                  14


                                          29
Chapter 2. Vectors and Geometry         2.4. Determinants and the Cross Product




                       Figure 2.15: The Triple Product.


2.4.3    The triple product and the determinant in three di-
         mensions
The cross product is defined so that the dot product of a with b × c is a
determinant:                                      
                                          a1 a2 a3
                      a · (b × c) = det  b1 b2 b3 
                                          c1 c2 c3
This determinant is called the triple product of a, b and c.
    Using this fact we can show that the absolute value of the triple product is
the volume of the parallelepiped spanned by a, b and c.
    A diagram of the parallepiped is shown in Figure 2.15. The absolute value
of the triple product is
                       |a · (b × c)| = a cos(θ) b × c .
Here θ is the angle between a and b × c. The quantity a cos(θ) is the height
of parallelepiped and b × c is the area of the base. The product of these is
the volume of the parallepiped, as claimed. Thus
                   
         a1 a2 a3
   det  b1 b2 b3  = Volume of the parallelepiped spanned by a, b and c
         c1 c2 c3
    The sign of the triple product is positive if θ is between zero and π/2 and
negative if θ lies between π/2 and π. This is the case if a is on the same side of
the plane spanned by b and c as b × c. This holds if the vectors b, c and a (in
that order) satisfy the right hand rule. Equivalently a, b and c (in that order)
satisfy the right hand rule.
    Mathematically, it is more satisfactory to define the right hand rule using
the determinant. That is, we say that vectors a, b and c satisfy the right hand
rule if the determinant a · (b × c) is positive.


                                       30
Chapter 2. Vectors and Geometry         2.4. Determinants and the Cross Product



2.4.4    MATLAB: assigning matrices and det and cross com-
         mands
cross The command cross(a,b) computes the cross product a × b. An error
     results if a or b are not vectors of length 3. As an example, the command

     cross([1 0 0],[0 1 0])

     gives the vector result [0 0 1].
matrices: The syntax to generate a matrix is shown below using a 2×2 example

     a = [1 2; 3 4]

     This command assigns a matrix to a that has the vector [1 2] in its first row
     and [3 4] in its second. Entries of a matrix can be accessed individually,
     for example a(1,2) is the entry in the first row, second column.
zeros: Many applications can lead to large matrices with many rows and columns.
     Even though MATLAB can do matrix computations, it can be tedious to
     enter these large matrices by hand. In some cases the matrices have mostly
     zeros as entries (these matrices are called sparse). In these cases it is more
     efficient to generate a matrix of all zeros and then modify the entries that
     are not zero. For example

     a = zeros(2,2);
     a(1,1) = 1;

     generates a 2 × 2 matrix with entries that are all zero except the upper
     left entry which is 1. Note that zeros(n,m) generates a matrix with n
     rows and m columns with all zero entries. So zeros(1,m) is a row vector
     of length m and zeros(n,1) is a column vector of length m with all zero
     entries.
rand: rand (n,m) generates a matrix with n rows and m columns with entries
     that are random numbers uniformly distributed in the interval [0,1].
det: The command det(a) returns the determinant of the matrix a. An error
     occurs if a is not a square (same number of rows and columns) matrix.
     Determinants of larger matrices (than 2 × 2 and 3 × 3 discussed in this
     section) are discussed in Chapter 4.

2.4.5    MATLAB: generating scripts with the MATLAB
         editor
Often times using the command window in MATLAB to solve a problem can be
tedious, because if the need arises to redo the problem, or change a parameter,
one has to rewrite it all. The editor comes in handy for such cases. The editor


                                        31
Chapter 2. Vectors and Geometry          2.4. Determinants and the Cross Product



is a text window (accesed from the command window: File → New → Blank
M-file) where one can write commands in the same syntax as the editor, and
when one runs it, the results appear in the command window exactly as if one
had written them there one after the other.
    For example, the code to generate three random orthogonal vectors would
look something like this:
a1 = rand(3,1)
b = rand(3,1);
a2 = cross(a1,b)
a3 = cross(a1,a2)
dot(a1,a2)
dot(a1,a3)
dot(a2,a3)
Note that the last three lines are there to check that the three vectors are
mutually orthogonal. Once the code was written, save it from the editor window:
File → Save as, making sure that the name of the file has a “.m” extension
(and the file name should contain no spaces). There are several different ways
of running the script, the fastest one is to hit the F5 key. Alternatively, from the
editor window it can be run from Debug → Run, or directly from the command
window by typing the name of the script into the MATLAB command line.

2.4.6     MATLAB: floating point representation of real num-
          bers
MATLAB can represent integers exactly (up to limited but large size). Using a
“floating point representation”, MATLAB can represent most real numbers only
approximately (but quite accurately - to 16 digits or so). In certain cases, the
errors made in floating point approximation of numbers can be amplified and
lead to noticeable errors in computed results. This will not happen typically in
the examples and computer labs for Math 152, but the reader should be aware
of the possibility.

Example 2.6 Consider the vectors

                                a   =   [1 1 1]
                                         √ √
                                b =     [ 2 2 0]

and c = a + b. If a 3 × 3 matrix A is made with rows a, b and c then the
determinant of A is zero (by construction, the vectors lie on the same plane).
If this computation is done in MATLAB,
a = [1 1 1];
b = [sqrt(2) sqrt(2) 0 ];
c = a + b;
A = [a; b; c];
det(A)


                                        32
Chapter 2. Vectors and Geometry              2.4. Determinants and the Cross Product



the result is 3.1402e-16 not zero due to floating point approximation of inter-
mediate computations. You can type eps in MATLAB to see the maximum
relative error made by floating point approximation. On the computer used to
do the computation above, eps had a value of 2.2204e-16, so it is believable that
the calculation error in the determinant was made by the combination of a few
floating point approximations.

   Try to determine how many decimal digits of accuracy the floating point
representation in your calculator uses (typically, the accuracy is greater than
what is displayed).

2.4.7     Problems
Problem 2.19 Compute [1, 2, 3] × [4, 5, 6]

Problem 2.20 Use the definition to find the determinant of the matrix
                                        
                              1 1 1
                            1 2 3 .
                              1 0 −1

Do the computation by hand showing your work, but you can check your result
using MATLAB. From your result, decide if the vectors [1 1 1], [1 2 3] and [1
0 -1] lie in the same plane (justify your answer, very briefly).

Problem 2.21 Verify that a · (a × b) = 0 and b · (a × b) = 0.

Problem 2.22 Simplify each of the following expressions:
 (a) ((1, 4, −1) · (2, 1, 3))((2, 1, 4) × (1, 4, 9))
 (b) (7, 1, 0) · ((2, 0, −1) × (1, 4, 3))
 (c) (a × b) × (b × a)

Problem 2.23 Explain why a b sin(θ) is the area of the parallelogram spanned
by a and b. Here θ is the angle between the two vectors.

Problem 2.24 Find examples to show that in general a × b = b × a and
a × (b × c) = (a × b) × c.

Problem 2.25 (Matlab) The Matlab command a=rand(1,n) generates an n×1
vector with random entries. Write a script that generates three random vectors
and write what you obtain from a × b − b × a, and from a × (b × c) − (a × b) × c.
Does that constitute a proof ?

Problem 2.26 Show that a × (b × c) = (a · c)b − (a · b)c.

Problem 2.27 (Matlab) Write a script that generates three random vectors and
checks that the result from problem 2.26 holds: a × (b × c) = (a · c)b − (a · b)c.


                                            33
Chapter 2. Vectors and Geometry                                 2.5. Lines and Planes



Problem 2.28 Derive an expression for (a × b) · (c × d) that involves dot
products but not cross products.

Problem 2.29 (a) Draw a sketch containing the vectors a, b and a×(a×b).
    Assume that a and b lie in the plane of the paper and have an acute angle
    between them.
 (b) Find a formula for a × (a × b) which involves only a , b and proja b.
     Hint: use a property of the dot product.

Problem 2.30 What is the analog of the cross product in two dimensions?
How about four dimensions?


2.5      Lines and Planes
2.5.1     Describing linear sets
The following sections we will consider points, lines, planes and space in two
and three dimensions. Each of these sets have two complementary (or dual)
descriptions. One is is called the parametric form and the other the equation
form. Roughly speaking, the parametric form specifies the set using vectors that
are parallel to the set, while the equation form uses vectors that are orthogonal
to the set. Using the parametric description, it is easy to write down explicitly
all the elements of the set, but difficult to check whether a given point lies in
the set. Using the equation description its the other way around: if someone
gives you a point, it is easy to check whether it lies in the set, but it is difficult
to write down explicitly even a single member of the set.
    One way of thinking about solving a system of linear equations is simply
going from one description to the other. This will (hopefully) become clear
later on.
    These sections will always follow the same pattern. We will consider the
parametric and equation descriptions, first in the special case when the set
passes through the origin. Then we consider the general case. Recall that when
we say “the point x” this means “the point at the head of the vector x whose tail
is at the origin.” In these sections we have used the notation [x1 , x2 ] instead
of [x, y] and [x1 , x2 , x3 ] instead of [x, y, z] for typical points in two and three
dimensions.

2.5.2     Lines in two dimensions: Parametric form
First we consider lines passing through the origin. Let a = [a1 , a2 ] be vector in
the direction of the line. Then all the points x on the line are of the form

                                       x = sa

for some number s. The number s is called a parameter. Every value of s
corresponds to exactly one point (namely sa) on the line.


                                         34
Chapter 2. Vectors and Geometry                                                2.5. Lines and Planes




                                                                                        x


                                                             b             a
                                                                                       x-q

                                                                     q
        b
                       a



            Line through the origin                              Line through q


                             Figure 2.16: A line in two dimensions.

   Now we consider the general case. Let q be a point on the line and a lie in
the direction of the line. Then the points on the line can be thought of as the
points on the line through the origin in the direction of a shifted or translated
by q. Then a point x lies on the line through q in the direction of a exactly
when
                                  x = q + sa
for some value of s.

2.5.3       Lines in two dimensions: Equation form
First we consider lines passing through the origin shown in Figure 2.16 (left).
Let b = [b1 , b2 ] be orthogonal to the direction of line. The point x is on the line
exactly when x · b = 0. This can be written
                                        x1 b1 + x2 b2 = 0.
    Now we consider the general case shown in Figure 2.16 (right). Let q be a
point on the line and b = [b1 , b2 ] be orthogonal to the direction of line. A point
x lies on the line through q in the direction of a exactly when x − q lies on the
line through the origin in the direction of a. Thus (x − q) · b = 0. This can be
written
                          (x1 − q1 )b1 + (x2 − q2 )b2 = 0
or
                                        x1 b1 + x2 b2 = c,
where c = q · b.

2.5.4       Lines in three dimensions: Parametric form
The parametric form of a line in three (or higher) dimensions looks just the
same as in two dimensions. The points x on the line are obtained by starting
at some point q on the line and then adding all multiples of a vector a pointing
in the direction of the line. So
                                           x = q + sa


                                               35
Chapter 2. Vectors and Geometry                                      2.5. Lines and Planes




                     Figure 2.17: A line in three dimensions.


The only difference is that now q and a are vectors in three dimensions.

2.5.5     Lines in three dimensions: Equation form
We begin with lines through the origin. A line through the origin can be de-
scribed as all vectors orthogonal to a plane. Choose two vectors b1 and b2 lying
in the plane that are not collinear (or zero). Then a vector is orthogonal to
the plane if and only if it is orthogonal to both b1 and b2 . Therefore the line
consists of all points x such that x · b1 = 0 and x · b2 = 0. If b1 = [b1,1 , b1,2 , b1,3 ]
and b2 = [b2,1 , b2,2 , b2,3 ] then these equations can be written

                       b1,1 x1   + b1,2 x2        + b1,3 x3   = 0
                       b2,1 x1   + b2,2 x2        + b2,3 x3   = 0

Notice that there are many possible choices for the vectors b1 and b2 . The
method of Gaussian elimination, studied later in this course, is a method of
replacing the vectors b1 and b2 with equivalent vectors in such a way that the
equations become easier to solve.
    Now consider a line passing through the point q and orthogonal to the
directions b1 and b2 as shown in Figure 2.17. A point x lies on this line
precisely when x − q lies on the line through the origin that is orthogonal to b1
and b2 . Thus (x − q) · b1 = 0 and (x − q) · b2 = 0. This can be written

            b1,1 (x1 − q1 ) + b1,2 (x2 − q2 ) + b1,3 (x3 − q3 ) =            0
            b2,1 (x1 − q1 ) + b2,2 (x2 − q2 ) + b2,3 (x3 − q3 ) =            0,
or
                       b1,1 x1   + b1,2 x2    + b1,3 x3       = c1
                       b2,1 x1   + b2,2 x2    + b2,3 x3       = c2
where c1 = q · b1 and c2 = q · b2



                                             36
Chapter 2. Vectors and Geometry                                  2.5. Lines and Planes



2.5.6     Planes in three dimensions: Parametric form
We begin with planes through the origin. Since a plane is a two dimensional
object, we will need two parameters to describe points on the plane. Let a1 and
a2 be non-collinear vectors in the direction of the plane. Then every point on
the plane can be reached by adding some multiple of a1 to some other multiple
of a2 . In other words, points x on the plane are all points of the form

                                   x = sa1 + ta2

for some values of s and t.
    If the plane passes through some point q in the directions of a1 and a2 , then
we simply shift all the points on the parallel plane through the origin by q. So
x lies on the plane if
                                x = q + sa1 + ta2
for some values of s and t.

2.5.7     Planes in three dimensions: Equation form
A plane through the origin can be described as all vectors orthogonal to a given
vector b as shown in Figure 2.18. (In this situation, if b has unit length it is
called the normal vector to the plane and is often denoted n.) Therefore x lies
on the plane whenever x · b = 0, or

                        b1 x1   + b2 x2    + b3 x3    =   0.

If a plane with normal vector b is translated so that it passes through the point
q, then x lies on the plane whenever x − q lies on the parallel plane through
the origin. Thus x lies on the plane whenever (x − q) · b = 0. Equivalently

             b1 (x1 − q1 )   + b2 (x2 − q2 )   + b3 (x3 − q3 )    =   0,

or
                        b1 x1   + b2 x2    + b3 x3    = c,
where c = q · b.

2.5.8     Problems
Problem 2.31 A line orthogonal to b can be described as the set of all points
x whose projections onto b all have the same value. Using the formula for
projections, show that this leads to the equation description of the line.

Problem 2.32 Find both the parametric form and equation form for the line
in Figure 2.19. Write down five points on the line (notice that the parametric
form is more useful for this). Check whether the point [ 1012 , 1069 ] is on the line
                                                           3     21
(notice that the equation form is more useful for this.)



                                          37
Chapter 2. Vectors and Geometry                            2.5. Lines and Planes




                 Figure 2.18: A plane in three dimensions.




                                              [7,5]




                          [0,1]




                  Figure 2.19: Diagram for problem 2.32.




                                    38
Chapter 2. Vectors and Geometry                                  2.5. Lines and Planes



Problem 2.33 Find the equation form for the line [1, 1] + s[−1, 2].

Problem 2.34 Find the parametric form for the line x1 − 3x2 = 5

Problem 2.35 Use a projection to find the distance from the point [−2, 3] to
the line 3x1 − 4x2 = −4

Problem 2.36 Consider the plane x − y + 2z = 7.
     1. What is the normal direction to the plane?
     2. Find the coordinates of any point (your choice) on the plane.

Problem 2.37 Let a, b and c be the vertices of a triangle. By definition, the
median of a triangle is a straight line that passes through a vertex of the triangle
and through the midpoint of the opposite side.
    (i) Find the parametric form of the equation for each median.
    (ii) Do all the medians meet at a common point? If so, which point?

Problem 2.38 Find a pair of equations which define the line

                          {(2, 0, −4) + s(0, 1, 3) : s ∈ R}

Problem 2.39 Find the intersection point between the line

                         {(2, −1, 6) + s(1, −1, 0) : s ∈ R}

and the plane
                        {t(0, 1, −3) + u(−1, 2, 0) : t, u ∈ R}
Hint: first find an equation for the plane.

Problem 2.40 Find the intersection point of the line with parametric form
below
                         (1, 2, 3) + t(1, 0, −1)
and the plane
                                  x + 2y − z = 5.
.

Problem 2.41 Find the equation of the plane containing the points [1, 0, 1],
[1, 1, 0] and [0, 1, 1].

Problem 2.42 Find the equation of the sphere which has the two planes x1 +
x2 + x3 = 3 and x1 + x2 + x3 = 9 as tangent planes if the centre of the sphere
is on the planes 2x1 − x2 = 0, 3x1 − x3 = 0.

Problem 2.43 The planes x + y + z = 2 and x − y + 2z = 7 intersect in a line.
Find a parametric representation of this line.


                                         39
Chapter 2. Vectors and Geometry                 2.6. Introduction to Linear Systems



Problem 2.44 Find the equation of the plane that passes through the point
[−2, 0, 1] and through the line of intersection of 2x1 + 3x2 − x3 = 0, x2 − 4x2 +
2x3 = −5.

Problem 2.45 What’s wrong with the question “Find the equation for the plane
containing [1, 2, 3], [2, 3, 4] and [3, 4, 5].”?

Problem 2.46 Find the distance from the point p to the plane b · x = c.

Problem 2.47 Find the equation for the line through [2, −1, −1] and parallel to
each of the two planes x1 + x2 = 0 and x1 − x2 + 2x3 = 0. Express the equation
fo the line in both parametric and equation form.

Problem 2.48 (Matlab) Plotting figures in Matlab is quite simple. Simply type
plot(1,2) in the command window and observe what you get. Copy the follow-
ing script (call it basicplot.m for example), and run it:

x = -2:0.1:2
m = 2
x_0 = 1
y_0 = 1
y = y_0 + m*(x-x_0)
plot(x,y,’.’)

Type help plot in the command window to learn about different options for
the plot command. Without closing the figure window, type hold on in the
command window, and re-run the script after changing the slope to m = -1/2
(the hold on command allows you to overlap plots). Notice that the two lines
should be perpendicular, but because of the scaling they appear not to be. Type
in the command axis equal to fix that. How would you modify the script to
plot a circle of radius 3?


2.6     Introduction to Linear Systems
2.6.1    Description of points and the geometry of solutions
         to systems of equations
So far we have considered the parametric and equation descriptions of lines and
planes in two and three dimensions. We can also try to describe points in the
same way. This will help you get a geometric picture of what it means to solve
a system of equations.
    The “parametric” description of a point doesn’t have any parameters! It
simply is the name of the point x = q. In two dimensions the equation form for
describing a point will look like

                          b1,1 x1   + b1,2 x2    = c1
                          b2,1 x1   + b2,2 x2    = c2


                                       40
Chapter 2. Vectors and Geometry                 2.6. Introduction to Linear Systems




   Figure 2.20: Intersection of lines in 2D that are not collinear is a point.


where the vectors b1 = [b1,1 , b1,2 ] and b2 = [b2,1 , b2,2 ] are not collinear. Each
one of this equations describes a line. The point x = [x1 , x2 ] will satisfy both
equations if it lies on both lines, i.e., on the intersection. Since the vectors b1
and b2 are not co-linear, the lines are not parallel, so the intersection is a single
point. This situation is shown in Figure 2.20.
   In three dimensions the equation form for describing a point will look like
                     b1,1 x1   + b1,2 x2    + b1,3 x3   = c1
                     b2,1 x1   + b2,2 x2    + b2,3 x3   = c2
                     b3,1 x1   + b3,2 x2    + b3,3 x3   = c3
where b1 , b2 and b3 don’t all lie on the same plane. This can be interpreted as
the intersection of three planes in a single point.
   Notice that going from the equation description of a point to the parametric
description just means finding the solution of the system of equations. If, in two
dimensions, the vectors b1 and b2 are not collinear, or in three dimensions, b1 ,
b2 and b3 don’t all lie on the same plane, then the system of equations has a
unique solution.
   Now suppose that you are handed an arbitrary system of equations
                     b1,1 x1   + b1,2 x2    + b1,3 x3   = c1
                     b2,1 x1   + b2,2 x2    + b2,3 x3   = c2
                     b3,1 x1   + b3,2 x2    + b3,3 x3   = c3
What does the set of solutions x = [x1 , x2 , x3 ] look like? As we just have seen,
if b1 , b2 and b3 don’t all lie on the same plane, there is a unique solution given
as the intersection of three planes. Recall that the determinant can be used
to test whether the vectors b1 , b2 and b3 lie on the same plane. So a unique
solution exists to the equation precisely when
                                                  
                                   b1,1 b1,2 b1,3
                             det  b2,1 b2,2 b2,3  = 0
                                   b3,1 b3,2 b3,3
What happens when the determinant is zero and three vectors b1 , b2 and b3 do
lie on the same plane? Then it could be that the three planes intersect in a line.


                                           41
Chapter 2. Vectors and Geometry                2.6. Introduction to Linear Systems




                       Figure 2.21: Planes intersecting.


In this case every point on that line is a solution of the system of equations,
and the solution set has a parametric description of the form x = q + sa. It
could also be that all three planes are the same, in which case the solution set
is the plane. In this case the solution set has a parametric description of the
form x = q + s1 a1 + s2 a2 Another possibility is that two of the planes could be
parallel with no intersection. In this case there are no solutions at all! Some of
these possibilities are illustrated in Figure 2.21.

2.6.2    Describing the whole plane in two dimensions and
         all of space in three dimensions
If the set we are trying to describe is the whole plane in two dimensions or all
of space in three dimensions, then we don’t need any equations, since there are
no restrictions on the points. However it does make sense to think about the
parametric form.
     Lets start with two dimensions. Consider Figure 2.22. If we pick any two
vectors a1 and a2 that don’t lie on the same line, then any vector x = [x1 , x2 ]
in the plane can be written as s1 a1 + s2 a2 . Notice that every choice of s1
and s2 corresponds to exactly one vector x. In this situation we could use the
parameters s1 and s2 as co-ordinates instead of x1 and x2 . In fact if a1 and
a2 are unit vectors orthogonal to each other, this just amounts to changing the
co-ordinate axes to lie along a1 and a2 . (The new co-ordinates [s1 , s2 ] are then
just what we were calling [x1 , x2 ] before.) (In fact, even if the vectors a1 and
a2 are not unit vectors orthogonal to each other, we can still think of them of
lying along new co-ordinate axes. However, now the axes have been stretched
and sheared instead of just rotated, and need not lie at right angles any more.)
     The situation in three dimensions is similar. Now we must pick three vectors
a1 , a2 and a3 that don’t lie on the same plane. Then every vector x has a unique
representation x = s1 a1 + s2 a2 + s3 a3 . Again, we could use s1 , s2 and s3 as
co-ordinates in place of x1 , x2 and x3 . Again, if a1 , a2 and 3a3 are orthogonal
with unit length, then this amounts to choosing new (orthogonal) co-ordinate
axes.



                                        42
Chapter 2. Vectors and Geometry                      2.6. Introduction to Linear Systems




                              Figure 2.22: A basis in 2D


2.6.3      Linear dependence and independence
The condition in two dimensions that two vectors are not co-linear, and the
condition in three dimensions that three vectors do not lie on the same plane
has now come up several times — in ensuring that a system of equations has a
unique solutions and in ensuring that every vector can be written in a unique
way in parametric form using those vectors. This condition can be tested by
computing a determinant.
    We will now give this condition a name and define the analogous condition
in any number of dimensions.
    First, some terminology. If a1 , a2 , . . . an are a collection of vectors then a
vector of the form
                            s1 a1 + s2 a2 + · · · sn an
for some choice of numbers s1 , . . . sn is called a linear combination of a1 , a2 , . . . an .
    Now, the definition. A collection of vectors a1 , a2 , . . . an is called linearly
dependent if some linear combination of them equals zero, i.e.,

                               s1 a1 + s2 a2 + · · · sn an = 0

for s1 , . . . sn not all zero. A collection of vectors is said to be linearly independent
if it is not linearly dependent. In other words, the vectors a1 , a2 , . . . an are
linearly independent if the only way a linear combination of them s1 a1 + s2 a2 +
· · · sn an can equal zero is for s1 = s2 = · · · = sn = 0.
      What does linear dependence mean in three dimensions? Suppose that a1 ,
a2 and a3 are linearly dependent. Then there are some numbers s1 , s2 and s3 ,
not all zero, such that
                                  s1 a1 + s2 a2 + s3 a3 = 0.
Suppose that s1 is one of the non-zero numbers. Then we can divide by −s1
and find that
                           −a1 + s2 a2 + s3 a3 = 0
for s2 = −s2 /s1 and s3 = −s3 /s1 . Thus

                                    a1 = s2 a2 + s3 a3 ,


                                             43
Chapter 2. Vectors and Geometry                      2.6. Introduction to Linear Systems



or a1 is a linear combination of a2 and a3 . But this implies that a1 lies on
the plane spanned by a2 and a3 , i.e., the vectors all lie on the same plane. If
s1 happens to be zero we can repeat the same argument with one of the si ’s
which is not zero. Thus linear dependence implies that all three vectors lie on
the same plane. Conversely, if all three vectors lie on the same plane, then we
can write one vector as a linear combination of the other two, a1 = s2 a2 + s3 a3
which implies −a1 + s2 a2 + s3 a3 = 0 which says that the vectors are linearly
dependent.
     So in three dimensions, linear dependence means the vectors lie on the same
plane. Similarly, in two dimensions, linear dependence means the vectors are
co-linear.
     One final piece of terminology. A collection of n linearly independent vectors
a1 , a2 , . . . an in n dimensional space is called a basis. If a1 , a2 , . . . an is a basis,
then every vector x can be written in a unique way as a linear combination

                               x = s1 a1 + s2 a2 + · · · sn an

2.6.4      Problems
Problem 2.49 Is the collection of vectors a1 = [1, 1], a2 = [1, 0] a basis for two
dimensional space? If so, express the vector x = [0, 1] as a linear combination
of a1 and a2

Problem 2.50 Let a = [2, 2, 2] and b = [3, 4, 1]. Find all vectors c such that
the list of vectors a, b, c is not a basis of R3 .

Problem 2.51 Show that the collection of vectors a = [1, 1, 1], b = [1, 1, 0] and
c = [1, 0, 0] is a basis of R3 . Express [1, 2, 3] as a linear combination of the
vectors a, b and c.

Problem 2.52 Let the vectors a = [1, 0, 4], b = [2, −1, 0]and c = [8, −3, 8]. Do
these vectors form a basis of R3 ?

Problem 2.53 Is it possible for four vectors to be linearly independent in three
dimensional space?

Problem 2.54 Show that the collection of vectors a = [2, 1, 3], b = [1, 0, 2] and
c = [3, 0, 0] is a basis of R3 . Express [12, 2, 4] as a linear combination of the
vectors a, b and c.

Problem 2.55 Suppose that a1 , a2 , . . . an is a basis. Show that if some vector
x has representation x = s1 a1 + s2 a2 + · · · sn an and x = t1 a1 + t2 a2 + · · · tn an ,
then s1 = t1 , s2 = t2 ,. . .,sn = tn . (Hint: subtract the two expressions for x and
use the fact that the basis vectors are linearly independent.)

Problem 2.56 (Matlab) Start by convincing yourself that the vectors (1, 3),
and (5, 2) are linearly independent. This can easily be shown by writing a script
that plots the vectors:


                                             44
Chapter 2. Vectors and Geometry                             2.7. Additional Topics




                                                  Ω




                                  a           v
                                        r
                                  θ


                              0

                        Figure 2.23: Rotational Motion.


plot([1,0],[3,0])
hold on
plot([5,0],[2,0])

We can extend this script by also plotting a linear combination of the two vectors:

alfa=1;
beta=1;
clf()
plot([alfa*1,0],[alfa*3,0])
hold on
plot([beta*5,0],[beta*2,0])
plot([alfa*1+beta*5,alfa*1],[alfa*3+beta*2,alfa*3],’--’)

Play around with the script and find parameters alfa and beta so that the
vector (1, 1) can be written as a linear combination of (1, 3) and (5, 2).


2.7     Additional Topics
These topics are not covered in Math 152 this year.

2.7.1    Application: rotational motion
Consider a rigid body rotating about an axis given by the unit vector a at a
rate of Ω radians per second. Let r be the position vector of a point on the
body as shown in Figure 2.23.
   What is the velocity of the point? The point travels on a circle with radius
 r sin(θ), where θ is the angle that r makes with the axis. Therefore, in one
second, the point travels a distance of Ω r sin(θ). Thus

  (i) the magnitude of the velocity is v = Ω r sin(θ).


                                        45
Chapter 2. Vectors and Geometry                              2.7. Additional Topics



 (ii) Now notice that v is orthogonal to the plane spanned by a and r.

(iii) Finally notice that Ωa, r and v obey the right hand rule.

   The facts (i), (ii) and (iii) imply that v is exactly the cross product of Ωa
and r. It is customary to let Ω denote the vector Ωa. Then

                                    v = Ω × r.

Problem 2.57 A body rotates at an angular velocity of 10 rad/sec about the
axis through the points [1, 1, −1] and [2, −3, 1]. Find the velocity of the point
[1, 2, 3] on the body.

Problem 2.58 The line L passing through the origin and the point [1,1,1]
passes through the midpoint of a thin metal rod of length 2 that is oriented
in the direction [1,0,0]. The rod begins rotating about L at 3 revolutions per
minute. What is the fastest speed (length of velocity vector) of any point on the
rod?

Problem 2.59 Imagine a plate that lies in the xy–plane and is rotating about
the z–axis. Let P be a point that is painted on this plane. Denote by r the
distance from P to the origin, by θ(t) the angle at time t between the line from
the origin to P and the x–axis and by [x(t), y(t)] the co-ordinates of P at the
time t. Find x(t) and y(t) in terms of θ(t). Compute the velocity of P in two
ways: 1. by differentiating [x(t), y(t)] and 2. by computing Ω × r.

2.7.2     Application: 3-D graphics
How can we represent a three dimensional object on piece of paper or computer
screen? Imagine the object in space and, a certain distance away, a point p
representing the eye of the observer. Between the observer and the object is a
plane called the view plane. The position of the origin of this plane is described
by a point q, and its orientation is given by three orthogonal unit vectors of
length 1, denoted e1 , e2 and e3 . (These are not the same as the standard basis
vectors i, j and k in this problem.) This situation is shown in Figure 2.24. As
usual, only the heads of the vectors (points) p, x, y and q are shown on the
diagram. (The origin, where the tails of these vectors lie, is not depicted at all.)
We will assume that the view plane is a distance one from the observer in the
direction e3 . Thus, e3 can be thought of as the direction that the observer is
looking. Think of light rays leaving the object at point x and travelling to the
observer’s eye at p. At some point y this line intersects the view plane. All the
vectors y on the view plane that correspond to some vector x on our object will
furnish the two dimensional representation of the object.
    How do we determine the point y? The parametric form of points on the
plane is q + s1 e1 + s2 e2 . So we must have that y = q + s1 e1 + s2 e2 for some
values of s1 and s2 . We also know that the vector x − p is in the same direction



                                        46
Chapter 2. Vectors and Geometry                                2.7. Additional Topics




            Figure 2.24: Perspective in 3-D graphics application.


as y − p. Therefore they must be multiples, i.e., y − p = λ(x − p) for some
number λ. Substituting in our expression for y yields

                        q + s1 e1 + s2 e2 − p = λ(x − p).

Since q − p = e3 this gives

                         e3 + s1 e1 + s2 e2 = λ(x − p).

Let us take the dot product of both sides of this equation with the unit vectors
e3 , e1 and e2 . We can use the fact that ei · ej is zero if i = j and 1 if i = j.
Start with e3 . This gives

                           e3 · e3 = 1 = λe3 · (x − p).

This determines λ.
                                             1
                                λ=                  .
                                       e3 · (x − p)
Now take the dot product with e1 . This gives

                                                e1 · (x − p)
                        s1 = λe1 · (x − p) =
                                                e3 · (x − p)

Similarly, taking the dot product with e2 leads to

                                       e2 · (x − p)
                                s2 =
                                       e3 · (x − p)

    To plot the image of an object, we now simply plot the co-ordinates s1 and
s2 corresponding to all the points on the object on the s1 –s2 plane.




                                          47
Chapter 2. Vectors and Geometry                         2.8. Solutions to Chapter Problems



Example 2.7 Take p = [11, 0, 0], q = [10, 0, 0], e1 = [0, 1, 0], e2 = [0, 0, 1]
and e3 = [−1, 0, 0]. What is the image of the point x = [1, 1, 1]? We compute
x − p = [−10, 1, 1] so that

                                 e1 · (x − p)       =     1
                                 e2 · (x − p)       =     1
                                 e3 · (x − p)       =     10

So s1 = s2 = 1/10.

Example 2.8 Continue the previous example and compute the image of a line
segment between [1, 1, 1] and [2, 0, 1]. These are all points of the form x =
[1, 1, 1] + t([2, 0, 1] − [1, 1, 1]) = [1 + t, 1 − t, 1] as t varies between 0 and 1. This
time we have x − p = [1 + t − 11, 1 − t, 1] so that

                               e1 · (x − p)     =       1−t
                               e2 · (x − p)     =       1
                               e3 · (x − p)     =       10 − t

Thus s1 = (1−t)/(10−t) and s2 = 1/(10−t). Even though it is not immediately
obvious, the points [s1 , s2 ], as t varies, all lie on a line segment. In fact

        s1 + 9s2 = (1 − t)/(10 − t) + 9/(10 − t) = (10 − t)/(10 − t) = 1.

This shows that the points [s1 , s2 ] lie on a line perpendicular to [1, 9].

    In fact, it is possible to show that any line segment in space maps to a line
segment on the s1–s2 plane. Thus, to plot the image of an object consisting
of straight line segments (such as the tetrahedron in the picture) it is only
necessary to plot the vertices and then join them by straight lines.

Problem 2.60 What are the s√and √ co=ordinates of the point x = [1, 2, 3],
                                 1     s2                  √     √
if p, q are as above, e1 = [0, 1/ 2, 1/ 2] and e2 = [0, −1/ 2, 1/ 2].

Problem 2.61 Plot the image on the s1 –s2 plane of the tetrahedron whose
                                                      √          √    √
vertices are located at [0, 0, 0], [0, 1, 0], [0, 1/2, 3/2] and [ 3/6, 6/3, 1/2] (Use
the same values as before: p = [−10, 0, 0], q = [10, 0, 0], e1 = [0, 1, 0], e2 =
[0, 0, 1] and e3 = [−1, 0, 0].)

Problem 2.62 Suppose that that points x lie on the line x = x0 + tv. Show
that corresponding planar points [s1 , s2 ] also lie on a line. (Hint: show that
there are numbers a, b, c that do not depend on t, so that as1 + bs2 = c for
every t.)

Problem 2.63 Consider a different drawing procedure where the point x maps
to the point on the view plane given by the intersection of the plane with the line
through x parallel to e3 . Find a formula for the s1 and s2 co-ordinates of x.


                                           48
Chapter 2. Vectors and Geometry                         2.8. Solutions to Chapter Problems




                         4




                         3




                         2
                                        (2,2)       (1,−1)

                         1


                                                (3,1)
                         0




                        −1
                                       (1,−1)

                        −2
                         −2   −1   0   1        2        3   4    5




                     Figure 2.25: Solution to problem 2.1

2.8     Solutions to Chapter Problems
Solution 2.1 See Figure 2.25.
Solution 2.2
  (i) A straight line passing through the origin in the direction of a.
 (ii) A ray (half line) passing through the origin in the direction of a.
(iii) A straight line parallel to a passing through b.
(iv) If a and b do not lie on the same line: a plane passing through the origin
     parallel to a and b (in two dimensions, this is the whole plane). If a and
     b lie on the same line: a straight line passing through the origin in the
     direction of a (and b).
 (v) If a and b both lie on the same line: a line passing through c parallel to a
     (and b). If a and b do not lie on the same line: a plane passing through
     c parallel to a and b (in two dimensions, this is the whole plane).

Solution 2.3 a − b is the vector that when added to b gives a. So if we draw
it with its tail at b then its head is at a. Similarly b − a when drawn with its
tail at a has its head at b.
Solution 2.4 To find the midpoint we add half of the vector with its tail at a
and head at b to a. So the midpoint is a+(1/2)(b−a) = (1/2)(a+b). Similarly,
the point one third of the way from a to b is a + (1/3)(b − a) = (2/3)a + (1/3)b.


                                           49
Chapter 2. Vectors and Geometry                   2.8. Solutions to Chapter Problems



Solution 2.5 The line segment is given by the vectors a + t(b − a) as t varies
between 0 and 1.
Solution 2.6 a = (1, 2);          b = (1, −2)

 (a) a + b = (2, 0)

 (b) 2a = (2, 4)

 (c) a − b = (0, 4)

 (d) a · b = 1 − 4 = −3.
            √         √
 (e) b = 1 + 4 = 5.

Solution 2.7 The radius of the circle is
                                                     √          √
                     (2, 5) − (3, 3) = (−1, 2) =         1+4=       5.

Thus using the standard equation for a circle
                                                  √
                           (x − 2)2 + (y − 5)2 = ( 5)2 = 5.

Solution 2.8 The equation is x − a = r, or

                       (x1 − a1 )2 + (x2 − a2 )2 + (x3 − a3 )2 = r

or
                     (x1 − a1 )2 + (x2 − a2 )2 + (x3 − a3 )2 = r2

Solution 2.9 The midpoint of the sphere is the midpoint of [2, 1, 4] and [4, 3, 10],
i.e., (1/2)([2, 1, 4] + [4, 3, 10]) = [3, 2, 7]. The radius is half the distance between
[2, 1, 4] and [4, 3, 10], i.e.,                      √
     (1/2) (2 − 4)2 + (1 − 3)2 + (4 − 10)2 = 11. Thus the equation is

                       (x1 − 3)2 + (x2 − 2)2 + (x3 − 7)2 = 11.

Solution 2.10
                                       √ √
 (a) a · b = −2 + 6 = 4, θ = arccos(4/( 5 13)) = 1.05 . . .

 (b) a · b = 1, θ = 1.249 . . .

 (c) a · b = 4, θ = 0

 (d) a · b = 2, θ = 1.079 . . . . . .

 (e) a · b = 0, θ = π/2 . . .




                                          50
Chapter 2. Vectors and Geometry                         2.8. Solutions to Chapter Problems



Solution 2.11 Consider the dot product to determine angles between vectors,
not the dot product (which only applies to 3D and cannot distinguish angles
θ in the range (0, π/2) from those in (π/2, π), as discussed in the solution to
problem 2.10 in the online notes).

 (a) a = (1, 1, 1),     b = (3, 1, −2)
     a · b =√ + 1 − 2 = 2
             3                                  √
     ||a|| = 3,       ||b|| = 32 + 12 + (−2)2 = 14.
     cos θ = √32 14 ,
                 √         θ = cos−1 √3√14 1.26 radians, or
                                       2
                                                                          72.0o

 (b) Using a · b and ||a|| above,
     proja b = 2 (1, 1, 1) = 2 , 3 , 3 . Remember, proja b should be in the di-
                3             3
                                 2 2

     rection of a.

Solution 2.12
                                                    √
                      a = (1, 4, 0)         a = 17
                                               √            √
                  b = (2, −1, 5)            b = 4 + 1 + 25 = 30
                b·a        √ 2−4         −2
 (a) cos θ =   a b     =       √
                            17 30
                                    =   √
                                         510
                                               so

                                     −2
                      θ = cos−1     √          ≈ 1.66 in radians or ≈ 95◦
                                      510

                 b·a        −2
 (b) proja b =   a 2a   =   17 (1, 4, 0)   = (−2/17, −8/17, 0).

Solution 2.13 The dot product is zero if −1 + 2 + s = 0, i.e., if s = −1.
Solution 2.14 Let a = [1, 2, 3], b = [4, 0, 5] and c = [3, 4, 6]. The sides of the
triangle are in the directions of a − b = [−3, 2, −2], a − c = [−2, −2, −3], and
b − c = [1, −4, −1]. To see if there are any right angles we compute (a − b) · (a −
c) = [−3, 2, −2]·[−2, −2, −3] = 8, (a−b)·(b−c) = [−3, 2, −2]·[1, −4, −1] = −9
and (a − c) · (b − c) = [−2, −2, −3] · [1, −4, −1] = 9. Since none of these are
zero, there are no right angles.
Solution 2.15 If [c1 , 1, c2 ] = s[2, −2, 3], with s a scalar multiple, then s = − 1
                                                                                   2
(from the values in the second component), hence:
c1 = −1,     c2 = −3/2.
Solution 2.16 The problem becomes much simpler if we choose the y axis
to run along the runway and the x axis to run perpendicular to the runway.
Replace the wind and plane velocities by vectors w and p as on the diagram in
Figure 2.26.
    The wind vector has components [−10 cos(20◦ ), −10 sin(20◦ )] while the plane
velocity has components [70 sin(θ), 70 cos(θ)]. We want the x component to be
zero hence we need 10 cos(20◦ ) = 70 sin(θ). This gives θ = 7.7◦ , i.e, the get the
heading of the plane we have to add 7.7◦ to the runway direction. Thus the


                                               51
Chapter 2. Vectors and Geometry                 2.8. Solutions to Chapter Problems




               Figure 2.26: Coordinate system for Solution 2.16.


plane’s heading is 267.7. The groundspeed is the magnitude of the velocity, in
this case simply the y component given by 70 cos(7.7◦ ) − 10 sin(20◦ ) = 66 knots.
Solution 2.17 If we use the co-ordinate system where the x axis is horizontal
and the y axis is vertical, then the force of gravity is Fg = [0, −mg] The unit
vector in the direction along the shaft is p = [− sin(θ), cos(θ)]. The force Fs
exerted by the shaft is in the direction of the shaft, hence a multiple of p.
So Fs = tp for some number t. To find t we use that the component of the
total force in the direction of the shaft must be zero. Thus projp Fg + Fs = 0,
i.e. p · Fg + t = 0, i.e., t = mg cos(θ). Thus the total force is Fg + Fs =
[0, −mg] + mg cos(θ)[− sin(θ), cos(θ)] = mg[− cos(θ) sin(θ), −1 + cos2 (θ)]. Note
that this is orthogonal to p, as it must be.
    If we use co-ordinates where the y axis runs along the shaft of the pendulum,
and the x axis perpendicular to it, then Fg = [−mg sin(θ), −mg cos(θ)] while
Fs = [0, t] for some value of t. In this case it is simply the y component that
must be zero, so t = mg cos(θ) and Fg + Fs = [−mg sin(θ), 0]
Solution 2.18

sqrt(a(1)^2+a(2)^2)

Solution 2.19 [1, 2, 3] × [4, 5, 6] = [−3, 6, −3]
Solution 2.20
                           
                 1   1    1
           det  1   2    3  = 1(−2 − 0) − 1(−1 − 3) + 1(0 − 2)
                 1   0   −1
                              = −2 + 4 − 2 = 0

Since the determinant is zero, the volume of the parallelepiped generated by the
row vectors is zero. This implies that the vectors lie on the same plane.


                                         52
Chapter 2. Vectors and Geometry                         2.8. Solutions to Chapter Problems



Solution 2.21 a · (a × b) = [a1 , a2 , a3 ] · [a2 b3 − a3 b2 , a3 b1 − a1 b3 , a1 b2 − a2 b1 ] =
a1 a2 b3 − a1 a3 b2 + a2 a3 b1 − a2 a1 b3 + a3 a1 b2 − a3 a2 b1 = 0. Now b · (a × b) =
−b · (b × a) = 0 by the previous calculation.
Solution 2.22

  (a) ((1, 4, −1) · (2, 1, 3))((2, 1, 4) × (1, 4, 9))

                                           i j ˆ
                                           ˆ ˆ k
                             = (2 + 4 − 3) 2 1 4
                                           1 4 9
                                 i(9       j(4       ˆ
                             = 3(ˆ − 16) + ˆ − 18) + k(8 − 1)
                             = 3(−7, −14, 7) = (−21, −42, 21)

 (b) (7, 1, 0) · ((2, 0, −1) × (1, 4, 3))

                                            ˆ ˆ
                                            i j   ˆ
                                                  k
                         =    (7, 1, 0) ·   2 0  −1
                                            1 4   3
                         =                 ˆ + 4) + ˆ
                              (7, 1, 0) · (i(0                  ˆ
                                                    j(−1 − 6) + k(8 − 0)
                         =    (7, 1, 0) · (4, −7, 8) = 28 − 7 + 0 = 21

  (c) (a × b) × (b × a)

                      = (a × b) × (−a × b) = −(a × b) × (a × b) = 0

      because c × c = 0 for any vector c.

Solution 2.23 Here we are assuming that θ lies in [0, π] so that sin(θ) is positive.
The quantity a is the length of the base of the parallelogram while b sin(θ)
is the height. The area of a parallelogram is the product of these.
Solution 2.24 Since a × b = −b × a it is never true that a × b = −b × a,
unless a × b = 0. For the other example, just try i, j and k. We have i × j = k
so i × (i × j) = i × k = −j On the other hand (i × i) × j = 0 × j = 0
Solution 2.25

a=rand(3,1);
b=rand(3,1);
c=rand(3,1);
cross(a,b) - cross(b,a)
cross(a,cross(b,c)) - cross(cross(a,b),c)

This actually constitutes a proof, since we found a counter example. We don’t
have to show that it’s always the case, we only had to show that in general the
two assertions are not true.


                                              53
Chapter 2. Vectors and Geometry                        2.8. Solutions to Chapter Problems



Solution 2.26 This a just a long calculation. One way to break it up is to
consider a = i, j and k separately. Suppose we can prove it for these special
cases. Then write a general a as a1 i + a2 j + a3 k. Then a × (b × c) = a1 i ×
(b × c) + a2 j × (b × c) + a3 k × (b × c) Assuming for the moment we know
that the special cases hold, then this equals a1 ((i · c)b − (i · b)c) + a2 ((j ·
c)b − (j · b)c) + a3 ((k · c)b − (k · b)c) = (a · c)b − (a · b)c Now we still
have to prove the special cases. For example if a = i we have i × (b × c) =
                                                         
                i               j                 k
det            1               0                 0        = −(b1 c2 −b2 c1 )j+(b3 c1 −b1 c3 )k =
        b2 c3 − b3 c2 b3 c1 − b1 c3 b1 c2 − b2 c1
[0, −b1 c2 + b2 c1 , b3 c1 − b1 c3 ] On the other hand(i · c)b − (i · b)c = c1 [b1 , b2 , b3 ] −
b1 [c1 , c2 , c3 ] = [0, c1 b2 − b1 c2 , c1 b3 − b1 c3 ]. The other two are similar.
Solution 2.27

a=rand(3,1);
b=rand(3,1);
c=rand(3,1);
cross(a,cross(b,c)) - (dot(a,c)*b - dot(a,b)*c)

Solution 2.28 (a×b)·(c×d) = ((a×b)×c)·d (by property 5) = −(c×(a×b))·d
(by property 1) = −((c · b)a + (c · a)b) · d = −(c · b)(a · d) + (c · a)(b · d)
Solution 2.29

  (a) Let v = a × (a × b). Since a × b is orthogonal to the paper, v must lie in
      the plane of the paper. v is also orthogonal to a. Checking orientations
      with the right hand rule, there are only two sketches (up to rotation and
      resizing the vectors) shown in Figure 2.27.

 (b) a × (a × b) = (b · a)a − (a · a)b by property 2 in the notes.

                                 2   b·a
                          = a             a−b        = a 2 (proja b − b).
                                      a 2


Solution 2.30 Actually there is no one right answer to this problem. But in
two dimensions the analog of the cross product could be an operation on a single
                                                           a a2
vector say ×(a) with the property that ×(a) · b = det 1              This implies
                                                           b1 b2
that ×(a) = [a2 , −a1 ]. Note that ×(a) is orthogonal to a. The answer in 4
dimensions is a little more obscure. In this case the cross product could be an
operation on three vectors that produces a vector that is orthogonal to all of
them. One way of producing such a vector would be to do the formal calculation
of a 4 × 4 determinant, where the first row contains the four unit vectors e1 ,
e2 , e3 and e4 , while the other three rows contain the entries of a,b and c. Do
you see why? (There is another possible, perhaps better answer, involving a
different type of product called the wedge product.)


                                               54
Chapter 2. Vectors and Geometry                 2.8. Solutions to Chapter Problems




              Figure 2.27: Vector sketch for the solution to 2.29.

Solution 2.31 Let q be a point on the line. Then the projection of x in the
direction of b is x·b2 . The set of all points whose projection in the direction of
                   b
b are the same as the projection of the point q in the direction of b is therefore
the set of points x such that x·b2 = q·b2 . Multiplying both sides by by b 2
                                 b        b
gives x · b = q · b.
Solution 2.32 The line is in the direction of a = [7, 5] − [0, 1] = [7, 4] and
is therefore perpendicular to b = [4, −7]. q = [0, 1] is a point on the line.
So the parametric form for the line is [0, 1] + s[7, 4] and the equation form
is 4x1 − 7x2 = −7. So five points on the line are (picking s = −2, 1, 0, 1, 2)
[−14, −7], [−7, −3], [0, 1], [7, 5], [14, 9]. To check whether [1012/3, 1069/21] lies
on the line, plug it into the equation: 4 · 1012/3 − 7 · 1069/21 = −7 so the point
does not lie on the line.
Solution 2.33 We can take q = [1, 1] and b = [2, 1] so the equation is 2x1 +x2 =
3.
Solution 2.34 We have to find a point on the line. Try for one of the form
[t, 0]. This is on the line if t − 3 · 0 = 5. So [5, 0] is on the line. The vector
orthogonal to the line is [1, −3] so a vector parallel is [3, 1]. Thus the parametric
form is [5, 0] + s[3, 1].
Solution 2.35 Pick a point q on the line and let b be orthogonal to the line.
Then the distance from a point x to the line is the length of the projection of
x − q onto b as shown in Figure 2.28. In this case we can take b = [3, −4],
q = [0, 1] and x = [−2, 3]. The projection is ((x − q) · b/ b 2 )b This equals
([−2, 2] · [3, −4]/25)[3, −4] = (−14/25)[3, 4] The length of this vector is 14/5.
Solution 2.36
 (a) the normal direction to the plane is (1, −1, 2)
 (b) the point (7, 0, 0) is on the plane (look for a solution with y = z = 0 that
     is the intersection of the plane with the line of the x-axis).

Solution 2.37 The midpoint between b and c is (b + c)/2. So the vector from
a to this midpoint is (b + c)/2 − a and thus the parametric form for the median


                                         55
Chapter 2. Vectors and Geometry                  2.8. Solutions to Chapter Problems




              Figure 2.28: The projection used in Solution 2.35.


is a + t((b + c)/2 − a). Similarly the parametric forms for the other medians is
b + s((a + c)/2 − b and c + r((a + b)/2 − c). When t = s = r = 2/3 these lines
meet at the common point (a + b + c)/3
Solution 2.38 The line is in direction (0, 1, 3). Two vectors which are orthog-
onal to the line are (1, 0, 0) and (0, 3, −1), because (1, 0, 0) · (0, 1, 3) = 0 and
(0, 3, −1) · (0, 1, 3) = 0. Note that there are other choices for these two vectors
that lead to different correct answers below. There will be two equations of the
form
                                       x = c1
                                  3y − z     = c2
Substituting the point (2, 0, 4) that we know is on the line gives c1 = 2 and
c2 = 4. Thus an equation form of the line is
                                       x =       2
                                  3y − z     =   4

Solution 2.39 To find an equation for the plane we need an orthogonal vector
                           ˆ
                           i    j ˆ
                                ˆ k
(0, 1, −3)×(−1, 2, 0) =   0     1 −3       =ˆ
                                            i(0+6)+ ˆ       ˆ
                                                    j(3−0)+ k(0+1) = (6, 3, 1).
                          −1    2 0
So an equation for the plane is 6x + 3y + z = c and we know that c = 0 since
the plane goes through the origin. Any point on the line has the form
                   (2, −1, 6) + s(1, −1, 0) = (2 + s, −1 − s, 6).
Substituting this into the equation for the plane gives
                        6(2 + s) + 3(−1 − s) + 6     =   0


                                        56
Chapter 2. Vectors and Geometry                     2.8. Solutions to Chapter Problems



                                              3s + 15 = 0
                                                    s = −5
So the intersection point is
          (2, −1, 6) − 5(1, −1, 0) = (2, −1, 6) + (−5, 5, 0) = (−3, 4, 6).
Note: it can be checked that t = −2 and u = 3 in the parametric form of the
plane gives this point as well.
Solution 2.40 Points on the line satisfy
                                     x =      1+t
                                     y =      2
                                     z   =    3 − t.
    To be on the plane, x+2y −z = 5, or (1+t)+2(2)−(3−t) = 5, or 2t+2 = 5,
or t = 3/2.
    Thus the intersection occurs at (1, 2, 3) + ( 3 )(1, 0, −1) = ( 5 , 2, 3 ).
                                                  2                 2      2

Solution 2.41 Two vectors in the direction of the plane are [1, 1, 0] − [1, 0, 1] =
[0, 1, −1] and [0, 1, 1] − [1, 0, 1] = [−1, 1, 0] To find a normal vector take the
cross product [0, 1, −1] × [−1, 1, 0] = [1, 1, 1]. So the equation of the plane is
x1 + x2 + x3 = 2.
Solution 2.42 The centre of the sphere lies in the plane halfway between the
planes x1 + x2 + x3 = 3 and x1 + x2 + x3 = 9. This is the plane x1 + x2 + x3 = 6.
So the centre satisfies the three equations
                                x1   +x2          +x3   =6
                               2x1   −3x2               =0
                               3x1                −x3   =0
We will develop efficient techniques for solving such systems of equations. For
now, we can just use brute force: The second equation says x2 = 2x1 and the
third x3 = 3x1 . Substituting this into the first equation yieldsx1 = 1, so x2 = 2
and x3 = 3. Therefore the centre of the sphere is [1, 2, 3]. To compute the
radius, we must find the distance between the planes x1 + x2 + x3 = 3 and
                                                              √
x1 + x2 + x3 = 6 If we divide the equations by [1, 1, 1] = 3, the first can be
interpreted as vectors whose projection onto the direction [1, 1, 1] has length
   √     √
3/ 3 = 3, the second can be interpreted as vectors whose √
                                  √       √                   projection onto the
direction [1, 1, 1] has length 6/ 3 = 2 3. The radius is 3 so the equation
is (x1 − 1)2 + (x2 − 2)2 + (x3 − 3)2 = 3.
Solution 2.43
x+y+z =2           normal (1, 1, 1) = b1
x − y + 2z = 7      normal (1, −1, 2) = b2
   The line direction can be found as
                                 i j ˆ
                                 ˆ ˆ k
                                          

               a = b1 × b2 =  1 1 1  = (3, −1, −2).
                                 1 −1 2

                                             57
Chapter 2. Vectors and Geometry                     2.8. Solutions to Chapter Problems



   We also need a point on the line. We can look for a point which has z = 0
(the point that is the intersection of the line and the x-y plane). We get the
two by two system

                                x + y = 2,        x−y =7
     Solving the system yields that x = 2 , and y = − 5 .
                                           9
                                                       2
         9    5
     So 2 , − 2 , 0 + s(3, −1, −2) is a parametric form of the line with parameter
s.
Solution 2.44 Set x2 = 0 and solve 2x1 − x3 = 0 and x1 + 2x3 = −5 giving
x1 = −1 andx2 = −2. This implies [−1, 0, −2] is in the plane we are looking for.
Set x3 = 0 and solve 2x1 + 3x2 = 0x1 − 4x2 = −5 giving x1 = −15/11 and x2 =
10/11] Hence [−15/11, 10/11, 0] also lies on the plane. We now have three points
on the plane. Two vectors in the direction of the plane are therefore [−2, 0, 1] −
[−1, 0, −2] = [−1, 0, 3] and[−15/11, 10/11, 0] − [−1, 0, −2] = [−4/11, 10/11, 2].
Thus a normal vector is [−1, 0, 3] × [−4/11, 10/11, 2] = [−30/11, 10/11, −10/11]
This is parallel to [−3, 1, −1]. Hence the equation is −3x1 + x2 − x3 = 5.
Solution 2.45 The three points are all on the same line. To see this, notice
that the vectors [2, 3, 4] − [1, 2, 3] = [1, 1, 1] and [3, 4, 5] − [1, 2, 3] = [2, 2, 2] are
parallel.
Solution 2.46 The distance of p to the plane b · x = c is the length of the
projection of p − q onto b, where q is any point on the plane. A point on the
plane is [c/b1 , 0, 0] (unless b1 = 0 in which case we choose either [0, c/b2 , 0] or
[0, 0, c/b3 ]) The length of the projection is (p − q) · b/ b = (b · p − c)/ b .
Solution 2.47 If the line is parallel to both planes, then it is orthogonal to both
normal vectors. Therefore [1, 1, 0] × [1, −1, 2] = [2, −2, −2] is in the direction of
the line. Therefore the parametric form of the line is [2, −1, −1] + s[2, −2, −2]
The two equations are x1 + x2 = [1, 1, 0] · [2, −1, −1] = 1 and x1 − x2 + 2 =
[1, −1, 2] · [2, −1, −1] = 1.
Solution 2.48

x=-3:0.1:3
y = sqrt(9-x.^2)
plot(x,y,’.’)
hold on
plot(x,-y,’.’)
axis equal

Solution 2.49 Yes [1, 1] and [1, 0] do form a basis, since they don’t lie on the
same line. (By the way, there is some potential for confusion here. When I
say here that [1, 1] and [1, 0] don’t lie on the same line, I’m thinking of them as
position vectors, drawn with their tails at the origin. It probably would be more
clear to say that the vectors are not parallel.) Every vector in the plane can be
written as a linear combination of these two. In particular [0, 1] = [1, 1] − [1, 0]


                                             58
Chapter 2. Vectors and Geometry                     2.8. Solutions to Chapter Problems



Solution 2.50 Note that a and b are linearly independent (they are not multi-
ples of one another). So a, b and c will be linearly independent and so a basis
for R3 unless c is a linear combination of a and b, that is it can be written as

                                    c = ra + sb

for some r and s.
Solution 2.51 The collection is a basis if the vectors are linearly independent.
We consider ra + sb + tc = 0 which leads to

                                 r+s+t = 0
                                   r+s = 0
                                           r    =    0

Starting from the last equation and working up, we see that r = 0, s = 0, and
t = 0 which shows that the vectors are linearly independent and therefore form
a basis. The linear independence could also have been shown by considering the
determinant
                               1 1 1
                               1 1 0 =1=0
                               1 0 0
Since the determinant is not zero the vectors do not lie on the same plane and
so are linearly independent. To write [1, 2, 3] as a linear combination

                    [1, 2, 3] = r[1, 1, 1] + s[1, 1, 0] + t[1, 0, 0]

match components working backwards to get successively r = 3, s = −1 and
t = −1.
Solution 2.52 The vectors a, b, and c are vectors in 3-D. Thus they form a
basis if and only if they are linearly independent.
   The three vectors are linearly independent if and only if the matrix A with
rows given by the three vectors satisfies

                                    det(A) = 0.

Here                                                 
                                   1            0   4
                               A= 2           −1   0 
                                   8           −3   8
Solution 2.53 No, it is not possible for four vectors to be linearly independent
in three dimensional space. To see this, first notice that if four vectors are lin-
early independent any subset of three vectors must also be linearly independent
(If three vectors would lie on the same plane, we could find a non-trivial linear
combination of those three equal to zero. Then by adding 0 times the left over
vector, we would get a non-trivial linear combination of all four vectors equal


                                          59
Chapter 2. Vectors and Geometry                 2.8. Solutions to Chapter Problems



to zero, contradicting their independence.) But this means that those three
vectors form a basis for three dimensional space. So the fourth vector must
be expressible as a linear combination of the first three. This means the four
vectors are not independent.
Solution 2.54 It is enough to check that det(A) = 0. We have that
                                  
                         2 1 3
                   A =  1 0 2 ,         det(A) = 6 = 0,
                         3 0 0

so they form a basis. We want to find s1 , s2 , and s3 such     that
                                                          
                  12           2               1               3
                2  = s1  1  + s2  0  + s3               0 .
                   4           3               2               0

We have the following system of equations:

                              12   =   2s1 + s2 + 3s3
                               2   = s1
                               4   = 3s1 + 2s2

Hence s1 = 2, s2 = −1, s3 = 3.
Solution 2.55 If x = s1 a1 + s2 a2 + · · · sn an and x = t1 a1 + t2 a2 + · · · tn an ,
then 0 = x − x = (s1 − t1 )a1 + (s2 − t2 )a2 + · · · (sn − tn )an . Since the vectors
a1 , a2 , . . . an are linearly independent, the only way this can happen is s1 − t1 =
0, . . . , sn − tn = 0, or s1 = t1 , . . . , sn = tn .
Solution 2.56 The values alpha = 0.23, beta = 0.15 provide a good ap-
proximation to the linear combination.
Solution 2.57 The axis is parallel to [2, −3, 1] − [1, 1, −1] = [1, −4, 2]. The unit
                                √
vector in this directions is 1/ 21[1, −4, 2] (Actually there is an ambiguity in the
problem: the unit vector could also point in the opposite direction. This would
                                                  √
change the sign of the answer.) So Ω = (10/ 21)[1, −4, 2]. The vector r has
its tail co-inciding with the tail of Ω and its head at [1, 2, 3]. √ r = [1, 2, 3] −
                                                                   So
[1, 1, −1] = [0, 1, 4] and the velocity is v = Ω × [0, 1, 4] = (10/ 21)[−18, −4, 1]
Solution 2.58 Some preliminary discussion:
  (i) it does not matter where the centre of the rod is along L, we can take it
      to be at the origin for convenience.
 (ii) the fastest speed is at the rod tip, and this does not change in time, so
(iii) it is OK to evaluate the speed at the initial position.
   Ω = 3 · 2π = 6π radians/min.
        1
   a = √3 (1, 1, 1) unit vector in the direction of the axis of rotation.
   ˆ


                                         60
Chapter 2. Vectors and Geometry                   2.8. Solutions to Chapter Problems




               Figure 2.29: Mapped tetrahedron of Solution 2.61.

                  6π
   v = Ωˆ × r = √3 (1, 1, 1) × (1, 0, 0), with r the initial position of the tip.
         a
   Now evaluate the cross product:
                            i j ˆ
                           ˆ ˆ k
                                     
                          1 1 1  = (0, 1, −1),
                           1 0 0
           6π
                                                             √
   so v = √3 (0, 1, −1), and hence the speed is ||v|| = 2π 6m/min
Solution 2.59 We have [x(t), y(t), 0] = r[cos(θ(t)), sin(θ(t)), 0] so the velocity
is
                                               ˙               ˙
                [x(t), y(t), 0] = r[− sin(θ(t))θ(t), cos(θ(t))θ(t), 0]
                 ˙     ˙
                                    ˙
                                = rθ(t)[− sin(θ(t)), cos(θ(t)), 0]
                             ˙
On the other hand Ω = [0, 0, θ(t)] so that Ω × r[cos(θ(t)), sin(θ(t)), 0] gives the
same answer.
Solution 2.60 We have q − p = [20, 0, 0], x − p = [11, 3, 2]. Thus e1 · (q − p) =
                                                        √                          √
e2 · (q − p) = 0, e3 · (q − p) = √−20, e1 · (x − p) =√ 2/2, e2 · (q − p) = 2/2,
                                                      5
e3 · (x − p) = −11. So s1 = 50 2/11 and s1 = 10 2/11.
                                                               √
Solution 2.61 The image of the vertices are [0, 0], [2, 0], [1, 3] and some more
horrible expression which is approximately [1.587 . . . , 0.9719 . . .]. So the image
on the screen looks something like what√ shown in Figure 2.29. Notice that
                                             is
the first three points—[0, 0], [2, 0] and [1, 3]— are simply double the x2 and x3
co-ordinates of the original points in space. Can you explain this geometrically?
(I know these are supposed to be answers, not questions, but√       still . . .) (By the
                                                                                  √
way, I actually intended the last point in the question to be [ 6/3, 1/2, 3/6].
Can you say why?)
Solution 2.62 We can write
                         1
 [s1 , s2 ] =                         [e1 · (x0 − p) + te1 · v, e2 · (x0 − p) + te2 · v]
              e3 · (x0 − p) + te3 · v

                                           61
Chapter 2. Vectors and Geometry                   2.8. Solutions to Chapter Problems



This means that as1 + bs2 = c can be rewritten

a(e1 · (x0 − p) + te1 · v) + b(e2 · (x0 − p) + te2 · v) − c(e3 · (x0 − p) + te3 · v) = 0

This holds for every t if both these equations hold:

          a(e1 · (x0 − p))    +b(e2 · (x0 − p)) −c(e3 · (x0 − p)) = 0
              ae1 · v             +be2 · v          −ce3 · v      =0

This is a system of two equations in three unknowns, a, b, and c. Such a system
always has a non trivial solution (as we will see in the following chapter.)
Solution 2.63 In this case we want y − x to point in the same direction as
e3 so y − x = λe3 . On the other hand y lies on the plane of the screen, so
y = q + s1 e1 + s2 e2 . Therefore

                             q − x + s1 e1 + s2 e2 = λe3 .

Now to find s1 and s2 take dot products with e1 and e2 . This yields s1 =
−e1 · (q − x) and s2 = −e2 · (q − x).




                                          62
Chapter 3

Solving Linear Systems

3.1     Linear Systems
3.1.1    General Form of Linear Systems
So far, we have seen systems of linear equations as the equations that describe
points, lines and planes. However, linear systems of equations show up in many
other ways in engineering problems. We will solve linear systems to find the
behaviour of an electrical circuit. Other examples would be the calculation of
equilibrium temperature distributions or electric fields. Such examples often
involve the discretization of a continuous function. In other words, a continuous
function like the temperature distribution in a body (which has a value for
each of the infinitely many points in the body) will be replaced by a list of
temperatures at a large but finite number n of closely spaced points. This gives
rise to a system of linear equations in n unknowns, where n can be in the tens
of thousands, or higher. Therefore, we want to develop a technique to solve
systems of linear equations in n unknowns when n is large.
    The most general form of a linear system of equations is

               a1,1 x1   +   a1,2 x2   +··· +    a1,n xn   =   c1
               a2,1 x1   +   a2,2 x2   +··· +    a2,n xn   =   c2
                  .
                  .      .
                         .                          .
                                                    .           .
                                                                .
                  .      .                          .           .
               am,1 x1   + am,2 x2     + · · · + am,n xn   = cm

Here the numbers ai,j and cj are known, and the goal is to find all values of
x1 , . . . , xn that satisfy all the equations.

3.1.2    Solving Linear Systems by Substitution
Let us start with an example, which we will solve using the method of substitu-
tion.



                                        63
Chapter 3. Solving Linear Systems                              3.1. Linear Systems



Example 3.1 Consider the system of equations

                         x1   +     x2   +     x3   =     6
                         x1   −     x2   +     x3   =     0
                        2x1   +     x2   −    8x3   =    −11

One could try to proceed as follows. Solve the first equations for, say, x3 . This
gives
                              x3 = 6 − x1 − x2 .
Now substitute this value for x3 into the second and third equations. This gives

                  x1    − x2      + (6 − x1 − x2 ) =   0
                  2x1   + x2      − 8(6 − x1 − x2 ) = −11
or

                                     −2x2     = −6
                              10x1 + 9x2      =     37

Now solve the first of these equations for x2 and substitute into the last equation.
This gives x2 = 3 and x1 = 1. Finally we can go back and calculate x3 =
6 − 1 − 3 = 2.

    Although this procedure works fine for n = 2 or even n = 3, it rapidly be-
comes unwieldy for larger values of n. We will now introduce a technique called
Gaussian elimination that works well for large n and can be easily implemented
on a computer.
    We have already observed that there may be many systems of equations
with the same solution. When there are only two unknowns, this amounts to
saying that different pairs of lines may intersect in the same point. Gaussian
elimination is based on the following idea. We introduce three elementary row
operations. These operations change the system of the equations into another
system with exactly the same the set of solutions. We then apply these elemen-
tary row operations in a systematic way to change the system of equations into
a system that is easily solved.

3.1.3    Elementary row (equation) operations
The first elementary row operation is
1. Multiplication of a row (equation) by a non-zero number
   For example, if we multiply the first equation in the system of Example 3.1
above by 3, we end up with

                        3x1   +   3x2    + 3x3       = 18
                        x1    −   x2     + x3        =  0
                        2x1   +   x2     − 8x3       = −11


                                         64
Chapter 3. Solving Linear Systems                              3.1. Linear Systems



This new system of equations has exactly the same solutions as the original
system, because we can undo the elementary row operation simply by dividing
the first equation by 3. Thus the values x1 , x2 , x3 solve this system if and only
if they solve the original system. (Notice that this would not be true if we
multiplied by zero. In that case we could not undo the operation, and the new
system of equations could well have more solutions than the original system.)
Any row operations we do can be undone by other row operations, and the set
of solutions of the linear system remain unchanged.
    The second elementary row operation is
2. Adding a multiple of one row (equation) to another row
   For example, if we added 2 times the first row to the second row in our
example we would obtain the system
                         x1   + x2       + x3    =  6
                        3x1   + x2       + 3x3   = 12
                        2x1   + x2       − 8x3   = −11
Again, the new system of equations has exactly the same solutions as the original
system, since we could undo this elementary row operation by subtracting 2
times the first row from the second row.
   The third and final elementary row operation is
3. Interchanging two rows (equations)
   For example, if we swapped the first and second equations in our original
system we would end up with
                         x1   − x2       + x3    =  0
                         x1   + x2       + x3    =  6
                        2x1   + x2       − 8x3   = −11
This obviously doesn’t change the solutions of the system since we have the
same equalities.

3.1.4       Augmented Matrices
To save unnecessary writing, we now set up an streamlined notation for systems
of linear equations. Notice that the only thing that distinguished one system of
equations from another are the coefficients. So, as shorthand, we can write the
system of equations of Example 3.1
                         x1   + x2       + x3    = 3
                         x1   − x2       + x3    = 3
                        2x1   + x2       − 8x3   = −4
simply as                                      
                               1     1     1  3
                              1    −1     1  3 
                               2     1    −8 −4

                                         65
Chapter 3. Solving Linear Systems                                 3.1. Linear Systems



This is called an augmented matrix. “Augmented” refers to the column to
the right of the line that contains the information about the right side of each
equation.

3.1.5    Problems
Problem 3.1 Express the system

                        x1   − 2x2       + 3x3          =     6
                       4x1   − 5x2       − 6x3          =     7
                       8x1   + 9x2       + 10x3         =    11

as an augmented matrix.

Problem 3.2 Start with the system

                        x1   +      x2   +    x3    =        6
                        x1   −      x2   +    x3    =        0
                       2x1   +      x2   −    8x3   =       −11

and perform the following sequence of row operations:
1. Subtract the first row from the second row
2. Subtract twice the first row from the third row
3. Multiply the second row by −1/2
4. Add the second row to the third row
5. Multiply the third row by −1/10
   Solve the resulting system of equations by starting with the third equation,
   then the second and then the first.

Problem 3.3 (Matlab) Let’s try to solve problem 3.2 using Matlab.

1. Start by generating an extended matrix to represent the system of equations:

   A=[1, 1, 1, 6; 1, -1, 1, 0; 2, 1, -8, -11]

2. Subtract the first row from the second row:

    A(2,:)=A(2,:)-A(1,:)

3. Subtract twice the first row from the third row:

    A(3,:)=A(3,:)-2*A(1,:)

4. Multiply the second row by −1/2:


                                         66
Chapter 3. Solving Linear Systems                        3.2. Gaussian Elimination



    A(2,:)=-A(2,:)/2

5. etc.
What other commands are necessary to arrive at the solution? Does it coincide
with the paper and pen solution?
Problem 3.4 Start with the system
                         x1   −      x2    + x3   = 10
                         x1   −      x2    − x3   = 6
                        6x1   +     3x2    + x3   = 0
Perform the following sequence of row operations
1. Subtract the second row from the first row
2. Add the first row to the second row
3. Subtract the first row from the third row
4. Multiply the second row by 3
5. Add the second row to the third row
Show that the resulting system of equations can be easily solved and find the
solution of the above system of equations.
Problem 3.5 Without using the calculator or computer, find the solution to
the system
                       2x1 + x2 =             5
                       3x1 + 5x2 = −10
You can leave the solutions as fractions.


3.2       Gaussian Elimination
Recall that we want to use a sequence of elementary row operations to turn an
arbitrary system of equations into an easily solved system of equations (with
exactly the same solutions). What equations are easily solved? Well, the easiest
possible equations to solve are ones of the form below, written as an augmented
matrix:                                        
                                  1 0 0 3
                                0 1 0 3 
                                  0 0 1 −4
If we translate from the shorthand back to the equations they represent, the first
row says x1 = 3, the second row says x2 = 3 and the third row says x3 = −4.
In other words, we can just read off the values of x1 , x2 and x3 in the rightmost
column. The equations are already solved, and there is nothing left to do!
    Slightly more work, but still easy to do, are upper triangular systems. These
are systems where all the entries below the diagonal are equal to zero, as in
                                                  
                                 1 1       1     3
                               0 −1 1           3 
                                 0 0 −4 −8

                                          67
Chapter 3. Solving Linear Systems                          3.2. Gaussian Elimination



The reason these are easy to solve is that the equation represented by the jth
row only involves the variables xj , xj+1 , . . . , xn . So if we start with the last
equation (in the example −4x3 = −8), we can solve it immediately for xn (in
the example x3 = 2). Now we move up one equation. This equation only
involves xn−1 and xn , and we already know xn . So we can solve it for xn−1 (in
the example −x2 + x3 = 3 so −x2 + 2 = 3 so x2 = −1). We can continue in
this way until all the xn ’s have been found. (In the example there is one more
equation x1 + x2 + x3 = 3 or x1 − 1 + 2 = 3 or x1 = 2.)
     In practise (i.e., in a typical computer program) the procedure that is ac-
tually used is to apply a sequence of row operations to turn the system of
equations into an upper triangular system. Then the equations are solved one
by one, starting at the bottom and working up. This is the most efficient way
to solve a system of equations. However, its sometimes convenient to apply row
operations to bring the equation into the “completely solved” form. Then, you
can just read off the solution from the last column.
     Let us now do a bunch of examples to illustrate this procedure. I’ll cook
them up so that everything that possibly can go wrong, does go wrong. (I’ll also
cook them up so that the numbers come out looking nice. This will definitely not
be the case in an example coming up in a real application!). Here is a shorthand
for indicating which elementary row operation was done. The notation 3(1, :)
means the first row was multiplied by the non-zero number 3. The notation
(2, :) = (2, :) − 4(5, :) means that 4 times the fifth row was subtracted from the
second row. Finally, (2, :) ↔ (3, :) means that the second and third row were
interchanged.
Example 3.2 Let us start with
                                                      
                         1 2 −2               −7   −29
                      1 2 −1                 −5   −18 
                                                      
                      0 3 0                  −3   −6 
                        −1 4 1                 1   14
We are trying to put this matrix in upper triangular form. So we start by trying
to produce zero entries in the first column under the top entry. We can do this
by adding multiples of the first row to the other rows. So, the first move is to
subtract the first row from the second row. The result is
                                         
                 1 2 −2 −7            −29
              0 0 1
                               2      11  (2, :) = (2, :) − (1, :)
                                          
              0 3 0 −3               −6 
                −1 4 1          1      14
The third row already has a zero in the first column, so there is nothing to do
here. To put a zero in the fourth row we add the first row to the last row.
                                       
                 1 2 −2 −7          −29
               0 0 1         2      11 
                                       
               0 3 0 −3            −6 
                 0 6 −1 −6          −15    (4, :) = (4, :) + (1, :)

                                         68
Chapter 3. Solving Linear Systems                       3.2. Gaussian Elimination



Now we shift our attention to the second column. We want to produce zeros
below the diagonal. If we attempt to do this by adding multiples of the first row
to other rows, we will destroy the zeros that we have already produced. So we
try to use the second row. This is where we run into the first glitch. Since the
entry in the second column of the second row is zero, adding a multiple of this
row to the others won’t have any effect on the numbers in the second column
that we are trying to change. To remedy this we simply swap the second and
third rows.                                 
                    1 2 −2 −7          −29
                   0 3 0 −3            −6 
                                             (2, :) ↔ (3, :)
                   0 0 1         2      11 
                    0 6 −1 −6          −15
Now we can complete the job on the second column by subtracting 2 times the
second row from the last row.
                                     
               1 2 −2 −7         −29
              0 3 0 −3          −6 
                                     
              0 0 1        2     11 
               0 0 −1 0          −3      (4, :) = (4, :) − 2(2, :)

Now we shift our attention to the third column. To produce a zero in the entry
below the diagonal we must add the third row to the last row.
                                        
                1 2 −2 −7           −29
               0 3 0 −3            −6 
                                        
               0 0 1         2      11 
                0 0 0         2      8      (4, :) = (4, :) + (3, :)

The matrix is now in upper triangular form. Let us find the solution. This last
row is shorthand for the equation 2x4 = 8. So x4 = 2. The third row now gives
x3 + 2(4) = 11, so x3 = 3. The second row gives 3x2 − 3(4) = −6 so x2 = 2.
Finally the first row gives x1 + 2(2) − 2(3) − 7(4) = −29 so x1 = 1.

Example 3.3 There is really no need to do anything more in Example 3.2, but
let us continue with elementary row operations to put the equations into the
“completely solved” form, just to see how this goes. First we divide the second
row by 3.                                
                  1 2 −2 −7          −29
                0 1 0 −1             −2  (2, :) = (1/3)(2, :)
                                         
                0 0 1         2      11 
                  0 0 0        2       8
Now we subtract twice the second row from the first row.
                                     
              1 0 −2 −5          −25     (1, :) = (1, :) − 2(2, :)
             0 1 0 −1            −2 
                                     
             0 0 1         2     11 
              0 0 0         2      8

                                       69
Chapter 3. Solving Linear Systems                         3.2. Gaussian Elimination



Now add twice the third row to the first row. Then divide the last row by 2.
                                    
                1 0 0 −1         −3      (1, :) = (1, :) + 2(3, :)
              0 1 0 −1          −2 
                                    
              0 0 1 2           11 
                0 0 0 1           4        (4, :) = (1/2)(4, :)

Finally, we add various multiples   of the last row to the previous rows.
                                       
                    1 0 0 0          1      (1, :) = (1, :) + (4, :)
                  0 1 0 0           2  (2, :) = (2, :) + (4, :)
                                       
                  0 0 1 0           3  (3, :) = (3, :) − 2(4, :)
                    0 0 0 1          4

We now can read the solution off from the last column.

    In the previous example there was a unique solution to the system of equa-
tions. We already know, from the geometrical meaning of the equations, that
sometimes there will be lots of solutions depending on a parameter. This is
expected to happen when there are fewer equations than unknowns (e.g., the
intersections of two planes in three dimensional space is usually a line) but will
also occur in certain degenerate cases when the number of equations is equal
to or more than the number of unknowns (e.g., three, or even four, planes may
intersect in a line too). What happens in the procedure of row reductions when
there are parameters in the solution? Let us look at another example.

Example 3.4 Consider the system
                                                     
                        1    3  2            −2    −1
                     1
                            3  4            −2     3 
                                                      
                     −2 −6 −4                5     5 
                       −1 −3 2                1     6

Perform Gaussian Elimination on this system. We begin, as before, by trying
to produce zeros in the first column under the diagonal entry. This procedure
yields                              
                  1 3 2 −2        −1
                0 0 2 0
                                  4  (2, :) = (2, :) − (1, :)
                                     
                0 0 0 1           3  (3, :) = (3, :) + 2(1, :)
                  0 0 4 −1         5     (4, :) = (4, :) + (1, :)
As in the previous example, there is now a zero sitting in the diagonal spot in
the second column. Last time, we swapped rows at this point to put a non-zero
entry in this place. But now, all the other entries below this one are zero too!
So there is nothing we can swap in to save the situation. (Clearly, swapping the
first row down is not a good idea, since that would destroy the zero in the first
column.) So we just have to admit defeat, and shift our attention one column



                                        70
Chapter 3. Solving Linear Systems                         3.2. Gaussian Elimination



to the right. We subtract twice the second row from the fourth row.
                                      
                  1 3 2 −2         −1
                0 0 2 0            4 
                                      
                0 0 0 1            3 
                  0 0 0 −1         −3    (4, :) = (4, :) − 2(2, :)
Now we complete the job by adding the third row to the last row.
                                    
                1 3 2 −2        −1
              0 0 2 0           4 
                                    
              0 0 0 1           3 
                0 0 0 0          0      (4, :) = (4, :) + (3, :)
What are the solutions? The third equation says x4 = 3 and the second equation
says x3 = 2. There is nothing new here. However the first equation introduces
not just one, but two new variables x1 and x2 . It reads x1 + 3x2 + 2(2) − 2(3) =
−1, or, x1 + 3x2 = 1 Clearly, there are infinitely many values of x1 and x2 that
satisfy this equation. In fact, if we fix x2 to be any arbitrary value x2 = s, and
then set x1 = 1 − 3s, x1 and x2 will be solutions. So for any choice of s

                       x1 = 1 − 3s, x2 = s, x3 = 2, x4 = 3

is a solution. There are infinitely many solutions depending on a parameter s.
We could also write this as
                             x1       1        −3
                                             
                            x2   0        1 
                             =   + s
                             x3       2         0
                                                  
                             x4       3         0
and recognize the solutions as a line in four dimensional space passing through
[1, 0, 2, 3] in the direction [−3, 1, 0, 0].

    There is one situation left to consider, namely when there are no solutions
at all. Geometrically, this happens, for example, when we are trying to find the
intersection of two parallel planes. Let us look at an example.

Example 3.5 Perform Gaussian Elimination on the following system
                                        
                            1    3     1
                          1     4     2 
                                        
                          −1 −3       0 
                            2    6     4
We begin in the usual way.
                                 
                     1 3        1
                    0 1
                               1  (2, :) = (2, :) − (1, :)
                                  
                    0 0        1  (3, :) = (3, :) + (1, :)
                     0 0        2   (4, :) = (4, :) − 2(1, :)

                                       71
Chapter 3. Solving Linear Systems                         3.2. Gaussian Elimination




        *   *   *   *   *   *   *   *   *   *
            *   *   *   *   *   *   *   *   *
                        *   *   *   *   *   *
                                *   *   *   *
                                    *   *   *




Figure 3.1: After Gaussian Elimination the augmented matrix will be in row
echelon form (left). With further work, the augmented matrix can be put in
reduced row echelon form (right).


There is nothing left to do in the second column, so we shift our attention to
the third column and subtract twice the third row from the fourth row.
                                 
                       1 3      1
                     0 1       1 
                                 
                     0 0       1 
                       0 0      0    (4, :) = (4, :) − 2(3, :)

Now we are done. If we write down the equation corresponding to the third
row, we get 0x1 + 0x2 = 1, or 0 = 1. Clearly there is no choice of x1 or x2 that
makes this true. So this is a system of equations with no solutions.

    Let us summarize what we have done in this section. Every system of equa-
tions can be brought into upper triangular form using a sequence of elementary
row transformations. The resulting upper triangular matrix will look something
like what is shown in Figure 3.1 (left). In this diagram, all the entries below the
staircase line are zero. The boxes represent non-zero entries. The stars represent
arbitrary entries, that may or may not be zero. Each circled star corresponds
to a parameter that must be introduced.
    If we want to put this example in completely reduced form, we use elementary
row operations to zero out the entries lying above the boxes too. Then we
multiply each row by a number so that the corner entries (in the boxes) become
1. The completely reduced form for the example above would look like the
diagram in Figure 3.1 (right). The official name of this form is the reduced row
echelon form.
    If the bottom of the matrix has a row that is all zeroes, except for the
augmented entry, then the system of equations has no solutions. This is because
the bottom row stands for an equation of the form 0 = with = 0. A typical
example is shown in Figure 3.2 (left).
    If all the steps on the staircase in the non-augmented part of the matrix have
size one, then there are no parameters to introduce, and the solution is unique.
Notice that in this case there are the same number of equations as variables. A
typical example is shown in Figure 3.2 (right).


                                                72
Chapter 3. Solving Linear Systems                       3.2. Gaussian Elimination




         *   *   *   *   *   *   *   *   *    *
             *   *   *   *   *   *   *   *    *
                         *   *   *   *   *    *
                                 *   *   *    *
                                     *   *    *




Figure 3.2: Augmented matrices after Gaussian Elimination with no solutions
(left) and with a single solution (right).


    Finally, we introduce some terminology. The rank of a matrix is the number
of non-zero rows in the matrix obtained after reducing it to the upper triangular
form described above. In other words the rank is the number of boxes in the
diagrams above. We can now rephrase the different possibilities in terms of
rank. If the rank of the augmented matrix is greater than the rank of the
unaugmented matrix (i.e., the matrix without the last column) then there are
no solutions. If the rank of the matrix is equal to the number of unknowns then
the solution is unique. If the rank r of the matrix is equal to the rank of the
unaugmented matrix, but less than the number n of unknowns, then there are
n − r parameters in the solution.

3.2.1    Using MATLAB for row reductions
MATLAB has a built in command called rref that reduces a matrix to reduced
row echelon form. Let us try it on the example in the previous section. First
we define the initial matrix A. Remember that the last column of this matrix is
the augmented part.
A = [1 2 -2 -7 -29; 1 2 -1 -5 -18; 0 3 0 -3 -6; -1 4 1 1 14]
To find the reduced row echelon form, simply type
>> rref(A)
ans =
1 0 0 0 1
0 1 0 0 2
0 0 1 0 3
0 0 0 1 4
Notice that this MATLAB command did the work of Examples 3.2 and 3.3. The
solution to the system can be read off the result of the rref command above.
    It is important to realize that floating point rounding errors as discussed
in Section 2.4.6 can lead to errors in solutions to linear systems computed by
MATLAB and other computational tools. At worst, these errors will lead to


                                             73
Chapter 3. Solving Linear Systems                           3.2. Gaussian Elimination



MATLAB finding “solutions” to problems that do not have exact solutions. In
these cases, solutions will often have very large values and often MATLAB will
give a warning about the problem being “ill-conditioned”.

3.2.2     Problems
Problem 3.6 Show that the lower triangular system of equations represented
by                                         
                           1 0       0    3
                          1 −1 0         3 
                           2 1 −8 −4
is also easily solved, by easily solving it! It’s just a matter of convention whether
we aim for upper triangular or lower triangular systems in the elimination pro-
cedure.
Problem 3.7 The following equations have already been put in upper triangular
form. In each case there are infinitely many solutions, depending on one or
more parameters. Write down the general expression for the solution in terms
of parameters.                               
                              1 2 1 2       1
                            0 0 1 1        4 
                                             
                            0 0 0 1        2 
                              0 0 0 0       0
                                                     
                                1     2    1    2   1
                               0     0    1    1   4 
                                                     
                               0     0    0    0   0 
                                0     0    0    0   0

                                  1   2     1   2 1
                                  0   0     1   1 4
Problem 3.8 Consider the system of equations          represented by the augmented
matrix                                               
                          1 2 2 2 1
                         1 3 −1 3 −1                 
                                                     
                         1 0 1 1 5                   
                          0 3 −2 2 −6
Show that this set of equations has infinitely many solutions and find a general
parametric representation of the solutions.
Problem 3.9 Consider the system           of equations represented by the augmented
matrix                                                   
                        1 2                2 −7 20
                      3 6
                                         −3 −5 −15      
                      0 6                 0 −6 −10 
                        2 −8              −2 −2 30

                                           74
Chapter 3. Solving Linear Systems                                    3.2. Gaussian Elimination



Put this matrix in upper triangular form and find the solution of the linear
system of equations. You can leave your answers in fractions but show the
operations you perform in full detail.
Problem 3.10 Solve the following system of equations.
                        x1     −    2x2     +       3x3    =     2
                        2x1    −    3x2     +       2x3    =     2
                        3x1    +    2x2     −       4x3    =     9
Problem 3.11 Solve the following system of equations.
                       2x1     +      x2        −   1x3     =    6
                        x1     −     2x2        −   2x3     =    1
                       −x1     +    12x2        +   8x3     =    7
Problem 3.12 Solve the following system of equations.
                         x1    +    2x2     +       4x3    =     1
                         x1    +     x2     +       3x3    =     2
                        2x1    +    5x2     +       9x3    =     1
Problem 3.13 Solve the following system of equations.
                         x1    +    2x2     +       4x3    =     1
                         x1    +     x2     +       3x3    =     2
                        2x1    +    5x2     +       9x3    =     3
Problem 3.14 Solve the following system of equations.
                 3x1   +       x2   −      x3       +     2x4    =       7
                2x1    −      2x2   +     5x3       −      7x4   =       1
                −4x1   −      4x2   +     7x3       −     11x4   =      −13
Problem 3.15 For what values of a, b, c, d, α and β does the system of equa-
tions
                         ax1 + bx2 = α
                         cx1 + dx2 = β
have a unique solution?
Problem 3.16 Consider the system of equations represented by the augmented
matrix                                         
                               1    2    0 7
                           4       8    6 10 
                              −4 −8 10 81
How many solutions does this linear system of equations have?
Problem 3.17 (Matlab) Consider the following system of equations
                         x1    +    2x2     +       4x3    =     7
                        4x1    +     x2     +       3x3    =     2
                         0     +    5x2     +       9x3    =     a
Write a script in Matlab that generates the augmented matrix and solves the
system with the rref command for various values of a. Specifically, how does
the solution of x1 vary when you range a from 1 to 10 (equally spaced)?


                                           75
Chapter 3. Solving Linear Systems                            3.3. Homogeneous Equations



3.3      Homogeneous Equations
If the coefficients on the right sides of a system of equations are all zero, the
system is said to be homogeneous. In other words, a homogeneous system is a
system of equations of the form
                 b1,1 x1   +     b1,2 x2     +··· +      b1,n xn   = 0
                 b2,1 x1   +     b2,2 x2     +··· +      b2,n xn   = 0
                    .
                    .      .
                           .                                .
                                                            .        .
                                                                     .
                    .      .                                .        .
                 bm,1 x1   + bm,2 x2         + · · · + bm,n xn     =   0
Given a system of equations, the associated homogeneous system is the homo-
geneous system of equations you get by setting all the right sides to zero.
    Geometrically, homogeneous systems describe points, lines and planes that
pass through the origin. In fact x = 0, i.e., x1 = 0, x2 = 0, . . . , xn = 0 is always
a solution to a homogeneous system of equations.
    When are there other (nonzero) solutions to the above homogeneous sys-
tem? We have n unknowns and m equations. When we perform the Gaussian
reduction, the right-hand sides of the equations will stay zero so the augmented
matrix will generally have the form
                    1   ∗    ∗     ∗ ··· ··· ··· ∗           0
                                                                 
                  0    1    ∗     ∗ ··· ··· ··· ∗           0 
                  0    0    0     1    ∗ ··· ··· ∗          0 
                                                                 
                 ··· ··· ··· ··· ··· ··· ··· ··· ···
                                                                 
                  0    0    0 ··· 0          1   ∗     ∗    0 .
                                                                 
                  0    0    0 ··· ··· ··· 1            ∗    0 
                                                                 
                  0    0    0 ··· 0          0   0     1    0 
                                                                 
                   ··· ··· ··· ··· ··· ··· ··· ··· ···
                                                                 
                    0   0    0 ··· 0          0   0     0    0
The last several lines may be identically zero. In the last section we saw that
there are solutions depending on parameters if the number of variables is greater
than the rank of the matrix. Thus, if n (the number of unknowns) is bigger than
the number of non-zero lines in the above row-reduced matrix, then there exists
a non-zero solution. Otherwise only a trivial solution x1 = 0, x2 = 0, . . . , xn = 0
is present. We illustrate the idea with some examples below.

Example 3.6 Consider a homogeneous system
                           3x1   +     6x2     +    x3   =    0
                           6x1   +     2x2     +   2x3   =    0
                            x1   +      x2     +   3x3   =    0
The augmented matrix can be reduced            by row operations to the form (check!)
                                                   
                              1 0              0 0
                            0 1               0 0,
                              0 0              1 0

                                             76
Chapter 3. Solving Linear Systems                              3.3. Homogeneous Equations



which implies x1 = x2 = x3 = 0. And, in agreement with our above statement,
the number of variables (3) is not more than the number of non-zero rows (also
3).

Example 3.7 Consider another homogeneous system:

                            −x1    +    2x2    +    4x3    =    0
                            2x1    −    4x2    −    8x3    =    0.
                            x1     −     x2    +    3x3    =    0

Its augmented matrix       is
                                                                             
        −1 2      4         0      −1          2 4 0      1              0   10 0
       2 −4 −8             0 →  0           0 0 0 → 0               1    7 0,
         1 −1 3             0       0          1 7 0      0              0    0 0

and the number of nonzero rows is 2, which is less than the number of unknowns,
3. Hence by the above statement there must be a nonzero solution. We find
x1 = −10x3 , x2 = −7x3 , with no requirement on x3 . Hence x3 is any number
t, and we obtain infinitely many nonzero solutions

                  x1 = −10t, x2 = −7t, x3 = t,              t ∈ (−∞, ∞),

one for each value of t.

    In a similar manner, if for some homogeneous system with 4 variables the
augmented matrix has only 2 nonzero rows, then the general solution has 4-2=2
free (undefined) variables on which the other two depend.

3.3.1      Properties of solutions of homogeneous systems.
   1. A homogeneous system has either one zero-solution (x1 = ... = xn = 0)
      or infinitely-many solutions that depend on parameters.

   2. If (x1 , ..., xn ) and (y1 , ..., yn ) are solutions to a given homogeneous system,
      (x1 + y1 , ..., xn + yn ) is also a solution. (Solutions are additive.)

   3. If (x1 , ..., xn ) is a solution to a given homogeneous system, (ax1 , ..., axn )
      is also a solution, for any number a. (Solutions are scalable.)

   The first statement follows from our previous discussion; the other two are
easy to verify, using the initial homogeneous system.

3.3.2      Connection of solutions to homogeneous and inho-
           mogeneous systems.
The importance of homogeneous equations comes from the following fact. If x =
[x1 , x2 , . . . , xn ] and y = [y1 , y2 , . . . , yn ] are two solutions to a (not necessarily


                                              77
Chapter 3. Solving Linear Systems                              3.3. Homogeneous Equations



homogeneous) system of equations,

                  b1,1 x1   +    b1,2 x2    +··· +       b1,n xn   =     c1
                  b2,1 x1   +    b2,2 x2    +··· +       b2,n xn   =     c2
                     .
                     .      .
                            .                               .
                                                            .             .
                                                                          .
                     .      .                               .             .
                  bm,1 x1   + bm,2 x2       + · · · + bm,n xn      = cm

then the difference x − y = [x1 − y1 , x2 − y2 , . . . , xn − yn ] solves the associated
homogeneous system. This is a simple calculation
b1,1 (x1 − y1 )   +     b1,2 (x2 − y2 )    +··· +      b1,n (xn − yn )    =   (c1 − c1 )    = 0
b2,1 (x1 − y1 )   +     b2,2 (x2 − y2 )    +··· +      b2,n (xn − yn )    =   (c2 − c2 )    = 0
       .
       .          .
                  .                                           .
                                                              .                   .
                                                                                  .
       .          .                                           .                   .
bm,1 (x1 − y1 )   + bm,2 (x2 − y2 )        + · · · + bm,n (xn − yn )      =   (cm − cm )    =   0
To see the implications of this let us suppose that x = q is any particular
solution to a (non-homogeneous) system of equations. Then if y is any other
solution y − x = z is a solution of the corresponding homogenous system. So
y = q + z. In other words any solution can be written as q+ some solution of
the corresponding homogenous system. Going the other way, if z is any solution
of the corresponding homogenous system, then q + z solves the original system.
This can be seen by plugging q + z into the equation. So the structure of the
set of solutions is

                  x = q + ( solution to homogeneous system)

As you run through all solutions to the homogenous system on the right, x
runs through all solutions of the original system. Notice that it doesn’t matter
which q you choose as the starting point. This is completely analogous to the
parametric form for a line, where the base point can be any point on the line.
   If we have applied the process of Gaussian elimination to the original system,
and concluded that the general solution has parameters, we will end up with a
general solution of the form

                                 q + s1 a1 + · · · + sn an .

Notice that q is a particular solution (corresponding to all parameters equal
to zero) and s1 a1 + · · · + sn an is the general solution to the corresponding
homogeneous system.
    These considerations have practical importance if you have to solve a bunch
of systems, all with the same coefficients on the left side, but with different coef-
ficients on the right. In this situation, you could first find the general solution to
the corresponding homogeneous system of equations. Then to find the general
solution to one of the systems, you would only need to find a single particular
solution, and then add the general solution to the homogeneous system to ob-
tain all solutions. The only trouble with this is that it might not really be any
easier to find a single particular solution than it is to find all solutions.


                                             78
Chapter 3. Solving Linear Systems                      3.3. Homogeneous Equations



Example 3.8 Find the general solution of      the system of equations
                                                    
                         1     1     0         0 1
                        −1 −1 1               2 1
                         3     3 −1            −2 1
In the form x = q + s1 a1 + s2 a2 . Verify that a1 and a2 solve the corresponding
homogeneous equation. The matrix
                                                     
                             1      1    0     0    1
                          −1 −1 1             2    1 
                             3      3 −1 −2         1
reduces to                                        
                               1 1 0 0          1
                             0 0 1 2           2 
                               0 0 0 0          0
so the solutions are x1 = 1 − s1 , x2 = s1 , x3 = 2 − 2s2 , x4 = s2 . This can be
written
                             1           −1           0
                                                    
                           0         1           0 
                       x =   + s1          + s2 
                             2            0           −2
                                                         
                             0            0            1
It’s easy to check that a1 = [−1, 1, 0, 0] and a2 = [0, 0, −2, 1] solve the corre-
sponding homogeneous system.

3.3.3    Problems
Problem 3.18 Find the general solution       of the system of equations
                                                    
                           1 0 1             0 10
                         −1 1 1             1 4 
                           0 1 2             1 14
In the form x = q + s1 a1 + s2 a2 , where a1 and a2 solve the corresponding
homogeneous system of equations.
Problem 3.19 Consider the system       of equations
                                                  
                        1     1         0     0  4
                      −1 −1            1     2 −1 
                        3     3         −1    −2 9
Verify that [4, 0, 3, 0] is a solution and write down the general solution.
Problem 3.20 Consider the system of equations given by the augmented matrix
                                             
                           1 −3       4    6
                         1 9 −10 10 
                           0 6       −7    2
Put this matrix in reduced row echelon form and comment on the number of
solutions of this system of equations.


                                        79
Chapter 3. Solving Linear Systems                         3.4. Geometric Applications



3.4      Geometric Applications
Now we will apply Gaussian elimination to some of the geometry problems we
studied in the first part of this course.
    Let us start with the question of linear independence. Recall that a collection
of vectors x1 , x2 , . . . , xn is called linearly dependent if we can find some non-zero
coefficients c1 , c2 , . . . , cn such that

                           c1 x1 + c2 x2 + · · · + cn xn = 0

This is actually a homogeneous system of linear equations for the numbers
c1 , . . . , cn . If c1 = c2 = · · · = cn = 0 is the only solution, then the vectors are
linearly independent. Otherwise, they are linearly dependent. To decide, we
must set up the matrix for the system of equations and perform a row reduction
to decide if there is a unique solution or not. In setting up the equations, it is
convenient to treat the xi ’s as column vectors.

Example 3.9 Decide if
                                                      
                              1              1             1
                       x1 =  2      x2 =  1     x3 =  2 
                              0              1             1

are linearly independent. The equation c1x1 + c2 x2 + c3 x3 = 0 can be written

                                c1   +c2    +c3     =0
                               2c1   +c2    +2c3    =0
                               0c1   +c2    +c3     =0

The matrix for this system of equations is
                                          
                                   1 1 1
                                 2 1 2
                                   0 1 1

Since this is a homogeneous system, we don’t have to write the augmented part
of the matrix. Performing a row reduction yields
                                          
                                   1 1 1
                                 0 1 0
                                   0 0 1

Since the number of non-zero rows is the same as the number of variables (three)
there are no non-zero solutions. Therefore the vectors are linearly independent.

   The row reduction in Example 3.9 also shows that any vector y in R3 can
be written as a linear combination of x1 , x2 and x3 . Writing y as a linear
combination of x1 , x2 and x3 means finding coefficients c1 , c2 and c3 such that


                                           80
Chapter 3. Solving Linear Systems                      3.4. Geometric Applications



c1x1 + c2 x2 + c3 x3 = y. This is a (non-homogeneous) system of linear equations
with augmented matrix                         
                                    1 1 1 y1
                                   2 1 2 y2 
                                    0 1 1 y3
Using the same Gaussian elimination steps as above, this matrix reduces to
                                             
                                     1 1 1 ∗
                                  0 1 0 ∗
                                     0 0 1 ∗
where the ∗’s are some numbers. This system has a (unique) solution.
Example 3.10 Here is another geometric example. Do the planes whose equa-
tions are given by x1 + x2 + x3 = 1, 2x1 + x2 + 2x1 = 1 and x2 = 1 intersect in
a single point? To answer this, we note that the intersection of the three planes
is given by the set of points that satisfy all three equations. In other words they
satisfy the system of equations whose augmented matrix is
                                                 
                                    1 1 1 1
                                  2 1 2 1
                                    0 1 0 1
A row reduction yields                           
                                1     1      1 1
                               0    −1      0 −1 
                                0     0      0 0
Thus solutions are given by
                                          
                                  0        1
                                 1 + s 0 
                                  0       −1
This is the parametric equation of a line. Thus the three planes intersect in a
line, not a point.

3.4.1    Problems
Problem 3.21 Are the following vectors linearly dependent or independent?
                         1             1             1
                                                
                      2            1           0
                x1 =   x2 =             x3 =  
                         0            −1             1
                         2            1              0
Can every vector in R4 be written as a linear combination of these vectors?
How about the vector the
                                      2
                                        
                                    4 
                              y1 =      ?
                                     −3
                                      4


                                        81
Chapter 3. Solving Linear Systems                              3.5. Resistor Networks



Problem 3.22 Consider the following vectors a1 , a2 and a3 such that:
                           1                    3           −3
                   a1 =             a2 =            a3 =
                           2                    1            4

                                                              −15
Are they linearly independent? Can the vector y =                     be written as a
                                                               5
linear combination of a1 and a2 ?
Problem 3.23 Consider the following 4 dimensional vectors a1 , a2 , and a3
such that                                           
                   1                0                10
                  1             0              0 
            a1 =  
                  0      a2 =  4 
                                            a3 = 
                                                   −5 
                                                         

                   0               −3                 0
Are these linearly independent? Can the vector y below be written as linear
combination of the above three vectors?
Problem 3.24 (Matlab) Consider the          following 3 vectors a1 , a2 , and a3 :
                                                                
                   1                        −1                  4
            a1 =  2      a2 =             2        a3 =  1 
                   3                        −1                 −1
The Matlab command
  plot3([a,0], [b,0], [c,0])
draws a line between (0, 0, 0) and a point with coordinates (a, b, c). Write a
script that draws the three vectors (use the command hold on to be able to
overlap them). Are the vectors linearly independent?


3.5     Resistor Networks
3.5.1    Elements of Basic Circuits
Electrical current, often denoted with the variable I, is a measure of charge flow
with MKS units of Amperes or Amps (Coulombs per second, where a Coulomb is
a measure of electrical charge). The voltage V is an electrical potential measured
in Volts (Joules per Coulomb). Thus IV has the units of power (J/s or Watts).
    A resistor is a simple electrical component that obeys Ohm’s law, that the
current through it is proportional to the voltage drop across it. The constant of
proportionality is called the resistance R in Ohms (abbreviated Ω, V/A). The
current I through a resistor across which there is a voltage drop V satisfies
                                     V = IR.
The current goes from high to low potential through the resistor.
    The resistor networks considered in these notes are circuits with three types
of components:


                                           82
Chapter 3. Solving Linear Systems                         3.5. Resistor Networks




                  Figure 3.3: Elements in resistor networks.


  1. Resistors
  2. Voltage sources
  3. Current sources
The notation we will use for these elements in circuit schematics is shown in
Figure 3.3. Later (in Chapter 5) we will see that such a network represents a
network with additional elements (inductors and capacitors) at a given instant
in time. At a given time, an inductor acts as a current source and a capacitor
acts as a voltage source. The current through the capacitor at that instant
determines the rate of change of voltage across it, and the voltage across the
inductor determines the rate of change of current through it.
    In the network, the resistances of all resistors will be given. The voltage
across all voltage sources and the currents through all current sources will be
given. There are two basic questions to answer about a resistor network with
these components.
basic problem: Find the currents through each resistor and each power source
     and also the voltage drops across each current source. This problem is a
     linear system of equations and so serves as an example of the mathematical
     techniques learnt in this chapter.
fundamental problem: An important sub-problem is to find the currents
    through every power source and the voltage drops across each current
    source. These quantities can be written in terms of the given voltages and
    current sources directly (eliminating the terms involving resistor currents).
    Later, solving this problem will tell us how to write a differential equation
    for the currents through inductors and the voltages across capacitors.
   There are two fundamental laws governing the behaviour of circuits that can
be used to set up equations that can be solved to answer the questions above.
They are Kirchhoff’s laws:
  1. The sum of voltage drops around any closed loops in the network must be
     zero.


                                       83
Chapter 3. Solving Linear Systems                         3.5. Resistor Networks



  2. The sum of currents entering a node must be zero.

3.5.2    Two Simple Examples Made Complicated
Consider the simple network with one current source and one resistor shown in
Figure 3.4 (left). Clearly, the current through the resistor must be I and the
voltage drop across the resistor must be IR using Ohms Law (use the signs in
the diagram for the direction of the drop).

Example 3.11 Consider the same example, introducing two nodes into the net-
work as shown in Figure 3.4 (right). We now have three unknowns, V1 , V2 and
I2 . Find a linear system for these unknowns and then solve the system. We
have made this simple example more complicated but we will learn something
as we work through it. Note that by specifying voltages at nodes (which will
determine voltage drops across components) we will always satisfy Kirchhoff’s
first law. Let us write down every other law that applies to this diagram:

                I − I2   = 0, Kirchhoff’s second law at node 2
                I2 − I   = 0, Kirchhoff’s second law at node 1
              V2 − V1    = I2 R, Ohm’s Law over the resistor

If you didn’t look too closely, you might be happy thinking these are three
equations for the three unknowns I2 , V1 and V2 . However, rewriting gives

                                            I2   = I
                                         I2      = I
                             V2 − V1 − I2 R      = 0

In augmented matrix form we can write the system and do Gaussian Elimina-
tion:                                                 
                1     0 0     I         1 0 0         I
              1      0 0     I  ∼  0 −1 1        −RI 
               −R −1 1        0         0 0 0         0
In the augmented matrix above, the unknowns are ordered I2 , V1 and then V2 .
The solutions are I2 = I (expected), V2 = s and V1 = s − RI where s is a
parameter that can take any value. This seems much more complicated that
the intuitive solution at the beginning of this section. However, the conclusions
are the same: the current through the resistor is I and the voltage drop across
the resistor is
                          V2 − V1 = s − (s − RI) = RI.
The arbitrary constant in the voltage occurs here because no reference voltage
has been specified (no point in the circuit has been grounded).




                                       84
Chapter 3. Solving Linear Systems                         3.5. Resistor Networks




    Figure 3.4: The simple resistor network considered in Example 3.11.


Example 3.12 Consider the circuit in Figure 3.5. Learning from the last ex-
ample, we have set a reference voltage at the lower left corner of the circuit,
and then voltage at the upper left corner is known. The current I1 is the branch
current from the V2 node to the V1 node. Form a linear system matching cur-
rents and voltages across resistors to Ohms Law, and matching branch currents
at the two nodes. Solve the linear system. There are 5 unknowns V1 , V2 , I1 , I2
and I3 in the circuit as shown in the figure (the augmented matrices below will
be written with the unknowns in this order). Ohm’s law on the four resistors
gives the following linear equations (in order of small to large resistance)

                               12 − V1        = I1
                               V1 − V2        =   2I2
                               V1 − V2        =   3I3
                                     V2       =   4I1

and matching the currents at the two nodes gives

                                    I1    = I2 + I3
                              I2 + I3     = I1

This gives six equations in five unknowns! (maybe you already see why this
happened to us). Writing the six equations above in an augmented matrix gives
                                                                     
     −1 0 −1 0             0     −12         1 0 0 0 0        2184/217
   1 −1 0 −2 0                   0   0 1 0 0 0             1680/217 
                                                                     
   1 −1 0            0 −3              0 0 1 0 0
                                  0                          420/217 
                                       ∼                              
   0
           1 −4 0         0      0   0 0 0 1 0
                                                              252/217 
   0       0    1 −1 −1          0   0 0 0 0 1              168/217 
      0     0 −1 1         1      0          0 0 0 0 0            0

On the left above is the result of Gaussian elimination to reduced row echelon
form. The solutions for V1 , V2 , I1 , I2 and I3 can be read off the last column
of the augmented matrix after reduction (V1 = 2184/217 etc.). Notice that the
“extra” equation became the bottom row (all zeros) in the reduced form, which


                                         85
Chapter 3. Solving Linear Systems                         3.5. Resistor Networks




        Figure 3.5: The resistor network considered in Example 3.12.


indicates that there was redundant information given in the linear system. If
you go back to the expressions for conservation of current at the two nodes
above, it is easy to see that these two equations carry the same information.

   As the two previous examples show, one has to be a bit careful picking the
unknowns and equations in a circuit to get a unique solution without introduc-
ing redundant equations. There are several ways to do this and when solving
small circuits by hand the linear system can be made much simpler to solve if
you pick the “right” technique. In the next section we will describe the “loop
current” technique which always leads to a solvable linear system with no extra
(redundant) equations.

3.5.3    Loop Currents
We want to be able to see any resistor network and write down equations that
will solve it uniquely, with no redundant equations like in the previous example
and no non-uniqueness (like that coming from the lack of a reference potential).
This can always be done using the following variables: loop currents, that is
currents in every elementary loop of the network, and voltage drops across any
current sources. This technique is described in more detail below.
    Consider a circuit that can be drawn on a piece of paper with no branches
overlapping (a so-called planar network). The branches in the circuit divide the
diagram into smaller areas. The set of branches around each of these small areas
is called an elementary loop. By assigning a loop current to each elementary
closed loop of the circuit, the second of Kirchhoff’s Circuit Laws is satisfied
automatically because in a closed loop the current entering any one point is
equal to the current travelling away from that point. Consider Figure 3.7 (left).


                                       86
Chapter 3. Solving Linear Systems                           3.5. Resistor Networks



There are three elementary loops and loop currents i1 , i2 and i3 associated with
each of them. Loop currents sum when they overlap in a branch. For example,
the current downwards through the 3 Ω resistor in Figure 3.7 is i2 − i3 . Be
careful of signs as you sum loop currents. In the example above, i2 is downwards
through the 3 Ω resistor but i3 is upwards, hence it appears with a negative sign
in our expression.
    In a circuit, it is convenient to take loop currents and voltage drops across
current sources as the unknowns. The first step in solving any electric network
is to identify the number of elementary loops in the network. If there are m
independent loops present, then variables i1 , i2 , . . . , im must be introduced to
represent the loop currents of each. If there are n current sources, then the
variables v1 , v2 , . . . vn must be introduced to represent the voltage drop across
each source. Together there are n + m unknowns.
    We can apply Kirchhoff’s voltage law to each of the m loops and obtain m
linear equations for the unknowns. The current through each current source
must match the loop currents through it. This gives n more linear equations
for a total of n + m linear equations for the n + m unknowns.

Example 3.13 Solve the resistor network in Figure 3.6. Note that this is the
same circuit as Example 3.12 but here the loop current method will be used.
The unknowns are the loop currents i1 and i2 (there are no current sources).
Remember that the loop currents add in shared components. For example, the
current downwards in the 2Ω resistor is i1 − i2 . Using loop currents Kirch-
hoff’s second law is always satisfied. The equations needed to solve for the loop
currents are obtained by summing voltage drops around each elementary loop:

     i1 + 2(i1 − i2 ) + 4i1 − 12    =   0, voltage drops going around loop 1
               3i2 + 2(i2 − i1 )    =   0, voltage drops going around loop 2.

Collecting terms:

                                    7i1 − 2i2   =   12
                                   −2i1 + 5i2   =   0

which can be solved to give i2 = 24/31 and i1 = 420/217. With these values
of i1 and i2 the current through each resistor and the power source can be
determined, solving the problem.

   There are easier ways to solve this particular small problem (the easiest is
probably to use combinations of the series and parallel resistor laws). However,
the loop current rule works for networks of arbitrarily large size and leads to
systems of equations with a relatively small number of unknowns. On tests
and exams, it is expected that students will be able to apply the idea of loop
currents.

Example 3.14 Solve the network shown in Figure 3.7 There are three inde-
pendent loop currents which are labelled i1 , i2 and i3 . There is a single current


                                           87
Chapter 3. Solving Linear Systems                             3.5. Resistor Networks




               Figure 3.6: The resistor network for Example 3.13

source of 4A with voltage drop v across it (minus to plus in the direction of the
current). There are also two 10V voltage sources. Remember that the current
in an electrical branch shared by two loop currents is equal to the (signed) sum
of the two loops currents, i.e. the current moving to the left through the 5Ω
resistor of Figure 3.7 is i1 − i3 and thus the voltage drop is 5(i1 − i3 ). Be careful
also of the sign of voltage drops. Moving around loop 1 in the circuit in Fig-
ure 3.7 clockwise, when the current source is crossed, there a voltage increase
of v, so this would be a voltage drop of −v in the expression for Kirchhoff’s
second law for this loop written below. We sum the voltage drops around loop
1 beginning at the current source and moving clockwise (the same direction as
i1 ) to obtain
                       −v + 2i1 + 5(i1 − i3 ) + 2(i1 − i2 ) = 0
which can be simplified to
                               9i1 − 2i2 − 5i3 − v = 0
The equations for the voltage drops around loops 2 and 3 are derived similarly
                        −2i1    + 5i2     − 3i3     = −10
                        −5i1    − 3i2     + 8i3     =  10
The final linear equation comes from matching the loop currents to the current
source:
                                   i1 = 4
The four linear equations above can be solved for the four unknowns i1 , i2 , i3 , v.
The solution can be found using MATLAB (the details are in the computer lab
#4 guide):
                               i1 =       4A
                               i2 ≈ 2.3871A
                               i3 ≈ 4.6452A
                               v =        8V
Now that the loop currents are determined, the branch currents can be written
down. For example the current over the 5Ω resistor to the right is i3 − i1 =
0.6452A. The right hand panel of Figure 3.7 displays all six branch currents.


                                         88
Chapter 3. Solving Linear Systems                             3.5. Resistor Networks




Figure 3.7: The left panel displays the schematic circuit from Example 3.14
with two 10V voltage sources, four resistors and a current source of 4A. The
loop currents i1 , i2 i3 represent the current in each of the independent closed
loops. The right panel is the solution to the electric network on the left panel
with branch currents shown.


Example 3.15 Solve the resistor network in Figure 3.8. In this case, solve both
the basic problem when V = 9 and I = 1, and then the fundamental problem
for arbitrary V and I. The unknowns for the problem are i1 , i2 , i3 and E. One
equation in the system of unknowns comes from the fact that the loop current
variables must match the current source:

                                    i3 = −I.

Note that this equation is so simple we will no longer consider i3 as a variable
but replace i3 by the known value −I in the equations below. Voltage drops
across the three loops give:

                       i1 + 2(i1 − i2 ) + 5i1 − V     =   0
                          3(i2 − i3 ) + 2(i2 − i1 )   =   0
                            5i3 + E + 3(i3 − i2 )     =   0

Since I and V and i3 (by the discussion above) are known quantities, we move
them to the right hand side of the linear equations for i1 , i2 and E which are
written below

                               8i1 − 2i2    = V
                             −2i1 + 5i2     = −3I                              (3.1)
                              −3i2 + E      =   8I.


                                       89
Chapter 3. Solving Linear Systems                          3.5. Resistor Networks




              Figure 3.8: The resistor network for Example 3.15


With V = 9 and I = 1 this is solved using Gaussian Elimination to give the
               1
solution E = 7 2 , i2 = −1/6 and i1 = 13/12. Note that the negative value for i2
means that this loop current physically goes in the opposite direction to that in
the Figure. For the fundamental problem, we consider (3.1) for arbitrary values
of V and I. We write the system as an augmented matrix and do Gaussian
Elimination with symbolic terms in the right hand sides:
                                                                    
            8 −2 0          V          1 −1/4 0             V /8
         −2 5 0           −3I  ∼  0      1     0    −2/3I + V /18 
            0 −3 1         8I          0    0     1      6I + V /6

so E = 6I +V /6 (the voltage across the current source in terms of the given volt-
age and currents of sources) and (after some algebra) i1 = 72 V − 1 I (the current
                                                            10
                                                                  6
through the power source). An alternate approach to solving the fundamental
problem is given in next section.


3.5.4    Alternate Presentation of Resistor Networks
Linear systems from resistor networks was presented in a previous version of
the notes in a different way. That previous presentation is reproduced here
beginning in the next paragraph. This alternate explanation may be helpful to
some students.
    Consider the circuit shown in Figure 3.9. We won’t be able to solve this
circuit until we a studied differential equations in the last part of this course.
However we can make some progress using what we know already.
    There are three types of components: resistors, inductors (coils) and ca-
pacitors. Associated with each component is the current I flowing through that
component, and the voltage drop V across that component. If there are n differ-
ent components in a circuit, then there are 2n variables (currents and voltages)
to determine. In the circuit above there are 8.
    Of course, these variables are not all independent. They satisfy two types
of linear relations: algebraic and differential. We won’t touch the differential
relations for now, but we can consider the algebraic relations.


                                       90
Chapter 3. Solving Linear Systems                             3.5. Resistor Networks




                                                           R1 I1V1


                   IV C                       R2 I V
                                                  22
                   44
                                                            L I3 3
                                                                V




                        Figure 3.9: A resistor network.


    The first algebraic relation relates the current and voltage drop across a
resistor. If R is the resistance and I and V are the current and voltage drop
respectively, then V = IR. In our example, this gives two equations

                                  V1   = I1 R1
                                  V2   = I2 R2


    The other two algebraic relations are Kirchhoff’s laws. The first of these
states that the total voltage drop across any loop in the circuit is zero. For the
two loops in the example circuit, this gives the equations

                                    V4 − V2     =      0
                              V1 + V3 + V2      =      0

Notice we have to take the direction of the arrows into account. The second
Kirchhoff law states that current cannot accumulate at a node. At each node,
the current flowing in must equal the current flowing out. In the example circuit
there are three nodes, giving the equations.

                               I4 + I2 − I1     =   0
                                    I1 − I3     =   0
                               I3 − I2 − I4     =   0

    We now want to pick a few of the variables, and solve for all the rest in terms
of these. In a small circuit like the example, this can be done “by hand.” For
example, its pretty obvious that I1 = I3 and V2 = V4 so one could eliminate two
variables right off the bat. However, it is also useful to have a systematic way
of doing it, that will work for any circuit (but probably will require a computer
for anything but the simplest circuit).
    As a rule of thumb, you can pick the voltages across the capacitor and the
currents across the inductors as basic variables and solve for the rest in terms
of these. In other words, we want I3 and V4 to be parameters when we solve


                                        91
Chapter 3. Solving Linear Systems                                    3.5. Resistor Networks



the system of equations. To accomplish this we will choose the order of the
variables with I3 and V4 at the end of the list. With this in mind we choose the
order I1 , I2 , I4 , V1 , V2 , V3 , I3 , V4 . Then the equations become

           R1 I1                       −V1                                 =0
                   R2 I2                         −V2                       =0
                                                 −V2                 +V4   =0
                                       V1        +V2    +V3                =0
            −I1    +I2      +I4                                            =0
             I1                                                −I3         =0
                   −I2      −I4                                +I3         =0

The matrix for this system is (since it is a homogeneous system of equations,
we don’t have to bother writing the augmented part)

                     R1       0         0    −1         0 0 0 0
                                                               
                    0       R2         0     0        −1 0 0 0 
                    0        0         0    0         −1 0 0 1 
                                                               
                    0        0         0    1         1 1 0 0
                                                               
                    −1       1         1    0         0 0 0 0
                                                               
                     1        0         0    0         0 0 −1 0
                                                               
                     0       −1        −1     0         0 0 1 0

Here is the reduced form    of this matrix.
                                                                   
                        1    0    0     0    0    0    −1       0
                                                                 1
                      0
                            1    0     0    0    0     0      − R2 
                                                                    
                                                                1 
                      0
                            0    1     0    0    0    −1       R2 
                      0
                            0    0     1    0    0    −R1      0  
                      0
                            0    0     0    1    0     0      −1  
                      0     0    0     0    0    1    R1       1 
                        0    0    0     0    0    0     0       0

Thus

                                  I1    = I3
                                           1
                                  I2    =    V4
                                          R2
                                                       1
                                  I4    = I3 −            V4
                                                       R2
                                 V1     = R1 I3
                                 V2     = V4
                                 V3     = −R1 I3 − V4


So we have succeeded in expressing all the variables in terms of I3 and V4 . We
therefore need only determine these to solve the circuit completely.


                                             92
Chapter 3. Solving Linear Systems                        3.5. Resistor Networks




   Figure 3.10: Circuit diagrams for Problems 3.25 (left) and 3.26 (right).


3.5.5    Problems
Problem 3.25 Find the currents and voltages in each component of the circuit
shown in Figure 3.10 (left).

Problem 3.26 The resistances of the resistors shown in the circuit shown in
Figure 3.10 (right) are R1 = 4Ω, R2 = 1Ω, R3 = R4 = 2Ω. Find the current
I2 that flows through resistor R2 if the voltage across the batteries are E1 = 5V
and E2 = 3V . In which direction does I2 flow? Solve the problem as a system
of linear equations for the loop currents i1 and i2 shown in the diagram.


Problem 3.27 Consider the resistor network:




Given R1 = 3[Ω], R2 = 1[Ω], R3 = 4[Ω], V = 26[V ] and I = 2[A], answer the
following questions:

a) What is the voltage drop through R3 ?

b) What is the current flow through R2 ?

c) What is the voltage drop through R1 ?

Problem 3.28 Consider the following resistor network:




                                      93
Chapter 3. Solving Linear Systems                         3.5. Resistor Networks




Suppose that R1 = 4[Ω], R2 = 1[Ω], R3 = 2[Ω] and that the current flow through
R3 is 1.5[A]. Solve the resistor network and answer the following questions:
a) What is the voltage drop across R3 ?
b) What is the current flow through R2 ?
c) What is the voltage drop through V ?

Problem 3.29 Consider the following resistor network:




Suppose that R1 = 4[Ω], R2 = 2[Ω], R3 = 10[Ω] and that V = 60[V ]. Solve the
resistor network and answer the following questions:

a) What is the voltage drop through R2 ?
b) What is the current flow through R1 ?
c) What is the current flow through R3 ?

Problem 3.30 Find the loop currents i1 , i2 , i3 in the following network:




   where R1 = 1[Ω], R2 = 3[Ω], R3 = 5[Ω], R4 = 2[Ω], E1 = 10[V ] and
E2 = 4[V ].


                                       94
Chapter 3. Solving Linear Systems                          3.5. Resistor Networks



Problem 3.31 Consider the following network:




   where R1 = 1[Ω], R2 = 2[Ω], R3 = [Ω], R4 = [Ω], V = 25[V ] and I = 3[A].

a) Set up and solve the linear system for the loop currents i1 , i2 , i3 and the
   voltage E across the current source.

b) What is the voltage drop through R2 ?

c) What is the current flow through V ?

d) What is the voltage drop through I?

Problem 3.32 Consider the resistor network:




    with R1 = 2[Ω], R2 = 5[Ω], R3 = 3[Ω], V = 10[V ], I2 = 3[A]. Suppose that
the voltage drop through I2 is E2 = 5[V ].

a) What is the current flow through R2 ?

b) What is the voltage drop through I1 ?

c) What is the current flow through I1 ?

Problem 3.33 If a circuit contains only resistors, then we can solve it com-
pletely using the ideas of this section. Write down the linear equations satisfied
by the currents in the circuit shown in Figure 3.11. In this diagram, the compo-
nent on the far left is a voltage source (battery). The voltage across the voltage
source is always E.




                                       95
Chapter 3. Solving Linear Systems                            3.6. Additional Topics



                       R1           R3                    R2n-1




            E           R2            R4                     R 2n




                 Figure 3.11: The circuit from Problem 3.33.




   Figure 3.12: The minimization of a quadratic function in one variable.


3.6     Additional Topics
3.6.1    Quadratic Functions
Let begin by recalling how we would find the minimum of a quadratic function
in one variable, namely a parabola given by f (x) = ax2 + bx + c as shown in
Figure 3.12. We simply find the value of x for which the derivative is zero, that
is, we solve f (x) = 0. Notice that since f is quadratic, this is a linear equation

                                    2ax + b = 0

which is easily solved for x = −b/2a (provided a = 0). So the minimum value
is f (−b/2a) = −b2 /(4a) + c.
    Of course, if a is negative, then the parabola points downwards, and we have
found the maximum value, not the minimum value.
    A quadratic function of two variables x1 and x2 is a function of the form

                f (x1 , x2 ) = ax2 + 2bx1 x2 + cx2 + dx1 + ex2 + f.
                                 1               2

(The 2 in front of b is just for convenience.) For what values of x1 and x2 is
f (x1 , x2 ) the smallest? Just like with the parabola in one variable, there may
be no such values. It could be that f has a maximum instead, or that f has
what is called a saddle point. However if f does have a minimum, the procedure


                                         96
Chapter 3. Solving Linear Systems                                                3.6. Additional Topics



described below is guaranteed to find it. (If f has a maximum or saddle point,
the procedure will find these points instead.)
   The idea behind finding the minimum is simple. Suppose that x1 and x2 are
the values for which f (x1 , x2 ) is smallest. Then the function g(s) = f (x1 +s, x2 )
must have a minimum at s = 0. So g (0) = 0. But
                 d
      g (s)   =    f (x1 + s, x2 )
                ds
                 d
              =    a(x1 + s)2 + 2b(x1 + s)x2 + cx2 + d(x1 + s) + ex2 + f
                                                 2
                ds
              = 2a(x1 + s) + 2bx2 + d
so that the condition is
                                g (0) = 2ax1 + 2bx2 + d = 0.
Notice that this expression can be obtained by holding x2 fixed and differenti-
ating with respect to x1 . It is called the partial derivative of f with respect to
                    ∂f
x1 and is denoted ∂x1 .
                                                                         ∂f
    The same argument can be applied to h(s) = f (x1 , x2 + s) (or ∂x2 .) This
yields
                            ∂f (x1 , x2 )
                   h (0) =                = 2bx1 + 2cx2 + e = 0.
                                ∂x2
    Therefore we conclude that the pair of values x1 and x2 at which f achieves
its minimum satisfy the system of linear equations
                                     2ax1        +2bx2       = −d
                                     2bx1        +2cx2       = −e
This is a 2 by 2 system with augmented matrix
                                            2a 2b −d
                                            2b 2c −e
    This is easily generalized to n variables. In this case the quadratic function
is given by
                                                n   n                    n
                  f (x1 , x2 , . . . , xn ) =             ai,j xi xj +         bi xi + c
                                                i=1 j=1                  i=1

To see this is the same, let us expand out the first term when n = 2. Then
         n    n
                  ai,j xi xj    = a1,1 x1 x1 + a1,2 x1 x2 + a2,1 x2 x1 + a2,2 x2 x2
        i=1 j=1

                                = a1,1 x2 + (a1,2 + a2,1 )x1 x2 + a2,2 x2
                                        1                               2

So this is just the same as before with a1,1 = a, a1,2 + a2,1 = 2b and a2,2 = c.
Notice that we might as well assume that ai,j = aj,i , since replacing both ai,j
and aj,i with (a1,2 + a2,1 )/2 doesn’t change f .


                                                    97
Chapter 3. Solving Linear Systems                                                 3.6. Additional Topics



                                                                    (xn,yn)
                                                      (x i,yi)
                              yi
                                          yi -axi-b
                        ax i+b



                    (x1,y1)



                                                      xi


                          Figure 3.13: Fitting a line through data.


    If this function f has a minimum we can find it by generalizing the procedure
above. In other words we try to find values of x1 , . . . , xn for which ∂f /∂x1 =
∂f /∂x2 = · · · = ∂f /∂xn = 0. This leads to a system of n linear equations whose
associated augmented matrix is
                         2a     2a1,2 . . . 2a1,n −b1 
                             1,1
                         2a2,1 2a2,2 . . . 2a2,n −b2 
                         .
                         .        .
                                   .            .
                                                .     . 
                                                      . 
                             .     .            .     .
                          2an,1 2an,2 . . . 2an,n −bn

3.6.2      Least squares fit
As a first application let us consider the problem of finding the “best” straight
line going through a collection of data points (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ). (Care-
ful! the xi ’s are not the unknowns in this problem, but rather the known fixed
data points, together with the yi ’s.) Consider Figure 3.13. Which straight line
fits best? There is no one answer. One can measure how good the fit of a
straight line is in various ways. However the following way of measuring the fit
results in a problem that is easy to solve.
    Each line is given by an equation y = ax+b. So the variables in this problem
are a and b. We want to find the values of a and b that give the best fitting
line. The vertical distance between the point (xi , yi ) and the line is given by
|yi − axi − b|. We will take as a measure of the fit, the square of this quantity,
added up over all the data points. So

f (a, b)   =         (yi − axi − b)2
                i

           =            yi + x2 a2 + b2 − 2xi yi a − 2yi b + 2xi ab
                         2
                              i
                i


           =              x2
                           i       a2 + 2             xi     ab + nb2 − 2         xi yi   a−2       yi   b+        2
                                                                                                                  yi
                    i                           i                             i                 i             i

Here we used that (                i   1) = n, the number of points. Therefore the linear


                                                            98
Chapter 3. Solving Linear Systems                                        3.6. Additional Topics



equations we must solve for a and b are

                        2    i x2
                                i     2(    i   xi ) 2 (   i   xi yi )
                        2(   i xi )        2n         2(       i yi )

We could solve these equations numerically in each particular case, but since its
just a 2 by 2 system we can also write down the answer explicitly. In fact, the
solution to
                                  A B E
                                  C D F
is                                    DE−BF
                                      AD−BC
                                      AF −CE        ,
                                      AD−BC
provided AD − BC = 0, as you may check directly, or derive using a sequence
of row transformations. So in this case
                             n(    xi yi ) − ( xi )( yi )
                    a =
                                  n( x2 ) − ( xi )2
                                          i
                             (    x2 )( yi ) − ( xi )( xi yi )
                                   i
                    b   =
                                     n( x2 ) − ( xi )2
                                             i


Example 3.16 Suppose we want to find the best straight line through the points
(1, 1), (2, 2), (2, 3), (3, 3) and (3, 4). Calculate

                          1               =n                     =5
                          xi      =1+2+2+3+3                     = 11
                          yi      =1+2+3+3+4                     = 13
                          x2i     =1+4+4+9+9                     = 27
                            2
                          yi      = 1 + 4 + 9 + 9 + 16           = 39
                         xi yi    = 1 + 4 + 6 + 9 + 12           = 32
so
            a = (5 · 32 − 11 · 13)/(5 · 27 − 112 ) = 17/14 = 1.214 . . .
and
          b = (27 · 13 − 11 · 32)/(5 · 27 − 112 ) = −1/14 = −0.0714 . . .
The result is shown in Figure 3.14.


3.6.3    Equilibrium configuration of hanging weights and
         springs
Consider the problem of n vertically hanging weight connected by springs. What
is the equilibrium configuration? We can solve this problem by calculating the
total potential energy of the system. The equilibrium configuration minimizes
the total potential energy. A diagram of the setup is shown in Figure 3.15. Our
goal is to compute the numbers x1 , . . ., xn . In the diagram n = 3.


                                           99
Chapter 3. Solving Linear Systems                                               3.6. Additional Topics




                       Figure 3.14: The solution in Example 3.16.


                                               k1 , l1
                                                             x1

                                                     m1

                                               k2 , l             x2
                                                     2

                                                     m2

                                               k3 , l                  x3
                                                     3

                                                     m3



                  Figure 3.15: Equilibrium configuration of springs.


   There are two sources of potential energy. One is the potential energy stored
in the spring. This is equal to ks2 /2, where k is the spring constant that
measures the stiffness of the spring, and s is the amount that the spring has
been stretched from its natural length. In our problem, suppose that the spring
constant of the ith spring is ki and its natural length is li . Then the potential
energy stored in the ith spring is ki (xi − xi−1 − li )2 /2. To make this formula
work out correctly for the first spring we set x0 = 0.
   The other source of potential energy is gravity. The gravitational potential
energy of the ith weight is −mi gxi . The reason for the minus sign is that we
are measuring distances downward.
   Thus the total potential energy in the system for n weights is the function
                                                 n
                                                         ki
                 f (x1 , x2 , . . . , xn ) =                (xi − xi−1 − li )2 − mi gxi .
                                               i=1
                                                         2

When n = 3 this becomes
                      k1             k2               k3
f (x1 , x2 , x3 ) =      (x1 −l1 )2 + (x2 −x1 −l2 )2 + (x3 −x2 −l3 )2 −m1 gx1 −m2 gx2 −m3 gx3
                      2              2                2
This is a quadratic function, so we know how to find the minimum. The equa-
tions are obtained by taking partial derivatives: To get the first equation we


                                                          100
Chapter 3. Solving Linear Systems                                 3.6. Additional Topics



hold x2 and x3 fixed and differentiate with respect to x1 and so on. Thus the
equations are

                      k1 (x1 − l1 ) − k2 (x2 − x1 − l2 ) − m1 g   =      0
                k2 (x2 − x1 − l2 ) − k3 (x3 − x2 − l3 ) − m2 g    =      0
                                      k3 (x3 − x2 − l3 ) − m3 g   =      0

The augmented matrix for this system is
                                                                    
                k1 + k2    −k2      0           m1 g + k1 l1 − k2 l2
               −k2      k2 + k3 −k3            m2 g + k2 l2 − k3 l3 
                   0       −k3      k3             m3 g + k3 l3

Example 3.17 Suppose that the spring constants are k1 = 1, k2 = 2 and k3 =
1. The masses are all equal to 1, g = 10 and the natural length of the springs is
1 for all springs (in appropriate units). Find the equilibrium configuration. We
must solve                                       
                                 3 −2 0         9
                               −2 3 −1 11 
                                 0 −1 1 11
Gaussian elimination gives
                                                  
                                  3    −2    0 9
                                 0    −1    1 11 
                                  0     0    2 106

which can be solved to give x1 = 31, x2 = 42, x3 = 53.

3.6.4     Problems
Problem 3.34 Find the “best” straight line going through the points (1, 1),
(2, 1), (2, 3), (3, 4), (3, 5) and (4, 4).

Problem 3.35 Consider the problem of finding the parabola y = ax2 + bx + c
that best fits the n data points (x1 , y1 ) . . . (xn , yn ). Derive the system of three
linear equations which determine a, b and c. (You need not solve solve them!)

Problem 3.36 Write down the augmented matrix for a system of n weights
and springs.

Problem 3.37 Write down the system of equations you would have to solve if
there are 5 identical springs with ki = 1 and li = 1 and five weights with m1 = 1,
m2 = 2, m3 = 3, m4 = 4, and m5 = 5.




                                          101
Chapter 3. Solving Linear Systems                 3.7. Solutions to Chapter Problems



3.7     Solutions to Chapter Problems
Solution 3.1 The augmented matrix is
                                                        
                             1 −2             3        6
                       A =  4 −5            −6        7 
                             8 9             10       11

Solution 3.2 Here is the sequence of systems of equations you would get:
                        x1     + x2    + x3           =  6
                        x1     − x2    + x3           =  0
                       2x1     + x2    − 8x3          = −11

                       x1     + x2     +        x3    =  6
                              − 2x2                   = −6
                      2x1     + x2     − 8x3          = −11
                      x1     + x2     +      x3       =  6
                             − 2x2                    = −6
                             − x2     − 10x3          = −23
                       x1     + x2    +      x3       =  6
                                x2                    =  3
                              − x2    − 10x3          = −23
                       x1     + x2    +      x3       =  6
                                x2                    =  3
                                      − 10x3          = −20
So x3 = 2, x2 = 3 and x1 = 1.
Solution 3.3
A = [1, 1, 1, 6; 1, -1, 1, 0; 2, 1, -8, -11]
A(2,:) = A(2,:) - A(1,:)
A(3,:) = A(3,:) - 2*A(1,:)
A(2,:) = -A(2,:)/2
A(3,:) = A(3,:) + A(2,:)
A(3,:) = -A(3,:)/10
We get the same solution as in the pen-and-paper version in problem 3.2.
Solution 3.4 Perform the sequence of operations to finally end up with the
system

                                          2x3     =    4
                             3x1 − 3x2 + 3x3      =    30
                             9x1      + 2x3       =    26.

Hence, x1 = 22/9, x2 = −50/9, and x3 = 2.


                                       102
Chapter 3. Solving Linear Systems                 3.7. Solutions to Chapter Problems



Solution 3.5 We want to solve the system
                                   2x1 + x2   =   5
                                  3x1 + 5x2   = −10.


We shall use the method of substitution.
  From the first equation, we solve for x2 :
                                      x2 = 5 − 2x1
We then substitute it into the second equation:
                                 3x1 + 5(5 − 2x1 ) = −10
. We now have a decoupled equation, and the solution is x1 = 5, x2 = −5.
Solution 3.6 The first equation reads x1 = 3. The second reads x1 − x2 = 3, or
3 − x2 = 3, or x2 = 0. The third reads 2x1 + x2 − 8x3 = −4, or 6 + 0 − 8x3 = −4
or x3 = 5/4.
Solution 3.7 The last equation gives x4 = 2 and the second last one x3 = 2.
Then we have to introduce a parameter x2 −s and the we find that x1 = −5−2s
              −5         −2
                         
             0        1 
Thus x =         + s      In the second system, we have to introduce a
              2          0
              2          0
parameter right off the bat. So x4 = s1 and x3 = 4 − s1 . Moving up one row, we
have to introduce another parameter x2 = s2 and then x1 = 1 − 2s2 − (4 − s1 ) −
                              −3          −1          −2
                                                     
                             0         0         1 
2s1 = −3 − s1 − 2s2 so x =       + s1      + s2       The third system is
                               4          −1           0
                               0           1           0
just the same. The extra rows of zeros have no effect.
Solution 3.8 Perform the following sequence of row operations:
 i) (2, :) = (2, :) − (1, :)
 ii) (3, :) = (3, :) − (1, :)
iii) (3, :) = (3, :) + 2(2, :)
iv) (4, :) = (4, :) − 3(2, :)
 v) (4, :) = (4, :) + (3, :),
in order to get the reduced row      echelon form (RREF)
                                                    
                              1       2 2 2 1
                            0
                                     1 −3 1 −2     .
                            0        0 −7 1 0 
                              0       0 0 0 0

                                          103
Chapter 3. Solving Linear Systems                  3.7. Solutions to Chapter Problems



Thus, we have that the rank of the augmented matrix, which is the same as
the rank of the unaugmented matrix is r = 3, while the number of unknowns
is n = 4. Therefore, the system has infinitely many solutions, that can be
represented as the general solution

                                        x = q + sa,

where s is any real number, q = (q1 , q2 , q3 , q4 ) is any particular solution of
the original system, and a = (a1 , a2 , a3 , a4 ) is any non-zero solution of the
corresponding homogeneous system.
   From the row echelon form we can see that q has to satisfy −7q3 + q4 = 0.
Let q4 = 7, then q3 = 1. For the other two q values we have

                          q2      = −2 + 3q3 − q4 = −6
                          q1      =   1 − 2q2 − 2q3 − 2q4 = −3.

Hence, we have that q = (−3, −6, 1, 7).
   Now, to find a, the homogenous row echelon form is
                                            
                             1 2 2 2 0
                           0 1 −3 1 0 
                           0 0 −7 1 0  .
                                            

                             0 0 0 0 0

Thus, a has to satisfy −7a3 + a4 = 0. Take a4 = 7, then a3 = 1.
   From the second row we have that a2 = 3a3 − a4 = −4, and from the first
row we get that a1 = −2a2 − 2a3 − 2a4 = −8. So a = (−8, −4, 1, 7), and a
general form of the solution is
                                              
                                  −3         −8
                                 −6       −4 
                           x=   1  + s 1 
                                               

                                   7          7

Solution 3.9 Perform the following sequence of row operations

 i) (2, :) = (2, :) − 3(1, :)

 ii) (4, :) = (4, :) − 2(1, :)

iii) (4, :) = (4, :) + 2(3, :)

iv) (3, :) ↔ (2, :)

 v) (4, :) ↔ (3, :)

vi) (4, :) = (4, :) − 3 (3, :),
                      2




                                            104
Chapter 3. Solving Linear Systems                    3.7. Solutions to Chapter Problems



to transform the augmented matrix into the           reduced row echelon form
                                                         
                          1 2 2 −7                     20
                         0 6 0 −6                    −10 
                                                         .
                         0 0 −6 0                    −30 
                          0 0 0        16             −30
We can now solve the system starting from the bottom:
                                   30     15
                     x4   = −         =−
                                   16      8
                                  −30
                     x3   =           =5
                                  −6
                                  −10 + 6x4     85
                     x2   =                  =−
                                      6         24
                                                               95
                     x1   = 20 − 2x2 − 2x3 + 7x4 =
                                                               24

Solution 3.10 The matrix
                                                       
                                1       −2     3      2
                               2       −3     2      2 
                                3        2    −4      9
reduces to                                             
                                1     −3      3       2
                               0      1       4     −2 
                                0      0      19     19
which has as solution x1 = 3, x2 = 2, x3 = 1.
Solution 3.11 The matrix
                                                        
                                2        1 −1          6
                               1       −2 −2          1 
                                −1      12  8          7
reduces to                                                
                            2      1           −1        6
                           0     −5/2        −3/2      −2 
                            0      0            0        0
which has as solution x1 = 13/5 + 4s/5, x2 = 4/5 − 3s/5, x3 = s.
Solution 3.12 The matrix
                                                      
                                    1    2    4      1
                                   1    1    3      2 
                                    2    5    9      1
reduces to                                             
                                1        2     4      1
                               0       −1    −1      1 
                                0        0     0      0

                                             105
Chapter 3. Solving Linear Systems                          3.7. Solutions to Chapter Problems



Thus , setting x3 = s, we obtain x2 = −1 − s and x1 = 3 − 2s.
Solution 3.13 The matrix
                                                            
                                          1    2    4      1
                                         1    1    3      2 
                                          2    5    9      3
reduces to                                                      
                                      1        2     4         1
                                     0       −1    −1         1 
                                      0        0     0         2
which has no solutions.

Solution 3.14 The matrix
                                                                    
                        3                1 −1             2       7
                     2                 −2 5             −7       1 
                       −4               −4 7             −11     −13
reduces to                                                              
                         3         1         −1        2             7
                        0        −8/3      17/3     −25/3         −11/3 
                         0         0          0        0             0
which has as solution x1 = 15/8 − 3s1 /8 + 3s2 /8, x2 = 11/8 + 17s1 /8 − 25s2 /8,
x3 = s1 , x4 = s2 .
Solution 3.15 Probably the easiest way to do this problem is to think geomet-
rically. This system of equations describes the intersection of two lines. The
lines will intersect in a single point if they are not parallel. This will happen
exactly when the two vectors [a, b] and [c, d] are not parallel. Recall that this
can be tested using the determinant. So the equations have a unique solution
                     a b
exactly when det            = ad − bc = 0.
                     c d
Solution 3.16 Perform the following sequence of row operations:
 i) (2, :) = (2, :) − 4(1, :)
 ii) (3, :) = (3, :) + 4(1, :)
                       5
iii) (3, :) = (3, :) + 3 (3, :)
to get the reduced row echelon matrix
                                                            
                               1 2 0                      7
                              0 0 6                     −18  .
                               0 0 0                     139
The last equation says 0 = 139, which cannot happen, therefore this linear
system has zero solutions.
Solution 3.17 The script is the following:


                                                   106
Chapter 3. Solving Linear Systems                 3.7. Solutions to Chapter Problems



for a = 1:10
A = [1 2 4 7; 4 1 3 2; 0 5 9 a];
rref(A)
end

If the value of a changes linearly, then the solution to x1 (or for that matter, of
x2 and x3 too) changes linearly with a.
Solution 3.18 Perform the following sequence of row operations:

  i) (2, :) = (2, :) + (1, :)

 ii) (3, :) = (3, :) − (2, :)

to get the reduced row echelon matrix
                                                    
                              1 0 1 0             10
                             0 1 2 1             14  .
                              0 0 0 0              0

Thus the row echelon form for the corresponding homogeneous system is
                                             
                              1 0 1 0 0
                            0 1 2 1 0 .
                              0 0 0 0 0

Let q be any solution to the original system. For example, check that q =
(9, 11, 1, 1) solves the system.
    Now, let a1 and a2 be any two linearly independent vectors which solve the
corresponding homogeneous system. For example, check that a1 = (1, 1, −1, 1),
and a2 = (1, −1, −1, 3) are two linearly independent vectors, which solve the
homogeneous system.
    Thus, one representation for the solutions of the system is
                                                      
                              9           1            1
                                  + s1  1  + s2  −1  .
                            11                        
                       x=  1          −1        −1 
                              1            1           3

Solution 3.19 The general solution is [4, 0, 3, 0] + s1 [−1, 1, 0, 0] + s2 [0, 0, −2, 1].
Solution 3.20 Perform the following sequence of row operations:

  i) (2, :) = (2, :) − (1, :)

 ii) (3, :) = (3, :) − 1 (2, :)
                       2
                 1
iii) (2, :) =   12 (2, :)

iv) (3, :) = 1 (3, :)
             4



                                          107
Chapter 3. Solving Linear Systems                         3.7. Solutions to Chapter Problems



 v) (2, :) = (2, :) + 1 (3, :)
                      3

vi) (1, :) = (1, :) − 6(3, :)

vii) (1, :) = (1, :) + 3(2, :)

to get the reduced row echelon matrix
                                      1
                                                           
                               1 0    2                   0
                              0 1 −7                     0 .
                                       6
                               0 0 0                      1

Here the rank of the augmented matrix (ra = 2) is less than the rank of the
unaugmented matrix ()ru = 3), therefore the linear system has zero solutions.
Solution 3.21 To decide whether the vectors are linearly independent we must
decide whether the homogeneous system of equations represented by the matrix
  1 1 1                                                             1 1       1
                                                                               
2 1 0                                                            0 −1 −2 
               has a non zero solution. Row reduction yields                    .
  0 −1 1                                                            0 0       3

  2 1 0                                                             0 0       0
This shows that the zero solution is unique and therefore the vectors are indepen-
dent. It can’t happen that three vectors span a four dimensional space. To test
whether y1 is a linear combination of the xi ’s we try to solve the equation with
                      1 1 1 2                                   1 1       1     2
                                                                                
                    2 1 0 4                                  0 −1 −2 0 
augmented matrix                     . Row reduction gives                      .
                      0 −1 1 −3                                 0 0       3 −3
                      2 1 0 4                                   0 0       0     0
This system does have a solution, therefore the vector y1 is a linear combination
of the xi ’s.
Solution 3.22 Three vectors in the plane cannot be linearly independent.
   There are two ways to see that y is a linear combination of a1 and a2 :

    • a1 and a2 are linearly independent because

                                                1     2
                                          det              = −5
                                                3     1

      Therefore {a1 , a2 } form a basis of R2 , and every vector is a linear combi-
      nation of a1 and a2 .

    • Solve the system
                                           x1 a1 + x2 a2 = y
      i.e.
                                      1               3          −15
                                 x1        + x2            =           .
                                      2               1           5




                                                108
Chapter 3. Solving Linear Systems             3.7. Solutions to Chapter Problems



     This system yields the augmented matrix

                1 3     −15                                    1    3   −15
                               → (2, :) = (2, :) − 2(1, :) →
                2 1      5                                     0   −5   35
                          1            1 3     −15
              → (2, :) = − (2, :) →                   ,
                          5            0 1     −7

     which gives x2 = 7, and x1 = 6.

Solution 3.23 The vectors a1 , a2 , and a3 are linearly independent if the only
solution to the system
                         x1 a1 + x2 a2 + x3 a3 = 0
is x1 = x2
         = x3 = 0 So,
                                                                       
       1   0    10 0                                      1 0 10 0
                          → (2, :) = (2, :)1− (1, :) →  0 0 −10 0 
      1   0     0 0                                                    
     
      0   4 −5 0             (4, :) = − 3 (4, :)       0 4 −5 0 
       0 −3 0 0                                           0 1      0   0
                                                 
                                1 0 10 0
                   1
       (2, :) = − 10 (2, :)   0 0 1 0                       (2, :) ↔ (3, :)
    →                       →
                              0 1 0 0  → (4, :) = (4, :) − 4(3, :) + 5(2, :)
                                                  
         (3, :) ↔ (4, :)
                                0 4 −5 0
                         
         1 0 10 0
       0 1 0 0 
    →                   
         0 0 1 0 
         0 0 0 0

The system has a unique solution x1 = x2 = x3 = 0, hence the three vectors
are linearly independent.
   To find if y can be written as a linear combination of the three vectors, we
look for a solution of the system
                                                                           
        1 0        10 11                                       1 0  10   11
                               → (2, :) = (2, :) − (1, :) →  0 0 −10 −10  →
      1 0          0    1                                                  
     
      0 4 −5 −1                 (3, :) = (3, :) + (4, :)    0 1  −5    9 
        0 −3 0           10                                    0 −3  0   10
                                                        
                                     1 0 10 11
                      1
         (2, :) = − 10 (2, :)      0 0        1     1 
                                → 0 1 −5
                                                         
      (4, :) = (4, :) + 3(3, :)                      9 
                                     0 0 −15 37

The second row of the reduced row echelon matrix implies that x3 = 1, while
the third row yields that x3 = − 37 . This contradiction means that the system
                                  15
has no solution, and y is therefore not a linear combination of a1 , a2 , and a3 .
Solution 3.24


                                       109
Chapter 3. Solving Linear Systems                3.7. Solutions to Chapter Problems



plot3([1,0],[2,0],[3,0])
hold on
plot3([-1,0],[2,0],[1,0])
plot3([4,0],[1,0],[-1,0])
In the figure window, if you go to Tools → Rotate 3D, you will see right away
that the three vectors do not share the same plane, therefore they must be
linearly independent.
Solution 3.25 We apply Kirchhoff’s junction rule
                                  I1 + I2 − I3 = 0
and Kirchhoff’s loop rules, moving around first the left loop and then the right
loop in a clockwise direction:
                              −2I1 + 5I2 + 6       =    0
                              −5 − 6I3 − 5I2       =    0
Writing these three equations in an augmented matrix with the unknowns or-
dered I1 , I2 and I3 gives
                                              
                              1    1 −1      0
                            −2 5      0    −6 
                              0 −5 −6        5
which is solved to give I1 = 41/52, I2 = −23/26 and I3 = −5/52 with units
of Amperes. The voltage drops are I1 R1 = 41/26, I2 R2 = 115/26 and I3 R3 =
15/26 across the three resistors, with drops in the direction of positive currents.
Note that this problem is more easily solved using two loop currents (try it and
make sure you get the same solution).
Solution 3.26 We write voltage drops around the two loops in the direction of
the loop currents:
                 4i1 + (i1 − i2 ) + 5   =   0 or 5i1 − i2 = −5
                 (i2 − i1 ) + 6i2 − 3   =   0 or       − i1 − i2 = −5
which is solved to give i2 = 5/17 and i1 = −16/17. Thus, I2 = i2 − i1 = 21/17.
Since it is positive, this current flows to the right.
Solution 3.27 We use loop currents to solve the network depicted in figure
3.27.
    Kirchhoff’s second law gives that i2 = −I, and the first law applied to the
closed loop on the left gives
                             R1 i1 + R2 (i1 − i2 ) = V.
As R1 = 3[Ω], R2 = 1[Ω], R3 = 4[Ω], V = 26[V ], and I = 2[A], we get
                                    i2 = −2[A]                                (3.2)
                                    3i1 + i1 + 2 = 26                         (3.3)


                                        110
Chapter 3. Solving Linear Systems               3.7. Solutions to Chapter Problems




                         Figure 3.16: Problem 3.27.




                         Figure 3.17: Problem 3.28.


Hence i1 = 6[A], and from the sign we get that i2 flows in the opposite direction
as initially supposed.
a) Voltage drop = R3 i2 = −8[V ]. The - sign indicates that the potential is
   higher at the left of R3 than on the right.
b) The flow through R2 is i1 − i2 = 8[A].
c) Voltage drop = R1 i1 = 18[V ].

Solution 3.28 Again, we use loop currents to solve the problem depicted in
figure 3.28.


                              R1 i1 + R2 (i1 − i2 ) = V
                              R3 i2 + R2 (i2 − i1 ) = 0,

with R1 = 4[Ω], R2 = 1[Ω], R3 = 2[Ω].
   The flow through R3 is i2 , which is 1.5[A]. The system becomes

                              4i1 + (i1 − 1.5) = V
                               3 + (1.5 − i1 ) = 0
                                                3
                                    5i1 − V =   2
                                       i1 = 9
                                            2
Therefore V = 21[V ], and i1 = 4.5[A]


                                        111
Chapter 3. Solving Linear Systems             3.7. Solutions to Chapter Problems




                         Figure 3.18: Problem 3.29.


a) Voltage drop = R3 i2 = 3[V ].
b) Current flow = i1 − i2 = 3[A].
c) Voltage drop = V = 21[V ]

Solution 3.29 As before:
                               R1 (i1 − i2 ) = V
                                                          ,
                        R2 i2 + R3 i2 + R1 (i2 − i1 ) = 0

with R1 = 4[Ω], R2 = 2[Ω], R3 = 10[Ω], V = 60[V ].
   Hence, the system becomes

                                 4(i1 − i2 ) = 15
                                                        ,
                           2i2 + 10i2 + 4(i2 − i1 ) = 0

                                    i1 − i2 = 15
                                                 .
                                    4i2 − i1 = 0
Solving the system we get i1 = 20[A], i2 = 5[A].
a) Voltage drop = R2 i2 = 10[V ].
b) Current flow through R1 = i1 − i2 = 15[A].
c) Current flow through R3 = i2 = 5[A].

Solution 3.30 Using Kirchhoff’s laws we get the following equations:
                     
                      R1 i1 + R2 (i1 − i2 ) = E1
                        R3 i2 + R2 (i2 − i1 ) = −E2 ,
                                R4 i3 = E2
                     


                                       112
Chapter 3. Solving Linear Systems                          3.7. Solutions to Chapter Problems



Plugging in the values of R1 , R2 , R3 , R4 , E1 , and E2 , we get
                          
                           i1 + 3(i1 − i2 ) = 10
                               5i2 + 3(i2 − i1 ) = −4 ,
                                        2i3 = 4
                          

We have that i3 = 2. To find i1 and i2 we solve the augmented system

                4      −3 10
                                        → (1, :) = (1, :) + (2, :) →
               −3       8 −4
                1      5 6                                               1   5     6
                                    → (2, :) = (2, :) + 3(1, :) →                      .
               −3      8 −4                                              0   23   14
          68               14
So i1 =   23 [A], i2   =   23 [A], i3   = 2[A].
Solution 3.31 From Kirchhoff’s first law:

                                             I + i3 = i2

From Kirchhoff’s second law:

                                 R1 i1 + R2 (i1 − i2 ) = V
                                        E + R2 (i2 − i1 ) = 0
                                           R4 i3 + R3 i3 = E

Substituting in the actual values, we get
                            
                             i1 + (i1 − i2 ) = 25
                            
                                    i2 − i3 = 3
                            
                             E + 2(i2 − i1 ) = 0
                            
                                      2i3 = E
                            

The system, with respect to i1 , i2 , i3 , and E, yields the following augmented
matrix:
                                                                                             
        2 −1 0          0 25                                        2 −1 0         0 25
     0
           1 −1 0            3 
                                 → (3, :) = (3, :) + (1, :) → 
                                                                   0 1 −1 0             3      
                                                                                                
     −2 2        0     1     0                                   0 1      0     1     0      
        0   0     2 −1 0                                            0 0      2 −1 0
                                                             
                                      2 −1 0         0    25
                                  0 1 −1 0                3 
    → (3, :) = (3, :) − (2, :) →                              → (4, :) = (4, :) − 2(3, :)     →
                                  0 0          1    1 −3 
                                      0 0       2 −1 0
                              
       2 −1 0         0     25
     0 1 −1 0               3 
                              
     0 0       1     1 −3 
       0 0      0 −3 6


                                                  113
Chapter 3. Solving Linear Systems                  3.7. Solutions to Chapter Problems




                              Figure 3.19: Problem 3.32.

               27
a) Then i1 =    2 [A], i2   = 2[A], i3 = −1[A], E = −2[V ].

b) Voltage drop: R2 (i1 − i2 ) = 23[V ].
                                      27
c) Current flow through V : i1 =        2 [A].

d) Voltage drop through I: E = −2[V ].

Solution 3.32 From Kirchhoff’s second law:

                                       I1 = i1
                                     I2 = −i2

From Kirchhoff’s first law:

                                  R1 i1 + R2 (i1 − i2 ) = E1
                      V + R3 i2 + E2 + R2 (i2 − i1 ) = 0,

which becomes                 
                              
                                        i1 = I1
                                        i2 = −3
                              
                              
                                   7i1 − 5i2 = E1
                                10 + 8i2 − 5i1 + 5 = 0
                              

We seek i1 and E1 :
                                    7i1 − E1 = −15
                                                   .
                                        −5i1 = 9
                              9                  12
Therefore we have that i1 = − 5 [A], E1 =         5 [V   ].

a) Current flow through R2 : i1 − i2 = 6 [A].
                                      5



                                           114
Chapter 3. Solving Linear Systems                      3.7. Solutions to Chapter Problems



                                          12
b) Voltage drop through I1 = E1 =          5 [V   ].
                                   9
c) Current flow through I1 = i1 = − 5 [A].

Solution 3.33 Note that this solution follows the alternate description of resis-
tor networks in the notes. Using the identities Vi = Ii Ri , the voltage equations
for the loops can be written

                                           I1 R1 + I2 R2 + E         =   0
                                       I3 R3 + I4 R4 − I2 R2         =   0
                                                                     .
                                                                     .
                                                                     .
                       I2n−1 R2n−1 + I2n R2n − I2n−2 R2n−2           =   0

The current equations for the nodes are

                                   I1 − I3 − I2        =   0
                                   I3 − I5 − I4        =   0
                                                       .
                                                       .
                                                       .
                                   I2n−1 − I2n         =   0



Solution 3.34 We have n = 6,       xi = 15,   x2 = 43,
                                               i                       yi = 18,   xi yi = 52,
so a = 14/11 ∼ 1.27 . . . and b = −2/11 ∼ −0.18 . . ..
Solution 3.35 We would want to minimize the quadratic function

f (a, b, c)   =        (ax2 + bxi + c − yi )2
                          i

              =         a2 x4 + b2 x2 + c2 + yi + 2abx3 + 2acx2 − 2ax2 yi + 2bcxi − 2bxi yi − 2cyi
                            i       i
                                              2
                                                      i       i      i

              =          x4 a2 +
                          i             x2 b2 + nc2 + 2
                                         i                           x3 ab + 2
                                                                      i                x2 ac + 2
                                                                                        i          xi bc

                  −2        x2 yi a − 2
                             i                  xi yi b − 2          yi c +        2
                                                                                  yi

The corresponding system of equations is

                       x4       x3        x2
                                                                        
                        i        i         i                      x2 yi
                                                                   i
                        3        2
                      xi       xi    ( xi )                   ( xi yi ) 
                       x2
                        i    ( xi )      n                      ( yi )

(I’ve divided each equation by two.)




                                            115
Chapter 3. Solving Linear Systems               3.7. Solutions to Chapter Problems



Solution 3.36 The matrix is
  k1 + k2  −k2       0         0          ...     0           0       0        m1 g + k1 l1 − k2 l2
                                                                                                       
 −k2     k2 + k3  −k3         0          ...     0           0       0        m2 g + k2 l2 − k3 l3     
 0        −k3    k3 + k4    −k4          ...     0           0       0        m3 g + k3 l3 − k4 l4
                                                                                                       
                                                                                                        
 0          0     −k4      k4 + k5       ...     0           0       0        m4 g + k4 l4 − k5 l5
                                                                                                       
                                                                                                        
 0          0       0       −k5          ...     0           0       0        m5 g + k5 l5 − k6 l6
                                                                                                       
                                                                                                        
    .
     .       .
             .       .
                     .         .
                               .                  .
                                                  .           .
                                                              .       .
                                                                      .                .
                                                                                       .
                                                                                                        
     .       .       .         .                  .           .       .                .
                                                                                                       
                                                                                                       
 0          0       0         0          . . . −kn−1   kn−1 + kn    −kn   mn−1 g + kn−1 ln−1 − kn ln
                                                                                                        
     0       0       0         0          ...     0       −kn        kn           mg + kn ln

Solution 3.37 The system of equations would be

                        2   −1       0     0    0     g
                                                         
                      −1    2      −1     0     0   2g 
                      0    −1       2    −1     0   3g 
                                                         
                        0    0      −1     2    −1   4g
                                                         
                        0    0       0    −1     1 5g + 1




                                         116
Chapter 4

Matrices and Determinants

4.1      Matrix operations
A matrix is a rectangular array of numbers. Here is an example of an m × n
matrix.                     a
                               1,1  a1,2 · · · a1,n 
                             a2,1 a2,2 · · · a2,n 
                       A= . .      .          . 
                               .     .
                                     .          . 
                                                .
                              am,1 am,2 · · · am,n
This is sometimes abbreviated A = [ai,j ]. An m × 1 matrix is called a column
vector and a 1 × n matrix is called a row vector. (The convention is that m × n
means m rows and n columns).
   Addition and scalar multiplication are defined for matrices exactly as for
vectors. If s is a number
          a
                1,1 a     ··· a
                         1,2
                                      sa
                                        1,n        sa    · · · sa
                                                           1,1
                                                                     
                                                                       1,2            1,n
           a2,1        a2,2     ···   a2,n   sa2,1                sa2,2     ···   sa2,n 
         s .
           .            .              . = .                        .               . ,
             .           .
                         .              .   .
                                        .         .                    .
                                                                       .               . 
                                                                                       .
            am,1        am,2     · · · am,n     sam,1               sam,2      · · · sam,n
and
a
   1,1   a1,2     ···     a1,n    b
                                      1,1       b1,2      ···       b1,n     a +b
                                                                                1,1    1,1     a1,2 + b1,2   ···   a1,n + b1,n   
 a2,1   a2,2     ···     a2,n     b2,1       b2,2      ···       b2,n     a2,1 + b2,1     a2,2 + b2,2   ···   a2,n + b2,n   
 .       .                .     + .           .                    .    =      .               .                   .        
 .       .                .       .           .                    .             .               .                   .
   .      .                .          .          .                    .             .               .                   .
                                                                                                                               
  am,1   am,2     · · · am,n           bm,1     bm,2      · · · bm,n             am,1 + bm,1   am,2 + bm,2   · · · am,n + bm,n
   The product of an m × n matrix A = [ai,j ] with a n × p matrix B = [bi,j ] is
a m × p matrix C = [ci.j ] whose entries are defined by
                                                 n
                                       ci,j =         ai,k bk,j .
                                                k=1


                                                117
Chapter 4. Matrices and Determinants                              4.1. Matrix operations




                      A1                                        A1 B A1 B A1 B3
                                                                    1    2
                                        B         B    B3   =
                                         1         2
                      A2                                        A2 B A2 B A2 B
                                                                    1    2    3


                      A3                                        A3 B A3 B A3 B3
                                                                    1    2

                      A4                                        A 4 B A 4 B A 4 B3
                                                                     1     2




Figure 4.1: Schematic of matrix multiplication as the inner product of rows and
columns of the product matrices.


An easy way to remember this is to chop the matrix A into m row vectors of
length n and to chop B into p column vectors also of length n, as in the following
diagram. The i, jth entry of the product is then the dot product Ai · Bj . This
is shown schematically in Figure 4.1.
    It is important to notice that the matrix product AB only makes sense if the
the number of columns of A equals the number of rows of B. So A2 = AA only
makes sense for a square matrix.
    Here is an example
               1   2
                                                                                                
 1    0   1   2             1×1+0×3+1×5+2×7 1×2+0×4+1×6+2×8                                   20 24
                 3   4 
1    1   1   4        = 1 × 1 + 1 × 3 + 1 × 5 + 4 × 7 1 × 2 + 1 × 4 + 1 × 6 + 4 × 8  =  37 44 
                  5   6
 0    0   1   1             0×1+0×3+1×5+1×7 0×2+0×4+1×6+1×8                                   12 14
                  7   8

     Notice that if A is an m × n matrix, and

                                         x1
                                           
                                        x2 
                                   x =  x3 
                                        
                                        . 
                                        . 
                                          .
                                             xn

and
                                             b1
                                                 
                                            b2   
                                             b3
                                                 
                                   b=            
                                             .
                                              .
                                                  
                                             .   
                                             bm
Then the equation
                                     Ax = b
is a short way of writing the system of linear equations corresponding to the
augmented matrix [A|b].


                                       118
Chapter 4. Matrices and Determinants                     4.1. Matrix operations



   We will see shortly why matrix multiplication is defined the way it is. For
now, you should be aware of some important properties that don’t hold for
matrix multiplication, even though they are true for multiplication of numbers.
First of all, in general, AB is not equal to BA, even when both products are
defined and have the same size. For example, if

                                        0    1
                                 A=
                                        0    0

and
                                        1    0
                                 B=
                                        0    0
then
                                         0   0
                                AB =
                                         0   0
but
                                        0    1
                               BA =            .
                                        0    0
This example also shows that two non-zero matrices can be multiplied together
to give the zero matrix.
    Here is a list of properties that do hold for matrix multiplication.

  1. A + B = B + A

  2. A + (B + C) = (A + B) + C

  3. s(A + B) = sA + sB

  4. (s + t)A = sA + tA

  5. (st)A = s(tA)

  6. 1A = A

  7. A + 0 = A (here 0 is the matrix with all entries zero)

  8. A − A = A + (−1)A = 0

  9. A(B + C) = AB + AC

 10. (A + B)C = AC + BC

 11. A(BC) = (AB)C

 12. s(AB) = (sA)B = A(sB)




                                       119
Chapter 4. Matrices and Determinants                                4.1. Matrix operations



4.1.1     MATLAB
Multiplication of matrices can be done using the * operator just as for multi-
plication of scalars. An error results if the matrices are not of compatible size.
Powers of matrices can be found using the ^ command like for scalars. The
MATLAB command
A^4
produces the same result as
A*A*A*A
where A is a previously defined, square matrix. Using these commands might
be helpful in working out the details of Problem 4.4 below, although you should
work out the first few matrix powers by hand for practise. Taking high powers
of a matrix will also be helpful in understanding the long time behaviour of
random walks described in Section 4.3.

4.1.2     Problems
Problem 4.1 Define
                                                                             
                               −1     2                                      2
               1   2   3
          A=             B =  −3     1C = [2               −2   0 ] D =  −11 
               1   2   1
                               −2     1                                      2

Compute all products of two of these (i.e., AB, AC, etc.) that are defined.

Problem 4.2 Consider the following matrices:
                     
               3 0
                                   0 1                             1   4   2
        A =  −1 2  ; B =                ; C=                                 .
                                   0 0                             3   1   5
               1 1

Compute all the possible products between them.

Problem 4.3 Compute A2 = AA and A3 = AAA for
                                     
                               0 a b
                         A = 0 0 c
                               0 0 0

and                                                     
                                     1         0       a
                                A = 0         1       0
                                     0         0       1

Problem 4.4 Let
                                          1        1
                                  A=
                                          0        1


                                         120
Chapter 4. Matrices and Determinants       4.2. Linear Transformations and Matrices



(a) Find A2 , A3 and A4 .
(b) Find Ak for all positive integers k.
(c) Find etA (part of the problem is to invent a reasonable definition!)
(d) Find a square root of A (i.e., a matrix B with B 2 = A).
(e) Find all square roots of A.
Problem 4.5 Compute Ak for k = 2, 3, 4        when
                               0 1             0   0
                                                    
                             0 0              1   0
                        A=
                               0 0             0   1
                                                     
                               0 0             0   0

4.2     Linear Transformations and Matrices
4.2.1    Linear Transformations
Recall that a function f is a rule that takes an input value x and produces
an output value y = f (x). Functions are sometimes called transformations or
maps (since they transform, or map, the input value to the output value). In
calculus, you have mostly dealt with functions whose input values and output
values are real numbers. However, it is also useful to consider functions whose
input values and output values are vectors.
    We have already encountered this idea when we discussed quadratic func-
tions. A quadratic function such as f (x1 , x2 , x3 ) = x2 + x2 + x2 can be con-
                                                         1    2    3
sidered as a transformation (or map) whose input is the vector x = [x1 , x2 , x3 ]
and whose output is the number y = x2 + x2 + x2 . In this case we could write
                                        1      2     3
f (x1 , x2 , x3 ) as f (x).
    An example of a transformation whose inputs and outputs are both vectors
in two dimensions is rotation by some angle, say 45◦ . If x is the input vector,
then the output vector R(x) is the vector you get by rotating x by 45◦ in the
counter-clockwise direction as shown in Figure 4.2 (left).
    A transformation T is called linear if for any two input vectors x and y and
any two numbers s and t,
                            T (sx + ty) = sT (x) + tT (y)                     (4.1)
This condition is saying that when we scalar multiply and add two vectors, it
doesn’t matter whether we (i) do scalar multiplication and addition first and
then apply a linear transformation, or (ii) do a linear transformation first and
then do scalar multiplication and addition. In both cases we get the same an-
swer. The linearity condition (4.1) is equivalent to the following two conditions:
 (i) For any two vectors x and y,
                                T (x + y) = T (x) + T (y).


                                        121
Chapter 4. Matrices and Determinants          4.2. Linear Transformations and Matrices




                                                                        R(x + y)=R(x )+R(y )
                       y= R(x)                           R(x)




                                                         R(y)
                                 x                                       x

                                                  y                             x+y

            Rotation by 45 degrees                    Rotation by 45 degrees


Figure 4.2: Rotation of a vector in 2D (left), graphical evidence that property
(i) of linear transformations holds for rotation in 2D (right).


 (ii) For any vector x and any scalar s,
                                         T (sx) = sT (x)

    Notice that the quadratic function f above is not a linear transformation,
since
         f (2x) = (2x1 )2 + (2x2 )2 + (2x3 )2 = 4(x2 + x2 + x2 ) = 4f (x).
                                                   1    2    3

So f (2x) is not equal to 2f (x) as would need to be true if f were linear.
    However, rotation by 45◦ is a linear transformation. The picture in Figure 4.2
(right) demonstrates that condition (i) holds.
    The most important example of a linear transformation is multiplication
by a matrix. If we regard vectors as column vectors, then multiplying an n
dimensional vector x with an m × n matrix A results in an m dimensional
vector y = Ax. The linearity property (4.1) is a consequence of properties 9
and 12 of matrix multiplication listed in Section 4.1. We will see that in fact
every linear transformation is of this form.

4.2.2    Rotations in two dimensions
Let us obtain a formula for the transformation that rotates a vector in two
dimensions counterclockwise by θ degrees. Let x be an arbitrary vector. Denote
by Rotθ x the vector obtained by rotating x counterclockwise by θ degrees. If the
angle between x and the x axis is φ, then the components of x can be written
x = [x1 , x2 ] with x1 = x cos(φ) and x2 = x sin(φ). This is shown graphically
in Figure 4.2.2 (right). To obtain the vector that has been rotated by θ degrees,
we simply need to add θ to φ in this representation. Thus y = Rotθ x = [y1 , y2 ],
where y1 = x cos(φ + θ) and y2 = x sin(φ + θ).
    To simplify this we can use the addition formulae for sin and cos. Recall
that
                   cos(a + b)        =   cos(a) cos(b) − sin(a) sin(b)
                   sin(a + b) =          cos(a) sin(b) + sin(a) cos(b)


                                            122
Chapter 4. Matrices and Determinants           4.2. Linear Transformations and Matrices




                                                             y= Rotθ x

          a


                                      b
                                                    θ                 x
                                                       φ
              projba                                y = || x || cos(φ+θ) x1= || x || cos( φ)
                                                     1

Figure 4.3: Graphical representation of a projection (left). Details of compo-
nents in a rotation by an angle θ.


Thus

                       y1   =     x cos(φ + θ)
                            =     x (cos(φ) cos(θ) − sin(φ) sin(θ))
                            =    cos(θ)x1 − sin(θ)x2

and so

                       y2   =     x sin(φ + θ)
                            =     x (sin(φ) cos(θ) + cos(φ) sin(θ))
                            =    sin(θ)x1 + cos(θ)x2

Notice now that this can be written as a matrix product:

                            y1            cos(θ) − sin(θ)            x1
                                  =
                            y2            sin(θ) cos(θ)              x2

The matrix
                                      cos(θ) − sin(θ)
                                                      ,
                                      sin(θ) cos(θ)
also denoted Rotθ , is called a rotation matrix. What this formula is saying is
that the linear transformation of rotation by θ degrees in the same as the linear
transformation of multiplication by the matrix. In other words, if we want to
know the co-ordinates of the vector obtained by rotating x by θ degrees, we
simply calculate Rotθ x.

4.2.3    Projections in two dimensions
Now we consider the transformation which projects a vector x in the direction
of another vector a as shown in Figure 4.2.2 (left). We already have a formula
for this transformation. In the special case that a has unit length, the formula
is
                               Proja x = (x · a)a.


                                              123
Chapter 4. Matrices and Determinants          4.2. Linear Transformations and Matrices



It follows from the properties of the dot product that
                 Proja (sx + ty)        =    ((sx + ty) · a)a
                                        = ((sx · a + ty · a)a
                                        = s((x · a)a) + t((y · a)a)
                                        = sProja x + tProja y
Thus Proja is a linear transformation. Let us now see that Proja is also given
by multiplication by a matrix. If a = [a1 , a2 ], then
                                            (x1 a1 + x2 a2 )a1
                       Proja x =
                                            (x1 a1 + x2 a2 )a2
                                            a2 x1 + a1 a2 x2
                                             1
                                   =
                                            a2 a1 x1 + a2 x2
                                                        2
                                             a21    a1 a2      x1
                                   =
                                            a2 a1    a22       x2
If a is the unit vector making an angle of θ with the x axis, then a1 = cos(θ)
and a2 = sin(θ). Using half angle formulae, we have
                                            1 + cos(2θ)
                          a2
                           1   =    cos2 (θ) =
                                                  2
                          2         2      1 − cos(2θ)
                         a2 = sin (θ) =
                                                 2
                                                  sin(2θ)
                      a1 a2 = cos(θ) sin(θ) =
                                                     2
Thus the matrix which when multiplied by x produces the projection of x onto
the line making an angle of θ with the x axis is given by
                               1 1 + cos(2θ)           sin(2θ)
                    Projθ =
                               2   sin(2θ)           1 − cos(2θ)

4.2.4    Reflections in two dimensions
A third example of a geometric linear transformation is reflection across a line.
The following figure illustrates reflection across a line making an angle θ with
the x axis. Let Ref θ x denote the reflected vector as shown in Figure 4.4.
   We can obtain the matrix for reflection from the following observation. The
vector with tail at x and head at Projθ x is Projθ x − x. If we add twice this
vector to x, we arrive at Ref θ x. Therefore
                          Ref θ x = x + 2(Projθ x − x)
                                    =       2Projθ x − x
             1   0
Now if I =         , then Ix = x for any vector x, since
             0   1
                      1    0   x1             1x1 + 0x2             x1
                                        =                   =
                      0    1   x2             0x1 + 1x2             x2

                                             124
Chapter 4. Matrices and Determinants        4.2. Linear Transformations and Matrices




                                     x




                                         Proj (x)
                                             θ
                                 θ


                                                Ref (x )
                                                   θ


              Figure 4.4: Graphical representation of a reflection.


I is called the identity matrix.
    Now we can write

                       Ref θ x = 2Projθ x − Ix = (2Projθ − I)x.

This means that the matrix for reflections is 2Projθ − I. Explicitly

                              1 + cos(2θ)        sin(2θ)     1       0
               Ref θ     =                                 −
                                sin(2θ)        1 − cos(2θ)   0       1
                              cos(2θ)  sin(2θ)
                         =
                              sin(2θ) − cos(2θ)

4.2.5    Every linear transformation is multiplication by a
         matrix
We have just seen three examples of linear transformations whose action on a
vector is given by multiplication by a matrix. Now we will see that for any linear
transformation T (x) there is a matrix T such that T (x) is the matrix product
T x.
    To illustrate this suppose that T is a linear transformation that takes three
dimensional vectors as input.
    Let e1 , e2 and e3 be the standard basis vectors in three dimensions, that is
                                                  
                                 1          0          0
                         e1 =  0  e2 =  1  e3 =  0 
                                 0          0          1
Then any vector can be written
                                       
            x1          1          0          0
      x =  x2  = x1  0  + x2  1  + x3  0  = x1 e1 + x2 e2 + x3 e3
            x3          0          0          1
Now, using the linearity property of the linear transformation T , we obtain

        T (x) = T (x1 e1 + x2 e2 + x3 e3 ) = x1 T (e1 ) + x2 T (e2 ) + x3 T (e3 )


                                          125
Chapter 4. Matrices and Determinants             4.2. Linear Transformations and Matrices




                                            e2

                   T(e 2)
                    =[-sin( θ),cos( θ)]
                                                          T( e 1)
                                                          =[cos(θ ),sin(θ)]
                                                   θ

                                                           e
                                                            1



Figure 4.5: Derivation of the matrix representing 2D rotation by rotating coor-
dinate directions.


Now take the three vectors T (e1 ), T (e2 ) and T (e3 ) and put them in the columns
of a matrix which we’ll also call T . Then
                                  
                                   x1
   T x = T (e1 ) T (e2 ) T (e3 )  x2  = T (e1 )x1 + T (e2 )x2 + T (e3 )x3 = T (x)
                                   x3

In other words, the action of the transformation T on a vector x is the same as

multiplying x by the matrix T = T (e1 ) T (e2 ) T (e3 )

    The same idea works in any dimension. To find the matrix of a linear
transformation T (x) we take the standard basis vectors e1 , e2 , . . . , en (where
ek has zeros everywhere except for a 1 in the kth spot) and calculate the ac-
tion of the linear transformation on each one. We then take the transformed
vectors T (e1 ), T (e2 ), . . . , T( en ) and put them into the columns of a matrix

T = T (e1 ) T (e2 ) · · · T (en ) . This matrix T then reproduces the action of

the linear transformation, that is, T (x) = T x.
    To see how this works in practise, let’s recalculate the matrix for rotations
                                                                         1
in two dimensions. Under a rotation angle of θ, the vector e1 =              gets
                                                                         0
                           cos(θ)                            0
transformed to T (e1 ) =            while the vector e2 =      gets transformed
                           sin(θ)                            1
              − sin(θ)
to T (e2 ) =           . This is shown graphically in Figure 4.5.
               cos(θ)
    According to our prescription, we must now put these two vectors into the
columns of a matrix. This gives the matrix

                                          cos(θ) − sin(θ)
                                  T =
                                          sin(θ) cos(θ)

which is exactly the same as Rotθ .


                                             126
Chapter 4. Matrices and Determinants    4.2. Linear Transformations and Matrices



4.2.6    Composition of linear transformations and matrix
         product
Suppose we apply one linear transformation T and then follow it by another
linear transformation S. For example, think of first rotating a vector and then
reflecting it across a line. Then S(T (x)) is again a linear transformation, since

         S(T (sx + ty)) = S(sT (x) + tT (y)) = sS(T (x)) + tS(T (y)).

What is the matrix for the composition S(T (x))? We know that there are
matrices S and T that reproduce the action of S(x) and T (x). So T (x) is
the matrix product T x and S(T x)) is the matrix product S(T x) (here the
parenthesis just indicate in which order we are doing the matrix product) But
matrix multiplication is associative. So S(T x) = (ST )x. In other words the
matrix for the composition of S(x) and T (x) is simply the matrix product of
the corresponding matrices S and T .
   For example, to compute the matrix for the transformation of rotation by
45◦ followed by reflection about the line making an angle of 30◦ with the x axis
we simply compute the product

                           cos(60◦ )  sin(60◦ )     cos(45◦ ) − sin(45◦ )
     Ref 30◦ Rot45◦   =
                           sin(60◦ ) − cos(60◦ )    sin(45◦ ) cos(45◦ )
                                √     √        √
                           1      3      2      2
                           2     2      2    −
                      =    √          √       √2
                            3            2     2
                           2    −12     2     2
                           √ √       √ √
                             2+ 6   − 2+ 6
                      =    √ 4√      √4 √
                            6− 2    − 6− 2
                              4        4


4.2.7    Problems
Problem 4.6 Let a be a fixed nonzero vector. Show that the transformation
T (x) = x + a is not a linear transformation.

Problem 4.7 Let a be a fixed vector. Show that the transformation T (x) = a·x
is a linear transformation (whose output values are numbers).

Problem 4.8 Find the matrices which project on the lines

(a) x1 = x2

(b) 3x1 + 4x2 = 0

Problem 4.9 Find the matrices that reflect about the lines

(a) x1 = x2

(b) 3x1 + 4x2 = 0



                                       127
Chapter 4. Matrices and Determinants       4.2. Linear Transformations and Matrices



Problem 4.10 Find the matrices which rotate about the origin in two dimen-
sions by
(a) π/4
(b) π/2
(c) π

Problem 4.11 Find the matrix which first reflects about the line making an
angle of φ with the x axis, and then reflects about the line making an angle of θ
with the x axis. Give another geometric interpretation of this matrix.

Problem 4.12 Let f : R2 → R2 be the linear transformation that reflects points
across the line x = 2y.
  1. What is the image of (1, 10), that is, the vector

                                                 1
                                      f              ?
                                                10

  2. Write down the matrix of f .


Problem 4.13 Let g : R2 → R2 be the linear transformation that first reflects
points across the line x = −y and then rotates points π/2 radians. Write down
the matrix of f .

Problem 4.14 Find the matrix that rotates about the z axis by and angle of θ
in three dimensions.

Problem 4.15 Let T : R4 → R3 be the map defined by

   T (x1 , x2 , x3 , x4 ) = (x1 + 4x2 + 5x3 , 3x1 − 2x2 + x3 − x4 , −x1 − x3 + x4 )

Show that T is a linear transformation.

Problem 4.16 Let T : R2 → R3 given by T (x) = Ax, where
                                       
                                  1 2
                           A =  0 1 .
                                  1 1

Determine whether each given vector is in the range of T . Recall that the range
of T is every vector that can “come out” or T , that is every vector in R3 that
can be written as T (x) for some x in R2 . Note that answering these questions
is the same as determining if a linear system has a solution.
       
         1
   1.  4  .
         2

                                          128
Chapter 4. Matrices and Determinants                   4.3. Application: random walks


         
        1
   2.  1  .
        1

Problem 4.17 Suppose T : R3 → R2 is a linear transformation such that
                                            
      1                      1                         0
               1                          2                       −1
 T  0  =       , T  −1  =             , T  0  =                          .
               1                          0                        5
      0                      0                         1
                
                1
   1. Find T  2 .
                3
   2. Write down the matrix of T .

Problem 4.18 Let g : R3 → R2 and h : R2 → R4 be linear                transformations
given by
                                                                         
                                                                   x1
             x1
                       x1 − x3 + x2        x1                        −x1 
        g  x2  =                , h           =                      .
                         x3 − 2x2          x2                         x2 
             x3
                                                                      −x2

The questions below concern h ◦ g, the composition of h and g defined by

                                 h ◦ g(x) = h(g(x))

for x ∈ R3 . Note that this composition is defined because the output of g and
the input of h have the same dimension, two.
                       
                      1
   1. Find h ◦ g  0 .
                     −5
   2. Find the matrices of g and h.

   3. Find the matrix of h ◦ g.


4.3       Application: random walks
Consider a system with three states, labelled 1, 2 and 3 in Figure 4.6.
    To make the problem more vivid, one can imagine these as being actual
locations. A random walker starts off at some location, say location 1 at time 0.
Then at a sequence of times, 1, 2, . . . , n . . ., the walker either stays where he is,
or moves to one of the other locations. The next location is chosen randomly,
but according to the transition probabilities pi,j . These are numbers between
0 and 1 that measure how likely it is that, starting from location j, the walker


                                          129
Chapter 4. Matrices and Determinants                             4.3. Application: random walks



                                p
                                 33
                                         3
                                                 p
                                             p 23
                                              32               p
                                      p   p                     22
                                       31 13           2
                                            p
                                             21
                                                  p
                                                  12
                                        1
                                p
                                 11


    Figure 4.6: Graphical description of a random walk with three states.


will move to location i. If pi,j = 0, then there is no chance that the walker will
move from j to i, and if pi,j = 1, then the walker will move for sure from j to i.
    Since the walker must either move from j to another site or stay put, the
sum of these probabilities must equal one:
                                                           3
                           p1,j + p2,j + p3,j =                pi,j = 1
                                                       i=1

At each time n there is a vector
                                                  
                                              xn,1
                                       xn =  xn,2 
                                              xn,3

that gives the probabilities that the walker is in location 1, 2 or 3 at time n. Let
us compute the vector xn+1 , given xn . To start, we must compute xn+1,1 , the
probability that the walker is at location 1 at time n + 1. There are three ways
the walker can end up at location 1. The walker might have been at location
1 at time n and have stayed there. The probability of this is p1,1 xn,1 . Or he
might have been at location 2 and have moved to 1. The probability of this is
p1,2 xn,2 . Finally, he might have been at location 3 and have moved to 1. The
probability of this is p1,3 xn,3 . Thus the total probability that the walker is at
location 1 at time n + 1 is the sum

                       xn+1,1 = p1,1 xn,1 + p1,2 xn,2 + p1,3 xn,3

Similarly, for all i

                       xn+1,i = pi,1 xn,1 + pi,2 xn,2 + pi,3 xn,3

But this is exactly the formula for matrix multiplication. So

                                        xn+1 = P xn

where P is the matrix with entries [pij ].


                                             130
Chapter 4. Matrices and Determinants                           4.3. Application: random walks



  Now suppose the initial probabilities are given by some vector x0 . For ex-
ample, if the walker starts off at location 1, then
                                          
                                           1
                                   x0 =  0 
                                           0

Then after one time step, we have

                                        x1 = P x0

Then,
                                  x2 = P x1 = P 2 x0
and so on, so after n time steps

                                        xn = P n x0
                                                           3
   Notice that the sum xn,1 + xn,2 + xn,3 = i=1 xn,i should equal one, since
the total probability of being in one of the three locations must add up to one.
   If the initial vector has this property, then it is preserved for all later times.
                              3                           3
To see this, suppose that j=1 xj = 1 Then, since i=1 pij = 1

                          3                   3    3
                               (P x)i   =              pij xj
                         i=1                 i=1 j=1
                                              3        3
                                        =                  pij    xj
                                             j=1    i=1
                                              3
                                        =          xj = 1
                                             j=1

In other words, the vector for the next time step also has components summing
to one. Of course, one can generalize this to a system with an arbitrary number
of states or locations. Later in the course we will see how to use eigenvalues
and eigenvectors to efficiently compute the limit as n tends to infinity of this
expression.
    A specific but somewhat nerdy example is given below.
Example 4.1 Ydnew the sorcerer and his apprentice, Xavier, have a magical
duel as part of a circus act. They take turns casting spells, which don’t always
work, but when they do the opponent is knocked out. Xavier always goes first
in the duel but his spells only work 1/3 of the time. Ydnew’s spells work 1/2 of
the time. When they practise, they find that each wins half the time. However,
in performances, the duel is limited to three attempts by Xavier. After that, if
there has been no knock-out, Ydnew is declared the winner.
(a) Describe the duel as a random walk.


                                            131
Chapter 4. Matrices and Determinants                4.3. Application: random walks



(b) Write the matrix for the random walk.
(c) Use the matrix to analyze the probability that Ydnew will win the duel when
    Xavier’s attempts are limited and also when they are unlimited.
There are four possible states:
(1) No winner yet, Xavier’s turn
(2) No winner yet, Ydnew’s turn
(3) Xavier has won.
(4) Ydnew has won.
These are shown with transition probabilities in     Figure 4.7. The transition
matrix with the node ordering we have chosen is
                                                    
                                 0 1/2 0 0
                               2/3 0 0 0            
                         P =  1/3 0 1 0
                                                     
                                                     
                                 0 1/2 0 1

The duel begins in state (1), that is Xavier’s turn, no winner yet:
                                         
                                           1
                                   (0)
                                         0 
                                 x =   0 
                                           0

The circus duel ends after 3 attempts by Xavier, that is after 5 transitions in
the state diagram (five rounds XYXYX). Using MATLAB we compute
                                                  
                                               0
                                           0.0741 
                        x(5) = P 5 x(0) ≈ 
                                           0.4815  .
                                                   

                                            0.4444

This means that Ydnew will win with probability 0.5185, the sum of the second
and fourth components in x(5) (he can win by a knock-out or if Xavier has
not defeated him after his third attempt). To investigate what happens in the
unlimited turn version of the duel they use when they practise, you can compute

                                  x(n) = P n x(0)

for n increasingly large. This gives strong numerical evidence that
                                                
                                              0
                                           0 
                              lim x(n) =  1/2  .
                                                 
                             n→∞
                                             1/2

                                       132
Chapter 4. Matrices and Determinants               4.3. Application: random walks




    Figure 4.7: Diagram of the sorcerers’ duel described in Example 4.1.


That is, in the unlimited duel they each win half the time, as they observed.
This unlimited case can be analyzed rigorously. The techniques used are not
part of the course, but it is done below for completeness. Xavier has a chance
to win on any odd turn. On turn 1 he can win if his spell works (1/3 chance).
On turn 3 Xavier can win if his spell missed on turn 1, Ydnew missed on turn 2
but Xavier succeeds on turn 3 ( 2 1 1 = 1 chance). On turn 5 he can win if there
                                 323    9
were failures in turns 1-4 but success on turn 5, with a chance of 3 1 3 1 1 = 27 .
                                                                   2 2
                                                                     2 23
                                                                                1

Notice the pattern that each successive chance to win goes down by a factor
of 3. The total chance of success is the sum of the chances to win at every
opportunity:
                                                   ∞        i
    1 1  1          1    1 1           1                1           1    1      1
     + +   + · · · = (1 + + + · · ·) =                          =             =
    3 9 27          3    3 9           3          i=0
                                                        3           3 1 − 1/3   2

as predicted from the MATLAB numerical results. To get the second last term
above, the expression for the sum of a geometric series was used.


Example 4.2 Investigate a modified sorcerer’s duel to the one described in Ex-
ample 4.1 in which both sorcerers are given a shield which can protect them from
one successful spell cast on them. Here, there are 10 possible states
(1) No winner yet, Xavier’s turn, both have shields still.

(2) No winner yet, Xavier’s turn, Xavier has lost his shield, but Ydnew still
     has his.

(3) No winner yet, Xavier’s turn, Xavier still has his shield, but Ydnew has
     lost his.

(4) No winner yet, Xavier’s turn, both have lost their shields.

(5-8) Same as (1)-(4) above, but Ydnew’s turn.

(9) Xavier has won.


                                       133
Chapter 4. Matrices and Determinants                4.3. Application: random walks



(10) Ydnew has won.
Using the ordering of states above, the   transition matrix is
                                                                        
                 0      0     0    0      1/2 0   0   0          0   0
              0        0     0    0      1/2 1/2 0   0          0   0   
                                                                        
              0        0     0    0       0   0 1/2 1/2         0   0   
                                                                        
              0        0     0    0       0   0 1/2 0           0   0   
                                                                        
              2/3      0     0    0       0   0  0   0          0   0   
         P = 0
                                                                         
                     2/3 0        0       0   0  0   0          0   0   
                                                                         
              1/3      0 2/3 0            0   0  0   0          0   0   
                                                                        
              0      1/3 0 2/3            0   0  0   0          0   0   
                                                                        
              0        0    1/3 1/3       0   0  0   0          1   0   
                 0      0     0    0       0 1/2 0 1/2           0   1

This matrix can be entered in MATLAB (remember the techniques for enter-
ing sparse matrices, that is matrices with mostly zero entries, learnt in your
computer lab #2). Starting with x(0) = e1 we can compute

                                x(5) = P 5 x(0)

numerically in MATLAB. Adding components 5-8 (Ydnew wins by default)
and component 10 (Ydnew wins by knock-out) gives approximately 0.7778, the
chance that Ydnew will win the circus duel. The unlimited duel can be investi-
gated numerically, giving
                                                 
                                              0
                                        
                                             0   
                                                  
                                        
                                             0   
                                                  
                                        
                                             0   
                                                  
                                n (0)
                                             0   
                           lim P x ≈            
                          n→∞
                                             0   
                                                  
                                        
                                             0   
                                                  
                                        
                                             0   
                                                  
                                         0.3750 
                                           0.6250

So in an unlimited match, Ydnew will win about 62.5% of the time.

4.3.1    Problems
Problem 4.19 Consider a random walk with 3 states, where the probability of
staying in the same location is zero. Suppose
   • the probability of moving from location 1 to location 2 is 1/2
   • the probability of moving from location 2 to location 1 is 1/3
   • the probability of moving from location 3 to location 1 is 1/4


                                       134
Chapter 4. Matrices and Determinants                4.3. Application: random walks



Write down the matrix P . What is the probability that a walker starting in
location 1 is in location 2 after two time steps?
Problem 4.20 Consider a random walk with 3 states, where the probability of
staying in the same location is zero. Suppose
   • the probability of moving from location 1 to location 2 is 1/3, that is p2,1 =
     1
     3.

   • the probability of moving from location 2 to location 1 is 2/5, that is p1,2 =
     2
     5.

   • the probability of moving from location 3 to location 1 is 1/4, that is p1,3 =
     1
     4.

  1. Write down the matrix P.
  2. What is the probability that a walker starting in location 1 is in location 2
     after three time steps?
  3. What is the probability that a walker starting in location 3 is in location 1
     after two time steps?
Problem 4.21 Consider a random walk with 4 states, where all the probabilities
pi,j are equal to 1/4.
  1. Compute P and P n , for every positive integer n.
  2. Compute the probability location vectors xn (xn = P n x0 ) in each of the
     following cases:
                                                 
                                 1                  0
                               0                 0 
                       x0 =   , or x0 =   .
                               0                 0 
                                 0                  1

Problem 4.22 Suppose that the matrix P given below is the matrix of a random
walk:                           1 1 2 
                                       3   4   5
                                           1   1
                              P = 0       2   5
                                                   .
                                       2   1   2
                                       3   4   5

  1. What is the probability that a walker starting in location 2 is in location 1
     after one time step?
  2. What is the probability that a walker starting in location 2 is in location 1
     after two time steps?
                    
                 1/3
  3. If x0 =  1/3  what is the probability that the walker is in location 3
                 2/3
     after two time steps?


                                       135
Chapter 4. Matrices and Determinants                               4.4. The Transpose




          Figure 4.8: Diagram of the random walk in Problem 4.24.


Problem 4.23 Consider a random walk with 3 states, where all the probabilities
pi,j are all equal to 1/3. What is P , P n ? Compute the  
                                                        probabilities P n x0 when
        1                                                   0
x0 =  0  (em i.e the walker starts in location 1), x0 =  1  (em i.e. the walker
        0                                                 0
                                   0
starts in location 2), and x0 =  0  (em i.e. the walker starts in location 3)
                                   1

Problem 4.24 Consider a random walk with three states with transition prob-
abilities shown in Figure 4.8.

  1. Suppose the system starts in state 3. What is the probability that it is in
     state 2 after 2 steps?

  2. Given that                                               
                                    0.25         0.25    0.25
                           P k →  0.375         0.375   0.375 
                                   0.375         0.375   0.375
      as k tends to infinity, what are the probabilities of the system being in each
      state after the system has been running for a long time. Show that these
      probabilities do not depend on the initial state.



4.4     The Transpose
If A is an m×n matrix, then its transpose AT is the matrix obtained by flipping
A about its diagonal. So the columns of AT are the rows of A (in the same order)
and vice versa. For example, if

                                       1     2    3
                                 A=
                                       4     5    6



                                       136
Chapter 4. Matrices and Determinants                                   4.4. The Transpose



then                                              
                                         1       4
                                  AT =  2       5
                                         3       6
Another way of saying this is that the i, jth entry of AT is the same as the j, ith
entry of A, that is,
                                   aT = aj,i .
                                    i,j

   There are two important formulae to remember for the transpose of a matrix.
The first gives a relation between the transpose and the dot product. If A is an
m × n matrix, then for every x ∈ Rn and y ∈ Rm we have

                              y · (Ax) = (AT y) · x                                 (4.2)

The proof of this formula is a simple calculation.
                                                                
                                       m             n
                       y · (Ax)   =         yi            ai,j xj 
                                      i=1            j=1
                                       m     n
                                  =              yi ai,j xj
                                      i=1 j=1
                                       n m
                                  =              aT yi xj
                                                  j,i
                                      j=1 i=1
                                       n         m
                                  =                   aT yi
                                                       j,i      xj
                                      j=1    i=1

                                  =   (AT y) · x

In fact, the formula (4.2) could be used to define the transpose. Given A, there
is exactly one matrix AT for which (4.2) is true for every x and y, and this
matrix is the transpose.
    The second important formula relates the transpose of a product of matrices
to the transposes of each one. For two matrices A and B such that AB is defined
the formula reads
                                 (AB)T = B T AT                           (4.3)
Notice that the order of the factors is reversed on the right side. To see why
(4.3) is true, notice that on the one hand

                           y · (ABx) = ((AB)T y) · x

while on the other hand

           y · (ABx) = y · (A(Bx)) = (AT y) · (Bx) = (B T AT y) · x

Thus ((AB)T y) · x = (B T AT y) · x for every x and y. This can only be true if
(4.3) holds.


                                       137
Chapter 4. Matrices and Determinants                              4.5. Matrix Inverses



4.4.1    MATLAB
The MATLAB operator ’ can be used to take the transpose of a matrix.

4.4.2    Problems
                                                 
                                            x1
                                                                   y1
Problem 4.25 Verify formula (4.2) for x =  x2  y =                    and
                                                                   y2
                                            x3

                                        1     2   3
                                 A=
                                        4     5   6

Problem 4.26 What is (AT )T ?

Problem 4.27 Verify (4.3) for

                             1   2                    1   2   3
                      A=              and B =
                             3   1                    4   5   6

Problem 4.28 Show that if A and B are both m × n matrices such that y ·
(Ax) = y · (Bx) for every y ∈ Rm and every x ∈ Rn , then A = B.

Problem 4.29 Show that if you think of (column) vectors in Rn as n × 1
matrices then
                             x · y = xT y
Now use this formula and (4.3) to derive (4.2).


4.5     Matrix Inverses
To solve the (scalar) equation
                                      ax = b
for x we simply multiply both sides by a−1 = a . Then, since a−1 a = 1, we find
                                             1


                                     x = a−1 b.

Of course, if a = 0 this doesn’t work, since we cannot divide by zero. In fact, if
a = 0 the equation ax = b either has no solutions, if b = 0, or infinitely many
solutions (every value of x), if b = 0.
    We have seen that a system of linear equations can be rewritten

                                      Ax = b

where is A is a known matrix x is the unknown vector to be solved for, and b
is a known vector. Suppose we could find an inverse matrix B (analogous to




                                        138
Chapter 4. Matrices and Determinants                            4.5. Matrix Inverses



a−1 ) with the property that BA = I (recall that I denotes the identity matrix).
Then we could matrix multiply both sides of the equation by B yielding

                                     BAx = Bb

But BAx = Ix = x, so x = Bb. Thus there is a unique solution and we have a
formula for it.
   Just as in the numerical case, where a could be zero, we can’t expect to
find an inverse matrix in all cases. After all, we know that there are linear
systems of equations with no solutions and with infinitely many solutions. In
these situations there can be no inverse matrix.
   When considering matrix inverses, we will always assume that we are dealing
with square (i.e., n × n) matrices.

Definition 4.1 If A is an n × n matrix, then B is called the inverse of A, and
denoted B = A−1 , if
                                BA = I
where I is the n × n identity matrix (with each diagonal entry equal to 1 and all
other entries 0).

   Here is an example. Suppose

                                             2    1
                                     A=
                                             5    3

then the inverse matrix is
                                                  3    −1
                             B = A−1 =
                                                 −5     2

since
              2   1    3     −1    6−5                  3−3     1   0
                                =                             =
              5   3    −5     2   10 − 10              −5 + 6   0   1
This means that to solve the linear equation

                                 2x1       +x2        =2
                                 5x1       +3x2       =4

we write it as a matrix equation

                                 2     1     x1       2
                                 5     3     x2       4

and then multiply both sides by the inverse to obtain

                        x1         3       −1         2   2
                             =                          =
                        x2         −5       2         4   −2



                                           139
Chapter 4. Matrices and Determinants                             4.5. Matrix Inverses




       Figure 4.9: Diagram of the reduced form of an invertible matrix.


   Here is an example of a matrix that doesn’t have an inverse. Let

                                           1   0
                                    A=           .
                                           0   0

To see that A doesn’t have an inverse, notice that the homogeneous equations
                                       0
Ax = 0 has a non-zero solution x =        . If A had an inverse B, then we could
                                       1
multiply both sides of the equation Ax = 0 by B to obtain x = B0 = 0. But
this is false. Therefore there cannot be an inverse for A.
    Clearly, having an inverse is somehow connected to whether or not there
are any non-zero solutions of the homogeneous equation Ax = 0. Recall that
Ax = 0 has only the zero solution precisely when Ax = b has a unique solution
for any b.
    Let A be an n × n matrix. The following conditions are equivalent:

(1) A is invertible.

(2) The equation Ax = b always has a unique solution.

(3) The equation Ax = 0 has as the only solution x = 0.

(4) The rank of A is n.

(5) The reduced form of A is as shown in Figure 4.9.

     We already know that the conditions (2), (3), (4) and (5) are all equivalent.
We also just have seen that if A is invertible with inverse B, then the solution
of Ax = b is x = Bb so it exists and since we have a formula for it, it is unique.
     So we just have to show that if Ax = b always has a unique solution, then
A has an inverse. Consider the transformation that takes a vector b to the
unique solution x of Ax = b, em i.e., T b = x. It is easy to check that this
is a linear transformation, since if T b1 = x1 , i.e., Ax1 = b1 and T b2 = x2 ,
i.e., Ax2 = b2 , then A(t1 x1 + t2 x2 ) = t1 Ax1 + t2 Ax2 = t1 b1 + t2 b2 , so that
T (t1 b1 +t2 b2 ) = t1 x1 +t2 x2 = t1 T b1 +t2 T b1 Since T is a linear transformation,


                                         140
Chapter 4. Matrices and Determinants                             4.5. Matrix Inverses



it is given by some matrix B, and since T (Ax) = x, we must have BAx = x
which implies that BA is the identity matrix.
    Going back to our first example, notice that not only is BA = I, but AB = I
too, since

                3    −1     2   1    6−5          −2 + 2   1        0
                                  =                      =
               −5     2     5   3   15 − 15       −5 + 6   0        1

For a general choice of A and B, BA need not be equal to AB. But if B is the
inverse of A, then it is always true that AB = BA = I.
   To see this, suppose that A is invertible with BA = I but we don’t know yet
whether AB = I. So what we need to show is that for every vector x, ABx = x.
First notice that if A is invertible, then any vector x can be written in the form
Ay for some y, since this is just the same as saying that the equation Ay = x
has a solution y. Thus ABx = ABAy = AIy = Ay = x.

4.5.1     Computing the inverse
How can we compute the inverse of an n × n matrix A? Suppose that A is
invertible and B is the inverse. Then AB = I. We can rewrite this equation as
follows. Think of the columns of B as being column vectors so that

                                B = b1 b2 · · · bn

Then the rules of matrix multiplication imply that

                           AB = Ab1 Ab2 · · · Abn

Now the identity matrix can also be written as a matrix of column vectors. In
this case the kth column is simply the matrix with zeros everywhere except for
a 1 in the kth place, in other words the vector ek . Thus

                                I = e1 e2 · · · en

   So if AB = I then the n equations

                                    Ab1    = e1
                                    Ab2    = e2
                                           .
                                           .
                                           .
                                   Abn     = en

hold. If we solve each of these equations for b1 , b2 , . . ., bn , then we have found
the inverse B.


                                          141
Chapter 4. Matrices and Determinants                        4.5. Matrix Inverses



   Here is a simple example. Suppose we want to find the inverse for

                                             2    1
                                A=                  .
                                             5    3

According to our discussion, we must solve Ab1 = e1 and Ab2 = e2 . The
augmented matrix for Ab1 = e1 is

                                        2    1 1
                                        5    3 0

We now perform a sequence of row operations. First divide the first row by 2.
This gives
                              1 1/2 1/2
                                            .
                              5 3       0
Now subtract 5 times the first row from the second row. This gives

                               1    1/2 1/2
                                             .
                               0    1/2 −5/2

Now subtract the second row from the first row. This gives

                               1     0   3
                                             .
                               0    1/2 −5/2

Finally, multiply the second row by 2. This gives

                                    1       0 3
                                                 .
                                    0       1 −5

                 3
Therefore b1 =        The augmented matrix for Ab2 = e2 is
                 −5

                                        2    1 0
                                        5    3 1

We now perform a sequence of row operations. First divide the first row by 2.
This gives
                               1 1/2 0
                                           .
                               5 3 1
Now subtract 5 times the first row from the second row. This gives

                                   1 1/2 0
                                           .
                                   0 1/2 1

Now subtract the second row from the first row. This gives

                                1        0 −1
                                              .
                                0       1/2 1

                                            142
Chapter 4. Matrices and Determinants                          4.5. Matrix Inverses



Finally, multiply the second row by 2. This gives

                                       1   0 −1
                                                .
                                       0   1 2

                  −1
Therefore b2 =       . So
                  2

                                                 3     −1
                          B = b1 b2 =
                                                 −5     2

Notice that we performed exactly the same sequence of row operations in finding
b1 and b2 . This is because the row operations only depend on the left side of
the augmented matrix, in other words, the matrix A. If we used this procedure
to find the inverse of an n × n matrix, we would end up doing exactly the same
row operations n times. Clearly this is a big waste of effort! We can save a lot
of work by solving all the equations at the same time. To do this we make a
super-augmented matrix with both right sides.

                                      2    1 1   0
                                      5    3 0   1

Now we only have to go through the sequence of row operations once, keeping
track of both right sides simultaneously. Going through the same sequence, we
obtain
                                1 1/2 1/2 0
                                                 .
                                5 3       0 1
                               1 1/2 1/2              0
                                                        .
                               0 1/2 −5/2             1
                              1        0   3         −1
                                                        .
                              0       1/2 −5/2        1
                                  1    0 3       −1
                                                    .
                                  0    1 −5       2
Notice that the vectors b1 and b2 are automatically arranged as columns on
the right side, so the matrix on the right is the inverse B.
    The same procedure works for any size of square matrix. To find the inverse
of A form the super-augmented matrix [A|I]. Then do a sequence of row oper-
ations to reduce A to the identity. If the resulting matrix is [I|B] then B is the
inverse matrix.
    What happens if A doesn’t have an inverse? In this case it will be impossible
to reduce A to the identity matrix, since the rank of A is less than n. So the
procedure will fail, as it must.
    As another example, let us now compute the inverse of an arbitrary invertible
2 × 2 matrix
                                         a b
                                  A=            .
                                         c d

                                           143
Chapter 4. Matrices and Determinants                                4.5. Matrix Inverses



We will see that A is invertible precisely when its determinant ∆ = ad − bc is
non-zero. So let’s assume this is the case, and do the computation. To start
with, let’s assume that ac = 0. Then neither a or c are zero. Here is the
sequence of row transformations.

                                      a b 1        0
                                      c d 0        1

                                 ac bc c       0       c(1)
                                 ac ad 0       a       a(2)
Notice that multiplication by a and by c would not be legal row transformations
if either a or c were zero.
                          ac  bc     c         0
                          0 ad − bc −c         a       (2) − (1)

                          1    b/a 1/a      0           (1/ac)(1)
                          0     1 −c/∆     a/∆          (1/∆)(2)
                      1       0 d/∆     −b/∆       (1) − (b/a)(2)
                      0       1 −c/∆    a/∆          (1/∆)(2)
Thus the inverse matrix is
                                         1    d −b
                              A−1 =                .
                                      ad − bc −c a

This was derived under the additional assumption that ac = 0. However one
can check directly that the same formula works, so long as ∆ = ad − bc = 0.

4.5.2    Inverses of Products
If both A and B are invertible, then so is AB. The inverse of AB is given by
B −1 A−1 . To check this, simply compute

                      ABB −1 A−1 = AIA−1 = AA−1 = I.

   If one of A or B is not invertible then AB is not invertible. To see this recall
that a matrix C is not invertible exactly whenever there is a non-zero solution
x to Cx = 0. If B is not invertible, then there is a non-zero vector x with
Bx = 0. Then ABx = A0 = 0 so AB is not invertible too. If B is invertible,
but A is not, then there is a non-zero x with Ax = 0. Let y = B −1 x. Since
B −1 is invertible, y cannot be zero. We have ABy = ABB −1 x = Ax = 0 so
AB is not invertible.




                                         144
Chapter 4. Matrices and Determinants                        4.5. Matrix Inverses



4.5.3         MATLAB
If A is an invertible n × n matrix then the MATLAB command inv(A) will
return its inverse. A system of linear equations corresponding to Ax = b where
A is an invertible n × n matrix and b is a column vector with n components can
be solved with the command
x = Ab
Note that if A is not square or A is square but not invertible the corresponding
system cannot be solved in this way. In this case, use the MATLAB command
rref on the augmented matrix of the system as described in Chapter 3.

4.5.4         Problems
Problem 4.30 Which of the following matrices are invertible?
          1   2
(a)
          3   4
                   
      1       2   3
(b)  0       3   4
      0       1   1
                   
      1       2   3
(c)  0       3   4
      0       1   2

Problem 4.31 Find the inverse for

                                       1   2
                                       3   5

Problem 4.32 Determine which of these matrices are invertible, and find the
inverse for the invertible ones.
                   
       2     3 −1
(a)  1      2    3 
      −1 −1 4
                   
       1 −1 1
(b)  −1 2 −1 
       2 −1 1
              
      1 1 1
(c)  1 2 3 
      1 4 9
                
      2 1 4
(d)  3 2 5 
      0 −1 1

                                       145
Chapter 4. Matrices and Determinants               4.6. Return to Resistor Networks


              
       1   0 a
(e)  0    1 0
       0   0 1
              
       1   a b
(f )  0   1 c
       0   0 1

Problem 4.33 The following matrices are        invertible. Find their inverses.
                                                                 
                                              1 2 −3 1
                  1 1 1                       −1 3 −3 −2 
           A =  0 2 3 , B =                                    .
                                               2 0 1         5 
                  5 5 1
                                                3 1 −2 5

Problem 4.34 Consider the system
                       
                               x + 5z = 6
                             x − 2y + 3z = 14
                             2x + y − 3z = −2
                       

   1. Write the above system in the matrix-form, in other words find a matrix
                             
                            6
      A such that Ax =  14 .
                           −2
   2. Find the inverse of A, if possible.
   3. How many solutions the system have? Write them all down.


4.6        Return to Resistor Networks
Let us revisit the resistor networks we considered in section 3.5. These networks
consist of resistors and voltage and current sources. The fundamental problem
was introduced, which was to write the current through the voltage sources and
the voltage across current sources in terms of arbitrary values of the sources.
   Recall example 3.15 which considered the circuit shown again in figure 4.10.
The sources are V and I. The fundamental problem is to determine E and J
(the current through the voltage source) in terms of V and I. The solution
found in section 3.5.3 was

                                E   = 6I + V /6                               (4.4)
                                        1     10
                                J   = − I+ V
                                        6     72
This solution can be given in matrix-vector form as
                                               1
                            E            6     6      I
                                    =                                         (4.5)
                            J           −16
                                              10
                                              72      V

                                        146
Chapter 4. Matrices and Determinants               4.6. Return to Resistor Networks




            Figure 4.10: The circuit considered in Example 3.15.

where we will call the 2 × 2 matrix in the equation above F . It can be shown
that for any circuit with n sources, the solution of the fundamental problem can
be written as multiplication by an n × n matrix. We will show below how F
can be constructed in a systematic way. Look back to section 3.5.3 to see how
we arrived at (4.4). There were three parts to the process:
  1. We wrote the linear system for the loop currents and the voltage drop
     across the current source. The equations for the system had the sources
     I and V in the right hand sides.
  2. We solved the system symbolically in terms of the right hand side involving
     V and I.
  3. We identified the elements of the solution that solved the fundamental
     problem.
We will now describe these three steps by matrix multiplication and do the
multiplication as an alternate way to determine the matrix F in (4.5). Recall
that the unknowns for the circuit are the three loop currents and E, the voltage
across the current source. Define the intermediate unknown column vector
                              x = (i1 , i2 , i3 , E)T .
Proceeding as in example 3.15 we find equations for x by considering the volt-
age drops around each elementary loop (the first three equations) around each
elementary loop (the first three equations) and matching the current through
the current source (the last equation below):
                       8i1   −2i2                         = V
                      −2i1   +5i2     −3i3                = 0
                                                                              (4.6)
                             −3i2     +8i3     +E         = 0
                                       i3                 = −I
Here, we did not eliminate i3 from the equations as was done in section 3.5.3.
The system (4.6) can be written
                                               I
                               F2 x = F1                                      (4.7)
                                               V

                                        147
Chapter 4. Matrices and Determinants                   4.6. Return to Resistor Networks



where                                                                 
                    8 −2           0 0                             0   1
                  −2 5           −3 0                          0    0 
            F2 = 
                                             and         F1 = 
                                                                        
                   0 −3            8 1                            0   0 
                   0   0           1 0                            −1   0
Since F2 is invertible (always true for our approach using loop currents and
current source voltages as variables) we can proceed from (4.7) to

                                    −1             I
                               x = F2 F1                                          (4.8)
                                                   V

Now to solve the fundamental problem of the circuit we want J = i1 , the current
through the voltage sources, and E, the voltage across the current source. We
can write
                                  E
                                       = F3 x                              (4.9)
                                  J
where
                                     0    0    0       1
                             F3 =                              .
                                     1    0    0       0
Combining (4.8) and (4.9) we obtain

                              E           −1               I
                                    = F3 F2 F1                                   (4.10)
                              J                            V

Comparing (4.10) to (4.5) we see that
                                          −1
                                  F = F3 F2 F1

where F is the matrix representing the fundamental solution. Computation
                            −1
does indeed show that F3 F2 F1 equals F with the matrices defined above.
    The solution of the fundamental problem for any circuit can be written as
the product of three matrices in the same process described above. We illustrate
this with a more complex circuit below.

Example 4.3 Find the matrix for the fundamental solution of the circuit shown
in Figure 4.11. Here there are four sources I1 , I2 , V1 and V2 so the fundamental
problem will be written                          
                                E1             I1
                              E2          I2 
                              J1  = F  V1                                (4.11)
                                                 

                                 J2            V2
where F is a 4 × 4 matrix to be determined. The intermediate unknowns in the
circuit will be
                           x = (i1 , i2 , i3 , i4 , E1 , E2 )T




                                         148
Chapter 4. Matrices and Determinants                   4.6. Return to Resistor Networks




  Figure 4.11: The circuit considered in Example 4.3. All resistors are 1Ω.


as shown in the figure. Equations for these unknowns are found using the loop
current method:
                 3i1   −i2           −i4               −E2    =    0
                 −i1   +3i2   −i3            +E1              =    0
                       −i2    +3i3   −i4                      = V1 − V2
                                                                                 (4.12)
                 −i1          −i3    +3i4              +E2    =    V1
                        i2                                    =   −I1
                 −i1                 +i4                      =   −I2

where the first four equations above come from matching voltage drops around
the four elementary loops and the last two equations come from matching the
loop currents to the current sources. Note that (4.12) can be written
                                              
                                            I1
                                          I 
                               F2 x = F1  2 
                                          V1                        (4.13)
                                           V2

where
                                                                               
             3    −1     0    −1 0 −1                               0    0   0 0
      
           −1     3    −1     0 1 0 
                                                              
                                                                   0    0   0 0 
            0    −1     3    −1 0 0                              0    0   1 −1 
 F2 =                                          and     F1 =                    
      
           −1     0    −1     3 0 1 
                                                              
                                                                   0    0   1 0 
            0     1     0    0 0 0                              −1    0   0 0 
            −1     0     0    1 0 0                                 0   −1   0 0



                                           149
Chapter 4. Matrices and Determinants                             4.7. Determinants



Now J2 = i4 − i3 and J1 = i3 so we can write
                                    
                                 E1
                                E2 
                                J1  = F3 x                                (4.14)
                                    

                                 J2

where                                              
                               0    0    0    0 1 0
                              0    0    0    0 0 1 
                        F3 = 
                              0
                                                    .
                                    0    1    0 0 0 
                               0    0   −1    1 0 0
As above we can combine (4.13) and (4.14) to obtain
                                               
                         E1                   I1
                       E2           −1
                                            I2 
                       J1  = F3 F2 F1  V1  .
                                               

                         J2                   V2

Comparing the equation above to the desired form (4.11) we see that the fun-
damental matrix F can be computed as
                                          −1
                                  F = F3 F2 F1 .

A MATLAB computation with the matrices defined          above gives
                                                               
                 2.1818     0.2727    0.8182            −0.4545
               0.2727      1.9091    0.7273            −0.1818 
          F ≈ −0.4545 −0.1818 0.4545
                                                                
                                                        −0.3636 
                 0.0909 −0.3636 −0.0909                  0.2727

    This technique can be used to find the fundamental solution of circuits even
if they are very large. You can imagine that the process of setting up the
                                                              −1
matrices F1 , F2 and F3 , computing an approximation of F2 and multiplying
the matrices together can all be automated and done computationally.
    In Chapter 5, fundamental matrices for circuits will be used to investigate the
transient behaviour of circuits with capacitors and inductors as well as resistors.


4.7     Determinants
4.7.1    Definition of Determinants
We have already encountered determinants for 2 × 2 and 3 × 3 matrices. For
2 × 2 matrices
                       a     a1,2
                   det 1,1        = a1,1 a2,2 − a1,2 a2,1 .
                       a2,1 a2,2



                                        150
Chapter 4. Matrices and Determinants                                        4.7. Determinants



For 3 × 3    matrices we can define the determinant by expanding along the top
row:
                         
      a1,1    a1,2   a1,3
                                       a           a2,3           a          a2,3           a          a2,2
det  a2,1    a2,2   a2,3  = a1,1 det 2,2              −a1,2 det 2,1             +a1,3 det 2,1
                                       a3,2        a3,3           a3,1       a3,3           a3,1       a3,2
      a3,1    a3,2   a3,3
If we multiply out the 2 × 2 determinants in this definition we arrive at the
expression
                    
      a1,1 a1,2 a1,3
det  a2,1 a2,2 a2,3  = a1,1 a2,2 a3,3 −a1,1 a2,3 a3,2 +a1,2 a2,3 a3,1 −a1,2 a2,1 a3,3 +a1,3 a2,1 a3,2 −a1,3 a2,2 a3,1
      a3,1 a3,2 a3,3
We now make a similar definition for an n×n matrix. Let A be an n×n matrix.
Define Mi,j to be the (n − 1) × (n − 1) matrix obtained by crossing out the ith
row and the jth column. So, for example, if
                                    1 2 3 4
                                              
                                  5 6 7 8
                             A=
                                    9 0 1 2
                                               
                                    3 4 5 6
then
                           × × × ×
                                                                  
                                              5 7                  8
                         5 × 7 8 
                  M1,2 =                = 9 1                    2
                           9 × 1 2
                                              3 5                  6
                           3 × 5 6
    We now define the determinant of an n × n matrix              A to be
                                                                             n
det(A) = a1,1 det(M1,1 )−a1,2 det(M1,2 )+· · ·±a1,n det(M1,n ) =                  (−1)j+1 a1,j det(M1,j ).
                                                                            j=1

Of course, this formula still contains determinants on the right hand side. How-
ever, they are determinants of (n − 1) × (n − 1) matrices. If we apply this
definition to those determinants we get a more complicated formula involving
(n − 2) × (n − 2) matrices, and so on, until we arrive at an extremely long
expression (with n! terms) involving only numbers.
    Calculating an expression with n! is completely impossible, even with the
fastest computers, when n gets reasonable large. For example 100! ≈ 10158 . Yet,
your computer at home can compute the determinant of a 100 × 100 matrix in
less than a second. The secret, of course, is to compute the determinant in a
different way. We start by computing the determinant of triangular matrices.

4.7.2        Determinants of Triangular matrices
Recall that triangular matrices are matrices whose entries above or below the
diagonal are all zero. For 2 × 2 matrices
                           a1,1   a1,2
                     det                 = a1,1 a2,2 − a1,2 0 = a1,1 a2,2
                            0     a2,2

                                             151
Chapter 4. Matrices and Determinants                                    4.7. Determinants



and
                        a1,1    0
                 det                  = a1,1 a2,2 − 0a2,1 = a1,1 a2,2
                        a2,1   a2,2
so the determinant is   the product of the diagonal elements. For 3 × 3 matrices
                                 
                 a1,1    0     0
                                                   a      0
           det  a2,1   a2,2   0  = a1,1 det 2,2             −0+0
                                                   a3,2 a3,3
                 a3,1   a3,2 a3,3
                                     = a1,1 a2,2 a3,3

A similar expansion shows that the determinant of an n × n lower triangular
matrix is the product of the diagonal elements. For upper triangular matrices
we have
                     
      a1,1 a1,2 a1,3
                                   a     a2,3           0 a2,3             0 a2,2
det  0     a2,2 a2,3  = a1,1 det 2,2        −a1,2 det         +a1,3 det
                                     0   a3,3           0 a3,3             0  0
       0     0   a3,3

Since we already know that the determinant of a 2 × 2 triangular matrix is
the product of the diagonals, we can see easily that the last two terms in this
expression are zero. Thus we get
                                   
                    a1,1 a1,2 a1,3
                                                     a     a2,3
              det  0    a2,2 a2,3  = a1,1 det 2,2
                                                       0   a3,3
                     0    0    a3,3
                                       = a1,1 a2,2 a3,3

Once we know that the determinant of a 3 × 3 upper triangular matrix is the
product of the diagonal elements, we can do a similar calculation to the one
above to conclude that determinant of a 4 × 4 upper triangular matrix is the
product of the diagonal elements, and so on.
    Thus, the determinant of any (upper or lower) triangular n × n matrix is the
product of the diagonal elements.
    We know that an arbitrary n×n matrix can be reduced to an upper (or lower)
triangular matrix by a sequence of row operations. This is the key to computing
the determinant efficiently. We need to determine how the determinant of a
matrix changes when we do an elementary row operation on it.

4.7.3    Summary of determinant calculation rules
We summarize the results on determinant calculation shown in more detail in
the additional topics sections 4.8.6 to 4.8.12 below. The first three rules below
show how a determinant changes when row operations are applied to it. This
can be used to simplify determinant calculation as shown in the next section.
In the statements below, A is a square matrix.
  1. If B is obtained from A by multiplying one row of A by the constant c
     then det(B) = c det(A).


                                          152
Chapter 4. Matrices and Determinants                             4.7. Determinants



  2. If B is obtained from A by switching two rows of A then det(B) =
     − det(A).

  3. If B is obtained from A by adding a multiple of one row to another then
     det(B) = det(A)

  4. det(A) = 0 if and only if A is not invertible.

  5. For all square matrices B of the same size as A, det(AB) = det(A) det(B).

  6. det(AT ) = det(A).

4.7.4    Calculation of determinant using row operations
We can now use elementary row operations to compute the determinant of
                                        
                                 1 2 3
                               1 2 1
                                 2 3 0

The sequence of row operations that transforms this matrix into an upper trian-
gular one is (R2)-(R1), (R3)-2(R1), exchange (R2) and (R3). The determinant
doesn’t change under the first two transformations, and changes sign under the
third. Thus
                                                    
                       1 2 3                 1 2 3
                 det( 1 2 1 ) = det( 0 0 −2 )
                       2 3 0                 2 3 0
                                                        
                                             1 2      3
                                   = det( 0 0 −2 )
                                             0 −1 −6
                                                           
                                               1 2       3
                                   = − det( 0 −1 −6 )
                                               0 0 −2
                                      = −(1)(−1)(−2) = −2

4.7.5    More expansion formulae
We can use the properties of the determinant to derive alternative expansion
formulae. Recall that we defined the determinant to be
                                 n
                     det(A) =         (−1)j+1 a1,j det(M1,j ).
                                j=1

In other words, we expanded along the top row. Now let’s see that we can
expand along other rows as well. Let A be the original matrix with rows a1 =



                                         153
Chapter 4. Matrices and Determinants                                            4.7. Determinants



[a1,1 , a1,2 , . . . , a1,n ], . . . an = [an,1 , an,2 , . . . , an,n ]. For example, if A is a 5 × 5
matrix then
                                                          a1
                                                       
                                                       a2 
                                                A =  a3 
                                                       
                                                          a4
                                                       
                                                          a5
Suppose we want to expand along the fourth row. Let A be the matrix, where
the fourth row of A has been moved to the first row, with all other rows still in
the same order, i.e.,
                                        a4
                                      
                                      a1 
                                A =  a2 
                                      
                                        a3
                                      
                                        a5
How is the determinant of A related to the determinant of A? We can change
A to A be a series of row flips as follows:

                           a1            a1             a1         a4
                                                               
                          a2          a2           a4       a1 
                     A =  a3  ,       a4  ,        a2  ,    a2  = A
                                                            
                           a4            a3             a3         a3
                                                            
                           a5            a5             a5         a5

We have performed 3 flips, so det(A ) = (−1)3 det(A) = − det(A).
   In general, to move the ith row to the top in this way, we must perform i − 1
flips, so det(A ) = (−1)i−1 det(A)
   Notice that A is a matrix with the properties

(1) a1,j = ai,j , since we have moved the ith row to the top

(2) M1,j = Mi,j , since we haven’t changed the order of the other rows.

Therefore

                    det(A)      = (−1)i−1 det(A )
                                                  n
                                =    (−1)i−1          (−1)j+1 a1,j det(M1,j )
                                                j=1
                                                 n
                                = (−1)i−1             (−1)j+1 ai,j det(Mi,j )
                                                j=1
                                       n
                                =          (−1)i+j ai,j det(Mi,j )
                                     j=1

This is the formula for expansion along the ith row.



                                                154
Chapter 4. Matrices and Determinants                             4.7. Determinants



   As an example let’s compute the determinant of a 3 × 3 matrix by expanding
along the second row.
                   
             1 2 3
                                  2 3             1 3            1 2
       det  1 3 1  = − det             + 3 det          − det
                                  2 1             1 1            1 2
             1 2 1
                        = −2 + 6 + 3 − 9 − 2 + 2 = −2

The formula for expanding along the ith row is handy if the matrix happens to
have a row with many zeros.
   Using the fact that det(A) = det(AT ) we can also write down expansion
formulae along columns, since the columns of A are the rows of AT . We end up
with the formula
                                   n
                      det(A) =         (−1)i+j ai,j det(Mi,j )
                                 i=1

   As an example let’s compute the determinant of a 3 × 3 matrix by expanding
along the second column.
                  
           1 2 3
                                  1 1             1 3             1 3
     det  1 3 1  = −2 det              + 3 det         − 2 det
                                  1 1             1 1             1 1
           1 2 1
                       = −2 + 2 + 3 − 9 − 2 + 6 = −2

The formula for expanding along the jth column is handy if the matrix happens
to have a column with many zeros.

4.7.6    MATLAB
The MATLAB command det(A) can be used to determine the determinant of
a square matrix as introduced in Chapter 3.

4.7.7    Problems
Problem 4.35 Find the determinant of

                               1       1     1    1
                                                   
                             1        2     4    8 
                               1       3     9   27
                                                   
                               1       4    16   64

Problem 4.36 Find the determinant of

                               1       −1    1   −1
                                                   
                             1        2     4   8 
                               1       −2    4   −8
                                                   
                               1       1     1   1



                                           155
Chapter 4. Matrices and Determinants                            4.7. Determinants



Problem 4.37 Consider the matrix
                                                        
                               2          0       2    4
                             0           0       3    2 
                        A=  2
                                                         .
                                          2       4    4 
                               3          0       6    2
  1. Find the determinant of A by expanding on the first row.
  2. Find the determinant of A by expanding on the row or column of your
     choice. This expansion should be simpler than the one used in part (a).
Problem 4.38 Compute                                   
                                      1       0       1
                                det  1       2       3
                                      3       0       1
by expanding along the second row, and by expanding along the third column.
Problem 4.39 Find the determinant of A given above by using row operations
to put A into echelon form.
Problem 4.40 We know that the determinant of the n × n matrix
                                          
                          a1
                       
                              a2          
                                           
                                  ..      
                                     .    
                                        an
is a1 a2 · · · an if the blank entries represent 0. What is the equivalent formula
for the determinant of the n × n matrix
                                                    
                                                  an
                                                .
                                             ..
                                                    
                                                     ?
                                                    
                                
                                       a2           
                                   a1
Justify your answer.
Problem 4.41 Find all values of λ for which the matrix
                                              
                           2−λ 1          0
                         −1       −λ     1 
                             1      3 1−λ
is not invertible.
Problem 4.42 Determine whether each of the following statements is true or
false. If true, explain briefly why it is true without showing specific examples.
If false, explain briefly why it is false or give a simple counterexample, and give
a correct statement by making small changes to the underlined text.


                                        156
Chapter 4. Matrices and Determinants                      4.8. Additional Topics



  1. The determinant of a square matrix is the product of the entries on its
     main diagonal.
  2. For every matrix A, we have det(AAT ) = det(AT A)

  3. If A and B are both invertible n×n matrices, then det(B −1 AB) = det(BAB −1 ) = det(A).
  4. If an n × n matrix A is multiplied by a scalar c, the determinant of the
     resulting matrix is c · det(A).
  5. If the column vectors of an n × n matrix are linearly independent, then the
     row vectors of the same matrix are also linearly independent.


4.8     Additional Topics
4.8.1    Application: General Least Squares
Let us restate our results from Chapter 3 on minimization of quadratic functions
using matrix notation. A quadratic function of
                                            
                                          x1
                                  x= . 
                                       . 
                                           .
                                          xn
in Rn can be written in matrix form as

                           f (x) = x · Ax + b · x + c

where A is an n×n matrix and b ∈ Rn and c is a number. The vector containing
the partial derivatives can be computed to be
                                   
                            ∂f /∂x1
                                .
                                .             T
                                     = (A + A )x + b
                                   
                               .
                          ∂f /∂xn
Recall that we made the assumption that aij = aji when we considered this
problem before. This property can be stated in compact form as A = AT . If
this is true then (A + AT ) = 2A so
                                      
                               ∂f /∂x1
                                   .
                                   .    = 2Ax + b
                                      
                                  .
                              ∂f /∂xn

To find the minimum value of f (if it exists) we need to find the value of x for
which the vector above is zero. In other words, x solves the equation

                                  2Ax = −b.

This is the same equation that we derived before.


                                        157
Chapter 4. Matrices and Determinants                        4.8. Additional Topics



4.8.2    Least squares solutions
Let’s take another look the situation where a system of linear equations, which
we now can write
                                    Bx = c,
has no solution. Typically this will be the case if there are more equations than
variables, that is, B is an matrix with more rows than columns. In this case
there is no value of x that makes the left side equal the right side. However, we
may try to find the value of x for which the right side Bx is closest to the left
side c.
    One way to go about this is to try to minimize distance between the left and
right sides. It is more convenient to minimize the square of the distance. This
quantity can be written
                     2
            Bx − c       =   (Bx − c) · (Bx − c)
                         = (Bx) · (Bx) − (Bx) · c − c · (Bx) + c · c
                         = x · (B T Bx) − 2(B T c) · x + c · c

This is a quadratic function, written in matrix form. We want to use the formula
of the previous section with A = B T B and b = B T c. Before we can do so, we
must verify that A = B T B satisfies AT = A. This is true because

                          (AT A)T = AT (AT )T = AT A

Thus the formula of the previous section implies that the minimum occurs at
the value of x that solves the linear equation

                                  B T Bx = B T c

Here we have cancelled a factor of 2 on each side.
    Now let’s derive the same result in another way. Think of all the values of Bx,
as x ranges through all possible values in Rn as forming a (high dimensional)
plane in Rm . Our goal is to find the value of x so that the corresponding value
of Bx on the plane is closest to c. Using the analogy to the geometric picture in
three dimensions, we see that the minimum will occur when Bx−c is orthogonal
to the plane. This means that the dot product of Bx − c with every vector in
the plane, that is, every vector of the form By, should be zero. Thus we have

                              (By) · (Bx − c) = 0

for every y ∈ Rn . This is the same as

                  y · (B T (Bx − c)) = y · (B T Bx − B T c) = 0

for every y ∈ Rn . This can happen only if

                                  B T Bx = B T c

which is the same result we obtained before.


                                       158
Chapter 4. Matrices and Determinants                        4.8. Additional Topics



4.8.3    Problems
Problem 4.43 Find the least squares solution to
                                 x1   +x2    =1
                                 x1          =1
                                 x1   +x2    =0
Compare Bx and b.

Problem 4.44 Refer back to the least squares fit example, where we tried to
find the best straight line going through a collection of points (xi , yi ). Another
way of formulating this problem is this. The line y = ax + b passes through the
point (xi , yi ) if
                                   axi + b = yi                              (4.15)
So, saying that the straight line passes through all n points is the same as saying
that a and b solve the system of n linear equations given by (4.15) for i =
1, . . . , n. Of course, unless the points all actually lie on the same line, this
system of equations has no solutions. Show that the least squares solution to
this problem is the same as we obtained before. (You may simplify the problem
by assuming there are only three points (x1 , y1 ), (x2 , y2 ) and (x3 , y3 ).)

4.8.4    Elementary matrices
Recall that there are three row operation that are used in Gaussian elimination:
(1) multiplication of a row by a non-zero number, (2) add a multiple of one row
to another row and (3) exchanging two rows.
    It turns out that each elementary row operation can be implemented by left
multiplication by a matrix. In other words, for each elementary row operation
there is a matrix Q such that QA is what you get by doing that row operation
to the matrix A.
    Here is an example. Suppose
                                                
                                     1 0 2 1
                               A = 2 0 0 1
                                     1 2 3 4

and suppose that the row operation is multiplying the first row by 2. Then the
matrix you get by doing that row operation to the matrix A is
                                              
                                   2 0 4 2
                            A = 2 0 0 1
                                   1 2 3 4

In this case the matrix Q turns out to be
                                                
                                      2 0      0
                               Q = 0 1        0
                                      0 0      1

                                       159
Chapter 4. Matrices and Determinants                           4.8. Additional Topics



Since                                                        
                   2   0   0    1   0   2   1     2   0   4     2
                  0   1   02     0   0   1 = 2   0   0     1
                   0   0   1    1   2   3   4     1   2   3     4
i.e., QA = A .
     Now suppose that the elementary row operation is subtracting twice the first
row from the second row. Then the matrix you get by doing that row operation
to the matrix A is                              
                                   1 0 2       1
                            A =  0 0 −4 −1 
                                   1 2 3       4
In this case the matrix Q turns out to be
                                           
                                      1 0 0
                              Q =  −2 1 0 
                                      0 0 1

Since                                                          
                 1     0 0    1     0   2   1     1 0      2      1
               −2     1 02       0   0   1 = 0 0     −4     −1 
                 0     0 1    1     2   3   4     1 2      3      4
i.e., again, QA = A .
     Finally, suppose that the elementary row operation is exchanging the second
and the third rows. Then the matrix you get by doing that row operation to
the matrix A is                                 
                                      1 0 2 1
                               A = 1 2 3 4
                                      2 0 0 1
In this case the matrix Q turns out to be
                                                
                                      1 0      0
                               Q = 0 0        1
                                      0 1      0

Since                                                        
                   1   0   0    1   0   2   1     1   0   2     1
                  0   0   12     0   0   1 = 1   2   3     4
                   0   1   0    1   2   3   4     2   0   0     1
i.e., again, QA = A .
     How can we find the matrices Q (called elementary matrices)? Here is the
procedure. Start with the identity matrix I and do the row transformation to it.
The resulting matrix Q is the matrix that implements that row transformation
by multiplication from the left. Notice that this is true in the examples above.




                                        160
Chapter 4. Matrices and Determinants                          4.8. Additional Topics



In the first example, the row transformation was multiplying the first row by 2.
If you multiply the first row of
                                            
                                      1 0 0
                                I = 0 1 0
                                      0 0 1

by two you get                                   
                                     2      0   0
                                Q = 0      1   0.
                                     0      0   1
In the second example, the row transformation was subtracting twice the first
row from the second row. If you subtract twice the second row from the first
row of I by two you get                    
                                    1 0 0
                             Q =  −2 1 0  .
                                    0 0 1
In the third example, the row transformation was exchanging the second and
third rows. If you exchange the second and third rows of I, you get
                                         
                                   1 0 0
                                 0 0 1.
                                   0 1 0

   Elementary matrices are useful in theoretical studies of the Gaussian elimi-
nation process. We will use them briefly when studying determinants.
   Suppose A is a matrix, and R is its reduced form. Then we can obtain R from
A via a sequence of elementary row operations. Suppose that the corresponding
elementary matrices are Q1 , Q2 , . . ., Qk . Then, starting with A, the matrix after
the first elementary row operation is Q1 A, then after the second elementary row
operation is Q2 Q1 A, and so on, until we have

                             Qk Qk−1 · · · Q2 Q1 A = R.

Now let us apply the inverse matrices, starting with Q−1 . This gives
                                                      k

              Q−1 Qk Qk−1 · · · Q2 Q1 A = Qk−1 · · · Q2 Q1 A = Q−1 R.
               k                                                k

Continuing in this way we see that

                              A = Q−1 Q−1 · · · Q−1 R
                                   1   2         k

    In the special case that A is an n × n invertible matrix, A can be reduced
to the identity matrix. In other words, we can take R = I. In this case A can
be written as a product of elementary matrices.

                    A = Q−1 Q−1 · · · Q−1 I = Q−1 Q−1 · · · Q−1
                         1   2         k       1   2         k



                                          161
Chapter 4. Matrices and Determinants                                          4.8. Additional Topics



Notice that in this case
                           A−1 = Qk Qk−1 · · · Q2 Q1 .
As an example, let us write the matrix
                                                2       1
                                   A=
                                                5       3
as a product of elementary matrices. The sequence of row transformations that
reduce A to the identity are:
1) (1/2)(R1)
2) (R2)-5(R1)
3) (R1)-(R2)
4) 2(R2)
     The corresponding elementary matrices and their inverses are
                               1/2      0                           2     0
                       Q1 =                         Q−1 =
                                                     1
                                0       1                           0     1

                                1       0                           1    0
                       Q2 =                         Q−1 =
                                                     2
                               −5       1                           5    1
                               1     −1                             1    1
                       Q3 =                         Q−1 =
                                                     3
                               0      1                             0    1
                               1    0                           1        0
                       Q4 =                     Q−1 =
                                                 4
                               0    2                           0       1/2
Therefore
                              A = Q−1 Q−1 Q−1 Q−1
                                   1   2   3   4
or
                   2   1   2       0        1       0       1   1         1    0
                         =
                   5   3   0       1        5       1       0   1         0   1/2

4.8.5      Problems
Problem 4.45 Each elementary matrix is invertible, and the inverse is also
an elementary matrix. Find the inverses of the three examples of elementary
matrices above. Notice that the inverse elementary matrix is the matrix for the
row transformation that undoes the original row transformation.
Problem 4.46 Write the matrix
                                                          
                                 2           3          −1
                                1          2           3 
                                −1          −1          1
as a product of elementary matrices.


                                            162
Chapter 4. Matrices and Determinants                                   4.8. Additional Topics



4.8.6    Exchanging two rows changes the sign of the deter-
         minant
We start with the elementary row operation of exchanging two rows. For 2 × 2
determinants,
                                   a b
                              det          = ad − bc,
                                   c d
while
                            c d
                       det         = cb − da = −(ad − bc),
                            a b
so exchanging two rows changes the sign of the determinant.
     We can do a similar calculation for 3 × 3 matrices. Its a a bit messier, but
still manageable. Again, we find that exchanging two rows changes the sign of
the determinant.
     How about the n × n case? We will assume that we have already proved
the result for the (n − 1) × (n − 1) case, and show how we can use this to show
the result for an n × n matrix. Thus knowing the result for 2 × 2 matrices,
implies it for 3 × 3, which in turn implies it for 4 × 4 matrices, and so on. We
consider three cases, depending on which rows we are exchanging. Suppose A
is the original matrix and A is the matrix with two rows exchanged.
(1) Exchanging two rows other than the first row: In this case we cross out the
    first row and any column from A we obtain M1,j which is the same as the
    matrix M1,j (corresponding to A) except with two of its rows exchanged.
    Since the size of M1,j is n − 1 we know that det(M1,j ) = − det(M1,j ) so
                                         n
                         det(A )   =          (−1)j+1 a1,j det(M1,j )
                                        j=1
                                              n
                                   = −             (−1)j+1 a1,j det(M1,j )
                                             j=1
                                   = − det(A)

(2) Exchanging the first and second row. Do see that this changes the sign
    of the determinant we have to expand the expansion. The following is a
    bit sketchy. I’ll probably skip it in class, but give the argument here for
    completeness. If we expand M1,j we get
                    j−1                                       n
     det(M1,j ) =         (−1)k+1 a2,k det(M1,2,j,k ) +             (−1)k a2,k det(M1,2,j,k )
                    k=1                                     k=j+1

    where M1,2,j,k is the matrix obtained from A by deleting the first and second
    rows, and the jth and kth columns. Inserting this into the expansion for A
    gives
                n j−1                                          n      n
    det(A) =             (−1)j+k a1,j a2,k det(M1,2,j,k )−                (−1)j+k a1,j a2,k det(M1,2,j,k )
               j=1 k=1                                        j=1 k=j+1


                                             163
Chapter 4. Matrices and Determinants                             4.8. Additional Topics



    The sum splits into two parts. Flipping the first two rows of A just exchanges
    the two sums. In other words S − R becomes R − S which is −(S − R). So
    exchanging the first two rows also changes the sign of the determinant.
(3) Exchanging the first row with the kth row. We can effect this exchange by
    first exchanging the kth and the second row, then exchanging the first and
    the second row, then exchanging the kth and the second row again. Each
    flip changes the determinant by a minus sign, and since there are three flips,
    the overall change is by a minus sign.

   Thus we can say that for any n × n matrix, exchanging two rows changes
the sign of the determinant.
   One immediate consequence of this fact is that a matrix with two rows
the same has determinant zero. This is because if exchange the two rows the
determinant changes by a minus sign, but the matrix doesn’t change. Thus
det(A) = − det(A) which is only possible if det(A) = 0.

4.8.7      The determinant is linear in each row separately
To say that the determinant is linear in the jth row means that if we write a
matrix as a matrix of row vectors,
                                      a 
                                                 1
                                         a2 
                                         . 
                                         . 
                                         . 
                                      A= 
                                         aj 
                                         . 
                                         . 
                                           .
                                          an
then
                            a1               a1           a1
                                                     
                           a2             a2         a2 
                            .
                             .
                                           . 
                                            . 
                                                         . 
                                                         . 
                             .              .          . 
                               
                  det(         ) = s det( ) + t det( )
                               
                       sb + tc           b           c 
                           .                .          . 
                                            .          . 
                               
                          .
                           .                 .            .
                          an                 an           an
   It is easy to from the expansion formula that the determinant is linear in
the first row. For a 3 × 3 example we have
                                                       
                          sb1 + tc1 sb2 + tc2 sb3 + tc3
                    det( a2,1         a2,2     a2, 3 )
                             a3,1      a3,2     a3, 3


       =   (sb1 + tc1 ) det(M1,1 ) − (sb2 + tc2 ) det(M1,2 ) + (sb3 + tc3 ) det(M1.3 )
       = s(b1 det(M1,1 ) − b2 det(M1,2 ) + b3 det(M1.3 ))


                                           164
Chapter 4. Matrices and Determinants                     4.8. Additional Topics



         +t(c1 det(M1,1 ) − c2 det(M1,2 ) + c3 det(M1.3 ))
                                                           
               b1   b2      b3                c1    c2     c3
    = s det( a2,1 a2,2 a2, 3 ) + t det( a2,1 a2,2 a2, 3 )
              a3,1 a3,2 a3, 3                a3,1 a3,2 a3, 3

A similar calculation can be done for any n × n matrix to show linearity in the
first row. To show linearity in some other row, we first swap that row and the
first row, then use linearity in the first row, and then swap back again. So

                      a1                      sb + tc
                                                   
                   a2                     a2 
                      .
                       .
                                                 .
                                                   .
                                                      
                       .                           .
                                                   
              det(         ) = − det(              )
                                                   
                   sb + tc                a1 
                      .
                       .
                                                 .
                                                   .
                                                      
                      .                         .  
                      an                          an
                                                b              c
                                                          
                                              a2          a2 
                                              .           . 
                                              .           . 
                                              .           . 
                                 = −s det( ) − t det( )
                                              a1          a1 
                                              .           . 
                                              . 
                                                 .          . 
                                                               .
                                                an            an
                                             a1            a1
                                                        
                                            a2          a2 
                                            .           . 
                                            .           . 
                                            .           . 
                                 = s det( ) + t det( )
                                           b            c 
                                            .           . 
                                            . 
                                              .           . 
                                                            .
                                             an            an

Notice that linearity in each row separately does not mean that det(A + B) =
det(A) + det(B).
   Note that multiplying a row by a constant multiplies the determinant by the
constant. This is a special case of linearity.

4.8.8    Adding a multiple of one row to another doesn’t
         change the determinant
Now we will see that the most often used row operation—adding a multiple of
one row to another—doesn’t change the determinant at all. Let A be an n × n




                                       165
Chapter 4. Matrices and Determinants                     4.8. Additional Topics



matrix. Write A as a matrix of rows.
                                      a1
                                        
                                     .. 
                                     . 
                                     ai 
                                     
                                     . 
                                  A= . 
                                     . 
                                    a 
                                     j
                                     . 
                                     . 
                                       .
                                      an

Adding s times the ith row to the jth row yields

                                            a1
                                                
                                             .
                                             .
                                             .
                                                 
                                                 
                                            ai
                                                 
                                                 
                                            .
                                             .
                                                  
                                A =
                                            .    
                                                  
                                   a       + sai 
                                    j            
                                            .
                                             .
                                                  
                                             .
                                                 
                                            an

So
                       a1              a1             a1
                                                 
                        .
                        .             . 
                                        .           . 
                        .             .             . 
                                                       .
                           
                           
                       ai             ai           ai 
                                                 
                           
                       .
                        .   ) = det( . ) + s det( . ) = det(A) + 0
                                     
                                        .
                                                     
     det(A ) = det(
                       .            .            . 
                                                       .
                    a + sa         a 
                                                      ai 
                                                     
                    j     i         j           
                       .            .            . 
                        .             .            . 
                                                       .
                        .               .
                           
                       an              an             an

Here we used linearity in a row and the fact that the determinant of a matrix
with two rows the same is zero.

4.8.9     The determinant of QA
To begin, we compute the determinants of the elementary matrices. Recall that
if A is the matrix obtained from A by an elementary row operation, then

(1) det(A ) = − det(A) if the row operation is swapping two rows

(2) det(A ) = s det(A) if the row operation is multiplying a row by s

(3) det(A ) = det(A) if the row operation is adding a multiple of one row to
    another


                                        166
Chapter 4. Matrices and Determinants                            4.8. Additional Topics



Recall that the elementary matrices are obtained from the identity matrix I by
an elementary row operation. So we can take A = I and A = Q in the formulae
above to obtain
(1) det(Q) = − det(I) = −1 if the row operation is swapping two rows
(2) det(Q) = s det(I) = s if the row operation is multiplying a row by s
(3) det(Q) = det(I) = 1 if the row operation is adding a multiple of one row to
    another
Going back to the first set of formulae, we have that in each case A = QA. In
each case the factor in front of det(A) is exactly det(Q) So we see that in each
case
                           det(QA) = det(Q) det(A).
This formula can be generalized. If Q1 , Q2 , . . ., Qk are elementary matrices then
det(Q1 Q2 Q3 · · · Qk A) = det(Q1 ) det(Q2 Q3 · · · Qk A) = det(Q1 ) det(Q2 ) det(Q3 · · · Qk A)
and so on, so we arrive at the formula

          det(Q1 Q2 Q3 · · · Qk A) = det(Q1 ) det(Q2 ) · · · det(Qk ) det(A).

4.8.10      The determinant of A is zero exactly when A is not
            invertible
Recall that if R denotes the reduced form of A, obtained by performing the
sequence of row reductions corresponding to Q1 , Q2 , . . ., Qk , then

                               A = Q−1 Q−1 · · · Q−1 R
                                    1   2         k

Each Q−1 is an elementary matrix, therefore
      i

                det(A) = det(Q−1 ) det(Q−1 ) · · · det(Q−1 ) det(R)
                              1         2               k

If A is not invertible, then R has a row of zeros along the bottom. Thus R is an
upper triangular matrix with at least one zero on the diagonal. The determinant
of R is the product of the diagonal elements so det(R) = 0. Thus det(A) = 0
too.
    If A is invertible, then we can reduce A to to identity matrix. In other
words, we can take R = I. Then det(R) = 1. Each det(Q−1 ) is non-zero too,
                                                             i
so det(A) = 0.

4.8.11      The product formula: det(AB) = det(A) det(B)
If either A or B is non-invertible, then AB is non-invertible too. Thus det(AB) =
0 and one of det(A) or det(B) is zero, so det(A) det(B) = 0 too. Thus det(AB) =
det(A) det(B).
     If both A and B are invertible, then

                                A = Q−1 Q−1 · · · Q−1
                                     1   2         k


                                         167
Chapter 4. Matrices and Determinants                            4.8. Additional Topics



so
                   det(A) = det(Q−1 ) det(Q−1 ) · · · det(Q−1 )
                                 1         2               k

and
                              B = Q−1 Q−1 · · · Q−1
                                  ˜ ˜
                                   1   2
                                                ˜
                                                  j

so
                   det(B) = det(Q−1 ) det(Q−1 ) · · · det(Q−1 )
                                ˜
                                  1
                                          ˜
                                            2
                                                          ˜
                                                            j

Therefore
                     AB = Q−1 Q−1 · · · Q−1 Q−1 Q−1 · · · Q−1
                           1   2         k
                                            ˜ ˜
                                             1   2
                                                          ˜
                                                            j

so

det(AB) = det(Q−1 ) det(Q−1 ) · · · det(Q−1 ) det(Q−1 ) det(Q−1 ) · · · det(Q−1 ) = det(A) det(B)
               1         2               k
                                                  ˜
                                                    1
                                                            ˜
                                                              2
                                                                            ˜
                                                                              j


4.8.12      The determinant of the transpose
Recall that the transpose AT of a matrix A is the matrix you get when you flip
A about its diagonal. If A is an n × n matrix, so is AT and we can ask what
the relationship between the determinants of these two matrices is. It turns out
that they are the same.
                              det(AT ) = det(A).
If A is an upper or lower triangular matrix, this follows from the fact that the
determinant of a triangular matrix is the product of the diagonal entries. If A
is an arbitrary n × n matrix then the formula follows from two facts.

(1) The transpose of a product of two matrices is given by (AB)T = B T AT .
    This implies that (A1 A2 · · · An )T = AT · · · AT AT .
                                            n        2 1

(2) For an elementary matrix Q we have det(QT ) = det(Q).

If you accept these two facts, then we may write

                             A = Q−1 Q−1 · · · Q−1 R
                                  1   2         k

where R is upper triangular. Thus

                       AT = RT (Q−1 )T · · · (Q−1 )T (Q−1 )T
                                 k             2       1

so

              det(AT )   = det(RT ) det((Q−1 )T ) · · · det((Q−1 )T )
                                          k                   1
                         =   det(R) det(Q−1 ) · · · det(Q−1 )
                                             k            1
                         =   det(Q−1 ) · · · det(Q−1 ) det(R)
                                  1               k
                         =   det(A)




                                       168
Chapter 4. Matrices and Determinants                                    4.8. Additional Topics



4.8.13     An impractical formula for the inverse
We can use the expansion formulae of the previous section to obtain a formula
for the inverse of a matrix A. This formula is really only practical for 3×3 matri-
ces, since for larger matrices, the work involved in computing the determinants
appearing is prohibitive.
    We begin with the expansion formula
                                       n
                       det(A) =            (−1)i+j ai,j det(Mi,j )
                                     j=1

If A is invertible, then det(A) = 0 so we can divide by it to obtain
                                n
                                               (−1)i+j det(Mi,j )
                          1=          ai,j
                               j=1
                                                    det(A)

Now suppose we take the matrix A and replace the ith row by the kth row for
some k = i. The resulting matrix A has two rows the same, so its determinant
is zero. Its expansion is the same as that for A, except that ai,j is replaced by
ak,j . Thus, if k = i
                                 n
                          0=          (−1)i+j ak,j det(Mi,j )
                                j=1

Dividing by det(A) yields
                                n
                                               (−1)i+j det(Mi,j )
                         0=          ak,j
                               j=1
                                                    det(A)

   Now let B be the matrix with entries
                                       (−1)i+j det(Mj,i )
                              bi,j =
                                            det(A)

This turns out to be the inverse A−1 .
   It gets a bit confusing with all the indices, but let’s think about what we
need to show. The k, ith entry of the product AB is given by
                                           n
                    (AB)k,i     =               ak,j bj,i
                                       j=1
                                        n
                                                       (−1)i+j det(Mi,j )
                                =               ak,j
                                       j=1
                                                            det(A)

According to the formulae above, this sum is equal to 1 if k = i and equal to 0 if
k = i. In other words, AB is the identity matrix I. This shows that B = A−1 .
Remember that


                                                169
Chapter 4. Matrices and Determinants                   4.9. Solutions to Chapter Problems



4.8.14      Cramer’s rule, an impractical way to solve systems
Given an impractical way to compute the inverse, we can derive an impractical
formula for the solution of a system of n equations in n unknowns, i.e., a matrix
equation
                                     Ax = b
                                                x1              b1
                                                                

The solution x is equal to A−1 b. So if x =  x2  and b =  b2 , then using
                                               .
                                               .               .
                                                               .
                                               .xn             .bn
the formula of the previous section for the inverse, we get
                          n                n
                                                 (−1)i+j det(Mj,i )
                   xi =         A−1 bj =
                                 i,j                                bj
                          j=1              j=1
                                                      det(A)

but this is exactly (1/ det(A)) times the formula for expanding the determinant
matrix obtained from A by replacing the ith column with b. Thus

           det(matrix obtained from A by replacing the ith column with b)
    xi =
                                      det(A)

4.8.15      Problems
                                                         
                                   1               0    1
Problem 4.47 Find the inverse of  1               2    3  using the “impractical” for-
                                   3               0    1
mula.

Problem 4.48 Solve the equation
                                  
                         1 0 1     x1       1
                        1 2 3   x2  =  0 
                         3 0 1     x3       1

using Cramer’s rule.


4.9      Solutions to Chapter Problems
Solution 4.1
                                               −13     7
                                   AB =
                                               −9      5
                                                             
                                        1         2        −1
                                BA =  −2        −4        −8 
                                       −1        −2        −5
                                                 −14
                                    AD =
                                                 −18


                                           170
Chapter 4. Matrices and Determinants                  4.9. Solutions to Chapter Problems



                                 CB = [ 4         2]
                                CD = [ 26 ]
                                    4   −4 0
                                            
                                  −22 22 0 
                            DC = 
                                    4   −4 0
                                             


AC, CA, DA, BC, BD and DB are not defined.
Solution 4.2
                    
              3    0
                                    0    1                       1   4   2
       A =  −1    2 ;      B=                   ;         C=               ;
                                    0    0                       3   1   5
              1    1
The matrix A is 3x2, the matrix B is 2x2, and the matrix C is 2x3. For a product
of two matrices to be possible, the number of columns in the left matrix has to
be equal to the number of rows in the right matrix. Hence there are 5 possible
products:                                         
                                            0 3
                          1.    A · B =  0 −1 
                                        0 1         
                                          3 12 6
                          2. A · C =  5 −2 8 
                                          4 5 7
                                             0 0
                          3.      B·B =
                                             0 0
                                           3 1 5
                          4. B · C =
                                           0 0 0
                                            1 10
                          5.     C ·A=
                                           13 7

Solution 4.3 If                                  
                                    0         a b
                               A = 0         0 c
                                    0         0 0
then                                       
                                     0 0 ac
                              A2 =  0 0 0 
                                     0 0 0
                                          
                                     0 0 0
                              A3 =  0 0 0 
                                     0 0 0
If                                                     
                                    1         0       a
                               A = 0         1       0
                                    0         0       1

                                        171
Chapter 4. Matrices and Determinants                          4.9. Solutions to Chapter Problems



then                                                            
                                           1              0   2a
                                    A2 =  0              1    0 
                                           0              0    1
                                                                
                                           1              0   3a
                                    A3 =  0              1    0 
                                           0              0    1

Solution 4.4
              1    k
a,b) Ak =
              0    1
c) Use the power series formula
                                                          x2   x3
                                   ex = 1 + x +              +    + ···
                                                          2    3!
       and substitute in the matrix tA for each occurrence of x. Substitute the
       identity matrix for 1. This gives
                              1 0    1                   1   t2          1   2
                  etA   =         +t                       +                   + ...
                              0 1    0                   1   2           0   1
                                          t2        t3                       2       3
                              1+t+        2     +   3!   + ···     t + 2 t2 + 3 t + · · ·
                                                                                3!
                        =                                                   2      3
                                               0                  1 + t + t2 + t + · · ·
                                                                                 3!
                                          t2        t3                           3
                              1+t+        2     +   3!   + ···      t + t2 + t + · · ·
                                                                             2!
                        =                                                  2     3
                                               0                  1 + t + t2 + t + · · ·
                                                                                3!
                              et    tet
                        =
                              0      et

                                                                                            a b
d,e) We are looking for all matrices that satisfy B 2 = A. Let B =                              .
                                                                                            c d
                        a2 + bc    ab + bd
       Then B 2 =                                   so we need to satisfy the equations
                        ca + dc    bc + d2

                                               a2 + bc =          1
                                           ab + bd            =   1
                                           ca + dc            =   0
                                               bc + d2        =   1
       The third equation says c(a + d) = 0 so either c = 0 or a + d = 0. But
       a + d = 0 would contradict the second equations, so we must have c = 0.
       So
                                                         a2   =      1
                                           b(a + d) = 1
                                                 d2 = 1


                                                    172
Chapter 4. Matrices and Determinants                4.9. Solutions to Chapter Problems



     So a = ±1 and d = ±1. If a = 1 then to satisfy the second equation we
     must have d = 1 and b = 1/2. If If a = −1 then to satisfy the second
     equation we must have d = −1 and b = −1/2. Thus the two square roots
          1 1/2          −1 −1/2
     are           and              . In general, a matrix may have more
          0    1           0   −1
     than two square roots.

Solution 4.5
                                 0        1     0     0
                                                       
                                0        0     1     0
                             A=
                                 0        0     0     1
                                                        
                                 0        0     0     0
                                  0        0    1     0
                                                       
                              2 0         0    0     1
                             A =
                                  0        0    0     0
                                                        
                                  0        0    0     0
                                  0        0    0     1
                                                       
                              3 0         0    0     0
                             A =
                                  0        0    0     0
                                                        
                                  0        0    0     0
A4 = A5 = · · · = 0
Solution 4.6 T (x + y) = x + y + a whereas T (x) + T (y) = x + a + y + a =
x + y + 2a. Since these are not equal, T is not linear.
Solution 4.7 It follows from the properties of the dot product that a·(sx+ty) =
sa · x + ta · y.
Solution 4.8 One way to do these problems is to determine the angle θ that
the lines make with the x axis, and then substitute into the formula. So, for
part (a), we have θ = π/4 (45◦ ). Thus the projection matrix is

1 1 + cos(2θ)      sin(2θ)     1 1 + cos(π/2)                   sin(π/2)     1 1   1
                             =                                             =
2   sin(2θ)      1 − cos(2θ)   2   sin(π/2)                   1 − cos(π/2)   2 1   1

Another way to do this problem is to go back to the derivation of the projection
matrix, and redo the formula for the matrix for projection in the direction of a
when a is not necessarily a unit vector. This gives

                                 1       a21        a1 a2
                                                          .
                            a2
                             1   + a2
                                    2
                                        a2 a1        a22

For part (a) the vector a = [1, 1] so this formula gives the same answer. For
part (b) the vector a = [4, −3] so the projection matrix is

                                 1   16        −12
                                                   .
                                 25 −12         9


                                        173
Chapter 4. Matrices and Determinants              4.9. Solutions to Chapter Problems



Solution 4.9 Since we have already computed the corresponding projection
matrices, we can use the formula 2P − I to get the reflection matrices. This
gives
(a)

                                        0     1
                                        1     0

(b)

                                  7/25        −24/25
                                 −24/25       −7/25


Solution 4.10 Here we can just substitute into the formula for the rotation
matrix. This gives
(a)
                                    √           √
                                  1/√2       −1/ 2
                                                √
                                  1/ 2        1/ 2

(b)

                                       0     −1
                                       1      0

(c)

                                       −1      0
                                        0     −1


Solution 4.11 We need to compute the matrix product

                  cos(2θ)  sin(2θ)          cos(2φ)  sin(2φ)
                  sin(2θ) − cos(2θ)         sin(2φ) − cos(2φ)

This equals

      cos(2θ) cos(2φ) + sin(2θ) sin(2φ) cos(2θ) sin(2φ) − sin(2θ) cos(2φ)
      sin(2θ) cos(2φ) − cos(2θ) sin(2φ) sin(2θ) sin(2φ) + cos(2θ) cos(2φ)

Using the addition formulae for cos and sin this can be rewritten

                         cos(2θ − 2φ) − sin(2θ − 2φ)
                         sin(2θ − 2φ) cos(2θ − 2φ)

This is the matrix for rotation by 2θ − 2φ.


                                       174
Chapter 4. Matrices and Determinants                 4.9. Solutions to Chapter Problems




                           Figure 4.12: Problem 4.12.


Solution 4.12 We first find the angle θ between the line x = 2y and the x-axis.

                                         2                    1
   Note that (using Pythagoras) cos θ = √5 , sin θ =          √ .
                                                               5
   Now,
                      1         cos(2θ)    sin(2θ)                  1
               f           =
                     10         sin(2θ) − cos(2θ)                   10
So, we need to calculate cos(2θ) and sin(2θ).
    We have cos(2θ) = cos2 θ − sin2 θ = 4 − 5 = 3 , and sin(2θ) = 2 sin θ cos θ =
                                        5
                                            1
                                                5
   2 √
     1    4
2 √5 5 = 5 .
    So,
                      3    4                    1                            1
          1           5    5       1                 3    4     1                43
   f            =     4                     =                            =             .
         10           5   −35      10           5    4   −3     10           5   −26

Therefore, the matrix of f is

                                   1    3        4
                                                     .
                                   5    4       −3

Solution 4.13 Notice that g is in fact the composition of Refθ and Rπ/2 , that
is:
                              g = Rπ/2 ◦ Refθ
where θ is the angle between the line x = −y and the x-axis (so, θ = π/2+π/4 =
3π/4).
   The matrix of g is given by the product of the matrices Rπ/2 and Ref3π/4 .
   Note that

          [g]   =   [Rπ/2 ][Ref3π/4 ]


                                        175
Chapter 4. Matrices and Determinants                    4.9. Solutions to Chapter Problems



                         cos π/2      − sin π/2         cos 6π/4        sin 6π/4
                  =
                         sin π/2       cos π/2          sin 6π/4       − cos 6π/4
                         0   −1         0      −1             1    0
                  =                                    =
                         1    0         −1      0             0   −1

Solution 4.14 Under this rotation, the vector e1 = [1, 0, 0] gets transformed
to [cos(θ), sin(θ), 0], e2 = [0, 1, 0] gets transformed to [− sin(θ), cos(θ), 0], and
e3 = [0, 0, 1] is transformed to itself (em i.e. it doesn’t change). Putting this in
the columns of a matrix yields
                                                      
                                 cos(θ) − sin(θ) 0
                               sin(θ) cos(θ) 0 
                                    0         0      1

Solution 4.15 Let

                                  x = (x1 , x2 , x3 , x4 )
                                   y = (y1 , y2 , y3 , y4 )

1.

     T (x + y)    = T (x1 + y1 , x2 + y2 , x3 + y3 , x4 + y4 )
                  =   ((x1 + y1 ) + 4(x2 + y2 ) + 5(x3 + y3 ), 3(x1 + y1 ) − 2(x2 + y2 ) + ...
                      (x3 + y3 ) − (x4 + y4 ), −(x1 + y1 ) − (x3 + y3 ) + (x4 + y4 ))
                  =   (x1 + 4x2 + 5x3 + y1 + 4y2 + 5y3 , 3x1 − 2x2 + x3 − x4 + ...
                      3y1 − 2y2 + y3 − y4 , −x1 − x3 + x4 − y1 − y3 + y4 )
                  =   (x1 + 4x2 + 5x3 , 3x1 − 2x2 + x3 − x4 , −x1 − x3 + x4 ) + ...
                      (y1 + 4y2 + 5y3 , 3y1 − 2y2 + y3 − y4 , −y1 − y3 + y4 )
                  = T (x) + T (y).

     This holds for any x, y ∈ R4
2.

     T (cx)   = T (cx1 , cx2 , cx3 , cx4 )
              =    (cx1 + 4cx2 + 5cx3 , 3cx1 − 2cx2 + cx3 − cx4 , −cx1 − cx3 + cx4 )
              =    (c(x1 + 4x2 + 5x3 ), c(3x1 − 2x2 + x3 − x4 ), c(−x1 − x3 + x4 ))
              = c(x1 + 4x2 + 5x3 , 3x1 − 2x2 + x3 − x4 , −x1 − x3 + x4 )
              = cT (x).

     This holds for any x ∈ R4 , and any c ∈ R. It is clear then that T is linear.

Solution 4.16 First recall the definition: a vector y = (y1 , y2 , y3 ) is said to be
in the range of T if there exists a vector x = (x1 , x2 ) such that T (x) = y.


                                               176
Chapter 4. Matrices and Determinants             4.9. Solutions to Chapter Problems



a) We should be looking for x = (x1 , x2 ), if any, such that T (x) = (1, 4, 2). But
                                                                            
      1                                         1 2                  x1 + 2x2
     4 =T        x1            x1                       x1
                          =A             = 0 1                =      x2       .
                   x2            x2                       x2
      2                                         1 1                  x1 + x2

   We get the following system of equations:
              
               x1 + 2x2 = 1
                                      x1 + 8 = 1               x1 = −7
                      x2 = 4    ⇒                ⇒
                                      x1 + 4 = 2               x1 = −2
                   x1 + x2 = 2
              

   So, the system does not have any solutions.
   In other words, there is no x = (x1 , x2 ) that satisfy the system above. So
   there is no x such that T (x) = (1, 4, 2). Hence, (1, 4, 2) is not in the range
   of T .

b) We are looking for x = (x1 , x2 ), if any, such    that T (x) = (1, 1, 1). So
                                                                            
      1                                         1     2                 x1 + 2x2
     1 =T         x1              x1                       x1
                           =A             = 0        1          =         x2  .
                    x2              x2                       x2
      1                                         1     1                  x1 + x2

   We get the following system of equations:
              
               x1 + 2x2 = 1
                                      x1 + 2 = 1               x1 = −1
                      x2 = 1    ⇒                ⇒
                                      x1 + 1 = 2                x1 = 0
                   x1 + x2 = 1
              

   So, as in the previous case, the system does not have any solutions; in other
   words, (1, 1, 1) is not in the range of T .

Solution 4.17

a) Note that                                   
                       1         1          1         0
                      2  = 3  0  − 2  −1  + 3  0  .
                       3         0          0         1
   So,
                                                
              1                    1             1            0
         T  2        =   3T  0  − 2T  −1  + 3T  0 
              3                    0             0            1
                                 1           2          −1        −4
                         =   3        −2         +3           =
                                 1           0           5        18




                                           177
Chapter 4. Matrices and Determinants             4.9. Solutions to Chapter Problems



b) In order to write the matrix of T , we need to find T (0, 1, 0). Note that
                                                
                             0          1         1
                           1  =  0  −  −1  .
                             0          0         0
     So,
                                               
                       0                  1            1
                  T  1         = T  0  − T  −1 
                       0                  0            0
                                       1          2             −1
                                   =         −        =              .
                                       1          0              1
     Now, the matrix of T is given by
                              
                     1             0    0
                                                                     1   −1   −1
        T = T  0  T  1  T  0  =
                                                                     1    1    5
                     0             0    1

Solution 4.18
a)
                                   
                       1            1
                                                                1+5+0
            (h ◦ g)  0  = h g  0  = h                                        (4.16)
                                                                −5 − 2 × 0
                      −5           −5
                                           
                                          6
                                 6      −6 
                           = h          −5  .
                                      =                                          (4.17)
                                −5
                                          5

b) We first find the matrix of g. To do so, we need to know g(1, 0, 0), g(0, 1, 0),
   and g(0, 0, 1).
     We have that
                                                           
         1                       0                            0
                  1                        1                             −1
      g 0  =          ,     g 1  =            ,        g 0  =            ,      .
                  0                        −2                            1
         0                       0                            1
     So, the matrix of g is
                                       1    1 −1
                                g=                     .
                                       0   −2 1
     Now we find the matrix of h. As with g, we need to find h(1, 0), and h(0, 1).
     We have that
                                                         
                                 1                       0
                      1       −1             0       0 
                  h       =        ,     h       =  1 .
                                                            
                      0       0              1
                                 0                      −1

                                       178
Chapter 4. Matrices and Determinants            4.9. Solutions to Chapter Problems



   So, the matrix of h is                         
                                     1           0
                                   −1           0 
                                h=
                                   0
                                                   .
                                                 1 
                                     0          −1

c) The matrix of h ◦ g is given by
                                                                          
                             1    0                           1      1    −1
                          −1 0            1    1   −1     −1     −1     1 
   h◦g = [h]4×2 [g]2×3 =                                =                 
                          0      1        0   −2    1     0      −2     1 
                             0 −1                             0      2    −1 4×3

   We can confirm them that h ◦ g : R3 → R4 . So, the matrix of h ◦ g will have
   4 rows and 3 columns, as expected.

Solution 4.19 We can determine the transition matrix, since we know each
pi,i = 0, and the sum over each column is one. We are given that p2,1 = 1/2.
Then, since p1,1 + p2,1 + p3,1 = 1, we have 0 + 1/2 + p3,1 = 1, so p3,1 = 1/2.
Similarly p1,2 = 1/3, p3,2 = 2/3, and p1,3 = 1/4, p2,3 = 3/4. Thus
                                                  
                                     0 1/3 1/4
                             P =  1/2 0 3/4 
                                    1/2 2/3 0
                                                      
                                                       1
If we start out with the probability vector x0 =  0  (i.e., the walker is in
                                                       0
location 1) then after two time steps the probability vector is P 2 x0 , i.e.,
                                                        
                           7/24 1/6 1/4        1        7/24
                   x2 =  3/8 2/3 1/8   0  =  3/8 
                            1/3 1/6 5/8        0         1/3
So the probability that the walker is in position 2 after two time steps is 3/8.
Solution 4.20
a) Since Pi,i = 0, and the sum of entries in each column is 1, the matrix P is
                                                   
                                     0 2/5 1/4
                            P =  1/3 0 3/4  .
                                   2/3 3/5 0

b) The random walker starts at location 1. So x0 = (1, 0, 0)T . The positions
   after three time steps will be
                                                                         
                    0                         3/10                       1/4
   x1 = P x0 =  1/3  ,        x2 = P x1 =  1/2  ,    x3 = P x2 =  1/4  ,
                   2/3                         1/5                       1/2

                                          179
Chapter 4. Matrices and Determinants              4.9. Solutions to Chapter Problems



     So, x3,2 , that is, the probability that the random walker is in location 2 after
                        1
     3 steps, will be 4 = 0.25.
c)
                                                                     
            0                         1/4                              18
                                                                   1 
     x0 =  0  ,       x1 = P x0 =  3/4  ,         x2 = P x1 =       5 ,
                                                                  60
            1                          0                               37
     So, x2,1 , the probability that the random walker is in location 1 after 2 steps,
              18
     will be 60 = 0.333...

Solution 4.21
a)
                                                                   
                      1/4 1/4      1/4      1/4         1 1     1   1
                    1/4 1/4       1/4      1/4  1  1 1       1   1 
                  P
                    1/4 1/4
                                                 =                           (4.18)
                                   1/4      1/4  4  1 1       1   1 
                      1/4 1/4      1/4      1/4         1 1     1   1
                                                                         
                            4      4    4    4             1    1   1     1
                       11 4       4    4    4    1      1    1   1     1 
                  P2 =                        =     · 4                   . (4.19)
                       44 4       4    4    4  4·4      1    1   1     1 
                            4      4    4    4             1    1   1     1
     So, P 2 = P , thus P 3 = P · P 2 = P · P = P . Hence, P n = P for every n ≥ 1
b) Let                                         
                                              1
                                             0 
                                       x0 =   .
                                             0 
                                              0
     Then                                      
                                              1
                                           1 1 
                                       x1 =   .
                                           4 1 
                                              1
     Hence, xn = P n x0 . But P n = P , for every n ≥ 1. So,
                                                        
                                                         1
                                n                    1 1 
                         xn = P x0 = P x0 = x1 =   .
                                                     4 1 
                                                         1

Now, let                                                 
                             0                            1
                            0                        1 1 
                      x0 =   ,
                            0             x1 = P x0 =   .
                                                       4 1 
                             1                            1

                                            180
Chapter 4. Matrices and Determinants                        4.9. Solutions to Chapter Problems



Again                                                          
                                                     1
                                                  1 1 
                              xn = P n x0 = P x0 =   .
                                                  4 1 
                                                     1

Solution 4.22                                     
                                       1/3 1/4 2/5
                                  P =  0 1/2 1/5 
                                       2/3 1/4 2/5

a) We are given                                      
                                                    0
                                             x0 =  1  .
                                                    0
     Then                                               
                                                        
                                                    1/4
                                      x1 = P x0 =  1/2  .
                                                    1/4
                  1
     So, x1,1 =   4   is the probability that the random walker is location 1 after one
     time step.

b) Again                                             
                                                    0
                                             x0 =  1  .
                                                    0
     Then
                                                                 37
                                                                   
                    1/4                                         120
                                                                  3                               37
      x1 = P x0 =  1/2  ,             x2 = P x1 =             10
                                                                              ⇒         x2,1 =       .
                                                                 47                               120
                    1/4                                         120


c)                                                    
                                                   1/3
                                            x0 =  1/3  .
                                                   1/3
     We have
                                       59                                     1853
                                                                                  
                                      180                                     5400
                                        7                                       46
                  x1 = P x0 =         30
                                            ,         x2 = P x1 =            225
                                                                                     
                                       79                                     2443
                                      180                                     5400

                           2443
     Therefore, x2,3 =     5400




                                                 181
Chapter 4. Matrices and Determinants         4.9. Solutions to Chapter Problems



Solution 4.23 In this situation the transition matrix is
                                              
                                      1 1 1
                                  1
                              P =     1 1 1
                                  3
                                      1 1 1

so P = P 2 = p3 = · · · P n . In this case, if we start out with any probability
vector x0 = [ x1 x2 x3 cr ] with x1 + x2 + x3 = 1, then
                                         
                                      1/3
                             P x0 =  1/3  = P n x0
                                      1/3

for each n.
Solution 4.24
(a) The transition matrix is
                                                
                                     1/2 0 1/3
                               P =  1/4 1/2 1/3 
                                     1/4 1/2 1/3

    The system starts in state 3 so x(0) = e3 . After one step, the probability
    vector is
                                                             
                                  1/2 0 1/3           0       1/3
              x(1) = P x(0) =  1/4 1/2 1/3   0  =  1/3 
                                  1/4 1/2 1/3         1       1/3

    That is, there is a equal probability of being in any state. After another
    step,                                                      
                                1/2 0 1/3           1/3        5/18
             x(2) = P x(1) =  1/4 1/2 1/3   1/3  =  13/36 
                                1/4 1/2 1/3         1/3       13/36
    So the probability that the system is in state 2 is 13/36.
(b) Suppose that the initial probabilities are given by the vector
                                            
                                             x1
                                  x(0) =  x2  .
                                             x3

    Then after k steps, the probabilities will be x(k) = P k x(0) . After many
    steps (using the limiting behaviour of P k stated in the question):
                                                       
                                  0.25 0.25 0.25           x1
                    x(k) ≈  0.375 0.375 0.375   x2 
                                 0.375 0.375 0.375         x3

                                       182
Chapter 4. Matrices and Determinants              4.9. Solutions to Chapter Problems


                                                             
                                    0.25x1 + 0.25x2 + 0.25x3
                             =  0.375x1 + 0.375x2 + 0.375x3 
                                  0.375x1 + 0.375x2 + 0.375x3
                                                      
                                                  0.25
                             = (x1 + x2 + x3 )  0.375 
                                                 0.375
                                      
                                  0.25
                             =  0.375 
                                 0.375

      where the last line follows because the initial probabilities must sum to one.
      Note that the state tends to this probability distribution no matter what
      the initial state is (or initial probabilities are). This is known as the steady
      state probability vector of this random walk.

Solution 4.25
                                       x1 + 2x2 + 3x3
                             Ax =
                                      4x1 + 5x2 + 6x3
so y · Ax = y1 x1 + 2y1 x2 + 3y1 x3 + 4y2 x1 + 5y2 x2 + 6y2 x3 . On the other hand
                                                   
                                         y1 + 4y2
                             AT y =  2y1 + 5y2 
                                        3y1 + 6y2

so (AT y) · x = y1 x1 + 4y2 x1 + 2y1 x2 + 5y2 x2 + 3y1 x3 + 6y2 x3 . These expressions
are equal.
Solution 4.26 (AT )T = A.
                                                                  
                                                            9    7
                            9 12
                           15
Solution 4.27 AB =               so            (AB)T   =  12   11 . On the other
                            7 11
                           15
                                                       15   15
                1 4            9                7
                      1 3
hand B T AT =  2 5      =  12               11 .
                      2 1
                3 6           15               15
Solution 4.28 If y = ei and x = ej (vectors that are all zeros except in one
spot) then y · (Ax) = ei · (Aej ) is the matrix entry aij . So we can conclude
that all the matrix entries of A are the same as those for B. This means the
matrices must be the same.
Solution 4.29 x · (Ay) = xT Ay = (AT x)T y = (AT x) · y
Solution 4.30 To determine whether these matrices invertible we reduce them
using Gaussian elimination. This gives
       1    2
(a)
       0   −2


                                         183
Chapter 4. Matrices and Determinants            4.9. Solutions to Chapter Problems


                   
      1   2     3
(b)  0   3     4 
      0   0   −1/3
                 
      1   2    3
(c)  0   3    4 
      0   0   2/3
Since these all have rank equal to their size, they are all invertible.
Solution 4.31 The inverse is
                                     −5     2
                                     3     −1

Solution 4.32
                                             
                               2 3        −1
(a)   This matrix reduces to  0 1/2      7/2  and so is not invertible
                               0 0         0
                                
                       −1 0 1
(b)   The inverse is  1 1 0 
                        3 1 −1
                        
         3 −5/2 1/2
(c)    −3    4      −1 
         1 −3/2 1/2
                     
        −7 5       3
(d)    3 −2 −2 
         3 −2 −1
                 
        1 0 −a
(e)   0 1 0 
        0 0 1
                      
        1 −a ac − b
(f)   0 1        −c 
        0 0        1

Solution 4.33 In order to find the inverse of A, we form the augmented matrix
[A|I]. Then we apply elementary row operations to A, to reduce A to its rref-
form. If A is invertible, then using this method [A|I] transfers to [I|B], where
B is the inverse of A.
   Now, let                                   
                                       1 1 1
                               A= 0 2 3 
                                       5 5 1


                                        184
Chapter 4. Matrices and Determinants                   4.9. Solutions to Chapter Problems



and form the augmented matrix:
                                                            
                           1 1               1    1    0   0
                         0 2                3    0    1   0 
                           5 5               1    0    0   1

The reduced row echelon      form of the augmented matrix is
                                                         
                      1        0 0 13/8 −1/2 −1/8
                    0         1 0 −15/8 1/2        3/8 
                      0        0 1     5/4    0    −1/4

Therefore, the inverse of A is
                                                                
                                     13/8         −1/2      −1/8
                           −1
                       A        =  −15/8          1/2       3/8  .
                                     5/4            0       −1/4

     Let                                                     
                                  1              2 −3       1
                                −1              3 −3      −2 
                             B=
                                2
                                                              
                                                 0 1        5 
                                 3               1 −2       5
Applying the same method, we observe that              B is not invertible because
                                                            
                                    1 0                0 2
                                  0 1                 0 1 
                      rref (B) = 
                                  0 0
                                                             .
                                                       1 1 
                                    0 0                0 0

Note that rank of B = 3, which is the number of non-zero rows in rref(B).
Solution 4.34
a) Let                                                       
                                      1            0        5
                                  A= 1           −2        3 
                                      2            1       −3
     be the coefficient matrix. Then
                                                  
                                                6
                                         Ax =  14 
                                                −2

b)                                                                           
                      3/28       5/28         5/14          3           5    10
                                                        1 
            A−1   =  9/28      −13/28        1/14  =      9          −13    2 
                                                       28
                      5/28      −1/28        −1/14          5          −1    −2


                                             185
Chapter 4. Matrices and Determinants         4.9. Solutions to Chapter Problems



c) In order to solve the system Ax = b, we multiply both sides by A−1 :
                                                            
                                               6           17
                                                     1
                    A−1 Ax = A−1 b = A−1  14  =         −33 
                                                     7
                                              −2            5
                                            
                                        17
                    ⇒       Ix = x =  −33  .
                                        5
   So, the system has the unique solution
                                          
                                        17
                                x =  −33 
                                         5

Solution 4.35 We reduce the matrix as follows:
                                 1 1 1 1
                                             
                               1 2 4 8 
                                 1 3 9 27
                                             
                                 1 4 16 64
                           1 1 1 1
                                       
                          0 1 3 7  (R2) − (R1)
                           0 2 8 26 (R3) − (R1)
                                       
                           0 3 15 63 (R4) − (R1)
                           1 1 1 1
                                      
                         0 1 3 7 
                           0 0 2 12 (R3) − 2(R2)
                                      
                           0 0 6 42 (R4) − 3(R2)
                           1 1 1 1
                                      
                         0 1 3 7 
                           0 0 2 12
                                      
                           0 0 0 6 (R4) − 3(R3)
None of these operations affect the determinant. So the determinant of the
original matrix is the same as the determinant of the reduced diagonal matrix.
This determinant is the product of the diagonal elements which equals 12.
Solution 4.36 We reduce the matrix as follows:
                             1 −1 1 −1
                                           
                           1 2 4 8 
                             1 −2 4 −8
                                           
                             1 1 1 1
                      1 −1 1 −1
                                    
                     0 3 3 9  (R2) − (R1)
                      0 −1 3 −7 (R3) − (R1)
                                    
                      0 2 0 2           (R4) − (R1)

                                       186
Chapter 4. Matrices and Determinants                           4.9. Solutions to Chapter Problems



                                1 −1 1 −1
                                          
                              0   1 1 3  (1/3)(R2)
                                0 −1 3 −7
                                          
                                0  2 0 2
                             1 −1 1 −1
                                         
                           0 1     1   3 
                             0 0    4 −4 (R3) + (R2)
                                         
                             0 0 −2 −4 (R4) − 2(R2)
                               1 −1 1 −1
                                           
                             0 1     1  3 
                               0 0    1 −1 (1/4)(R3)
                                           
                               0 0 −2 −4
                              1 −1 1 −1
                                        
                            0 1 1 3 
                              0 0 1 −1
                                        
                              0 0 0 −6 (R4) + 2(R3)
The two operations that changed the determinant were multiplying the the
second row by 1/3 and multiplying the third row by 1/4. Thus the determinant
of the diagonal matrix is (1/3)(1/4)× the determinant of the original matrix.
Hence the determinant of the original matrix is 3 × 4 × (−6) = −72.
Solution 4.37
a)
                                             
                        2     0     2       4
                       0     0     3       2 
     det A   =    det 
                                             
                        2     2     4       4 
                        3     0     6       2
                                                                                                
                          0       3 2                 0                  0   2             0   0   3
             =    2 det  2       4 4  − 0 + 2 det  2                  2   4  − 4 det  2   2   4 
                          0       6 2                 3                  0   2             3   0   6
                                        2    4                   2   4                     2   2
             =    2 0 − 3 det                        + 2 det                  + 2(2) det           − ···
                                        0    2                   0   6                     3   0
                                        2    2
                         4(3) det
                                        3    0
             =    2(0 − 3(4 − 0) + 2((2 − 0)) + 4(0 − 6) − 12(0 − 6)
             =    2(12) − 24 + 72 = 72.

b)
                                                            
                                    2            0   2     4
                                   0            0   3     2 
                 det A    =   det 
                                   2
                                                             
                                                 2   4     4 
                                    3            0   6     2

                                                     187
Chapter 4. Matrices and Determinants                  4.9. Solutions to Chapter Problems


                                                            
                                         2            2    4
                      = −0 + 0 − 2 det  0            3    2 +0
                                         3            6    2
                                                  2    4                2   2
                      = −2 −0 + 3 det                      − 2 det
                                                  3    2                3   6
                      = −2(3(4 − 12) − 2(12 − 6))
                      = −2(−24 − 12) = 72


Solution 4.38
                    
          1 0      1
                                0            1         1        1         1          0
    det  1 2      3  = −1 det                + 2 det            − 3 det
                                0            1         3        1         3          0
          3 0      1
                       = −1 × 0 + 2 × (1 − 3) − 3 × 0 = −4
                    
           1   0   1
                               1 2             1 0            1                  0
     det  1   2   3  = 1 det       − 3 det          + 1 det
                               3 0             3 0            1                  2
           3   0   1
                          =   1 × (0 − 6) − 3 × 0 + 1 × (2 − 0) = −4

Solution 4.39
                                       
                         2    0   2   4
                        0    0   3   2                              1
        D      =   det 
                        2
                                                 →        (1, :) =     (1, :)
                              2   4   4                              2
                         3    0   6   2
                                       
                         1    0   1   2
       1                0    0   3   2                    (3, :) = (3, :) − 2(1, :)
         D     =   det                          →
       2                2    2   4   4                    (4, :) = (4, :) − 3(1, :)
                         3    0   6   2
                                        
                         1    0   1    2
       1                0    0   3    2 
         D     =   det 
                        0
                                                     →      (4, :) = (4, :) − (2, :)
       2                      2   2    0 
                         0    0   3   −4
                                        
                         1    0   1    2
       1                0    0   3    2 
         D     =   det 
                        0
                                                     →      (2, :) ↔ (3, :)
       2                      2   2    0 
                         0    0   0   −6
                                        
                         1    0   1    2
      1                 0    2   2    0 
     − D       =   det 
                        0
                                         
      2                       0   3    2 
                         0    0   0   −6
             1
Therefore, − 2 D = (1)(2)(3)(−6) = −36                →      D = 72.


                                            188
Chapter 4. Matrices and Determinants                  4.9. Solutions to Chapter Problems



Solution 4.40 If n is even, then n row interchanges are required to reduce the
                                      2
matrix to a diagonal matrix. Therefore, the determinant is (−1)n/2 a1 a2 · · · an .
    If n is odd, then n−1 row interchanges are required to reduce the matrix to
                           2
a diagonal matrix (note that the centre row stays). Therefore, the determinant
        n−1
is (−1) 2 a1 a2 · · · an .
    In other words:
                                                        
                       an                    a1
                    .                           a2
                 ..
                                                        
det                          = (−1)k det                 = (−1)k a1 a2 · · · an ,
                                                        
                                                   ..
            a
             2
                                                    .    
        a1                                                            an

where
                                           n
                                           2        n even
                              k=          n−1
                                           2         n odd
Alternatively, we can expand the determinant on the first row:
                                                                                
                     an                                       an−1
    
                . ..                        
                                                         . ..                      
det                     = (−1)1+n an det 
                                                                                
                                                                                   
          a2                                     a2                             
      a1                                       a1
                                                                                                       
                                                                                                 an−2
                                                                                             .
                                                                                        ..
                                                                                                       
                              =       (−1)1+n an (−1)1+(n−1) an−1 det 
                                                                                                       
                                                                                                        
                                                                                  a2                   
                                                                              a1
                              = ···
                              = (−1)1+n an (−1)1+(n−1) an−1 · · · (−1)1+1 a1
                              =       (−1)(1+n)+n+···+2 a1 a2 · · · an
                                             (n+1)(n+2)
                                                        −1
                              =       (−1)       2           a1 a2 · · · an

Solution 4.41 A matrix is not invertible if and only if its determinant is equal
to zero. Therefore, all we need to do is find the determinant of the matrix, and
determine the values of λ that make the determinant zero.
                          
      2−λ 1           0
                                                −λ     1             −1      1
det  −1     −λ       1  = (2 − λ) det                       − det                               +0
                                                 3 1−λ                1 1−λ
         1     3 1−λ
                                  =     (2 − λ)(−λ(1 − λ) − 3) − (−(1 − λ) − 1)
                                  =     (2 − λ) λ2 − λ − 3 − (λ − 2)
                                  =     (2 − λ)(λ2 − λ − 2)
                                  =     (2 − λ)(λ − 2)(λ + 1) = 0
             ⇒    λ = 2, −1


                                              189
Chapter 4. Matrices and Determinants               4.9. Solutions to Chapter Problems



Solution 4.42
a) False     Counterexample:

                                      1    1
                               det                 =0=1
                                      1    1

   The correct statement should be:
   The determinant of a triangular (or diagonal) matrix is the product of
   the entries of its main diagonal.
b) False     Counterexample:

                                A = [1 2] ∈ R1×2 .

   Then
                                                                 1
                        det(AAT )     =     det [1 2]
                                                                 2
                                      = det([5]) = 5
                                                1
                        det(AT A)     = det         [1 2]
                                                2
                                                       1   2
                                      =     det                      =0
                                                       2   4

   Therefore, det(AAT ) = (AT A).
   The correct statement should be:
   For every square matrix A, we have det(AAT ) = (AT A).
c) True     Since
                                                    det(A) det(B)
     det(B −1 AB)   =    det(B −1 ) det(A) det(B) =               = det(A)
                                                       det(B)
                                                    det(B) det(A)
     det(BAB −1 )   =    det(B) det(A) det(B −1 ) =               = det(A).
                                                       det(B)

d) False     Counterexample: Let

                                               1   0
                                     A=
                                               0   1

   Then

                                      det(A)       =       1
                                                                     2    0
                                     det(2A)       =       det                =4
                                                                     0    2
                ⇒       det(A) = 2 det(A).


                                          190
Chapter 4. Matrices and Determinants            4.9. Solutions to Chapter Problems



   The correct statement should be:
   If an n × n matrix A is multiplied by a scalar c, the determinant of the
   resulting matrix is det(cA) = cn · det(A).

e) True       Since suppose the column vectors of A ∈ Rn×n are linearly inde-
   pendent, then Ax = 0 has a unique solution x = 0. The row vectors of AT
   will also be linearly independent, hence AT x = 0 also has a unique solution
   x = 0. Furthermore, since the columns are linearly independent,

                 det(A) = 0      ⇒           det(AT ) = det(A) = 0.

                                        
                         1 1               1
                                                          3 2
   Solution 4.43 B =  1 0  and b =  1  so B T B =            and B T b =
                                                          2 2
                         1 1               0
    2                                                  1
        The equation B T Bx = B T b has solution x =        . For this vector
    1                                              −1/2
                     1/2
   x we have Bx =  1 
                     1/2
                                                    
                                     x1 1              y1
   Solution 4.44 We obtain B =  x2 1  and b =  y2  so that B T B =
                                     x3 1              y3
       x2
        i     xi                                     T          xi yi
                  (where n = 3 in this example) and B b =             . Thus
       xi    n                                                   yi
   we end up with the same equations as before.

   Solution 4.45 The inverse operation to multiplying the first row by two is
   multiplying thefirst row by 
                             1/2. Therefore the inverse elementary matrix to
     2 0 0          1/2 0 0
    0 1 0  is  0       1 0 .
     0 0 1           0    0 1
   The inverse operation to subtracting twice the first row from the second
   row is adding twice the first row  the second row. Therefore the inverse
                                   to           
                           1 0 0         1 0 0
   elementary matrix to  −2 1 0  is  2 1 0 .
                           0 0 1         0 0 1
   The inverse operation to exchanging the last two rows
                                                             is exchanging them
                                                                    
                                                       1       0 0
   again. Therefore the inverse elementary matrix to  0       0 1  is the same
                                                     0       1 0
            1 0 0
   matrix  0 0 1 .
            0 1 0



                                       191
Chapter 4. Matrices and Determinants                                           4.9. Solutions to Chapter Problems



   In each case one can check directly that the inverse matrices when multiplied
   by the original matrices give the identity matrix.

   Solution 4.46 We can reduce A to the identity with the following row
   operations: (R2) − (1/2)(R1), (R3) + (1/2)(R1), (R3) − (R2), (R1) − 6(R2),
   (R2) − (5/4)(R3), (R1) + 7(R3), (1/2)(R1), 2(R2), (1/2)(R3). So
    1   0    0        1   0   0      1/2   0       0       1   0   7       1   0    0          1    −6       0       1    0        0        1    0   0       1         0       0
    0
    0
        1
        0
             0
            1/2
                      0
                      0
                          2
                          0
                              0
                              1
                                      0
                                      0
                                           1
                                           0
                                                   0
                                                   1
                                                           0
                                                           0
                                                               1
                                                               0
                                                                   0
                                                                   1
                                                                           0
                                                                           0
                                                                               1
                                                                               0
                                                                                   −5/4
                                                                                    1
                                                                                               0
                                                                                               0
                                                                                                    1
                                                                                                    0
                                                                                                             0
                                                                                                             1
                                                                                                                     0
                                                                                                                     0
                                                                                                                          1
                                                                                                                          −1
                                                                                                                                   0
                                                                                                                                   1
                                                                                                                                            0
                                                                                                                                           1/2
                                                                                                                                                 1
                                                                                                                                                 0
                                                                                                                                                     0
                                                                                                                                                     1
                                                                                                                                                            −1/2
                                                                                                                                                             0
                                                                                                                                                                       1
                                                                                                                                                                       0
                                                                                                                                                                               0
                                                                                                                                                                               1
                                                                                                                                                                                       A=I

   so
             1    0   0        1      0    0           1   0   0       1   6   0    1     0    0         1       0   −7        2       0   0     1    0     0      1       0       0
   A=       1/2
             0
                  1
                  0
                      0
                      1
                               0
                              −1/2
                                      1
                                      0
                                           0
                                           1
                                                       0
                                                       0
                                                           1
                                                           1
                                                               0
                                                               1
                                                                       0
                                                                       0
                                                                           1
                                                                           0
                                                                               0
                                                                               1
                                                                                    0
                                                                                    0
                                                                                          1
                                                                                          0
                                                                                              5/4
                                                                                               1
                                                                                                         0
                                                                                                         0
                                                                                                                 1
                                                                                                                 0
                                                                                                                     0
                                                                                                                     1
                                                                                                                               0
                                                                                                                               0
                                                                                                                                       1
                                                                                                                                       0
                                                                                                                                           0
                                                                                                                                           1
                                                                                                                                                 0
                                                                                                                                                 0
                                                                                                                                                     1/22
                                                                                                                                                      0
                                                                                                                                                            0
                                                                                                                                                            1
                                                                                                                                                                   0
                                                                                                                                                                   0
                                                                                                                                                                           1
                                                                                                                                                                           0
                                                                                                                                                                                   0
                                                                                                                                                                                   2




   Solution 4.47 The determinant is −4 by problem 4.38, and

                                    2          3                     1 3                              1              2
                      M1,1 =                               M1,2 =                       M1,3 =
                                    0          1                     3 1                              3              0
                                    0          1                     1 1                              1              0
                      M2,1        =                        M2,2    =                    M2,3        =
                                    0          1                     3 1                              3              0
                                    0          1                     1 1                              1              0
                      M3,1        =                        M3,2    =                    M3,3        =
                                    2          3                     1 3                              1              2

   Thus the inverse is
                                                                       1                          1
                                                                                                             
                                     2   −0                        −2       −2                0
                               1                                                              1
                                                                                                     2
                                                                                                     1
                                   −(−8) −2                        −2  =  −2                 2     2
                                                                                                             
                              −4                                             3
                                    −6   −0                         2        2                0     −12


   Solution 4.48 According to Cramer’s rule
                                                                                     
                                       1 0                                          1
                               1
                     x1 =        det  0 2                                          3 = 0
                              −4
                                       1 0                                          1
                                                                                     
                                       1 1                                          1
                               1                                                           3
                     x2 =        det  1 0                                          3 = −
                              −4                                                           2
                                       3 1                                          1
                                                                                     
                                       1 0                                          1
                               1
                     x3 =        det  1 2                                          0 = 1
                              −4
                                       3 0                                          1




                                                                   192
Chapter 5

Complex numbers

5.1       Complex arithmetic
Complex numbers can be thought of as points on the xy plane. The point
(x, y), thought of as a complex number, is written x + iy (or x + jy if you are
an electrical engineer). The i stands for an “imaginary” quantity such that
i2 = −1. If z = x + iy then x is called the real part of z and y is called the
imaginary part of z.
    Complex numbers are added just as if they were vectors in two dimensions.
If z = x + iy and w = s + it, then

                  z + w = (x + iy) + (s + it) = (x + s) + i(y + t)

   To multiply two complex numbers, just remember that i2 = −1. So if
z = x + iy and w = s + it, then

      zw = (x + iy)(s + it) = xs + i2 ytr + iys + ixt = (xs − yt) + i(xt + ys)

The modulus of a complex number, denoted |z| is simply the length of the
corresponding vector in two dimensions. If z = x + iy

                            |z| = |x + iy| =      x2 + y 2

An important property is
                                   |zw| = |z||w|
just like for real numbers.
    The complex conjugate of a complex number z, denoted z , is the reflection
                                                              ¯
of z across the x axis (also called the real axis). Thus x + iy = x − iy. Thus
complex conjugate is obtained by changing all the i’s to −i’s. We have

                                     zw = z w
                                          ¯¯

and
                                     z z = |z|2
                                       ¯


                                        193
Chapter 5. Complex numbers                                   5.2. Complex exponential



This last equality is useful for simplifying fractions of complex numbers by
turning the denominator into a real number, since
                                      z    zw
                                            ¯
                                        =
                                      w   |w|2
For example, to simplify (1 + i)/(1 − i) we can write

                     1+i      (1 + i)2      1 − 1 + 2i
                         =                =            =i
                     1−i   (1 − i)(1 + i)       2
A complex number z is real (i.e. the y part in x+iy is zero) whenever z = z. We
                                                                      ¯
also have the following formulas for the real and imaginary part. If z = x + iy
then x = (z + z )/2 and y = (z − z )/(2i)
               ¯                  ¯
    Complex numbers are indispensable in many practical calculations. We will
discuss complex exponentials when we talk about differential equations. The
reason why we are interested in them now is the following fact:

Theorem 5.1 If we use complex numbers, every polynomial can be completely
factored. In other words, given a polynomial λn + an−1 λn−1 + · · · + a1 λ + a0 ,
there exist (possibly complex) numbers r1 , r2 , . . . , rn such that

        λn + an−1 λn−1 + · · · + a1 λ + a0 = (λ − r1 )(λ − r2 ) · · · (λ − rn )

The numbers r1 are the values of λ for which the polynomial is zero.

   So for example the polynomial λ2 + 1 has no real roots, since there is no real
number λ for which it is zero. However there are two complex roots, ±i and

                              λ2 + 1 = (λ + i)(λ − i)

Of course, actually finding the roots of a high degree polynomial is difficult.
Here are some points to keep in mind.
   You can always find the roots of a quadratic polynomial using the quadratic
formula. In other words, the roots of aλ2 + bλ + c are
                                     √
                               −b ± b2 − 4ac
                                       2a
If the quantity inside the square root is negative, then the roots are complex.
So, for example the roots of λ2 + λ + 1 are
                √                √              √ √               √
          −1 ± 12 − 4       −1 ± −3       −1 ± −1 3        −1       3
                         =             =                =      ±i     .
                 2              2               2           2      2

5.2     Complex exponential
We begin by considering the differential equation

                                     y (t) = y(t)


                                         194
Chapter 5. Complex numbers                                   5.2. Complex exponential



In other words we are looking for a function whose derivative is equal to the
function. The exponential is such a function, so y(t) = et is a solution to this
differential equation. So is y(t) = Cet , where C is a constant.
   Now consider the equation

                                     y (t) = ay(t)

where a is a real number. Then, using the chain rule, we see that y(t) = Ceat is
a solution for any choice of constant C. Notice that the constant C is y(0), the
value of the solution at time zero. If we insist that the solution at time t = 0
take on a particular value
                                    y(0) = y0
Then this forces the constant C to be y0
   Now consider the equation

                                     y (t) = iy(t)

A solution to this equation is given by

                              y(t) = cos(t) + i sin(t)

To check this, just differentiate.

                           y (t)    = − sin(t) + i cos(t)
                                    = i(cos(t) + i sin(t))
                                    = iy(t)

So it is natural to define the exponential, eit , of a purely imaginary number it
to be
                              eit = cos(t) + i sin(t)
The complex exponential satisfies the familiar rule ei(s+t) = eis eit since by the
addition formulas for sine and cosine

    ei(s+t)   =   cos(s + t) + i sin(s + t)
              = cos(s) cos(t) − sin(s) sin(t) + i(sin(s) cos(t) + cos(s) sin(t))
              = (cos(s) + i sin(s))(cos(t) + i sin(t))
              = eis eit

Now it easy to check that solutions to

                                    y (t) = iby(t)

are given by y(t) = Ceibt , where C is an arbitrary constant. Since we are dealing
with complex numbers, we allow C to be complex too.
    The exponential of a number that has both a real and imaginary part is
defined in the natural way.

                       ea+ib = ea eib = ea (cos(b) + i sin(b))


                                         195
Chapter 5. Complex numbers           5.3. Polar representation of a complex number




                                                           r eiθ

                                              r

                                        eiθ
                                         θ




            Figure 5.1: Polar representation of a complex number.


and it is easy to check that the solution to the differential equation

                          y (t) = λy(t) = (a + ib)y(t)

is given by y(t) = Ceλt = Ce(a+ib)t . As before, if we insist that the solution at
time t = 0 take on a particular value

                                      y(0) = y0 ,

then this forces the constant C to be y0 .


5.3     Polar representation of a complex number
Notice that the number eiθ = cos(θ) + i sin(θ) lies on the unit circle on the
complex plane, at the point making an angle of θ (radians) with the x axis. If
we multiply eiθ by a real number r, then we obtain the complex number whose
polar co-ordinates are r and θ. This is shown in Figure 5.1.
    Notice that r is exactly the length (sometimes called the modulus) of the
complex number reiθ . The angle θ is called the argument. This representa-
tion of complex numbers makes the definition of complex multiplication more
transparent. We have

                           r1 eiθ1 r2 eiθ2 = r1 r2 ei(θ1 +θ2 )

   In other words, when we multiply two complex numbers, the moduli get
multiplied and the arguments get added.


5.4     MATLAB
Complex numbers are handled naturally by MATLAB. Specifying a complex
number is done with the syntax z = 1+3i. The commands real(z) and imag(z)
return the real and imaginary parts of a complex number z. The commands
abs(z) and conj(z) return the length and conjugate of z. Typing sqrt(-1)
returns


                                          196
Chapter 5. Complex numbers                                           5.5. Problems



ans =
           0 + 1.0000i


5.5      Problems
Problem 5.1 Show that |zw| = |z||w| for complex numbers z and w.

Problem 5.2 Show that zw = z w for complex numbers z and w.
                           ¯¯

Problem 5.3 Show that z z = |z|2 for every complex numbers z.
                        ¯

Problem 5.4 Simplify the following expressions to the form x + iy.

  1. i(2 − 3i)(−2 + i)
                 5
  2.
       (1 − i)(2 − i)(3 − i)
  3. (−1 + i)50

Problem 5.5 Prove that
        z1   |z1 |
  1.       =       ;
        z2   |z2 |

  2. (z n ) = (¯)n for all n ∈ N;
               z
       1
  3.     = z if |z| = 1;
           ¯
       z
  4. z is either real or pure imaginary if (¯)2 = z 2 .
                                            z

Problem 5.6 There are three values of z which gives z 3 = −i. Find all three
values in the form x + iy.


5.6      Solutions to Chapter Problems
Solution 5.1 If z = x + iy and w = s + it then zw = xs − yt + i(xt + ys) so
|zw|2 = (xs − yt)2 + (xt + ys)2 = x2 s2 + y 2 t2 − 2xyst + x2 t2 + y 2 s2 + 2xyst =
(x2 + y 2 )(s2 + t2 ) = |z|2 |w|2 .
Solution 5.2 If z = x + iy and w = s + it then zw = xs − yt + i(xt + ys)
so zw = xs − yt − i(xt + ys). On the other hand z w = (x − iy)(s − it) =
                                                ¯¯
xs − yt − i(xt + ys).
Solution 5.3 If z = x + iy then z z = (x + iy)(x − iy) = x2 + y 2 = |z|2 .
                                  ¯
Solution 5.4



                                       197
Chapter 5. Complex numbers                                  5.6. Solutions to Chapter Problems



a)

                             i(2 − 3i)(−2 + i)       = (3 + 2i)(−2 + i)
                                                     = −6 − 4i + 3i + 2i2
                                                     = −8 − i

b)

                    5                                    5(1 + i)(2 + i)(3 + i)
                                       =
          (1 − i)(2 − i)(3 − i)              (1 − i)(2 − i)(3 − i)(1 + i)(2 + i)(3 + i)
                                             5(1 + 3i)(3 + i)
                                       =
                                                 2 · 5 · 10
                                             5(10i)
                                       =             = i/2
                                              100

c)
                                                        √         3π   50
                                 (−1 + i)50      =          2ei    4


                                                                75π
                                                 =     225 ei    2


                                                 = 225 ei( 2 +36π)
                                                          3π



                                                 = 225 (−i) = −225 i


Solution 5.5

a) Let z1 = |z1 |eiθ1 , z2 = |z1 |eiθ2 ,

                              z1   |z1 |eiθ1   |z1 | i(θ1 −θ2 )   |z1 |
                                 =           =       e          =
                              z2   |z2 |eiθ2   |z2 |              |z2 |

     Alternatively,
                         2
                    z1           z1     z1            z1        z1              z1 z1   |z1 |2
                             =                =                        =              =
                    z2           z2     z2            z2        z2              z2 z2   |z2 |2

             z1
     Since   z2   > 0, |z1 | > 0, |z2 | > 0, we have that

                                              z1   |z1 |
                                                 =
                                              z2   |z2 |

b) Let z = reiθ . Then
                                                                            n
                             z n = rn einθ = rn e−inθ = re−iθ                   = (z)n



                                                 198
Chapter 5. Complex numbers                                   5.6. Solutions to Chapter Problems



c) If |z| = 1, then z = eiθ . So
                                        1   1
                                          = iθ = e−iθ = z
                                        z  e

d) Let z = x + iy. Then

                       (z)2 = (x − iy)2 = x2 − y 2 − 2ixy
                           z 2 = (x + iy)2 = x2 − y 2 + 2ixy.

   If (z)2 = z 2 , then x2 − y 2 − 2ixy = x2 − y 2 + 2ixy, therefore the condition
   is that −2ixy = 2ixy, that is, xy = 0.
   Hence, (z)2 = z 2 only when z is either real or pure imaginary.

Solution 5.6 If z 3 = −i, then

                                                           3π        1/3
                                z = (−i)1/3 = ei            2 +2kπ




                           3π    1/3          π
       k=0 ⇒          ei    2          = ei 2 = i
                                                                           √
                           7π    1/3          7π        7π          7π       3 1
       k=1 ⇒          ei    2          = ei    6   = cos    + i sin     =−     − i
                                                         6           6      2√ 2
                           11π   1/3          11π         11π          11π     3 1
       k=2 ⇒          ei    2          = ei    6    = cos      + i sin     =    − i
                                                           6            6     2   2




                                                    199
Chapter 6

Eigen-analysis

6.1     Eigenvalues and eigenvectors
Definition 6.1 Let A be an n × n matrix. A number λ and a vector x are
called an eigenvalue eigenvector pair if

(1) x = 0

(2) Ax = λx

    In other words, the action of A on the vector x is to stretch or shrink it by
an amount λ without changing its direction. We say λ is an eigenvalue of A if
there exists a vector x so that λ and x are an eigenvalue eigenvector pair. Notice
that we do not allow x = 0. If we did, any number λ would be an eigenvalue.
However we do allow λ = 0. Saying that 0 is an eigenvalue of A means that
there is a non-zero solution x (the eigenvector) of Ax = 0x = 0. So we see that
0 is an eigenvalue of A precisely when A is not invertible.
    Let’s look at some examples. Consider first the matrix of reflection about
a line making an angle of θ with the x axis shown in Figure 6.1. Let x be
any vector that lies along the line. Then the reflection doesn’t affect x. This
means that Rx = x. In other words, x is an eigenvector with eigenvalue 1. On
the other hand, suppose that y is a vector at right angles to the line. Then
the reflection flips y into minus itself. So Ry = −y. In other words, y is an
eigenvector with eigenvalue −1. If we take any other vector and reflect it, we
don’t end up with a vector that lies on the same line as the original vector.
Thus there are no further eigenvectors or eigenvalues.
    An important point to notice is that the eigenvector is not uniquely deter-
mined. The vector x could be any vector along the line, and y could be any
vector orthogonal to the line. In fact, if we go back to the original definition of
eigenvalue and eigenvector we can see that if λ and x are an eigenvalue eigen-
vector pair, then so are λ and sx for any non-zero number s, since sx = 0 and
Asx = sAx = sλx = λsx. So the important thing about an eigenvector is its


                                       200
Chapter 6. Eigen-analysis                         6.1. Eigenvalues and eigenvectors




                            y

                                        x = Rx
                                    θ



                                    Ry = - y



         Figure 6.1: Eigenvalues and eigenvectors of a 2D reflection.


direction, not its length. However, there is no such ambiguity in the definition
of the eigenvalue. The reflection matrix has exactly two eigenvalues: 1 and −1.
    In some sense, the reflection matrix R illustrates the most satisfactory sit-
uation. R is a 2 × 2 matrix with two distinct eigenvalues. The corresponding
eigenvectors x and y are linearly independent (in fact they are orthogonal) and
form a basis for two dimensional space. It will be important in applications to
determine whether or not there exists a basis of eigenvectors of a given matrix.
In this example, x and y are a basis of eigenvectors of R.
    As our next example, consider the identity matrix I. Since the identity
matrix doesn’t change a vector, we have Ix = x for any vector x. Thus any
vector x is an eigenvector of I with eigenvalue 1. This example shows that a
given eigenvalue may have many eigenvectors associated with it. However, in
this example, there still exists a basis of eigenvectors: any basis at all is a basis
of eigenvectors of I.
    Next we will consider the rotation matrix . . . and run into trouble. Suppose
R is the matrix of rotation by π/4 (i.e. 45◦ ). Then Rx is never in the same
direction as x, since R changes the direction of x by π/4. So R has no eigenvalues
and no eigenvectors. This unfortunate state of affairs will cause us to make a
considerable detour into the theory of complex numbers. It turns out that if we
work with complex numbers rather than real numbers, then the rotation matrix
has eigenvalues too.

6.1.1     Computing the eigenvalues and eigenvectors
We now consider the problem of finding all eigenvalue eigenvector pairs for a
given n × n matrix A. To start, suppose someone tells you that a particular
value λ is an eigenvalue of A. How can you find the corresponding eigenvector
x? This amounts to solving the equation Ax = λx for x. This can be rewritten

                                  (A − λI)x = 0,

where I denotes the identity matrix. In other words x is a non-zero solution to
a homogeneous equation. It can be found by Gaussian elimination.


                                         201
Chapter 6. Eigen-analysis                      6.1. Eigenvalues and eigenvectors



   For example, suppose you know that 4 is an eigenvalue of
                                           
                               3 −6 −7
                             1    8     5 .
                              −1 −2 1

To find the corresponding eigenvector, we must solve
                                            
                 3 −6 −7             1 0 0         x1    0
             1     8     5  − 4  0 1 0   x2  =  0  .
                −1 −2 1              0 0 1         x3    0

This can be written
                                         
                        −1     −6   −7     x1       0
                        1      4    5   x2  =  0  .
                        −1     −2   −3     x3       0

To solve this we reduce the matrix. This yields
                                             
                                −1 −6 −7
                               0 −2 −2 
                                 0    0     0

The fact that the rank of this matrix is less than 3 confirms that 4 is indeed
an eigenvalue. If the rank of the matrix were 3 then the only solution to the
equation would be 0 which is not a valid eigenvector.
   Taking x3 = s as a parameter, we find that x2 = −s and x1 = −s. Thus
                                           
                                         −1
                                 x = s  −1 
                                          1

is an eigenvector (for any non-zero choice of s). In particular, we could take
s = 1. Then                               
                                       −1
                                 x =  −1  .
                                        1
When doing calculations by hand, it makes sense to take the scalar multiple s
in the eigenvector calculation so that it simplifies the form of the eigenvector
(clears common denominators in the components, for example). Eigenvectors
computed in MATLAB are normalized (scaled so that they have length 1).
    Now that we have a method for finding the eigenvectors once we know the
eigenvalues, the natural question is: Is there a way to determine the eigenvalues
without knowing the eigenvectors? This is where determinants come in. The
number λ is an eigenvector if there is some non-zero solution x to the equation
(A − λI)x = 0. In other words, λ is an eigenvalue if the matrix (A − λI) is not
invertible. This happens precisely when det(A − λI) = 0.


                                      202
Chapter 6. Eigen-analysis                                 6.1. Eigenvalues and eigenvectors



   This gives us a method for finding the eigenvalues. Compute det(A − λI).
This will be a polynomial in λ. The eigenvalues will be exactly the values of λ
that make this polynomial zero, i.e., the roots of the polynomial.
   So here is the algorithm for finding the eigenvalues and eigenvectors:

(1) Compute det(A − λI) and find the values of λ for which it is zero. These
    are the eigenvalues.

(2) For each eigenvalue, find the non-zero solutions to (A − λI)x = 0. These
    are the eigenvectors.

   I should mention that this is actually only a practical way to find eigenvalues
when the matrix is small. Finding eigenvalues of large matrices is an important
problem and many efficient methods have been developed for use on computers.

Example 6.1 Find the eigenvalues and eigenvectors of

                                            2       1
                                   A=                 .
                                            1       2

First we compute

                       det(A − λI)    =     (2 − λ)(2 − λ) − 1
                                      = λ2 − 4λ + 3

We can find the roots of this polynomial using the quadratic formula or by
factoring it by inspection. We get

                            λ2 − 4λ + 3 = (λ − 1)(λ − 3),

so the eigenvalues are 1 and 3. Now we find the eigenvector for λ = 1. We must
solve (A − I)x = 0. The matrix for this homogeneous system of equations is

                                       1    1
                                              .
                                       1    1

Reducing this matrix yields
                                        1       1
                                        0       0
so an eigenvector is
                                            1
                                           −1
Next we find the eigenvector for λ = 3. We must solve (A − 3I)x = 0. The
matrix for this homogeneous system of equations is

                                      −1         1
                                      1         −1


                                          203
Chapter 6. Eigen-analysis                              6.1. Eigenvalues and eigenvectors



Reducing this matrix yields
                                         −1    1
                                          0    0
so an eigenvector is
                                          1
                                            .
                                          1

Example 6.2 Let us find the eigenvalues and eigenvectors of
                                           
                                3 −6 −7
                        A= 1        8    5 .
                               −1 −2 1

First we compute
                                             
                        3−λ      −6        −7
det(A − λI)   =    det  1      8−λ        5 
                         −1      −2       1−λ
              =    (3 − λ)((8 − λ)(1 − λ) + 10) + 6((1 − λ) + 10) − 7(−2 + (8 − λ))
              = −λ3 + 12λ2 − 44λ + 48

It is not always easy to find the zeros of a polynomial of degree 3. However, if
we already know one solution, we can find the other two. Sometimes, one can
find one solution by guessing. In this case we already know that 4 is a solution
(since this is the same matrix that appeared in the example in the last section).
We can check this:

                        −64 + 12 × 16 − 44 × 4 − 48 = 0

This means that λ3 +12λ2 −44λ+48 can be factored as −λ3 +12λ2 −44λ+48 =
(λ − 4)q(λ), where q(λ) is a second degree polynomial. To find q(λ) we can use
long division of polynomials.

                                −λ2        +8λ         −12
                       λ−4      −λ3       +12λ2        −44λ   +48
                                     3             2
                                −λ        +4λ
                                          8λ2          −44λ
                                          8λ2          −32λ
                                                       −12λ   +48
                                                       −12λ   +48

This yields q(λ) = −λ2 + 8λ − 12 This can be factored using the quadratic
formula (or by inspection) as q(λ) = −(λ − 2)(λ − 6) So we conclude

               −λ3 + 12λ2 − 44λ + 48 = −(λ − 4)(λ − 2)(λ − 6)




                                         204
Chapter 6. Eigen-analysis                        6.1. Eigenvalues and eigenvectors



and the eigenvalues are 2, 4 and 6. Now we find the eigenvector for λ = 2. We
must solve (A−2I)x = 0. The matrix for this homogeneous system of equations
is                                          
                                  1 −6 −7
                                1    6   5 
                                 −1 −2 −1
Reducing this matrix yields
                                                
                               1    −6        −7
                              0    −8        −8 
                               0     0         0
so an eigenvector is                    
                                       1
                                     −1 
                                       1
Next we find the eigenvector for λ = 4. We must solve (A − 4I)x = 0. The
matrix for this homogeneous system of equations is
                                            
                               −1 −6 −7
                              1     4    5 
                               −1 −2 −3
Reducing this matrix yields
                                                
                               −1       −6    −7
                               0       −2    −2 
                                0        0     0
so an eigenvector is                    
                                      −1
                                     −1 
                                      1
Finally we find the eigenvector for λ = 6. We must solve (A − 6I)x = 0. The
matrix for this homogeneous system of equations is
                                            
                                −3 −6 −7
                               1     2   5 
                                −1 −2 −5
Reducing this matrix yields
                                                
                               −3       −6    −7
                               0        0     8 
                                0        0     0
so an eigenvector is                   
                                     −2
                                     1 
                                      0


                                        205
Chapter 6. Eigen-analysis                             6.1. Eigenvalues and eigenvectors



Example 6.3 (Repeated Eigenvalues) Find the eigenvalues and eigenvec-
tors of                              
                             1 1 0
                       A = 0 2 0.
                             0 −1 1
Consider                                                       
                                     1−λ                1    0
                  det(A − λI) = det  0               2−λ    0 .
                                      0                −1   1−λ
In this case it makes sense to expand along the last column. This yields

        det(A − λI) = 0 − 0 + (1 − λ)(1 − λ)(2 − λ) = (1 − λ)2 (2 − λ)

This is already factored, so the zeros are λ = 1 and λ = 2. Notice that the
factor (1 − λ) occurs occurs to the second power. In this situation there are
fewer distinct eigenvalues than we expect. Lets compute the eigenvectors. To
find the eigenvectors for λ = 1 we must solve the homogeneous equation with
matrix A − I, i.e.,                        
                                   0 1 0
                                 0 1 0
                                   0 −1 0
This reduces to                                  
                                   0      1     0
                                  0      0     0
                                   0      0     0
and we find that there are two parameters in the solution. The set of solutions
in parametric form is                  
                                 1        0
                              s0 + t0
                                 0        1
We can find two linearly independent solutions by setting s = 1, t = 0 and
s = 0, t = 1. This gives          
                                1      0
                               0,0
                                0      1
To find the eigenvectors for λ = 2 we must solve the homogeneous equation
with matrix A − 2I, i.e.,                
                               −1 1     0
                              0    0   0 
                                0 −1 −1
This reduces to                                   
                               −1          1     0
                               0         −1    −1 
                                0          0     0


                                          206
Chapter 6. Eigen-analysis                             6.1. Eigenvalues and eigenvectors



and we find that the set of solutions in parametric form is
                                         
                                        1
                                   s 1 
                                       −1
Setting s = 1 gives the eigenvector
                                         
                                        1
                                       1 
                                       −1
In this 3 × 3 example, even though there are only two distinct eigenvalues, 1
and 2, there are still three independent eigenvectors (i.e., a basis), because the
eigenvalue 1 has two independent eigenvectors associated to it.

Example 6.4 (Repeated Eigenvalues with “Missing” Eigenvectors) Find
the eigenvalues and eigenvectors of
                                           2    1
                                 A=               .
                                           0    2
Here
                            det(A − λI) = (λ − 2)2
so there is only one eigenvalues λ = 2. To find the eigenvectors, we must solve
the homogeneous system with matrix
                                      0     1
                                              .
                                      0     0
The solutions are
                                           1
                                      s
                                           0
so there is only one eigenvector direction. So here is a matrix that does not
have a basis of eigenvectors. Matrices like this, that have too few eigenvectors,
will not be studied further in this course, but they do occur in applications.

6.1.2    Complex eigenvalues and eigenvectors
Since eigenvalues are found as roots of polynomials, we now see that they can be
complex. A discussion of complex eigenvalues and eigenvectors is given below.

Example 6.5 Lets consider the matrix of rotation by π/2. This is the matrix

                                          0     1
                                 A=               .
                                          −1    0
We compute
                                           −λ      1
                    det(A − λI) = det                = λ2 + 1
                                           −1     −λ

                                          207
Chapter 6. Eigen-analysis                         6.1. Eigenvalues and eigenvectors



The roots are ±i so the eigenvalues are i and −i. Now we compute the eigenvec-
tor corresponding to the eigenvalue i. We must solve the homogeneous equation
with matrix
                                    −i 1
                                    −1 −i
Notice that we will have to do complex arithmetic to achieve this, since the
matrix now has complex entries. To reduce this matrix we have to add i times
the first row to the second row. This gives

                               −i         1      −i 1
                                               =
                            −1 + −i2    −i + i   0 0

So if we let the x2 = s, then −ix1 + s = 0, or x1 = −is. So the solution is

                                             −i
                                        s
                                             1

and we may choose s = 1. Lets check that this is really an eigenvector:

                           0    1      −i   1    −i
                                          =   =i    .
                          −1    0      1    i    1

To find the other eigenvector we can use a trick. Suppose that the original
matrix A has only real entries. This will always be the case in our examples.
Suppose that A has a complex eigenvalue eigenvector pair λ and x. Then
                                                                      ¯x ¯ x
Ax = λx. Taking the complex conjugate of this equation, we obtain A¯ = λ¯ .
(Here conjugating a matrix or a vector just means conjugating each entry).
                                ¯               x ¯x                    ¯
Now, since A has real entries, A = A. Hence A¯ = λ¯ . In other words λ is an
                             ¯
eigenvalue with eigenvector x. In the present example, we already know that
¯ = −i is an eigenvalue. But now we don’t have to compute the eigenvector that
i
goes along with it. It is simply the conjugate of the one we already computed.
So the eigenvector corresponding to −i is

                                             i
                                             1

    The eigenvalues of A are the zeros or roots of the polynomial det(A − λI).
If we use complex numbers then det(A − λI) can be completely factored, i.e.,

                  det(A − λI) = ±(λ − λ1 )(λ − λ2 ) · · · (λ − λn )

Finding the roots may be difficult. However for 2 × 2 matrices we may use the
quadratic formula.
    If all the roots are distinct (i.e., λi = λj for i = j) then the corresponding
eigenvectors x1 , x2 , . . . , xn are linearly independent (I didn’t show you why this
is true, so I’m just asking you to believe it!) and therefore form a basis.
    If there are repeated roots, then there are fewer than n distinct eigenvalues.
In this situation, it might happen that there are not enough eigenvectors to


                                            208
Chapter 6. Eigen-analysis                           6.1. Eigenvalues and eigenvectors



form a basis . However it also might happen that more than one eigenvector
associated to a given eigenvalue, so that in the end there are enough eigenvectors
to form a basis. Compare Examples 6.3 and 6.4 from earlier, where we saw that
either situation can occur. Unfortunately, the only way we have to find out is
to try to compute them all.

6.1.3     MATLAB
When applied to a square matrix A, eig(A) will return the eigenvalues and the
eigenvectors of A. To use this command enter the following in MATLAB:
>> [P,D] = eig(A)
What will be returned is a matrix P with the normalized (unit length) eigen-
vectors of A in its columns, and a matrix D with the eigenvalues of A along
it’s diagonal. The eigenvalue corresponding to the ith column of P is found
in the (i, i) position of D. Using the eig command above will return complex
eigenvalues and eigenvectors when present.

Example 6.6 Consider A
                                                    
                                   1           4   5
                               A= 6           3   9                           (6.1)
                                   2           7   8
We can enter A into MATLAB and find its eigenvectors and eigenvalues with
the following commands:
>> A=[1 4 5; 6 3 9; 2 7 8];
>> [P,D] = eig(A)

P =

      -0.3919   -0.5895       0.2238
      -0.6401   -0.5446      -0.8511
      -0.6609    0.5966       0.4750


D =

      15.9657         0            0
            0   -0.3653            0
            0         0      -3.6004
    These results tell us that A has eigenvectors {v1 , v2 , v3 } and corresponding
eigenvalues {λ1 , λ2 , λ3 } as follows:
                                                                    
                            −0.3919         −0.5895              0.2238    
      {v1 , v2 , v3 } ≈       −0.6401  ,  −0.5446  ,  −0.8511 
                                −0.6609       0.5966              0.4750
                                                                           


                                         209
Chapter 6. Eigen-analysis                                6.1. Eigenvalues and eigenvectors



     {λ1 , λ2 , λ3 }   ≈    {15.9657, −0.3653, −3.6004}

6.1.4     Problems
                         1          1                                   1                      1
Problem 6.1 Show that        and        are eigenvectors for the matrix                          .
                         1         −1                                   1                      1
What are the corresponding eigenvalues?

Problem 6.2 Suppose P is a projection matrix. What are the eigenvalues and
eigenvectors of P ?

Problem 6.3 Find the eigenvalues and eigenvectors for

           0    3             −2   −8                   29   −10              −9    −14
    a)                 b)                c)                              d)
           3    0             4    10                  105   −36              7     12

Problem 6.4 Find the eigenvalues and the corresponding eigenvectors of the
matrix
                                 2 3
                                       .
                                 2 1

Problem    6.5 Find the eigenvalues     and eigenvectors           for
                                                                                                   
      0    −1 1            1 1           1           7             −9     −15             31    −100   70
a)  1      0 2  b)  1 0              −2  c)  0                4       0      d)    18    −59    42 
      2     0 2            1 −1          1           3             −9     −11             12    −40    29

Problem 6.6 Let P be a 2 × 2 transitional probability matrix in the form

                                     p11       1 − p22
                                                             .
                                   1 − p11       p22

Prove that one of the eigenvalues of P must be 1, and another one must be in
the interval [−1, 1]. (Hint: Let c = p11 + p22 .)

Problem 6.7 Find the eigenvalues and the corresponding eigenvectors of the
following matrix.
                                          
                                 0 1 −1
                          A= 5 0 1 
                                 0 1 −1

Problem 6.8 Find the eigenvalues and               the corresponding eigenvectors of the
following matrix.
                                                        
                                  2                0   1
                          A= 0                    2   1 
                                  1                0   2



                                             210
Chapter 6. Eigen-analysis               6.2. Eigenanalysis simplifies matrix powers



Problem 6.9 Is there a rank two matrix M, such that vectors − µ1 = [1, 2, 3]T
                                                             →
                                                             v
     →
     − µ = [3, 2, 1]T are eigenvectors of M, both corresponding to the same
and v 2
eigenvalue µ1 = µ2 = −1? If your answer is yes then find such a matrix, and
if your answer is no, then justify your answer.

Problem 6.10 Find the eigenvalues and the corresponding eigenvectors of the
matrix
                                2    3
                                          .
                               −2 −1

Problem 6.11 Given a 2 × 2 matrix

                                         a   i
                                 A=
                                         i   b
   a, Find values for a and b that A2 = A.
   b, Find values for a and b that A3 = A and A4 = A.
   c, Find values for a and b that A8 = A and A9 = A.

Problem 6.12 Like in the previous question, given a 2 × 2 matrix

                                         a   i
                                 A=
                                         i   b
    a, Find values for a and b that the two eigenvalues of A are µ1 = 2 + i and
µ2 = 2 − i.
    b, Find the eigenvectors of A in the previous question. (where the two eigen-
values are 2 ± i)


6.2     Eigenanalysis simplifies matrix powers
In the previous section we learnt how to find eigenvalues and eigenvectors of a
matrix, including the case when they are complex. There are two main uses
of this eigenanalysis: efficiently computing powers of a matrix (studied in this
section) and in the solution of differential equations considered in Section 6.3
below.
    Recall that in the random walk application in Section 4.3 we were interested
in high powers of a matrix, specifically

                                   lim P n x(0)
                                  n→∞

where P is the matrix of transition probabilities and x(0) is the column vector
of initial probabilities. We will explore the use of eigenanalysis to simplify our
understanding of these kind of problems in two examples below.




                                       211
Chapter 6. Eigen-analysis                  6.2. Eigenanalysis simplifies matrix powers



Example 6.7 We consider again the sorcerers’ duel in Example 4.1. We will
consider what happens if the duel is allowed to continue without limit until there
is a winner. Rather than compute
                                      lim P n x(0)
                                     n→∞

numerically in MATLAB as was done in Section 4.3 we will use an eigenanalysis
of P to understand this limit. The transition matrix for this problem is
                                   0 1/2 0 0
                                                 
                                 2/3 0 0 0 
                           P =
                                  1/3 0 1 0
                                                  
                                   0 1/2 0 1

with initial state x(0) = (1, 0, 0, 0)T . The eigenanalysis of P is summarized
below:
                  λ1 = 1      k1 = (0, 0, 1, 0)T
                 λ2 = 1       k2 = (0, 0, 0, 1)T
                     1                   √ 2         √  1
              λ3 = √          k3 = (1 − 3, √ (1 − 3), √ , 1)T
                      3                          3       3
                     1                   √        2   √    1
             λ4 = − √         k4 = (1 + 3, − √ (1 + 3), − √ , 1)T
                      3                            3        3
Note that λ = 1 is a repeated eigenvalue but there there is still a basis of
eigenvectors (the set of eigenvectors associated with λ = 1 is two-dimensional).
We can write
                        x(0) = c1 k1 + c2 k2 + c3 k3 + c4 k4               (6.2)
for some coefficients c1 , c2 , c3 and c4 uniquely determined. Equation (6.2) can
be written in matrix-vector form
                                      T c = x(0)                                  (6.3)
where c = (c1 , c2 , c3 , c4 )T and T is the 4 × 4 matrix with eigenvectors k1 , k2 , k3
and k4 in its columns. Solving (6.3) (I used MATLAB) gives c1 = 1/2, c2 = 1/2,
c3 ≈ −0.6830 and c4 ≈ 0.1830. With these values of c (6.2) is a representation
of x(0) as a linear combination of eigenvectors of P . This makes working out
later states x(n) easy, as shown below.
                x(1)   = P x(0) = P (c1 k1 + c2 k2 + c3 k3 + c4 k4 )
                       = c1 λ1 k1 + c2 λ2 k2 + c3 λ3 k3 + c4 λ4 k4
                                           1            1
                       = c1 k1 + c2 k2 + √ c3 k3 − √ c4 k4
                                             3           3
where in the middle line we have remembered that the ki vectors are eigenvec-
tors, so P ki = λki for i = 1, 2, 3, 4. Similarly
                                                       1       1
             x(2) = P x(1) = P 2 x(0) = c1 k1 + c2 k2 + c3 k3 + c4 k4
                                                       3       3

                                          212
Chapter 6. Eigen-analysis                 6.2. Eigenanalysis simplifies matrix powers



and
                                                     n                   n
                                            1                       1
      x(n) = P n x(0) = c1 k1 + c2 k2 +    √             c3 k3 + − √         c4 k4   (6.4)
                                             3                       3

This formula (6.4) is a simple formula for the state at any time n that does
not involve much computational work. In addition, it is easy to see from this
formula that
                          lim x(n) = (0, 0, 1/2, 1/2)T
                            n→∞

as found numerically in Example 4.1

Example 6.8 The weather in Vancouver is either good, average or bad on any
given day. If the weather is good on any day, there is a 60% chance the weather
will be good, 30% chance average, and 10% bad on the next day. If the weather
is average, then on the next day there is a 40% chance of good, 30% of average
and 30% of bad weather. If the weather is bad then on the next day there is
a 40% chance of good, 50% of average and 10% of bad. If the weather is good
today, what will the weather be like a long time from now? We number the
states

1) good

2) average

3) bad

The corresponding transition matrix is
                                                          
                                       6         4       4
                                  1 
                             P =       3         3       5
                                 10
                                       1         3       1

The initial state is x(0) = [1, 0, 0]T . The eigenanalysis of P is summarized below:

                            λ1 = 1,    k1 = [1/2, 1/3, 1/6]T
                        λ2 = 0.2,      k2 = [−2, 1, 1]T
                      λ3 = −0.2,       k1 = [0, −1, 1]T

As above, we put the eigenvectors into the columns of a matrix T and solve

                                  T c = (1, 0, 0)T

for c = (1, −1/4, 1/12). The right hand side (1, 0, 0)T of the system above
corresponds to the initial state x(0) of good weather. As before, this gives us
the representation of the initial state as a linear combination of the eigenvectors:
                                        1     1
                             x(0) = k1 − k2 + k3 .
                                        4    12

                                          213
Chapter 6. Eigen-analysis                6.2. Eigenanalysis simplifies matrix powers



Again, we see that multiplying by P in this representation leads to an easy
formula since ki are eigenvectors:
                                      1             1
               x(n) = P n x(0) = k1 − (0.2)n k2 + (−0.2n )k3
                                      4            12
Note that after a long time (n → ∞) the second and third terms above tend to
zero, so
                         lim x(n) = k1 = (1/2, 1/3, 1/6)
                        n→∞
so after a long time after the first nice day, the weather will have a 1/2 chance
of being good, 1/3 average and 1/6 bad.
    Let us consider the example above a bit more closely. Intuitively, the weather
after a long time should not depend on what it was like the day you started. In
fact, we can show that
                                 lim x(n) = k1                               (6.5)
                                  n→∞

for any starting probability x(0) (for which the entries must be non-negative and
sum to one).
    To show this, we have to show that writing
                            x(0) = c1 k1 + c2 k2 + c3 k3                      (6.6)
                                          (0)
always gives c1 = 1 no matter what x is. Note that the entries of k2 and
k3 sum to zero and the entries of k1 and x(0) sum to one. So by summing the
entries of (6.6) we see that c1 = 1 which shows that (6.5) is true for any starting
probability x(0) as our intuition predicted. The probability k1 that all initial
states tend to is called an equilibrium probability. In some cases described in the
theorem below you can guarantee the existence of an equilibrium probability.
Theorem 6.1 If P is a transition matrix (non-negative entries with all columns
summing to one) that in addition has all positive entries then P has an eigen-
value 1 with a single eigenvector k1 that can chosen to be a probability vector.
All other eigenvalues λ satisfy |λ| < 1 with eigenvectors with components that
sum to zero. Thus,
                                  lim x(n) = k1
                                  n→∞

for any x(0) . That is, k1 is an equilibrium probability.
    Note that the Example 6.8 the transition matrix satisfied the conditions
of the theorem and had a equilibrium probability. The transition matrix of
Example 6.7 did not satisfy the conditions of the theorem and does not have an
equilibrium probability (depending on the initial state, you can tend to different
fractions of times that each sorcerer wins the duel).
    We can summarize the process used to analyze these examples: writing
a vector x (the initial probability in our examples) as a linear combination
of eigenvectors of a matrix A (the transition matrix for our examples) then
easily writing An x as a linear combination of eigenvectors. This process can
be summarized in matrix-vector notation as Diagonalization discussed in more
detail in Section 6.5.1.


                                        214
Chapter 6. Eigen-analysis             6.3. Systems of linear differential equations



6.2.1     Problems
Problem 6.13 Find the eigenvalues and corresponding eigenvectors of the stochas-
tic matrix P below. Use the eigenvectors and eigenvalues to describe

                                    lim P n x(0) ,
                                   n→∞

where x(0) = [1, 0, 0]T , and
                                                 1     1
                                                          
                                          0      4     2
                                          1      1     1
                                P =      2      2     2
                                                           
                                          1      1
                                          2      4     0

Problem 6.14 What is the necessary condition on a and b, for P having an
equilibrium probability? Find the equilibrium probability vector (if it exists) for
a = b = 1/4.
                                          1      1     1
                                               
                                          4      3     2
                                          1      1
                                P =      2      3     b 
                                          1      1
                                          4      3     a

6.3      Systems of linear differential equations
Consider the system of differential equations
                          y1 (t) = a1,1 y1 (t) +a1,2 y2 (t)
                                                                              (6.7)
                          y2 (t) = a2,1 y1 (t) +a2,2 y2 (t)
This system of equations describes a situation where we have two quantities y1
and y2 , where the rate of change of each one of the quantities depends on the
values of both. We can rewrite this as a matrix equation. Let y(t) be the vector

                                                y1 (t)
                                 y(t) =                ,
                                                y2 (t)

and define the derivative of a vector to be the vector of derivatives, i.e.,

                                                y1 (t)
                                 y (t) =               .
                                                y2 (t)

Define A to be the matrix
                                         a1,1        a1,2
                                A=                        .
                                         a2,1        a2,2

Then the system of equations (6.7) can be rewritten

                                    y (t) = Ay.

   A general system of linear equations has this form, except y(t) is an n-
dimensional vector and A is an n × n matrix. How can we find solutions to such


                                          215
Chapter 6. Eigen-analysis                    6.3. Systems of linear differential equations



a system of equations? Taking a hint from the scalar case, we can try to find
solutions of the form
                                y(t) = eλt x
where x is a fixed vector (not depending on t). With this definition

                                       y (t) = λeλt x

so that y = Ay whenever

                                 λeλt x = Aeλt x = eλt Ax

Dividing by eλt , this condition becomes

                                          λx = Ax.

In other words, y(t) = eλt x is a solution exactly whenever λ and x are an
eigenvalue eigenvector pair for A. So we can find as many solutions as we have
eigenvalue eigenvector pairs.
    To proceed we first notice that if y1 (t) and y2 (t) are two solutions to the
equation y = Ay, then a linear combination c1 y1 (t) + c2 y2 (t) is also a solution,
since
                  d
                     c1 y1 (t) + c2 y2 (t)     = c1 y1 (t) + c2 y2 (t)
                  dt
                                               = c1 Ay1 (t) + c2 Ay2 (t)
                                               = A c1 y1 (t) + c2 y2 (t)

Notice that we are assuming that the constants c1 and c2 do not depend on t.
Similarly, if y1 (t), y2 (t), . . ., yn (t) are n solutions then c1 y1 (t) + c2 y2 (t) + · · · +
cn yn (t) is a solution for any choice of constants c1 , c2 , . . . , cn .
    Now suppose that A is an n × n matrix. Suppose that λ1 , λ2 , . . . , λk are its
eigenvalues with eigenvectors x1 , x2 , . . . , xk . Then we have that for any choice
of constants c1 , c2 , . . . , ck ,

                     y(t) = c1 eλ1 t x1 + c2 eλ2 t x2 + · · · + ck eλk t xk               (6.8)

is a solution. Have we found all solutions? In other words, could there be a
solution of the equation that is not this form, or is every solution of the form
(6.8) for some choice of c1 , c2 , . . . , ck ?
    There is a theorem in differential equations that says that given an initial
condition x0 there is one and only one solution of y = Ay satisfying y(0) = y0 .
So our theoretical question above is equivalent to the following quite practical
question. Given an initial vector y0 , does there exist a solution y(t) of the form
(6.8) whose value at zero is the given initial condition, i.e., y(0) = y0 ?
    This will be true if, given any vector x0 , one can find c1 , c2 , . . . , ck so that

                         y(0) = c1 x1 + c2 x2 + · · · + ck xk = x0


                                              216
Chapter 6. Eigen-analysis                    6.3. Systems of linear differential equations



This is exactly the condition that the eigenvectors form a basis. It turns out
that in the “bad” cases where there are not enough eigenvectors of A to form a
basis, there are solutions that don’t have the form (6.8).
   Now suppose that there are n eigenvectors that do form a basis. How can
we actually find the numbers c1 , c2 , . . . , cn such that
                          c1 x1 + c2 x2 + · · · + ck xn = x0 ?
Just notice that this is a system linear equations
                                              c1
                                             
                                             c2 
                             x1 x2 · · · xn  .  = x0
                                             . 
                                               .
                                                        cn
so you know what to do.
Example 6.9 Find the general solution to the system of equations
                              y1 (t)    = 2y1 (t)        +y2 (t)
                              y2 (t)    = y1 (t)         +2y2 (t)
This is equivalent to the matrix equation
                                               2       1
                                   y (t) =               y(t)
                                               1       2
               2    1                                                  1
The matrix              has eigenvector and eigenvalues λ1 = 1, x1 =       and
               1    2                                                 −1
                   1
λ2 = 3, x2 =          . The eigenvectors x1 and x2 form a basis, so the general
                   1
solution is
                                                              1          1
              y(t) = c1 eλ1 t x1 + c2 eλ2 t x2 = c1 et          + c2 e3t
                                                             −1          1
Example 6.10 Continue Example 6.9 above and find the solution satisfying
the initial condition
                                    2
                           y(0) =
                                    1
We have to find constants c1 and c2 so that
                                    1       1   2
                              c1       + c2   =
                                    −1      1   1
This is the same as solving
                                    1    1        c1         2
                                                        =
                                   −1    1        c2         1
The solution is
                                        c1             1/2
                                              =
                                        c2             3/2


                                              217
Chapter 6. Eigen-analysis                  6.3. Systems of linear differential equations



Example 6.11 Now let’s do an example where the eigenvalues are complex.
Consider the equation
                                 0 1
                        y (t) =        y(t).
                                −1 0
                                                                            0   1
Find the general solution of this differential equation. The matrix                has
                                                                           −1   0
                                                   −i
eigenvector and eigenvalues λ1 = i, x1 =              and complex conjugate λ2 = −i,
                                                   1
       i
x2 =     . The eigenvectors x1 and x2 form a basis, so the general solution is
       1

                                                         −i           i
             y(t) = c1 eλ1 t x1 + c2 eλ2 t x2 = c1 eit      + c2 e−it
                                                         1            1

    In most applications, the solutions that we are interested in are real. The
solution above looks decidedly complex! Remember, however, that the constants
c1 and c2 can be complex too. Perhaps for special choices of c1 and c2 the
solution will turn out to be real. This is in fact always true when the original
matrix is real. In this case the complex eigenvalues and eigenvectors occur in
conjugate pairs. So if
                                  y1 (t) = eλt x
is a solution, then so is
                                                   ¯
                                        y1 (t) = eλt x
                                        ¯            ¯
So if we choose c1 = a/2 and c2 = a/2 for a real number a, then
                                    ¯                         ¯
                      c1 eλt x + c2 eλt x = a/2(eλt x + eλt x)
                                        ¯                   ¯
                                                λt
                                          = a (e x)

(here stands for the real part. We used that for a complex number z, z + z =
                                                                         ¯
2 z). Similarly, if we choose c1 = a/2i and c2 = −a/2i, then
                                    ¯                         ¯
                     c1 eλt x + c2 eλt x = a/2i(eλt x − eλt x)
                                       ¯                    ¯
                                                λt
                                         = a (e x)

where () denotes the imaginary part of the argument. The upshot is that the
real and imaginary parts of a solution are also solutions. Its sometimes easier to
just start with one the complex solutions and find its real and imaginary parts.
This gives us two real solutions to work with. Notice that it doesn’t matter
which one of the complex solutions we pick. Because they are conjugate, their
real parts are the same, and their imaginary parts differ by only a minus sign.

Example 6.12 Continuing Example 6.11 In the example we have

                                             −i
                        y1 (t)   = eit
                                             1

                                             218
Chapter 6. Eigen-analysis                6.3. Systems of linear differential equations



                                        −ieit
                                  =
                                         eit
                                        −i(cos(t) + i sin(t))
                                  =
                                         cos(t) + i sin(t)
                                        −i cos(t) + sin(t)
                                  =
                                        cos(t) + i sin(t)
                                        sin(t)    − cos(t)
                                  =            +i
                                        cos(t)    + sin(t)
The real part and imaginary part are
                                          sin(t)
                                          cos(t)
and
                                         − cos(t)
                                          sin(t)
One can check directly that these are solutions to the original equation. The
general solution can also be written
                                  sin(t)      − cos(t)
                            a1           + a2
                                  cos(t)       sin(t)
The advantage of this way of writing the solution is that if we choose a1 and a2
to be real the solution is real too.
Example 6.13 Continuing Example 6.12 Now suppose we want to satisfy
an initial condition. Let’s find the solution y(t) of the equation that satisfies
                                                 2
                                      y(0) =        .
                                                 −2
There are two ways to proceed. we could use the complex form of the general
solution. Then we must find c1 and c2 such that
                                  −i      i    2
                             c1      + c2   =
                                  1       1   −2
This amounts to solving
                                  −i i      c1          2
                                                   =
                                  1 1       c2          −2
The solution is
                                                   −1
                        c1               −i i           2
                                  =
                        c2               1 1            −2
                                        1   1          −i     2
                                  =
                                       −2i −1          −i    −2
                                         i + 1/2
                                  =
                                         i − 1/2

                                           219
Chapter 6. Eigen-analysis                   6.3. Systems of linear differential equations



So c1 = i + 1/2 and c2 = i − 1/2. If we plug these into the expression for the
general solution we get the right answer. However, there is still a fair amount
of complex arithmetic needed to show explicitly that the solution is real. It’s
easier to start with the real solutions. In this approach we must find a1 and a2
so that
                sin(0)      − cos(0)      0      −1    2
          a1           + a2          = a1   + a2    =
                cos(0)       sin(0)       1      0    −2

Thus a1 = a2 = −2 so the solution is

                    sin(t)      − cos(t)   −2 sin(t) + 2 cos(t)
               −2          + −2          =
                    cos(t)       sin(t)    −2 cos(t) − 2 sin(t)

Example 6.14 Now let’s do an example where the eigenvalues are complex,
and have both a real and imaginary part. Let’s solve

                                            −1     1
                                  y (t) =             y(t)
                                            −1     −1

with initial condition
                                                   1
                                       y(0) =
                                                   1
The first step is to find the eigenvalues and eigenvectors. I’ll omit the compu-
                                                          1
tations. The result is λ1 = −1 + i with eigenvector x1 =       and the complex
                                                           i
                                                 1
conjugates λ2 = −1 − i with eigenvector x2 =        . Thus a solution is
                                                −i

                                                       1
                                   y1 (t) = e(−1+i)t
                                                       i

To find real solutions we calculate the real and imaginary parts of this.

                                     e(−1+i)t
                     y1 (t)   =
                                     ie(−1+i)t
                                     e−t eit
                              =
                                     ie−t eit
                                     e−t (cos(t) + i sin(t))
                              =
                                     ie−t (cos(t) + i sin(t))
                                     e−t cos(t)     e−t sin(t)
                              =        −t       + i −t
                                     −e sin(t)      e cos(t)

So the general solution can be written

                              e−t cos(t)      e−t sin(t)
                         a1     −t       + a2 −t
                              −e sin(t)       e cos(t)


                                             220
Chapter 6. Eigen-analysis                6.3. Systems of linear differential equations



To satisfy the initial condition, we need
                                   1      0   1
                              a1     + a2   =
                                   0      1   1
so that a1 = 1 and a2 = 1. Thus the solution is
                                          cos(t) + sin(t)
                            y(t) = e−t
                                         − sin(t) + cos(t)

6.3.1    Problems
                                                                    −2 −8
Problem 6.15 Find the general solution to y = Ay when A =                     .
                                                                     4    10
(Hint: This matrix appeared in the problems of last chapter). Find the solution
                                        1
satisfying the initial condition y(0) =   .
                                        1

                                                                        1 −2
Problem 6.16 Find the general solution to y = Ay when A =                       .
                                                                        2 1
Find both the complex form and the real form. Find the solution satisfying the
                           1
initial condition y(0) =      .
                           1
                                                                              
                                                                   6 0 13
Problem 6.17 Find the general solution to y = Ay when A =  5 1 13 .
                                                                   −2 0 −4
Find both the complex form
                             and the real form. Find the solution satisfying the
                           1
initial condition y(0) =  1 .
                           1
Problem 6.18 Find the general solution of the following system of differential
equations:
                                    1 1
                         y (t) =            y(t)
                                    5 1
Problem 6.19 Find the solution of the following system of differential equa-
tions, that satisfy the initial condition y(0) = [0, 1]T .
                                          3     −1
                               y (t) =               y(t)
                                          1     −1
Problem 6.20 Find the real form of the solution of the following system of
differential equations, that satisfy the initial condition y(0) = [2, −1]T .
                                         −3     −1
                              y (t) =                y(t)
                                         7       1
Problem 6.21 Is it true that every 3 × 3 matrix with real entries always has
at least one real eigenvalue? Why?


                                          221
Chapter 6. Eigen-analysis                                      6.4. LCR circuits




              Figure 6.2: Diagrams of capacitors and inductors.


6.4     LCR circuits
6.4.1    Capacitors and inductors
We introduce new elements into our study of electrical circuits: capacitors and
inductors. At an instant in time t, a capacitor acts as a voltage source with
voltage V (t). At a given instant in time an inductor acts as a current source
with current I(t). These circuit elements are shown in Figure 6.2.
    The voltage across the capacitor changes proportional to the current i (with
direction as in Figure 6.2) through it

                                  dV    i
                                     =−                                    (6.9)
                                  dt    C
where the constant of proportionality C is the capacitance of the capacitor with
MKS units of Farads. Note that a capacitor with large capacitance can provide
more current for the same drop in voltage.
    The current through an inductor changes proportional to the voltage v across
it (with direction as in Figure 6.2)

                                   dI    v
                                      =−                                  (6.10)
                                   dt    L
where the constant of proportionality L is the inductance with MKS units of
Henrys.
    We consider here at a high level how the behaviour in time of circuits with
many capacitors, inductors and resistors. At a given instant in time, capacitors
can be treated as a voltage sources and inductors as current sources. To de-
termine how these sources change in time, the current through the capacitors
(6.9) and the voltage across the inductors (6.10) is needed. However, deter-
mining these from the sources is the fundamental circuit problem considered in
Section 3.5.3 and Section 4.6. Thus, a system of differential equations can be
derived for circuit networks with capacitors and inductors by solving the fun-
damental problem for the circuit and scaling the result by the capacitance and
inductance of the elements. Some specific examples are given below.


                                      222
Chapter 6. Eigen-analysis                                      6.4. LCR circuits




                       Figure 6.3: A simple LRC circuit.


6.4.2    Differential equations for LCR circuits
Example 6.15 The simplest example of an LRC circuit is the simple series
circuit shown in Figure 6.3. We will derive a system of differential equations
for V (t) and I(t) in the circuit and determine what combination of values for
the components L, R, and C lead to oscillations. The equation is simple enough
that the solution to the fundamental problem can just be written down. The
current through the capacitor is I and the voltage v across the inductor is
IR − V . Using the relationships (6.9, 6.10) we write

                     dV
                            = −I/C
                     dt
                      dI
                            = −(IR − V )/L = V /L − IR/L
                      dt
or in matrix vector form
                                     x = Ax
                 T
where x = (V, I) and
                                      0  −1/C
                                A=
                                     1/L −R/L
The eigenvalues λ of A satisfy
                                            R     1
                       det(A − λI) = λ2 +     λ+    =0
                                            L    LC
with solutions
                               1
                           λ=     −R ± R2 − 4L/C .
                              2L
Considering this expression carefully you can see that if the λ values are real
they are negative. If they are complex, the real part is −R/(2L). Thus solutions
always decay exponentially. This is to be expected in a circuit with no external
power. For the circuit to have oscillations, the eigenvalues must be complex, so

                                 R2 − 4L/C < 0


                                      223
Chapter 6. Eigen-analysis                                       6.4. LCR circuits




                Figure 6.4: Circuit diagram for Example 6.16.


which is commonly rewritten as

                                  R    C
                                         < 1.
                                  2    L

Example 6.16 Consider the circuit shown in Figure 6.4. The capacitor has
capacitance C = 1 Farad and the inductor inductance L = 1 Henry. The
capacitor is charged up to 12V and then the switch is closed (so E(0) = 12 and
I(0) = 0).

(a) Derive a system of differential equations for I(t) and E(t).

(b) We expect that E(t) and I(t) → 0 as t → ∞ (no external power). Will there
    be oscillations in E and I in time?

Remember that if i(t) is the current upward through the capacitor then

                                dE    i
                                   = − = −i                               (6.11)
                                dt    C
since C = 1. If e(t) is the voltage across the inductor as shown then

                                 dI    e
                                    = − = −e                              (6.12)
                                 dt    L
since L = 1. It is still necessary to work out i(t) and e(t) in terms of E(t) and
I(t). This is the fundamental problem for the circuit. We solved this problem
in Example 3.13 in Section 3.6 and considered it from a different perspective in
the additional notes to Chapter 4. We found that
                                        5     1
                               i =        E− I
                                       36     6
                                       1
                               e =       E + 6I
                                       6



                                      224
Chapter 6. Eigen-analysis                                             6.4. LCR circuits




                                                            R1 I1V1


                  IV C                            R2 I V
                                                      22
                  44
                                                             L I3 3
                                                                 V




                            Figure 6.5: Circuit diagram.


Inserting these into (6.11) and (6.12) gives the desired system for E and I:
                                  dE        5     1
                                         = −   E+ I
                                  dt        36    6
                                   dI       1
                                         = − E − 6I
                                   dt       6
or in vector form x = Ax where
                              E                     −5/36   1/6
                     x=                 and A =
                              I                     −1/6    −6
An eigenanalysis of A gives λ1 ≈ −0.1436 and λ2 ≈ −5.9953. Since these are
not complex, the circuit does not exhibit oscillations.

6.4.3    Alternate description of LCR circuits
LCR circuits were presented differently in a previous version of the notes. That
approach is reproduced below starting in the next paragraph. It follows the
discussion in Section 3.5.4.
    We now return to the circuit that we discussed previously, shown in Fig-
ure 6.5. Recall that we chose as basic variables I3 and V4 and solved for all the
other variables in terms of these. The result was

                                  I1     = I3
                                            1
                                  I2     =    V4
                                           R2
                                                   1
                                  I4     = I3 −       V4
                                                   R2
                                  V1     = R1 I3
                                  V2     = V4
                                  V3     = −R1 I3 − V4

Now we can complete the job and determine I3 and V4 . We have to take into
account now that the currents and voltages are functions of time. The relations


                                            225
Chapter 6. Eigen-analysis                                           6.4. LCR circuits



between currents and voltages across capacitors and inductors involves the time
derivatives.
    If I and V are the current flowing through and the voltage across a capacitor
with capacitance C, then
                                    dV     1
                                        = I
                                    dt     C
If I and V are the current flowing through and the voltage across an inductor
with inductance L, then
                                    dI    1
                                       = V
                                    dt    L
Notice that we have chosen as basic the variables that get differentiated.
    Using these relations for I3 and V4 yields
                                    dI3             1
                                             =        V3
                                     dt             L
                                    dV4             1
                                             =        I4
                                     dt             C
Now we re-express everything in terms of I3 and V4 using the equations we
derived previously.
                  dI3            1                   −R1      1
                         =         (−R1 I3 − V4 ) =      I3 − V4
                   dt            L                    L       L
                  dV4            1        1         1       1
                         =         (I3 −    V4 ) = I3 −        V4
                   dt            C       R2        C      R2 C
This can be written as
                                       −R1            1
                            I3          L           −L      I3
                                   =    1             1
                            V4          C          − R2 C   V4

Lets try to determine for what values of R1 , L, C and R2 the circuit exhibits
oscillations. Recall that the solution will involve sines and cosines whenever the
matrix has complex eigenvalues.
    The polynomial det(A − λI) = λ2 + bλ + c, where
                                          R1    1
                                    b=       +
                                          L    R2 C
and
                                        R1      1
                                  c=         +    .
                                       R2 LC   LC
                                                                       √
The eigenvalues are the roots of this polynomial, given by (−b ±           b2 − 4c)/2.
These will be complex if b2 < 4c, i.e., if
                                         2
                         R1    1                     R1      1
                            +                <4           +
                         L    R2 C                  R2 LC   LC
Notice that this can be achieved by decreasing R1 and increasing R2


                                             226
Chapter 6. Eigen-analysis                                              6.5. Additional Topics




                                               R


                       V          C                      I         L




                Figure 6.6: Circuit diagram for Problem 6.23.


6.4.4    Problems
Problem 6.22 In the circuit in Figure 6.5, suppose that R1 = R2 = 1 ohm,
C = 1 farad and L = 1 henry. If the initial current across the inductor is
I3 (0) = 1 ampere and initial voltage across the capacitor is V4 (0) = 1 volt, find
I3 (t) and V4 (t) for all later times. What is V1 (t)?


Problem 6.23 Consider the circuit with diagram in Figure 6.6 Write down the
system of equations satisfied by I and V . For what values of L, C and R does
the circuit exhibit oscillations? Suppose that R = 1 ohm, C = 1 farad and L = 1
henry. If the initial current across the inductor is I(0) = 1 ampere and initial
voltage across the capacitor is V (0) = 1 volt, find I(t) and V (t) for all later
times.


6.5     Additional Topics
6.5.1    Diagonalization
Diagonal matrices (that is, matrices that have zero entries except on the diago-
nal) are extremely easy to work with. For a start, the eigenvalues of a diagonal
matrix are exactly the diagonal entries. If

                               λ1         0        ···       0
                                                              
                               0         λ2       ···       0 
                            D= .
                               .          .
                                           .                 . 
                                                             . 
                                .          .                 .
                                      0   0        · · · λn

then det(D − λI) = (λ1 − λ)(λ2 − λ) · · · (λn − λ) which is zero precisely when
λ equals one of λ1 , λ2 , . . ., λn . The corresponding eigenvectors are just the
standard basis vectors e1 , e2 , . . ., en .




                                          227
Chapter 6. Eigen-analysis                                                               6.5. Additional Topics



      It is also easy to compute powers of a diagonal matrix. We simply obtain
                                                     λk      0         ···     0   
                                                         1
                                                  0          λk2       ···     0
                                            Dk =  .
                                                                                    
                                                  .           .
                                                               .                .
                                                                                .
                                                                                    
                                                   .           .                .
                                                                                    
                                                   0             0      · · · λk
                                                                               n

This formula makes it easy to compute the matrix exponential of D. Recall
that the matrix etD is defined to be the matrix power series

                                                              t2 2 t3 3
                                    etD = I + tD +              D + D + ···
                                                              2    3!
Using the formula above we find that
                                                                                    t2 λ2                         
           1
            
                        0   ··· 0
                                   
                                      tλ1                     0        ···    0
                                                                                        1
                                                                                               0      ···    0
                                                                                    2        t2 λ2
          0            1   ··· 0  0                       tλ2       ···    0   0
                                                                                                                   
                                                                                                  2
                                                                                                      ···    0
etD
                                                                                                                   
        = .                    .+ .                                                          2                   + ···
          .            .
                        .       .  .                        .
                                                              .               . + .
                                                                              .   .          .             .
           .            .       .      .                      .               .     .         .             .
                                                                                                                   
                                                                                               .             .     
                0       0   ··· 1                   0        0         · · · tλn                            2 2
                                                                                                            t λn
                                                                                       0       0      ···
                                   t2 λ2
                                                                                                            2
                1 + tλ1 +             2
                                        1
                                            + ···                  0            ···           0
                                                                       t2 λ2                             
                                   0                   1 + tλ2 +         2
                                                                            2
                                                                                ···            0          
        =                                                                                                
                                   .
                                    .                          .
                                                               .                               .
                                                                                               .
                                                                                                          
                                   .                          .                               .2 2       
                                                                                                 t λn
                                    0                         0                 · · · 1 + tλn + 2 + · · ·
           etλ1                 0       ···     0
                                                 
                            tλ2
           0               e           ···     0
        =  .
           .                   .
                                .               .
                                                .
             .                  .               .
                    0           0       etλn

Things are not quite so simple for an arbitrary n × n matrix A. However, if
A has a basis of eigenvectors then it turns out that there exists an invertible
matrix B such that AB = DB, where D is the diagonal matrix whose diagonal
elements are the eigenvalues of A. Multiplying by B −1 from either the left or
right gives
                        A = BDB −1 , D = B −1 AB
In fact, B is simply the matrix whose columns are the eigenvectors of A. In other
words, if x1 , x2 , . . ., xn are the eigenvectors for A then B = x1 x2 · · · xn .
To see this notice that

                        AB          = A x1 x2 · · · xn

                                    =       Ax1 Ax2 · · · Axn

                                    =       λ1 x1 λ2 x2 · · · λn xn


                                                             228
Chapter 6. Eigen-analysis                                                                  6.5. Additional Topics



                                                        λ1              0           ···        0
                                                                                                
                                                        0              λ2          ···        0 
                       =    x1 x2 · · · xn              .
                                                        .               .
                                                                         .                     . 
                                                                                               . 
                                                         .               .                     .
                                                                0       0           · · · λn
                       = BD
The assumption that A has a basis of eigenvectors implies that the matrix B is
invertible.
   Using the representation A = BDB −1 it is easy to calculate powers of A.
We have
              A2 = BDB −1 BDB −1 = BDIDB −1 = BD2 B −1
and similarly
                                     Ak = BDk B −1
Therefore we can now also sum the power series for the exponential and obtain
                                     etA = BetD B −1

6.5.2    Computing high powers of a matrix
Recall that when we were discussing the random walk problem, we ended up
with the problem of computing the limit for large n of P n x0 where P is the
matrix of transition probabilities.
   We can now solve this problem using diagonalization. Lets do an example.
Suppose that
                                                   1        1
                                         P =       2        4
                                                   2        3
                                                   3        4
We wish to compute P n for large n.
    We begin by diagonalizing P . This involves finding the eigenvalues and
eigenvectors. I won’t give the details of this computation. The results are
                1                          −1
λ1 = 1, x1 =       and λ2 = 1/12, x1 =         . So
              8/3                           1
                                                                                      −1
                                1    −1        1       0            1        −1
                     P =        8                       1           8
                                3    1         0       12           3        1
and
                                                                                           −1
                            1       −1        1n           0                1       −1
                  Pn =      8                           1       n           8
                            3       1          0       12                   3        1
                             1 n                                                     1
But 1n = 1 for all n and    12       → 0 as n → ∞, since                            12    < 1. So
                                                                                               −1
                                         1    −1           1        0           1     −1
                 lim P n    =            8                                      8
                 n→∞                     3    1            0        0           3     1
                                                                                 3         3
                                         1    −1           1        0           11        11
                            =            8                                      −8         3
                                         3     1           0        0           11        11
                                          3     3
                            =            11    11
                                          8     8
                                         11    11



                                               229
Chapter 6. Eigen-analysis                                   6.5. Additional Topics



6.5.3    Another formula for the determinant
If A has a basis of eigenvectors, then we can get another formula for the deter-
minant. Using the multiplicative property of the determinant, we have

         det(A) = det(BDB −1 ) = det(B) det(D) det(B)−1 = det(D).

But det(D) is just the product of the diagonal elements, i.e., the eigenvalues.
Thus the determinant of A is the product of its eigenvalues:

                                det(A) = λ1 λ2 · · · λn .

Actually, this is true even if A doesn’t have a basis of eigenvectors and isn’t
diagonalizable.

6.5.4    The matrix exponential and differential equations
The matrix exponential etA can be used to solve the differential equation

                                    y (t) = Ay(t)

with initial condition
                                      y(0) = x0
                                                                d tA
To see this notice that etA satisfies the differential equation   dt e   = AetA . This
follows from the power series representation

                                           t2 2 t3 3
                         etA = I + tA +      A + A + ···
                                           2    3!
since
                         d tA         2t 2 3t2 3
                            e   = A+     A +      A + ···
                         dt            2       3!
                                            t2
                                = A + tA2 + A3 + · · ·
                                            2!
                                             t2
                                = A(I + tA + A2 + · · ·)
                                              2
                                = AetA

Also
                                       e0A = I
These two facts imply that y(t) = etA x0 is the solution to our differential
                                                d
equation and initial condition, since y (t) = dt etA x0 = AetA x0 = Ay(t) and
          0A
y(0) = e x0 = Ix0 = x0 .
    The matrix exponential is a nice theoretical construction. However, to actu-
ally compute the matrix exponential using diagonalization involves just the same
ingredients—computing the eigenvalues and vectors—as our original solution.
In fact it is more work.


                                          230
Chapter 6. Eigen-analysis                                           6.5. Additional Topics



    However, there is one situation where the matrix exponential gives us some-
thing new. This is the situation where A does not have a basis of eigenvectors.
The power series definition of the matrix exponential still makes sense, and can
                                                                  1 1
compute it in certain special cases. Consider the matrix A =             . This
                                                                  0 1
matrix does not have a basis of eigenvectors. So it cannot be diagonalized.
                                                            et tet
However, in a homework problem, you showed that etA =                . Thus the
                                                            0 et
solution to
                                       1 1
                              y (t) =         y(t)
                                       0 1
with initial condition
                                                    2
                                      y(0) =
                                                    1
is
                                2   et        tet       2   2et + tet
                y(t) = etA        =                       =
                                1   0          et       1       et
Notice that this solution involves a power of t in addition to exponentials.

6.5.5    Converting higher order equations into first order
         systems
So far we have only considered first order differential equations. In other words,
the equations have only involved first derivatives y (t) and not higher derivatives
like y (t). However higher order equations, especially second order equations,
occur often in practical problems. In this section we will show that a higher
order linear differential equation can be converted into an equivalent first order
system.
    Suppose we want to solve the equation

                               y (t) + ay (t) + by(t) = 0

with initial conditions

                                      y(0)    = y0
                                     y (0)    = y0

Define the new functions z1 (t) and z2 (t) by

                                    z1 (t)   = y(t)
                                    z2 (t)   = y (t)

Then
            z1 (t)   = y (t)        = z2 (t)
            z2 (t)   = y (t)    = −ay (t) − by(t)         = −az2 (t) − bz1 (t)



                                             231
Chapter 6. Eigen-analysis                                           6.5. Additional Topics



and
                                       z1 (0)    = y0
                                       z2 (0)    = y0


                                      z1 (t)
   In other words the vector                    satisfies the equation
                                      z2 (t)
                           d z1 (t)    0               1   z1 (t)
                                     =
                           dt z2 (t)   −b             −a   z2 (t)
with initial condition
                                       z1 (0)   y
                                              = 0 .
                                       z2 (0)   y0
Example 6.17 Suppose we want to solve the second order equation
                                      y + 4y + y = 0
with initial conditions
                                   y(0) = 1, y (0) = 0
If we let z1 (t) = y(t) and z2 (t) = y (t) then
                           d z1 (t)    0               1   z1 (t)
                                     =
                           dt z2 (t)   −1             −4   z2 (t)
with initial condition
                                       z1 (0)   1
                                              =   .
                                       z2 (0)   0
To solve this we first find the eigenvalues and eigenvectors. They are λ1 =
      √            1√                   √              1√
−2 + 3, x1 =              and λ1 = −2 − 3, x1 =              So the general
                −2 + 3                             −2 − 3
solution is
                          √                       √
                                    1√                    1√
               c1 e(−2+    3)t
                                        + c2 e(−2− 3)t
                                 −2 + 3                −2 − 3
To satisfy the initial condition, we need
                               1√          1√    1
                     c1            + c2        =
                            −2 + 3      −2 − 3   0
The solution is                                √
                                 c1             √ + 1/2
                                                 3/3
                                        =
                                 c2            − 3/3 + 1/2
Thus
 z1 (t)    √             √
                                 1√     √             √
                                                              1√
        = ( 3/3+1/2)e(−2+ 3)t        +(− 3/3+1/2)e(−2− 3)t
 z2 (t)                       −2 + 3                       −2 − 3
and so
                          √               √        √               √
         y(t) = z1 (t) = ( 3/3 + 1/2)e(−2+ 3)t + (− 3/3 + 1/2)e(−2− 3)t


                                                232
Chapter 6. Eigen-analysis                                   6.5. Additional Topics




                                        k ,l
                               m

                                     x(t)
                   Figure 6.7: A simple mass-spring system.

   Actually, to solve the equation
                            y (t) + ay (t) + by(t) = 0
it is not really necessary to turn it into a first order system. We can simply try
to find solutions of the form y(t) = eλt . If we plug this into the equation we get
(λ2 + aλ + b)eλt which is zero if λ is a root of λ2 + aλ + b. This polynomial has
two roots, which yields two solutions.
     Still, the procedure of turning a higher order equation into a first order sys-
tem is important. This is because on a computer it is much easier to solve a first
order system than a high order equation. If the coefficients a and b are functions
of t, then exact solutions (like exponentials) usually can’t be found. However,
one can still turn the equation into a first order system y (t) = A(t)y(t) where
the matrix now depends on t and solve this on a computer.

6.5.6    Springs and weights
To begin, lets consider the situation where we have a single weight hanging on
a spring shown in Figure 6.7. We want to determine how the weight moves in
time. To do this we calculate the forces acting on the weight and use Newton’s
law of motion F = ma.
    One force acting on the weight are the force of gravity. This acts in the
positive x direction (i.e., downward) and has magnitude mg. The other force
is due to the spring. It’s magnitude is k(x − l) in the negative x direction.
The acceleration is the second derivative x (t). Thus the total force is F =
mg − k(x(t) − l) and ma = mx (t) Newton’s law reads
                   mx (t) = mg − k(x − l) = −kx + mg + lk
    This is not quite in the form we can handle, due to the term mg + lk on
the right. What we must do is first find the equilibrium solution. In a previous
lecture we found the equilibrium position by minimizing the potential energy.
There is another, equivalent, way. That is to find the value of x for which the
total force is zero. In other words
                              −kxeq + mg + lk = 0


                                       233
Chapter 6. Eigen-analysis                                      6.5. Additional Topics



or
                                  xeq = (mg + lk)/k
Notice that the total force can be written
                         −kx + mg + lk = −k(x − xeq )
Now let y(t) = x(t)−xeq be the displacement from the equilibrium point. Notice
that y (t) = x (t) and y (t) = x (t), since xeq is a constant. So the equation for
y(t) is
                                my (t) = −ky(t)
or
                                        k
                                  y (t) + y(t) = 0
                                        m
    We could turn this into a first order system. However, it is easier to try
solutions of the form eλt . Substituting this into the equation yields
                                  (λ2 + k/m)eλt = 0
                                                                              √
so we require that λ2 + k/m = 0, or λ = ±i k/m. Thus, solutions are ei
        √
                                                                                  k/mt

and e−i k/mt . To obtain real solutions, we can take the real and imaginary
parts. This gives as solutions sin( k/mt) and cos( k/mt) , and the general
solution is
                       c1 sin( k/mt) + c2 cos( k/mt)
    We can make the equation a little more interesting by adding friction. A
frictional force is proportional to the velocity and acts in the direction opposite
to the motion. Thus the equation with friction reads
                                                  k
                            y (t) + βy (t) +        y(t) = 0
                                                  m
This can be solved by turning it into a first order system, or directly, using trial
solution of the form eλt as above.
    Now we turn the problem with several weights and springs shown in Fig-
ure 6.8. In this problem matrices play an essential role.
    We begin by computing the forces acting on each weight. Let us start with
the first weight. Gravity is pulling down, and the springs above and below are
pushing or pulling with a force proportional to their extensions. Thus the total
force on the first weight is m1 g − k1 (x1 − l1 ) + k2 (x2 − x1 − l2 ). To get the
signs right on the spring forces, think of what happens when one of the xi ’s gets
large. For example, when x1 gets large, the first spring stretches and pulls up,
so the sign of the force should be negative for large x1 . So Newton’s equation
for the first weight is
m1 x1 (t) = m1 g−k1 (x1 −l1 )+k2 (x2 −x1 −l2 ) = −(k1 +k2 )x1 +k2 x2 +m1 g+k1 l1 −k2 l2
or
                              k1 + k2      k2          k1 l1 − k2 l2
                 x1 (t) = −           x1 +    x2 + g +
                                m1         m1               m1

                                            234
Chapter 6. Eigen-analysis                                                    6.5. Additional Topics




                                    k1 , l1
                                                  x1

                                         m1

                                    k2 , l2            x2

                                         m
                                          2
                                    k3 , l                  x3
                                          3

                                         m3



            Figure 6.8: A more complicated mass-spring system.


Similarly the equations for the second and third weights are
                        k2      k2 + k3       k3           k2 l2 − k3 l3
           x2 (t)   =      x1 −         x2 +      x3 + g +
                        m2        m2         m2                 m2
                        k3      k3           k3 l3
           x3 (t)   =      x2 −    x3 + g +
                        m3      m3           m3


Thus can be written as a second order matrix equation

                                 x (t) = Kx(t) + b

where                                            
                                           x1 (t)
                                  x(t) =  x2 (t)  ,
                                           x3 (t)
                                 − k1m1 2
                                     +k           k2
                                                                   0
                                                                        
                                                 m1
                                   k2
                        K=        m2          − k2m2 3
                                                   +k             k3
                                                                  m2
                                                                         
                                                  k3               k
                                    0            m3              − m33
and                                          −k
                                    g + k1 l1m1 2 l2
                                                    
                                             −k
                              b =  g + k2 l2m2 3 l3  .
                                      g + k3 l33
                                             m
With this notation, the equilibrium solution is the value of x that makes all the
forces zero. That is,
                                 Kxeq + b = 0
or,
                                   xeq = −K −1 b
As in the case of one weight the force side of the equation can now be written
as
                    Kx + b = K(x + K −1 b) = K(x − xeq )


                                              235
Chapter 6. Eigen-analysis                                            6.5. Additional Topics



so if we define
                                    y(t) = x(t) − xeq ,
the equation for y(t) is
                                      y (t) = Ky(t)
    To solve this second order 3 × 3 system, we could turn it in to a first order
6 × 6 system. However, just as in the case of a single higher order equation we
can proceed directly. We try trial solutions of the form eκt y. Substituting this
into the equation, we see that the equation is satisfied if

                                        κ2 y = Ky

in other words, κ2 is an eigenvalue of K with eigenvector y, or κ is one of the
two square roots of and eigenvalue.
    So, if K has eigenvalues λ1 , λ2 and λ3 with eigenvectors y1 , y2 and y3 ,
                                                        √           √         √
then six solutions of the equation are given by e λ1 t y1 , e− λ1 t y1 , e λ2 t y2 ,
   √          √              √
e− λ2 t y2 , e λ3 t y3 and e− λ3 t y3 . If some of the λi ’s are negative, then these
solutions are complex exponentials, and we may take their real and imaginary
parts to get real solutions. The general solution is a linear combination of these,
and the coefficients in the linear combination may be chosen to satisfy an initial
condition.
    To make this clear we will do an example. Suppose that all the masses mi ,
lengths li and spring constants ki are equal to 1. Then
                                                     
                                        −2 1        0
                               K =  1 −2 1 
                                         0    1 −1

Suppose that the initial position of the weights is x1 = 30, x2 = 60 and x3 = 70,
and that the initial velocities are x1 = 1 and x2 = x3 = 0. We will determine the
positions of the weights for all subsequent times. The numbers in this problem
don’t turn out particularly nicely, so I’ll just give them to 3 significant figures.
   The first step is to find the eigenvalues and eigenvectors of K. They are
given by
                     λ1 = −0.198 λ2 = −1.55 λ3 = −3.25
                                                              
                       0.445              −1.25              1.80
               x1 =  0.802  x2 =  −0.555  x3 =  −2.25 
                        1.00               1.00              1.00
          √                         √                       √
Let µ1 = 0.198 = 0.445, µ2 = 1.55 = 1.25 and µ3 = 3.25 = 1.80 Then if
y(t) = x(t) − xeq , then general solution for y(t) is

y(t) = (c1 eiµ1 t + d1 e−iµ1 t )x1 + (c2 eiµ2 t + d2 e−iµ2 t )x2 + (c3 eiµ3 t + d3 e−iµ3 t )x3

where c1 , d1 , c2 , d2 , c3 , d3 are arbitrary constants. Taking real and imaginary
parts, we can also write the general solution as

y(t) = (a1 cos(µ1 t)+b1 sin(µ1 t))x1 +(a2 cos(µ2 t)+b2 sin(µ2 t))x2 +(a3 cos(µ3 t)+b3 sin(µ3 t))x3


                                            236
Chapter 6. Eigen-analysis                                     6.5. Additional Topics



where a1 , b1 , a2 , b2 , a3 , b3 are arbitrary constants. Notice that we can find the
general solution for y(t) = x(t) − xeq without knowing xeq . However, since the
initial conditions were given in terms of x and not y, we now have to find xeq
to be able to convert initial conditions for x to initial conditions for y. If we
work in units where g = 10 then
                                                
                                                 10
                                           b =  10 
                                                 11
and                                              
                                              31
                           xeq = −K −1 b =  52 
                                              63
so                                                    
                                       30       31      −1
                 y(0) = x(0) − xeq =  60  −  52  =  8 
                                       70       63       7
Also                                             
                                                  1
                                y (0) = x (0) =  0 
                                                  0
So to satisfy the first initial condition, since cos(0) = 1 and sin(0) = 0, we need
that                                                      
                                                        −1
                      y(0) = a1 x1 + a2 x2 + a3 x3 =  8  .
                                                         7
Explicitly, we need to solve
                                                     
                                             a1      −1
                             [x1 |x2 |x3 ]  a2  =  8  ,
                                             a3       7
or                                                       
                     0.445     −1.25      1.80     a1      −1
                    0.802     −0.555    −2.25   a2  =  8 
                      1.00      1.00      1.00     a3       7
This is not a pretty sight, but I can punch the numbers into my computer and
find that                                     
                                 a1       7.04
                                a2  =  1.33 
                                 a3       −1.37
To satisfy the second initial condition, we differentiate the expression for the
general solution of y(t) and set t = 0. This gives
                                                         
                                                          1
                       µ1 b1 x1 + µ2 b2 x2 + µ3 b3 x3 =  0 
                                                          0

                                            237
Chapter 6. Eigen-analysis                                   6.5. Additional Topics



Solving this numerically gives
                                               
                               µ1 b1       0.242
                              µ2 b2  =  −0.435 
                               µ3 b3       0.194

Finally, we divide by the µi s to give
                                              
                                 b1       0.543
                                b2  =  −0.348 
                                 b3        1.80
Now we have completely determined all the constants, so the solution is com-
plete.

6.5.7        Problems
Problem 6.24 Consider the second order equation

                                 y − 5y + 6y = 0

with initial conditions y(0) = 1, y (0) = 0. Solve this by turning it into a 2 × 2
system. Then solve it directly by using trial solutions eλt .

Problem 6.25 Consider the second order equation

                                 y +y +y =0

with initial conditions y(0) = 1, y (0) = 0. Solve this by turning it into a 2 × 2
system. Then solve it directly by using trial solutions eλt .

Problem 6.26 How can you turn a third order equation

                            y    + ay + by + cy = 0

into an equivalent 3 × 3 system of equations?

Problem 6.27 Suppose K is a 3 × 3 matrix with eigenvalues and eigenvectors
given by
                    λ1 = −1 λ2 = −4 λ3 = −9
                                              
                       1            1              0
                x1 =  0  x2 =  0  x3 =  1 
                       1            −1             0
Find the solution of
                                  y (t) = Ky(t)
satisfying                                 
                                            1
                                   y(0) =  2 
                                            1

                                       238
Chapter 6. Eigen-analysis                    6.6. Solutions to Chapter Problems


                                           
                                            0
                                  y (0) =  1 
                                            1
Problem 6.28 Consider a system of two hanging weights and springs. Suppose
that all the masses, spring constants and spring lengths are equal to one, and
that g = 10. Find the positions x1 (t) and x2 (t) for all times if x1 (0) = 20,
x2 (0) = 30, x1 (0) = 1, x2 (0) = 1.


6.6     Solutions to Chapter Problems
                              1    1  1    2     1       1
Solution 6.1 We compute                 =    =2    , so      is an eigen-
                              1    1  1    2     1       1
                                  1 1   1    0       1          1
vector with eigenvalue 2. Also             =    =0      , so        is an
                                  1 1   −1   0      −1         −1
eigenvector with eigenvalue 0.
Solution 6.2 If P projects onto some line, then a vector x lying on that line
doesn’t get changed by P so P x = x and x is an eigenvector with eigenvalue 1.
On the other hand, if x is perpendicular to the line, then P x = 0 = 0x so x is
an eigenvector with eigenvalue 0.
Solution 6.3
a) det(A − λI) = λ2 − 9 = (λ − 3)(λ + 3) so the eigenvalues are λ1 = 3 and
   λ2 = −3. To find the eigenvector for λ1 = 3 we must solve the homogeneous
                          −3 3                               −3 3
   equation with matrix             . The matrix reduces to          and the
                           3 −3                               0 0
                         1
   eigenvector is x1 =      . To find the eigenvector for λ2 = −3 we must
                         1
                                                  3 3
   solve the homogeneous equation with matrix           . The matrix reduces
                                                  3 3
        3 3                                1
   to         and the eigenvector is x2 =      .
        0 0                                −1
b) det(A − λI) = λ2 − 8λ + 12 = (λ − 2)(λ − 6) so the eigenvalues are λ1 = 3 and
   λ2 = −3. To find the eigenvector for λ1 = 2 we must solve the homogeneous
                          −4 −8                                −4 −8
   equation with matrix             . The matrix reduces to              and the
                           4    8                              0     0
                         2
   eigenvector is x1 =       . To find the eigenvector for λ2 = 6 we must solve
                        −1
                                             −8 −8
   the homogeneous equation with matrix                 . The matrix reduces to
                                              4      4
     −8 −8                                    1
               and the eigenvector is x2 =         .
      0    0                                 −1
c) det(A − λI) = λ2 + 7λ + 6 = (λ + 6)(λ + 1) so the eigenvalues are λ1 = −6
   and λ2 = −1. To find the eigenvector for λ1 = −6 we must solve the


                                      239
Chapter 6. Eigen-analysis                         6.6. Solutions to Chapter Problems



                                              −10
                                                35
   homogeneous equation with matrix                 . The matrix reduces to
                                              −30
                                               105
    35 −10                                   2
               and the eigenvector is x1 =       . To find the eigenvector for
     0   0                                   7
                                                                   30 −10
   λ2 = −1 we must solve the homogeneous equation with matrix               .
                                                                  105 −35
                           30 −10                                   1
   The matrix reduces to              and the eigenvector is x2 =     .
                           0    0                                   3

d) det(A−λI) = λ2 −3λ−10 = (λ+2)(λ−5) so the eigenvalues are λ1 = −2 and
   λ2 = 5. To find the eigenvector for λ1 = −2 we must solve the homogeneous
                           −7 −14                               −7 −14
   equation with matrix              . The matrix reduces to             and
                           7    14                               0   0
                              2
   the eigenvector is x1 =       . To find the eigenvector for λ2 = 5 we must
                             −1
                                                    −14 −14
   solve the homogeneous equation with matrix                   . The matrix
                                                      7     7
               −14 −14                                    1
   reduces to               and the eigenvector is x2 =       .
                 0      0                                −1

Solution 6.4
               2−λ     3
        det                     =      (2 − λ)(1 − λ) − 6 = 2 − 3λ + λ2 − 6
                2     1−λ
                                =      λ2 − 3λ − 4 = (λ − 4)(λ − 5) = 0
                                ⇒      λ = −1, 4.

The eigenvector for λ = −1:

      3   3     k1                        3k1 + 3k2 = 0
                      =0       ⇒                               ⇒       k1 = −k2
      2   2     k2                        2k1 + 2k2 = 0

If we set k2 = 1 for convenience, we have that the first eigenpair is

                                                     −1
                            λ1 = −1,       v1 =            .
                                                      1

Now, the eigenvector for λ = 4:

   −2      3     k1                        −2k1 + 3k2 = 0                        3
                      =0        ⇒                                  ⇒      k1 =     k2
   2      −3     k2                        2k1 − 3k2 = 0                         2
If we set k2 = 1 for convenience, we have that the second eigenpair is

                                                     3/2
                            λ2 = 4,       v2 =             .
                                                      1

Solution 6.5


                                         240
Chapter 6. Eigen-analysis                    6.6. Solutions to Chapter Problems



a) det(A − λI) = −λ3 + 2λ2 + λ − 2 = −(λ − 1)(λ − 2)(λ + 1) so the eigenvalues
   are λ1 = 1 and λ2 = 2 and λ3 = −1. To find the eigenvector for 1 = 1
                                                                          λ
                                                              −1 −1 1
   we must solve the homogeneous equation with matrix  1 −1 2 . The
                                                              2   0 1 
                         −1 −1 1                                         1
   matrix reduces to  0 −2 3  and the eigenvector is x1 =  −3 . To
                           0   0 0                                       2
   find the eigenvector for λ2 = 2 we must solve the 
                                                       homogeneousequation
                   −2 −1 1                                 −2 −1 1
   with matrix  1 −2 2 . The matrix reduces to  0 −1 1  and the
                    2  0
                         0                                  0      0 0
                          0
   eigenvector is x2 =  1 . To find the eigenvector for λ3 = −1 we must solve
                          1                             
                                              1 −1 1
   the homogeneous equation with matrix  1 1 2 . The matrix reduces
                                            2 0    3
        1 −1 1                                      3
   to  0 2 1  and the eigenvector is x3 =  1 .
        0 0 0                                       −2
                                                            √          √
b) det(A − λI) = −λ3 + 2λ2 + 3λ −√ = −(λ − 2)(λ − 3)(λ + 3) so the
                                       6                √
   eigenvalues are λ1 = 2 and λ2 = 3 and λ3 = − 3.. To find the eigen-
   vector for λ1 = 2 we must solve the homogeneous equation with matrix
                                                               
     −1 1        1                                0 −1 −1
    1 −2 −2 . The matrix reduces to  0              0      0  and the eigen-
      1 −1 −1                                  −1 1         1
                      0                                         √
   vector is x1 =  −1 . To find the eigenvector for λ2 = 3 we must solve
                      1                            √                    
                                               1− 3        1
                                                           √         1
   the homogeneous equation with matrix  1              − 3        −2 . The
                                                                      √
                             √                    1  −1 1 − 3
                          1− 3       1√         1 √
   matrix reduces to  0          2 − 3 −3 + 2 3  and the eigenvector is
                           0       0          0
            −1
             √                                         √
   x2 =  − 3 . To find the eigenvector for λ3 = − 3 we must solve the ho-
             1                           √                      
                                      1+ 3        1
                                                  √       1
   mogeneous equation with matrix  1          + 3       −2 . The matrix re-
                                                           √
                                         1       −1 1 + 3




                                      241
Chapter 6. Eigen-analysis                      6.6. Solutions to Chapter Problems


               √                                                          
             1+ 3           1√     √1                                     −1
                                                                          √
   duces to  0           2+ 3    2 3 + 1  and the eigenvector is x3 =  3 .
               0            0       0                                      1

c) det(A − λI) = −λ3 + 48λ − 128 = −(λ + 8)(λ − 4)2 so the eigenvalues
   are λ1 = −8 and λ2 = 4. To find the eigenvector for λ1 = −8 we must
                                                             
                                                  15 −9 −15
   solve the homogeneous equation with matrix  0 12       0 . The matrix
                                                 3 −9 −3  
                 15 −9 −15                                      1
   reduces to  0 12         0  and the eigenvector is x1 =  0 . To find
                  0    0     0                                  1
   the eigenvector(s) for λ2 = 4 we must solve the homogeneous equation with
                                                                
              3 −9 −15                                  3 −9 −15
   matrix  0 4         0 . The matrix reduces to  0 0         0 . Thus
              3 −9 −15                                  0 0      0
   there  two eigenvectors corresponding to this eigenvalues, for example
          are             
           3               5
   x2 =  1  and x3 =  0 .
           0               1

d) det(A − λI) = −λ3 + λ2 + λ − 1 = −(λ + 1)(λ − 1)2 so the eigenvalues
   are λ1 = −1 and λ2 = 1. To find the eigenvector for λ1 = −1 we must
                                                                 
                                                  32 −100 70
   solve the homogeneous equation with matrix  18 −58 42 . The matrix
                                                12 −40     30
                 32 −100 70                                     5
   reduces to  0     −14 21  and the eigenvector is x1 =  3 . To find the
                  0    0     0                                  2
   eigenvector for λ2 = 1 we must solve the homogeneous equation with matrix
                                                             
     30 −100 70                                 30 −100 70
    18 −60 42 . The matrix reduces to  0            0     0 Thus there are
     12 −40 28                                  0      0     0              
                                                                          7
   two eigenvectors corresponding to this eigenvalues, for example x2 =  0 
                                                                        −3
                0
   and x3 =  7 .
                10

Solution 6.6
      p11 − λ   1 − p22
det                         =    (p11 − λ)(p22 − λ) − (1 − p11 )(1 − p22 )
      1 − p11   p22 − λ
                            = p11 p22 − (p11 + p22 )λ + λ2 − (1 − (p11 + p22 ) + p11 p22 )
                            = λ2 − (p11 + p22 )λ + (p11 + p22 ) − 1 = 0


                                       242
Chapter 6. Eigen-analysis                            6.6. Solutions to Chapter Problems



Let c = p11 + p22 , then

                   λ2 − cλ + c − 1        =    0
                                               c±     c2 − 4(c − 1)
                                      λ   =
                                                   √ 2
                                               c ± c2 − 4c + 4
                                          =
                                                        2
                                               c ± (c − 2)
                                          =
                                                    2


Since p11 and p22 are in the interval [0, 1], we have that c ∈ [0, 2].
   The first eigenvalue is λ1 = c+c−2 = c − 1, therefore λ1 ∈ [−1, 1].
                                   2
   The second eigenvalue is λ2 = c−(c−2) = 1.
                                       2

Solution 6.7
                                                          
                                     −λ        1     −1
         det(A − λI)       =   det  5        −λ      1    
                                      0        1    −1 − λ
                                          −λ         1                1    −1
                           = −λ det                         − 5 det
                                           1       −1 − λ             1   −1 − λ
                           = −λ(λ + λ2 − 1) − 5(−λ) = −λ3 − λ2 + λ + 5λ = 0
                           = λ(λ2 + λ − 6) = λ(λ − 2)(λ + 3)
 ⇒       λ = 0, 2, −3.

To find the eigenvector corresponding to λ1 = 0, we find the RREF of the
extended matrix (A − λ1 I, 0), keeping in mind that one of the variables is a free
variable.
                                                         1
                                                                                   
                  0 1 −1 0                       1 0 5 0                             −1
(A − λ1 I, 0) =  5 0 1 0              →       0 1 −1 0  ;           vλ1 = α  5  .
                  0 1 −1 0                       0 0 0 0                              5

Now, for the eigenvector corresponding to λ2 = 2:
                                                                                           
                  −2 1 −1 0                       1 0            −1 0                         1
(A − λ2 I, 0) =  5 0 1 0              →       0 1             −3 0  ;          vλ2   = α 3 .
                  0 1 −3 0                        0 0             0 0                         1

And for the eigenvector     corresponding to λ3 = −3:
                                                                                         
                  3 1       −1 0                   1 0         −1     0                    1
(A − λ3 I, 0) =  5 3        1 0         →       0 1          2     0 ;     vλ3   = α  −2  .
                  0 1        2 0                   0 0          0     0                    1




                                          243
Chapter 6. Eigen-analysis                     6.6. Solutions to Chapter Problems



Solution 6.8
                                                  
                                  2−λ       0   1
         det(A − λI)     =   det  0       2−λ  1 
                                   1        0  2−λ
                                           2−λ    1                 0       1
                         =   (2 − λ) det                 + det
                                            0    2−λ               2−λ      1
                         =   (2 − λ)3 − (2 − λ) = (2 − λ) (2 − λ)2 − 1 = 0
                         =   (2 − λ)(λ2 − 4λ + 3) = (λ − 2)(λ − 1)(λ − 3)
   ⇒      λ = 1, 2, 3.

To find the eigenvector corresponding to λ1 = 1, we again find the RREF of the
extended matrix (A − λ1 I, 0), keeping in mind that one of the variables is a free
variable.
                                                                                
                  1 0 1 0                       1 0 1 0                            1
(A − λ1 I, 0) =  0 1 1 0            →       0 1 1 0 ;           vλ1 = α  1  .
                  1 0 1 0                       0 0 0 0                          −1

Now, for the eigenvector corresponding to λ2 = 2:
                                                                             
                  0 0 1 0                     1 0 0         0                   0
(A − λ2 I, 0) =  0 0 1 0          →        0 0 1         0 ;     vλ2   = α 1 .
                  1 0 0 0                     0 0 0         0                   0

   And for the eigenvector   corresponding to λ3 = 3:
                                                                                        
                  −1 0        1 0                     1 0    −1    0                       1
(A − λ3 I, 0) =  0 −1        1 0         →       0 1      −1    0 ;         vλ3   = α 1 .
                  1    0     −1 0                     0 0     0    0                       1

Solution 6.9 The answer is yes. The two eigenvectors and eigenvalues define
six equations. There are nine unknowns -the entries of M- however, there is
an extra condition; the rank of M is two. There are several solutions, here we
present one.
    Let’s try                                 
                                   a11 0 a13
                           M =  a21 0 a23  .
                                   a31 0 a33
We have here six unknowns and six equations, so there is a chance to find a
solution. If the two non-zero column vectors are independent, then we are done,
it has rank 2.
                                                                  
    a11 0 a13         1        −1             a11 0 a13         3        −3
  a21 0 a23   2  =  −2  , and  a21 0 a23   2  =  −2 
    a31 0 a33         3        −3             a31 0 a33         1        −1


                                       244
Chapter 6. Eigen-analysis                              6.6. Solutions to Chapter Problems



                       a11 + 3a13 = −1, and 3a11 + a13 = −3
From this we have a13 = 0, and a11 = −1.

                       a21 + 3a23 = −2, and 3a21 + a23 = −2
                                     1
From this we have that a21 = a23 = − 2 .

                       a131 + 3a33 = −3, and 3a31 + a33 = −1

This is similar to the first pair: a33 = −1, and a31 = 0.
   The resulting M matrix is
                                                 
                                      −1 0 0
                             M =  −2 0 −1  .
                                       1
                                               2
                                      0 0 −1

M has   rank 2, and
                                                                         
  −1     0 0        1      −1           −1                   0    0      3       −3
 −1                                      1
         0 − 1   2  =  −2  , and  − 2                  0   − 1   2  =  −2 
    2         2                                                    2
   0     0 −1       3      −3            0                   0   −1      1       −1

Solution 6.10
      2−λ         3
det                           =   (2 − λ)(−1 − λ) + 6 = −2 − 2λ + λ + λ2 + 6
       −2       −1 − λ
                                                                 √                  √
                                                            1      15          1      15
                              =   λ2 − λ + 4 =         λ−     +i          λ−     −i         =0
                                                            2     2            2     2
                                               √
                                          1      15
                              ⇒   λ=        ±i
                                          2     2

Solution 6.11

a) Find values for a and b such that A2 = A

        a   i      a     i        a   i                     a2 − 1 = a    ai + bi = i
                              =                    ⇒
        i   b      i     b        i   b                     ai + bi = i   −1 + b2 = b

   Hence, a2 − a − 1 = 0, and a + b = 1. Upon solving the system we get
                                  √                √
                              1± 5            1      5
                          a=         ,    b=
                                 2                2

b) If A3 = A, then we can try

                             A−1 A3 = A−1 A            ⇒      A2 = I


                                             245
Chapter 6. Eigen-analysis                                          6.6. Solutions to Chapter Problems



         a   i       a   i             1       0                        a2 − 1 = 1       ai + bi = 0
                               =                           ⇒
         i   b       i   b             0       1                        ai + bi = 0      −1 + b2 = 1
     Hence, a2 = 2, and a = −b. We get
                                   √                                   √
                             a = ± 2,                          b=          2
                 √                 2
                  2       i
                          √            =I           ⇒              A3 = A;          A4 = I = A
                 i       − 2

c) From the previous solution, we want A8 = I = A. If A4 = I, then A4 · A4 =
   A8 = I, hence A9 = A.

Solution 6.12
a)
                                                          a        i
                                                   A=
                                                          i        b
     The two eigenvalues are 2 + i and 2 − i. The characteristic polynomial is

                                           (a − λ)(b − λ) + 1 = 0
                                           2
                                        λ − (a + b)λ + ab + 1 = 0
                                       a+b±             (a + b)2 − 4ab − 4
                         λ1,2 =
                                                           2
                                                                         2
                                            a+b                a+b
                                λ1,2 =          ±                              −1
                                             2                  2

     λ1,2 should be 2 ± i. This is only possible if a+b = 0, hence a = b, and
                                                     2
     a + b = 4. Therefore, a = b = 2, and the matrix is

                                                          2    i
                                                   A=
                                                          i    2

b) Now, to find the eigenvectors, we now that the eigenvalues are λ1,2 = 2 +
   i, 2 − i For the first eigenvector we have

                                                        −i i             x1
                             (A − λ1 I)v1 =                                         =0
                                                         i −i            x2

     Hence x1 = x2 , and v1 = α(1, 1)T .
     For the second eigenvector we have

                                                          i    i       x1
                              (A − λ2 I)v2 =                                    =0
                                                          i    i       x2

     Hence x1 = −x2 , and v2 = α(1, −1)T .


                                                    246
Chapter 6. Eigen-analysis                       6.6. Solutions to Chapter Problems



Solution 6.13
                                            1   1
                                                   
                                      0     4   2
                                   1        1   1
                             P =  2        2   2
                                                    
                                   1        1
                                   2        4   0
                                    1           1
The characteristic polynomial isλ − 2 λ2
                                 3
                                         −      2 λ.    From this, the eigenvalues are
found to be λ1,2,3 = 0, 1, − 1 .
                             2
   To find the eigenvector associated with λ1 ,      we find the reduced row-echelon
form of the matrix (P − λ1 I):
       0 1 2 0 1
                                                                                      
           4                         1 1 1          0                    1    0   −1     0
     1 1 1 0                →    0 1 2           0          →       0    1    2     0 
       2   2   2
       1   1
       2   4   0 0                   2 1 0          0                    0   −1   −2     0
                                                           
                    1 0 −1 0                                1
         →        0 1 2 0            →            v1 =  −2 
                    0 0 0 0                                 1
   To find the eigenvector associated with λ2 , we find the RREF of the matrix
(P − λ2 I):
            1     1
                                         1 −1 −1 0
                                                                                             
      −1    4     2   0                         4    2                   1 0               −1   0
     1 −1        1
                      0       →       0 −3        3
                                                        0     →        0 1               −2   0 
        2    2    2                             4   2
        1   1                                  3
        2   4    −1 0                    0     4   −3 0
                                                     2
                                                                         0 −1              −2   0
                          
                        1
         →      v2 =  −2 
                        1
  Finally, to find the eigenvector associated with λ3 , we find the RREF of         the
matrix (P − λ3 I):
     1 1 1                            1
                                                                                         
       2   4   2   0                 1 2 1 0                       1 0              1    0
     1 1 1 0             →       0 3 0 0              →      0 1               0    0 
       2       2                         4
       1   1   1
       2   4   2   0                 0 0 0 0                       0 −1            −2    0
                           
                        −1
         →       v3 =  0 
                         1

We can express x(0) as a linear combination of the eigenvectors:
                                                  
                          1          1           −1         1
                    1            1  1
            x(0) =      −2  +       2 −          0  =  0 .
                    4             4         2
                          1          1            1         0

From the notes, we know that for n > 0 x(n) = P n x(0) , therefore the zero
eigenvalue and the −1/2 disappear for large n, and we have that
                                          1 
                                                4
                              lim x(n) =       1
                                                2
                                                    
                             n→∞                1
                                                4


                                      247
Chapter 6. Eigen-analysis                               6.6. Solutions to Chapter Problems



Solution 6.14
                                         1        1      1
                                                             
                                         4        3      2
                                         1        1
                               P =      2        3     b 
                                         1        1
                                         4        3     a
We know that a ≥ 0, b ≥ 0, and that a + b = 1/2. If a > 0, and b > 0,
by theorem 6.1 we know that the existence of an equilibrium probability is
guaranteed. There are two cases left to be analyzed:

 i) a = 0 and b = 1/2

 ii) a = 1/2 and b = 0

In both cases one value of the matrix is zero, so we can’t apply theorem 6.1
directly. On the other hand, P 2 for both cases is

 i)
                                         17           13 7
                                                                      
                                         48           36 24
                              P2 =       5
                                         12
                                                        4
                                                        9
                                                                   5
                                                                  12
                                                                       
                                         11             7          7
                                         48            36         24

 ii)
                                             17        13 3
                                                                      
                                             48        36 8
                               P2 =          7
                                             24
                                                         5
                                                        18
                                                                  1
                                                                  4
                                                                       
                                             17         13        3
                                             48         36        8

with all positive entries in both cases, so there is always an equilibrium proba-
bility.
    Now, for the case a = b = 1/4 we have to find the aforementioned equilib-
rium probability vector. From theorem 6.1, the vector will be the eigenvector
corresponding to the eigenvalue λ = 1, with all its entries summing to one. We
find the eigenvalue by finding the reduced row-echelon form of the extended
matrix (P − λI, 0):
                  3       1    1
                                                                        
                   −4      3    2    0                   4 0     −5 0
                    1       2   1                       0 16 −21 0 
           P = 2         −3    4    0       →
                    1      1
                    4      3   −3 0                      0 0      0     0
                                4 
                                  20
                →       v = α  21 
                                  16

Since the entries of v have to sum to one, we get that the equilibrium probability
vector is                                   
                                          20
                                     1 
                                          21 
                                    57
                                          16




                                         248
Chapter 6. Eigen-analysis                                    6.6. Solutions to Chapter Problems



                                                                               −2
Solution 6.15 The eigenvalues and vectors are λ1 = 2 x1 =                           and λ2 = 6
                                                                               1
       −1
x2 =      . Thus the general solution is
       1

                                               −2          −1
                                  c1 e2t          + c2 e6t
                                               1           1
To satisfy the initial condition we need
                                       −2      −1   1
                                c1        + c2    =
                                        1       1   1
or
                                      −2       −1       c1       1
                                                             =
                                      1         1       c2       1
which has solution
                                               c1         −2
                                                    =
                                               c2         3
so the solution is
                                               −2        −1
                                  −2e2t           + 3e6t
                                               1         1

                                                                                       i
Solution 6.16 The eigenvalues and vectors are λ1 = 1 + 2i x1 =                             and
                                                                                       1
                      −i
λ2 = 1 − 2i x2 =         . Thus the general solution in complex form is is
                      1

                                               i               −i
                           c1 e(1+2i)t           + c2 e(1−2i)t
                                               1               1
To find two real solution we take the real and imaginary parts of
                                  i                 ie(1+2i)t
                     e(1+2i)t              =
                                  1                 e(1+2i)t
                                                      i(cos(2t) + i sin(2t))
                                           = et
                                                        cos(2t) + i sin(2t)
                                                      − sin(2t) + i cos(2t)
                                           = et
                                                       cos(2t) + i sin(2t)
Thus the general solution in real form is
                                      − sin(2t)         cos(2t)
                          a1 et                 + a2 et
                                       cos(2t)          sin(2t)
To satisfy the initial condition we need
                                           0      1   1
                                  a1         + a2   =
                                           1      0   1

                                                    249
Chapter 6. Eigen-analysis                               6.6. Solutions to Chapter Problems



or
                                 0    1        a1         1
                                                    =
                                 1    0        a2         1
which has solution
                                      a1            1
                                               =
                                      a2            1
so the solution is
                                 − sin(2t) + cos(2t)
                            et
                                 cos(2t) + sin(2t)cr
                                                                    
                                                                      0
Solution 6.17 The eigenvalues and vectors are λ1 = 1 x1 =  1 , λ2 = 1 + i
                                                                  0
       −5 − i                              −5 + i
x2 =  −5 − i  and λ3 = 1 − i x3 =  −5 + i . Thus the general solution in
          2                                   2
complex form is is
                                                                 
                    0                  −5 − i                  −5 + i
            c1 et  1  + c2 e(1+i)t  −5 − i  + c3 e(1−i)t  −5 + i 
                    0                    2                       2
The two complex solutions are conjugates of each other. To find real solutions
wemust find the real    and imaginary parts of one of them. These are given by
                                                       
     −5 cos(t) + sin(t)              −5 sin(t) − cos(t)
et  −5 cos(t) + sin(t)  and et  −5 sin(t) − cos(t)  Thus the general solution
          2 cos(t)                         2 sin(t)
in real form is given by
                                                                          
               0             −5 cos(t) + sin(t)             −5 sin(t) − cos(t)
       a1 et  1  + a2 et  −5 cos(t) + sin(t)  + a3 et  −5 sin(t) − cos(t) 
               0                  2 cos(t)                       2 sin(t)
To solve the initial value problem, we must solve
                                                 
                          0          −5          −1       1
                     a1  1  + a2  −5  + a3  −1  =  1 
                          0           2          0        1
This means solving the system of equations with augmented matrix
                                             
                               0 −5 −1 1
                              1 −5 −1 1 
                               0 2      0 1
Reducing this matrix yields
                                                    
                                  1       −5    −1 1
                                 0       −5    −1 1 
                                  0        0    −2 7

                                           250
Chapter 6. Eigen-analysis                                6.6. Solutions to Chapter Problems



Which yields the solution a1 = 0, a2 = 1/2 and a3 = −7/2. So the final answer
is
                                                                        
      −5 cos(t) + sin(t)          −5 sin(t) − cos(t)      cos(t) + 18 sin(t)
 1 t                       7 t
   e −5 cos(t) + sin(t)  − e −5 sin(t) − cos(t)  = et  cos(t) + 18 sin(t) 
 2                          2
           2 cos(t)                    2 sin(t)            cos(t) − 7 sin(t)
Solution 6.18
                                                1   1
                                  y (t) =                 y(t)
                                                5   1
                                                                   √
The characteristic polynomial is (1 − λ)2 − 5 = 0, hence λ1,2 = 1 ± 5
   We need to find the eigenvectors:
   √                                               √
  − 5     1
          √                          √1              5 √ 1                                      1
                                                                                                √
                      →     v1 =           ;                       →                    v2 =
   5    − 5                            5            5     5                                    − 5
The general solution is given by
              y(t)   = c1 eλ1 t v1 + c2 eλ2 t v2
                                √        1                √               1
                     = c1 e(1+ 5)t √             + c2 e(1− 5)t            √
                                          5                              − 5
Or, equivalently
                                            √                    √
                     y1 (t)     = c1 e(1+ 5)t + c2 e(1− 5)t
                                     √      √         √     √
                     y2 (t)     = c1 5e(1+ 5)t − c2 5e(1− 5)t

Solution 6.19
                                            3       −1
                                 y (t) =                   y(t)
                                            1       −1
                                                                                √
The characteristic polynomial is λ2 − 2λ − 2 = 0, hence λ1,2 = 1 ±                  3
   We need to find the eigenvectors:
                 √                                        √
            2− 3         −1√                         2+ 3
                                       →      v1 =
                1     −2 − 3                            1
                  √                                       √
             2+ 3        −1√                          2− 3
                                       →      v2 =
                1      −2 + 3                            1
The general solution is given by
                                 √
                                           √                    √
                                                                            √
                              (1+ 3)t   2+ 3                 (1− 3)t     2− 3
           y(t)    = c1 e                               + c2 e
                                          1                                1
The initial condition states that y(0) = [0, 1]T . Hence
                                   √               √
                            c1 (2 + 3) + c2 (2 − 3) =                    0
                                                     c1 + c2         =   1
                         1   1                      1    1
                     c1 = − √ ,                 c2 = + √
                         2    3                     2      3

                                            251
Chapter 6. Eigen-analysis                        6.6. Solutions to Chapter Problems



Or, equivalently
                          1    √       1     √
            y1 (t)   = − √ e(1+ 3)t + √ e(1− 3)t
                        2 3           2 3
                        1   1       √      1     1                 √
                                 (1+ 3)t
            y2 (t)   =    −√    e        −   +√                e(1− 3)t
                        2    3             2      3

Solution 6.20
                                      3     −1
                            y (t) =                y(t)
                                      7      1
                                                                     √
The characteristic polynomial is λ2 + 2λ + 4 = 0, hence λ1,2 = −1 ± i 3
   Finding the eigenvectors:
                √                                            √
         −2 − i 3       −1
                         √     0                       −2 + i 3
                                        →      v1 =
             7       2−i 3                                 7

The function below is a solution. We will separate the real and imaginary parts
to get two independent solutions:
          √
                       √                             √
                −2 + i 3           1 i√3t −2 + i 3
    e(−1+i 3)t                 =      e
                    7              et             7
                                               √
                                   1 −2 + i 3              √           √
                               =    t
                                                     (cos( 3t) + i sin( 3t))
                                   e         7
                                                √        √     √
                                   1 −2 cos( 3t) − 3 sin( 3t)
                               =                       √               + ···
                                   et            7 cos( 3t)
                                             √       √            √
                                         i     3 cos( 3t) − 2 sin( 3t)
                                                           √
                                        et           7 sin( 3t)

The general solution is
                    √       √   √                     √         √            √
       c1 −2 cos( 3t) − 3 sin( 3t)               c2       3 cos( 3t) − 2 sin( 3t)
 y(t) = t                  √                 +                        √
       e             7 cos( 3t)                  et             7 sin( 3t)

   The initial condition states that y(0) = [2, −1]T . Hence
                                          √
                                  −2c1 + 3c2 = 2
                                           7c1        = −1
                             1             12
                       c1 = − ,       c2 = √
                             7            7 3

Solution 6.21 If A is a 3 × 3 matrix with real entries, then det(A − λI) is
a polynomial of the form −λ3 + aλ2 + bλ + c. If you plot the graph of this
polynomial, it tends to infinity for large positive λ and to negative infinity for
large negative λ. Thus there must be a place where the polynomial crosses
the real axis, that is, a real zero. This means that there is always at least one


                                      252
Chapter 6. Eigen-analysis                                  6.6. Solutions to Chapter Problems



real eigenvalue. The other two eigenvalues can be either both real, or complex
conjugates (as in the previous problem).
Solution 6.22 We have to solve the equation
                                                −1     −1
                                x (t) =                   x(t)
                                                1      −1
with
                                                       1
                                          x(0) =
                                                       1
                                                                                        1
The eigenvalues and eigenvectors of the matrix are λ1 = −1 + i, x1 =                       ,
                                                                                        −i
                      1
λ2 = −1 − i, x2 =         Real solutions are the real and imaginary parts of
                      i

                                                       1
                                      e(−1+i)t            ,
                                                       −i
which are
                                                  cos(t)
                                          e−t
                                                  sin(t)
and
                                                  sin(t)
                                      e−t
                                                 − cos(t)
Thus the general solution is
                                          cos(t)           sin(t)
                    x(t) = a1 e−t                + a2 e−t          .
                                          sin(t)          − cos(t)
To satisfy the initial conditions, we need
                                     1      0    1
                            a1         + a2    =
                                     0      −1   1
so a1 = 1, a2 = −1 and the solution is
                                              cos(t) − sin(t)
                                    e−t
                                              sin(t) + cos(t)

Solution 6.23 The current is the same through each component. The voltage
across the resistor is IR, so if VL is the voltage across the inductor, then VL +
V + IR = 0 so VL = −V − IR. The equations are therefore
                                       1
                            I        = L VL          = −RI −
                                                        L
                                                                1
                                                                LV
                                        1
                            V        = CI
or
                                I               −R
                                                 L     −L1
                                                                I
                                          =      1
                                V                C      0       V

                                                 253
Chapter 6. Eigen-analysis                               6.6. Solutions to Chapter Problems



The characteristic polynomial (i.e., det(A−λI)) is λ2 +(R/L)λ+1/(LC), so the
eigenvalues are λ = (−R/L ± R2 /L2 − 4/(LC))/2. Oscillations occur when
the eigenvalues are complex, so, if R2 /L2 < 4/(LC) or R < 2 L/C. If we set
all the values to 1 the equation becomes

                                I            −1    −1         I
                                      =
                                V            1      0         V
and the initial conditions are
                                          I(0)    1
                                                =
                                          V (0)   1
                                                                       √
                                             √                 (−1 + i 3)/2
The eigenvalues are vectors are λ1 = (−1 + i 3)/2, x1 =                       ,
                                  √                                  1
            √             (−1 − i 3)/2
λ2 = (−1 − i 3)/2, x2 =                  . Real solutions are given by the real
                                1
and imaginary parts of
        √
                       √                           √          √              √
               (−1 + i 3)/2             (−1/2 + i 3/2)(cos( 3t/2)√ i sin( 3t/2))
                                                      √               +
  e(−1+i 3)t/2                 = e−t/2
                     1                            cos( 3t/2) + i sin( 3t/2)
               √           √      √                   √            √      √
         − cos( 3t/2)/2 − 3 sin( 3t/2)/2 + i(− sin( 3t/2)/2 + 3 cos( 3t/2)/2)
                                     √              √
= e−t/2
                                 cos( 3t/2) + i sin( 3t/2)

So the general solution is
                √            √   √                       √          √     √
                           −
         − cos( 3t/2)/2 √ 3 sin( 3t/2)/2           − sin( 3t/2)/2 √ 3 cos( 3t/2)/2
                                                                  +
a1 e−t/2                                 +a2 e−t/2
                      cos( 3t/2)                              sin( 3t/2)
To satisfy the initial conditions, we need
                                                   √
                                −1/2                   3/2   1
                          a1         + a2                  =
                                1/2                    0     1
                          √                        √
This has solution a1 =        3/3, a2 = −2 −           3/3.
Solution 6.24 The equivalent 2 × 2 system is
                                              0        1
                                    z =                  z
                                             −6        5
with
                                                   1
                                          z(0) =
                                                   0
                                                                  1                   1
The eigenvalues and eigenvectors are λ1 = 2, x1 =                   , λ2 = 3, x2 =      .
                                                                  2                   3
The general solution is
                                           1          1
                                 c1 e2t      + c2 e3t
                                           2          3

                                             254
Chapter 6. Eigen-analysis                                 6.6. Solutions to Chapter Problems



The initial conditions are satisfied if
                                     1      1   1
                               c1      + c2   =
                                     2      3   0

which has solution c1 = 3, c2 = −2. So y(t) = z1 (t) = 3e2t − 2e3t . To solve this
directly, plug in eλt . This solves the equation if λ2 − 5λ + 6 = 0, or λ = 2 or 3.
So the general solution is y(t) = c1 e2t + c2 e3t . Satisfying the initial conditions
y(0) = 1 and y (0) = 0 leads to c1 = 3 and c2 = −2 as above.
Solution 6.25 The equivalent 2 × 2 system is
                                             0         1
                                    z =                  z
                                            −1        −1
with
                                                      1
                                         z(0) =
                                                      0
                                                √                                1 √
The eigenvalues and eigenvectors are λ1 = −1/2+i 3/2, x1 =                               ,
                                                                            −1/2 + i 3/2
             √                           1 √
λ2 = −1/2 − i 3/2, x2 =                                     Real solutions are the real and
                                    −1/2 − i 3/2
imaginary parts of
                                       √
                                                      1 √
                            e(−1/2+i    3/2)t
                                                 −1/2 + i 3/2
which leads to the general solution
                         cos √                              sin
       a1 e−t/2                        + a2 e−t/2 √
               −(1/2) cos −( 3/2) sin             ( 3/2) cos −(1/2) sin
                                       √
(the sines and cosines are evaluated at 3t/2). Initial conditions require

                               1                0          1
                       a1             + a2 √          =
                           −(1/2)           ( 3/2)         0
                                         √
which has √ solution a1√ 1 and a2 = 3/3. Thus the solution is y(t) = z1 (t) =
                        =         √
e−t/2 (cos( 3t/2) + ( 3/3) sin( 3t/2)) If we try to solve directly by substi-
                                     √
tuting eλt , we√get λ = −1/2 ± i √     3/2. The real and imaginary parts of eλt
     −t/2                     −t/2
are e     (cos( 3t/2) and e
                     √             sin( 3t/2)), so we obtain the general solution
                                           √
y(t) = a1 e−t/2 (cos( 3t/2) + a2 e−t/2 sin( 3t/2)). Choosing a1 and a2 to solve
the initial condition yields the same answer as above.
                                                                              
                                                                               z1
Solution 6.26 Set z1 (t) = y(t), z2 (t) = y (t) and z3 (t) = y (t). Then z =  z2 
                                                                               z3
solves the equation                                
                                       0   1      0
                              z = 0       0      1 z
                                      −c −b −a

                                                255
Chapter 6. Eigen-analysis                          6.6. Solutions to Chapter Problems



Solution 6.27 The general solution is
                                                                               
                        1                              1                            0
(a1 cos(t)+b1 sin(t))  0 +(a2 cos(2t)+b2 sin(2t))  0 +(a3 cos(3t)+b3 sin(3t))  1 
                        1                             −1                            0
To satisfy the initial conditions, we need
                                              
                           1          1          0       1
                      a1  0  + a2  0  + a3  1  =  2 
                           1          −1         0       1
which has solution a1 = 1, a2 = 0 and a3 = 2, and
                                             
                        1           1           0       0
                   b1  0  + 2b2  0  + 3b3  1  =  1 
                        1           −1          0       1
which has solution b1 = 1/2, 2b2 = −1/2 and 3b3 = 1, so that b1 = 1/2,
b2 = −1/4 and b3 = 1/3
Solution 6.28 Going through the analysis with two springs and weights, we
end up with
                                  −2 1
                           K=
                                  1 −1
and
                                             10
                                     b=
                                             11
Then
                                                    21
                               xeq = K −1 b =
                                                    31
so if we define y = x − xeq , we obtain initial conditions

                                               1
                                    y(0) =
                                               1
and
                                        1
                                    y (0) =
                                        1
                                      √                      √
The eigenvalues of K are √1 = (−3 + 5)/2 and λ√ = (−3 − 5)/2 with
                          λ                        2
                   (−1 + 5)/2                (−1 − 5)/2
eigenvectors x1 =               and x2 =                . If we set µ1 =
                        1                         1
√               √
  −λ1 and µ2 = −λ2 , then the general solution is
       y(t) = (a1 cos(µ1 t) + b1 sin(µ1 t))x1 + (a2 cos(µ2 t) + b2 sin(µ2 t))x1
The initial condition are satisfied if
                                                    1
                                a1 x1 + a2 x2 =
                                                    1

                                         256
Chapter 6. Eigen-analysis                       6.6. Solutions to Chapter Problems



and
                                                     1
                             µ1 b1 x1 + µ2 b2 x2 =     .
                                                     1
These can be solved by inverting 2 × 2 matrices. The answers are
                                              √
                          a1      1 −3/2 +√ 5/2
                              =√
                          a2       5 3/2 + 5/2
and                                       √
                        b1      1 (−3/2 +√ 5/2)/µ1
                              =√
                        b2       5 (3/2 + 5/2)/µ2




                                        257

More Related Content

PDF
Reconstruction of Surfaces from Three-Dimensional Unorganized Point Sets / Ro...
PDF
Calculus Research Lab 2: Integrals
PDF
PDF
PDF
Matlab tutorial by manish myst, ssgbcoet
PDF
Vivarana fyp report
PDF
Calculus Research Lab 1: Derivatives
PDF
Elementray college-algebra-free-pdf-download-olga-lednichenko-math-for-colleg...
Reconstruction of Surfaces from Three-Dimensional Unorganized Point Sets / Ro...
Calculus Research Lab 2: Integrals
Matlab tutorial by manish myst, ssgbcoet
Vivarana fyp report
Calculus Research Lab 1: Derivatives
Elementray college-algebra-free-pdf-download-olga-lednichenko-math-for-colleg...

What's hot (20)

PDF
Ieml semantic topology
PDF
Machine learning-cheat-sheet
PDF
Final Report - Major Project - MAP
PDF
M.Sc thesis
PDF
btpreport
PDF
Fuzzy and Neural Approaches in Engineering MATLAB
PDF
10.1.1.127.5075
PDF
Di11 1
PDF
PDF
Real-Time Non-Photorealistic Shadow Rendering
PDF
Ibm Datamodeling
PDF
Thesis_Main
PDF
Interactive Filtering Algorithm - George Jenkins 2014
PDF
Mining of massive datasets
PDF
David_Mateos_Núñez_thesis_distributed_algorithms_convex_optimization
PDF
Pattern classification via unsupervised learners
PDF
PDF
A Matlab Implementation Of Nn
Ieml semantic topology
Machine learning-cheat-sheet
Final Report - Major Project - MAP
M.Sc thesis
btpreport
Fuzzy and Neural Approaches in Engineering MATLAB
10.1.1.127.5075
Di11 1
Real-Time Non-Photorealistic Shadow Rendering
Ibm Datamodeling
Thesis_Main
Interactive Filtering Algorithm - George Jenkins 2014
Mining of massive datasets
David_Mateos_Núñez_thesis_distributed_algorithms_convex_optimization
Pattern classification via unsupervised learners
A Matlab Implementation Of Nn
Ad

Similar to M152 notes (20)

PDF
Vector spaces, vector algebras, and vector geometries
PDF
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
PDF
Oop c++ tutorial
PDF
Convex optimization user guide
PDF
Csharp
PDF
Matlab ilu
PDF
PDF
Introduction to-matlab
PDF
Introduction to-matlab
PDF
Trignometry notes notes
PDF
Basic calculus free
PDF
Coding interview preparation
PDF
Location In Wsn
PDF
Error correcting codes and cryptology
PDF
Heat source simulation
PDF
The Dissertation
PDF
Phd dissertation
PDF
Advanced Calculus and Analysis MA1002 - CiteSeer ( PDFDrive ).pdf
PDF
Mastering Modern C++: C++11, C++14, C++17, C++20, C++23
Vector spaces, vector algebras, and vector geometries
Fundamentals of computational_fluid_dynamics_-_h._lomax__t._pulliam__d._zingg
Oop c++ tutorial
Convex optimization user guide
Csharp
Matlab ilu
Introduction to-matlab
Introduction to-matlab
Trignometry notes notes
Basic calculus free
Coding interview preparation
Location In Wsn
Error correcting codes and cryptology
Heat source simulation
The Dissertation
Phd dissertation
Advanced Calculus and Analysis MA1002 - CiteSeer ( PDFDrive ).pdf
Mastering Modern C++: C++11, C++14, C++17, C++20, C++23
Ad

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PPTX
Modernising the Digital Integration Hub
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
WOOl fibre morphology and structure.pdf for textiles
PPTX
The various Industrial Revolutions .pptx
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
Architecture types and enterprise applications.pdf
PPTX
Chapter 5: Probability Theory and Statistics
PPT
What is a Computer? Input Devices /output devices
PDF
Hybrid model detection and classification of lung cancer
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
CloudStack 4.21: First Look Webinar slides
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
STKI Israel Market Study 2025 version august
PDF
Five Habits of High-Impact Board Members
PDF
Getting started with AI Agents and Multi-Agent Systems
PPTX
Web Crawler for Trend Tracking Gen Z Insights.pptx
NewMind AI Weekly Chronicles – August ’25 Week III
Modernising the Digital Integration Hub
Univ-Connecticut-ChatGPT-Presentaion.pdf
A contest of sentiment analysis: k-nearest neighbor versus neural network
WOOl fibre morphology and structure.pdf for textiles
The various Industrial Revolutions .pptx
Zenith AI: Advanced Artificial Intelligence
1 - Historical Antecedents, Social Consideration.pdf
Architecture types and enterprise applications.pdf
Chapter 5: Probability Theory and Statistics
What is a Computer? Input Devices /output devices
Hybrid model detection and classification of lung cancer
Module 1.ppt Iot fundamentals and Architecture
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
CloudStack 4.21: First Look Webinar slides
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
STKI Israel Market Study 2025 version august
Five Habits of High-Impact Board Members
Getting started with AI Agents and Multi-Agent Systems
Web Crawler for Trend Tracking Gen Z Insights.pptx

M152 notes

  • 1. Contents 1 Introduction 5 1.1 Course Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2 About the Subject . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.1 Connection to Geometry . . . . . . . . . . . . . . . . . . . 6 1.2.2 Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . 6 1.2.3 Eigen-analysis . . . . . . . . . . . . . . . . . . . . . . . . . 7 1.3 About These Notes . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4 About The Computer Labs . . . . . . . . . . . . . . . . . . . . . 9 2 Vectors and Geometry 10 2.1 Chapter Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.2.1 Multiplication by a number and vector addition . . . . . . 11 2.2.2 Co-ordinates . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.2.3 Properties of vector addition and scalar multiplication . . 16 2.2.4 MATLAB: basic scalar and vector operations . . . . . . . 17 2.2.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.3 Geometrical Aspects of Vectors . . . . . . . . . . . . . . . . . . . 19 2.3.1 Length of a vector . . . . . . . . . . . . . . . . . . . . . . 19 2.3.2 The dot product . . . . . . . . . . . . . . . . . . . . . . . 19 2.3.3 Projections . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3.4 MATLAB: norm and dot commands . . . . . . . . . . . . 23 2.3.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.4 Determinants and the Cross Product . . . . . . . . . . . . . . . . 25 2.4.1 The determinant in two and three dimensions . . . . . . . 25 2.4.2 The cross product . . . . . . . . . . . . . . . . . . . . . . 27 2.4.3 The triple product and the determinant in three dimensions 30 2.4.4 MATLAB: assigning matrices and det and cross commands 31 2.4.5 MATLAB: generating scripts with the MATLAB editor . 31 2.4.6 MATLAB: floating point representation of real numbers . 32 2.4.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 2.5 Lines and Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 2.5.1 Describing linear sets . . . . . . . . . . . . . . . . . . . . 34 2.5.2 Lines in two dimensions: Parametric form . . . . . . . . . 34 1
  • 2. CONTENTS CONTENTS 2.5.3 Lines in two dimensions: Equation form . . . . . . . . . . 35 2.5.4 Lines in three dimensions: Parametric form . . . . . . . . 35 2.5.5 Lines in three dimensions: Equation form . . . . . . . . . 36 2.5.6 Planes in three dimensions: Parametric form . . . . . . . 37 2.5.7 Planes in three dimensions: Equation form . . . . . . . . 37 2.5.8 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 2.6 Introduction to Linear Systems . . . . . . . . . . . . . . . . . . . 40 2.6.1 Description of points and the geometry of solutions to systems of equations . . . . . . . . . . . . . . . . . . . . . 40 2.6.2 Describing the whole plane in two dimensions and all of space in three dimensions . . . . . . . . . . . . . . . . . . 42 2.6.3 Linear dependence and independence . . . . . . . . . . . . 43 2.6.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 2.7 Additional Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 2.7.1 Application: rotational motion . . . . . . . . . . . . . . . 45 2.7.2 Application: 3-D graphics . . . . . . . . . . . . . . . . . . 46 2.8 Solutions to Chapter Problems . . . . . . . . . . . . . . . . . . . 49 3 Solving Linear Systems 63 3.1 Linear Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 3.1.1 General Form of Linear Systems . . . . . . . . . . . . . . 63 3.1.2 Solving Linear Systems by Substitution . . . . . . . . . . 63 3.1.3 Elementary row (equation) operations . . . . . . . . . . . 64 3.1.4 Augmented Matrices . . . . . . . . . . . . . . . . . . . . . 65 3.1.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 3.2 Gaussian Elimination . . . . . . . . . . . . . . . . . . . . . . . . . 67 3.2.1 Using MATLAB for row reductions . . . . . . . . . . . . . 73 3.2.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 3.3 Homogeneous Equations . . . . . . . . . . . . . . . . . . . . . . . 76 3.3.1 Properties of solutions of homogeneous systems. . . . . . 77 3.3.2 Connection of solutions to homogeneous and inhomoge- neous systems. . . . . . . . . . . . . . . . . . . . . . . . . 77 3.3.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 3.4 Geometric Applications . . . . . . . . . . . . . . . . . . . . . . . 80 3.4.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.5 Resistor Networks . . . . . . . . . . . . . . . . . . . . . . . . . . 82 3.5.1 Elements of Basic Circuits . . . . . . . . . . . . . . . . . . 82 3.5.2 Two Simple Examples Made Complicated . . . . . . . . . 84 3.5.3 Loop Currents . . . . . . . . . . . . . . . . . . . . . . . . 86 3.5.4 Alternate Presentation of Resistor Networks . . . . . . . . 90 3.5.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 3.6 Additional Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 3.6.1 Quadratic Functions . . . . . . . . . . . . . . . . . . . . . 96 3.6.2 Least squares fit . . . . . . . . . . . . . . . . . . . . . . . 98 3.6.3 Equilibrium configuration of hanging weights and springs 99 3.6.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 2
  • 3. CONTENTS CONTENTS 3.7 Solutions to Chapter Problems . . . . . . . . . . . . . . . . . . . 102 4 Matrices and Determinants 117 4.1 Matrix operations . . . . . . . . . . . . . . . . . . . . . . . . . . 117 4.1.1 MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 4.1.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 4.2 Linear Transformations and Matrices . . . . . . . . . . . . . . . . 121 4.2.1 Linear Transformations . . . . . . . . . . . . . . . . . . . 121 4.2.2 Rotations in two dimensions . . . . . . . . . . . . . . . . . 122 4.2.3 Projections in two dimensions . . . . . . . . . . . . . . . . 123 4.2.4 Reflections in two dimensions . . . . . . . . . . . . . . . . 124 4.2.5 Every linear transformation is multiplication by a matrix 125 4.2.6 Composition of linear transformations and matrix product 127 4.2.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 127 4.3 Application: random walks . . . . . . . . . . . . . . . . . . . . . 129 4.3.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 134 4.4 The Transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136 4.4.1 MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 4.4.2 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 4.5 Matrix Inverses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 4.5.1 Computing the inverse . . . . . . . . . . . . . . . . . . . . 141 4.5.2 Inverses of Products . . . . . . . . . . . . . . . . . . . . . 144 4.5.3 MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 4.5.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 4.6 Return to Resistor Networks . . . . . . . . . . . . . . . . . . . . 146 4.7 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 4.7.1 Definition of Determinants . . . . . . . . . . . . . . . . . 150 4.7.2 Determinants of Triangular matrices . . . . . . . . . . . . 151 4.7.3 Summary of determinant calculation rules . . . . . . . . . 152 4.7.4 Calculation of determinant using row operations . . . . . 153 4.7.5 More expansion formulae . . . . . . . . . . . . . . . . . . 153 4.7.6 MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 4.7.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 4.8 Additional Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 4.8.1 Application: General Least Squares . . . . . . . . . . . . 157 4.8.2 Least squares solutions . . . . . . . . . . . . . . . . . . . . 158 4.8.3 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 4.8.4 Elementary matrices . . . . . . . . . . . . . . . . . . . . . 159 4.8.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 4.8.6 Exchanging two rows changes the sign of the determinant 163 4.8.7 The determinant is linear in each row separately . . . . . 164 4.8.8 Adding a multiple of one row to another doesn’t change the determinant . . . . . . . . . . . . . . . . . . . . . . . 165 4.8.9 The determinant of QA . . . . . . . . . . . . . . . . . . . 166 4.8.10 The determinant of A is zero exactly when A is not invertible167 4.8.11 The product formula: det(AB) = det(A) det(B) . . . . . . 167 3
  • 4. CONTENTS CONTENTS 4.8.12 The determinant of the transpose . . . . . . . . . . . . . . 168 4.8.13 An impractical formula for the inverse . . . . . . . . . . . 169 4.8.14 Cramer’s rule, an impractical way to solve systems . . . . 170 4.8.15 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 4.9 Solutions to Chapter Problems . . . . . . . . . . . . . . . . . . . 170 5 Complex numbers 193 5.1 Complex arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . 193 5.2 Complex exponential . . . . . . . . . . . . . . . . . . . . . . . . . 194 5.3 Polar representation of a complex number . . . . . . . . . . . . . 196 5.4 MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 5.5 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197 5.6 Solutions to Chapter Problems . . . . . . . . . . . . . . . . . . . 197 6 Eigen-analysis 200 6.1 Eigenvalues and eigenvectors . . . . . . . . . . . . . . . . . . . . 200 6.1.1 Computing the eigenvalues and eigenvectors . . . . . . . . 201 6.1.2 Complex eigenvalues and eigenvectors . . . . . . . . . . . 207 6.1.3 MATLAB . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 6.1.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 6.2 Eigenanalysis simplifies matrix powers . . . . . . . . . . . . . . . 211 6.2.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 6.3 Systems of linear differential equations . . . . . . . . . . . . . . . 215 6.3.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 6.4 LCR circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 6.4.1 Capacitors and inductors . . . . . . . . . . . . . . . . . . 222 6.4.2 Differential equations for LCR circuits . . . . . . . . . . . 223 6.4.3 Alternate description of LCR circuits . . . . . . . . . . . . 225 6.4.4 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 6.5 Additional Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 6.5.1 Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . 227 6.5.2 Computing high powers of a matrix . . . . . . . . . . . . 229 6.5.3 Another formula for the determinant . . . . . . . . . . . . 230 6.5.4 The matrix exponential and differential equations . . . . . 230 6.5.5 Converting higher order equations into first order systems 231 6.5.6 Springs and weights . . . . . . . . . . . . . . . . . . . . . 233 6.5.7 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 6.6 Solutions to Chapter Problems . . . . . . . . . . . . . . . . . . . 239 4
  • 5. Chapter 1 Introduction Linear Algebra is a branch of Mathematics. Many applied problems from Sci- ence, Engineering and Finance can be written in terms of Linear Algebra ques- tions. This is also true of Calculus, which is why these two fields are stressed in undergraduate Mathematics education at UBC and other universities. By putting a class of commonly occurring problems into a unified, abstract frame- work, the problems can be studied in detail and well understood. This under- standing can then be taken back to problems in many different fields. It is especially important to realize that very large problems (with potentially mil- lions of unknowns) can be understood in the same framework as the model problems we do by hand in these courses. This course has computer labs that involve the mathematical software, MATLAB, that will allow you to solve larger problems using numerical computations. Unlike Calculus, Linear Algebra does not require a lot of background knowl- edge. Basic operations in Linear Algebra are just arithmetic. However, there is a powerful connection between these simple arithmetic operations and geo- metric quantities. Simple ideas in this course start to become abstract when combined together. Linear Algebra is a subject you can study with very limited matematical background, but you are advised to keep up with course lectures, readings, assignments and computer labs so you won’t be left behind at the transition from concrete ideas to abstract ones. Details of the course taught in the Spring of 2011 can be found at the common web page for all sections of the course, which can be found by taking a link at http://www.math.ubc.ca/~wetton/ On this page, you will find details about the grade breakdown for the course, test dates and links to the page on computer labs. Weekly assignments for the course will be posted here. Other useful material such as old tests for practise can also be found on this page. 5
  • 6. Chapter 1. Introduction 1.1. Course Goals 1.1 Course Goals The goal of the course is to enable students to 1. recognize linear algebra questions (for which there are straight-forward analytic and numerical solution techniques) as parts of applied problems 2. make the connection between geometric properties and analytic quantities (determinants, dot and cross products, eigenvalues, etc.) 3. recognize that linear systems of equations can have unique, infinite or no solutions and know how to determine all solutions or that none exist 4. recognize matrix multiplication as a linear transformation and that such transformations (to the same dimensional space) can be simplified using eigen-analysis 5. use complex numbers, which arise naturally in the eigen-analysis of ma- trices 1.2 About the Subject The subject of the course is Linear Algebra, focussing on three main topics: vectors and matrices and connections to geometry, linear systems, and eigen- analysis of matrices. Several applications are considered including resistor net- works and random walks. 1.2.1 Connection to Geometry The first topic considered in the course is vectors, which are quantities with both magnitude and direction. A typical quantity represented as a vector is a force F on an object as shown in Figure 1.1. The force F in that figure acts in the x − y plane as shown. The vector force F can be represented by its components (Fx , Fy ). Some interesting questions you will be able to answer after completing this course are: what directions are perpendicular to this force?; what are the coordinates of the force in the rotated coordinate system x − y ?; what are the coordinates of the force if its direction is rotated? These last two questions are related. 1.2.2 Linear Systems Consider the following simple example. You probably saw something like this in high school. Example 1.1 Bob and Sue together have 12 dollars. Sue has 2 dollars more than Bob. How much money do each have? You can probably guess the solution by trial and error, but let us proceed a bit more formally. Let x be the amount of 6
  • 7. Chapter 1. Introduction 1.2. About the Subject Figure 1.1: Force vector and coordinate systems. money Sue has and y the amount Bob has. The two statements in the example can be written mathematically as x+y = 12 x−y = 2. The equations above are a linear system for the unknowns x and y. A technique that can be used to solve the system (that is, determine the values of x and y that simultaneously solve both equations above) is substitution. The second equation can be written as x=y+2 This can be substituted into the first equation above, eliminating y from the problem (y + 2) + y = 12 or 2y + 2 = 12 so 2y = 10 so y = 5 The value of y = 5 determines x = 7 from either of the original relationships. Thus it is determined that Bob has 5 dollars and Sue has 7. Often (but not always) a linear system of n equations for n unknowns has a unique solution. The example above was for the case n = 2. However, the substitution technique used above becomes impractical when n is larger than 3. In this course you will learn the Gaussian Elimination technique to solve linear systems. This systematic method can find all solutions when there are any and also determine if the system has no solutions. This method can be implemented in numerical software and used to solve very large systems. 1.2.3 Eigen-analysis The final subject of the course is eigen-analysis of matrices and its applications. A simple, motivational example comes from the study of discrete dynamical systems. Consider a sequence of values x0 , x1 , · · · xn · · · 7
  • 8. Chapter 1. Introduction 1.3. About These Notes where the index n is a time level. Suppose that xn is determined by the previous value xn−1 in the same way for every n, that is xn = f (xn−1 ) for every n ≥ 1 (1.1) for a given function f . This could describe the population number xn of a species at year n. The simple model assumes that the population the next year only depends on the population this year through the function f . If the initial value x0 were given, then the values x1 , x2 · · · xn · · · could be determined using (1.1) repeatedly. A linear problem arises when we take the specific example f (x) = ax where a is a given constant. In this case, it is easy to compute the entries of the sequence: x1 = f (x0 ) = ax0 x2 = f (x1 ) = f (ax0 ) = a2 x0 . . . . . . xn = f (xn−1 )) = an x0 For this example, we can determine how the sequence behaves very well because we have an expression for xn above that is easy to understand. There are several cases: 1. If x0 = 0 then xn = 0 for all n. 2. if |a| < 1 then limn→∞ xn = 0. 3. if a = 1 then xn = x0 for all n. 4. if a = −1 then the values alternate in sign: xn has the value x0 is n is even, −x0 if n is odd. 5. if |a| > 1 and x0 = 0 then the values of the sequence grow in absolute value as n → ∞. Linear discrete dynamical systems for vectors are also of interest. In these cases, multiplication by the number a in the example above is replaced by multiplication by a matrix A. Eigen-analysis of the matrix A allows one to understand how the system behaves as n → ∞ in a similar way to the simple example above. 1.3 About These Notes The first version of these notes was written by Richard Froese for Math 152 taught in the Spring of 2004. There are many text books on elementary linear algebra material, but none have the material in the order we want for Math 152 for Applied Science students. These notes stress geometric concepts in two and three dimensional space. They also treat applications and numerical 8
  • 9. Chapter 1. Introduction 1.4. About The Computer Labs approximation using MATLAB in more detail than most texts. For this reason, the authors have felt it was worthwhile to maintain and improve these notes for Math 152. Additionally, we believe it is a social benefit for students to have access to this material without having to purchase an expensive, commercial text. An update to the notes was made by Richard Froese for the course in 2007, including solutions to the exercises. The version written for the 2009 course had some updates by Brian Wetton: this introductory chapter, some additional comments on MATLAB commands, a reworked section on linear systems arising from electrical networks, and additional problems and solutions. In addition, the notes were converted to standard L TEXformat to make them easier to maintain. A Substantial revisions for the notes for 2010 were done by Ignacio Rozada, who was supported by a UBC Skylight grant over the Summer of 2009 to add the problems and solutions used in weekly assignments the previous year. He also added additional MATLAB material. Brian Wetton also added additional notes to Chapter 4 on the use of matrix multiplication and inverses in the derivation of solutions to the “fundamental problem” of resistor networks. These notes for 2011 had minor revisions revisions done by Brian Wetton. Further revisions are planned. There will be additional MATLAB material added to the notes and additional problems and solutions. We have considered having the notes printed and available at cost at the UBC bookstore but are not yet sure of the demand from students. If you have any suggestions for additional material or ways to improve the presentation, please send your ideas to Brian Wetton (wetton@math.ubc.ca). 1.4 About The Computer Labs The course includes six one-hour computer labs. These are given to small groups of students every other week starting in the second week of the term. Locations and times for the lab sections can be found following a link from the course web page. The labs are being updated slightly for the 2010 course and will be posted during the term as they are finalized. The labs are designed to be done during the lab hour, and must be handed in at the end of the lab period. It is a good idea to read through the lab notes before going to the lab so you are ready to begin in the lab. After your first lab, you will be able to go to the lab rooms in open hours to improve your MATLAB skills and to prepare for upcoming labs if you find you are not able to complete the labs in the lab hour. Computer lab material including your knowledge of MATLAB commands will be tested on midterms and the final exam. There are two main goals for the labs. The first is to gain familiarity with the computational tool, MATLAB, that is commonly used in later courses and Engineering careers. The second is to be able to solve larger, more interesting applied problems that would otherwise be inaccessible using analytic methods. Seeing the algorithms of MATLAB in action may also help you understand the underlying mathematical concepts you see in the lectures. 9
  • 10. Chapter 2 Vectors and Geometry 2.1 Chapter Introduction This chapter contains an introduction to vectors, which correspond to points in two, three and higher dimensional spaces. In this chapter, you will become famil- iar with basic vector operations such as addition, scalar multiplication, length, the dot product, and the cross product (for three dimensional vectors). Vector representation of lines in 2D and 3D and planes in 3D is presented. Criteria for when such objects intersect at unique points is given in terms of determinants. This geometric presentation motivates our study of these kind of problems in higher dimensional settings in later chapters. Throughout this chapter, MAT- LAB commands are introduced that perform the operations described in the text. For 2D and 3D problems, using MATLAB is only a convenience. For higher dimensions, doing the computations by hand (even with a calculator) is impractical, and a computational framework like MATLAB is essential to be able to solve linear problems. 2.2 Vectors Vectors are used to describe quantities that have both a magnitude and a di- rection. You are probably familiar with vector quantities in two and three dimensions, such as forces and velocities. Later in this course we will see that vectors can also describe the configura- tion of a mechanical system of weights and springs, or the collections of voltages and currents in an electrical circuit. These more abstract vector quantities are not so easily visualized since they take values in higher dimensional spaces. We begin this course by discussing the geometry of vectors in two and three dimensions. In two and three dimensions, vectors can be visualized as arrows. Before we can draw a vector, we have to decide where to place the tail of the vector. If we are drawing forces, we usually put the tail of the vector at the 10
  • 11. Chapter 2. Vectors and Geometry 2.2. Vectors Figure 2.1: Forces acting on a pendulum (left) and position and velocity of a particle (right) Figure 2.2: Scalar multiplication. place where the force is applied. For example, in Figure 2.1 (left) the forces acting on a pendulum bob are gravity and the restraining force along the shaft. If we are drawing the velocity of a particle at a given time, we would place the tail of the velocity vector v(t) at the position of the particle at that time as shown in Figure 2.1 (right). Once we have chosen a starting point for the tails of our vectors (i.e., an origin for space), every point in space corresponds to exactly one vector, namely the vector whose tail is at the origin and whose head is at the given point. For example, in Figure 2.1 (right) we have chosen an arbitrary point as the origin (marked with a circle) and identified the position of the particle with the vector r(t). 2.2.1 Multiplication by a number and vector addition There are two basic operations defined for vectors. One is multiplication of a vector by a number (also called scalar multiplication). The other is addition of two vectors. A vector a can be multiplied by a number (or scalar) s to produce a new vector sa. If s is positive then sa points in the same direction as a and has length s times the length of a. This is shown in Figure 2.2. If s is negative then sa points in the direction opposite to a and has length |s| times the length of a. 11
  • 12. Chapter 2. Vectors and Geometry 2.2. Vectors Figure 2.3: Vector Addition. To add two vectors a and b and we draw the parallelogram that has a and b as two of its sides as shown in Figure 2.3. The vector a + b has its tail at the origin and its head at the vertex of the parallelogram opposite the origin. Alternatively we can imagine sliding (or translating) one of the vectors, without changing its direction, so that its tail sits on the head of the other vector. (In the diagram we translated the vector a.) The sum a + b is then the vector whose tail is at the origin and whose head coincides with the vector we moved. Example 2.1 Describe and sketch the following set of points {sa : s ∈ R} (that is, the set of all scalar multiples of a) where a is a non-zero vector in R2 . The set is a straight line going through the origin with direction a as shown in Figure 2.4. 2.2.2 Co-ordinates In order to do calculations with vectors, we have to introduce co-ordinate axes. Once we have chosen in what directions the co-ordinate axes will lie, we can specify a vector by giving its components in the co-ordinate directions. In the Figure 2.5 we see two choices of x and y axes. For the first choice of axes, the vector a has co-ordinates [5, 3] and for the second choice of axes the √ co-ordinates are [ 34, 0]. In a given problem, it makes sense to choose the axes so that at least some of the vectors have a simple representation. For example, in analyzing the forces acting on a pendulum, we would either choose the y axis either to be vertical, or to lie along the shaft of the pendulum. We can choose to write the co-ordinates of a vector in a row, as above, or in a column, like 5 3 Later on, we will almost always write vectors as columns. But in this chapter we will write vectors as rows. Writing vectors as rows saves space on the page but we will learn later how to write vectors in row form even when we want them to be column vectors for other reasons. Note: When writing vector coordinates 12
  • 13. Chapter 2. Vectors and Geometry 2.2. Vectors a Figure 2.4: Figure for Example 2.1. by hand or in this text, either square or round brackets can be used. However, when using MATLAB, vectors must be created with square brackets (round brackets are used for other purposes). A convenient way to choose the co-ordinate axes is to specify unit vectors (that is, vectors of length one) that lie along each of the axes. These vectors are called standard basis vectors and are denoted i and j (in two dimensions) and i, j and k (in three dimensions). These vectors are shown in Figure 2.6 (Sometimes they are also denoted e1 and e2 (in two dimensions) and e1 , e2 and e3 (in three dimensions).) The unit vectors have co-ordinates i = e1 = [1, 0] j = e2 = [0, 1] in two dimensions, and i = e1 = [1, 0, 0] j = e2 = [0, 1, 0] k = e3 = [0, 0, 1] in three dimensions. Often, we make no distinction between a vector and its co-ordinate repre- sentation. In other words, we regard the co-ordinate axes as being fixed once and for all. Then a vector in two dimensions is simply a list of two numbers (the 13
  • 14. Chapter 2. Vectors and Geometry 2.2. Vectors Figure 2.5: Two choices of co-ordinate axes. Figure 2.6: Unit vectors in 2D (left) and 3D (right). 14
  • 15. Chapter 2. Vectors and Geometry 2.2. Vectors Figure 2.7: Adding vector co-ordinates. components) [a1 , a2 ], and a vector in three dimensions is a list of three numbers [a1 , a2 , a3 ]. Vectors in higher dimensions are now easy to define. A vector in n dimensions is a list of n numbers [a1 , a2 , . . . , an ]. When a vector is multiplied by a number, each component is scaled by the same amount. Thus if a = [a1 , a2 ], then sa = s[a1 , a2 ] = [sa1 , sa2 ] Similarly, when two vectors are added, their co-ordinates are added component- wise. So if a = [a1 , a2 ] and b = [b1 , b2 ], then a+b = [a1 , a2 ] + [b1 , b2 ] = [a1 + b1 , a2 + b2 ] This is shown in Figure 2.7. The analogous formulae hold in three (and higher dimensions). If a = [a1 , a2 , . . . , an ] and b = [b1 , b2 , . . . , bn ], then sa = s[a1 , a2 , . . . , an ] = [sa1 , sa2 , . . . , san ] a+b = [a1 , a2 , . . . , an ] + [b1 , b2 , . . . , bn ] = [a1 + b1 , a2 + b2 , . . . , an + bn ] Example 2.2 Sketch axes x1 -x2 . Add the vectors (1,1) and (2,-1) to your sketch. Draw these vectors with base point at the origin. Now add the vector (1,-2) to your sketch, starting at the base point (1,1). That is, draw the vector with components 1 to the right and 2 down starting at (1,1). Note: your sketch should show graphically that (1,1)+(1,-2)=(2,-1). See Figure 2.8. 15
  • 16. Chapter 2. Vectors and Geometry 2.2. Vectors Figure 2.8: Figure for example 2.2 2.2.3 Properties of vector addition and scalar multiplica- tion Let 0 denote the zero vector. This is the vector all of whose components are zero. The following properties are intuitive and easy to verify. 1. a + b = b + a 2. a + (b + c) = (a + b) + c 3. a + 0 = a 4. a + (−a) = 0 5. s(a + b) = (sa + sb) 6. (s + t)a = sa + ta 7. (st)a = s(ta) 8. 1a = a They follow from similar properties which hold for numbers. For example, for numbers a1 and b1 we know that a1 + b1 = b1 + a1 . Thus a+b = [a1 , a2 ] + [b1 , b2 ] = [a1 + b1 , a2 + b2 ] = [b1 + a1 , b2 + a2 ] = [b1 , b2 ] + [a1 , a2 ] = b + a, so property 1 holds. Convince yourself that the rest of these properties are true. (What is the vector −a?). It might seem like a waste of time fussing over obvious properties such as these. However, we will see when we come to the cross product and matrix product, that sometimes such “obvious” properties turn out to be false! 16
  • 17. Chapter 2. Vectors and Geometry 2.2. Vectors 2.2.4 MATLAB: basic scalar and vector operations On the course web page there is a link to the Math 152 computer lab page, where information on the location of computer labs can be found, as well as how to start up the MATLAB program on the computers in these labs. In the command window at the prompt >>, you can type MATLAB commands directly. Some basic commands are given below assignment: Scalar and vector variables can be assigned using the “=” oper- ator. For example a = 2 followed by <enter> assigns the scalar value of 2 to the variable a. The result of the command is printed out although this can be suppressed by using a colon at the end of the command: a = 2; Here a is still assigned the value of 2 but no output is generated. Vector variables are assigned with the following notation: b = [1 2]; Note that b = [1, 2] has the same meaning in MATLAB, i.e., numbers separated by a comma or a space imply row vectors. For column vectors, the entries have to be separated by semicolons: b1 = [2; 3]; Note also that there are no special distinctions between the names of scalar and vector variables. addition: Both scalar and vector addition can be done with the “+” operator. Keeping the values of scalar a and vector b above, we enter the commands a2 = 5; b2 = [2 9]; a+a2 c = b+b2; The first two lines above assign a new scalar and vector. The third line prints out the answer 7 (2+5). The last line assigns the resulting vector [3 11] ([1 2] + [2 9]) to the new vector c but prints nothing. scalar multiplication Scalar multiplication (of vectors and other scalars) is implemented using the “*” command. Using the variables defined above, 17
  • 18. Chapter 2. Vectors and Geometry 2.2. Vectors a*a2 a*b would result in 10 (2 times 5) and [2 4] (2 times [1 2]). The “*” command also implements matrix-vector and matrix-matrix multiplication discussed later in the course. Vector-vector multiplication (dot products and cross products) are implemented using different commands as discussed in the next section. other commands: There are many useful functions built in to MATLAB such as sqrt (square root), cos (cosine, taking an argument in radians), acos (inverse cosine, giving an result in radians) and many more. They are called as follows sqrt(2) √ which will return 2 to 4 decimal places. Type help followed by a com- mand name gives you a description of that command. Try typing help atan2 since atan2 is a pretty useful function. These MATLAB functions can take vector arguments, they act on each entry of the vector. For example sqrt([1 4]) will produce the vector [1 2]. 2.2.5 Problems Problem 2.1 Sketch axes x1 -x2 . Add the vectors (2,2) and (1,-1) to your sketch. Draw these vectors with base point at the origin. Now add the vector (1,-1) to your sketch, starting at the base point (2,2). That is, draw the vector with components 1 to the right and 1 down starting at (2,2). Note: your sketch should show graphically that (2,2)+(1,-1)=(3,1). Problem 2.2 Let a, b and c be fixed non-zero vectors. Describe and sketch the following sets of points in two and three dimensions: (i) {sa : s ∈ R} (i.e., the set of all scalar multiples of a) (ii) {sa : s > 0} (i.e., the set of all positive scalar multiples of a) (iii) {b + sa : s ∈ R} (iv) {sa + tb : s, t ∈ R} (v) {c + sa + tb : s, t ∈ R} Problem 2.3 Describe the vectors a − b and b − a. Problem 2.4 Find an expression for the midpoint between a and b. Find an expression for a point one third of the way between a and b. Problem 2.5 Find an expression for the line segment joining a and b. 18
  • 19. Chapter 2. Vectors and Geometry 2.3. Geometrical Aspects of Vectors Figure 2.9: Pythagorean Formula. 2.3 Geometrical Aspects of Vectors 2.3.1 Length of a vector It follows from the Pythagorean formula that the length a of a = [a1 , a2 ] satisfies a 2 = a2 + a2 . This is shown in Figure 2.9. 1 2 Thus a = a2 + a2 . 1 2 Similarly, for a vector a = [a1 , a2 , a3 ] in three dimensions, a = a2 + a2 + a2 . 1 2 3 The distance between two vectors a and b is the length of the difference b − a. 2.3.2 The dot product The dot product of two vectors is defined in both two and three dimensions (actually in any dimension). The result is a number. Two main uses of the dot product are testing for orthogonality and computing projections. The dot product of a = [a1 , a2 ] and b = [b1 , b2 ] is given by a · b = a1 b1 + a2 b2 . Similarly, the dot product of a = [a1 , a2 , a3 ] and b = [b1 , b2 , b3 ] is given by a · b = a1 b1 + a2 b2 + a3 b3 . The properties of the dot product are as follows: 0. If a and b are vectors, then a · b is a number. 1. a · a = a 2 . 19
  • 20. Chapter 2. Vectors and Geometry 2.3. Geometrical Aspects of Vectors Figure 2.10: The vectors a, b and a − b (left) Lengths of Segments (right). 2. a · b = b · a. 3. a · (b + c) = a · b + a · c. 4. s(a · b) = (sa) · b. 5. 0 · a = 0. 6. a · b = a b cos(θ), where θ is angle between a and b. 7. a·b = 0 ⇐⇒ a = 0 or b = 0 or a and b are orthogonal (i.e,. perpendicular). Properties 0 to 5 are easy consequences of the definitions. For example, to verify property 5 we write 0 · a = [0, 0, 0] · [a1 , a2 , a3 ] = 0a1 + 0a2 + 0a3 = 0. Property 6 is the most important property and is often taken as the definition of the angle θ between vectors a and b. Notice that our definition is given in terms of the components of the vectors, which depend on how we chose the co-ordinate axes. It is not at all clear that if we change the co-ordinate axis, and hence the co-ordinates of the vectors, that we will get the same answer for the dot product. However, property 6 says that the dot product only depends on the lengths of the vectors and the angle between them. These quantities are independent of how co-ordinate axes are chosen, and hence so is the dot product. To show that property 6 holds we compute a − b 2 in two different ways. First of all, using properties 1 to 5, we have 2 a−b = (a − b) · (a − b) = a·a−a·b−b·a+b·b = a 2 + b 2 − 2a · b (2.1) (Which properties were used in each step?) Next we compute a−b as depicted in Figure 2.10 (left). 20
  • 21. Chapter 2. Vectors and Geometry 2.3. Geometrical Aspects of Vectors We mark the lengths of each of the line segments in Figure 2.10 (right). Using Pythagoras’ theorem for the right angled triangle on the right of this diagram, we see that a−b 2 = ( a − b cos(θ))2 + b 2 sin2 (θ). Thus, using cos2 (θ) + sin2 (θ) = 1, a−b 2 = a 2 + b 2 cos2 (θ) − 2 a b cos(θ) + b 2 sin2 (θ) 2 2 = a + b −2 a b cos(θ) (2.2) Actually, this is just the cosine law applied to the triangle in Figure 2.10 and you may have been able to write (2.2) directly. Now we equate the two expressions (2.1, 2.2) for a − b 2 . This gives 2 2 2 2 a + b − 2a · b = a + b −2 a b cos(θ) 2 2 Subtracting a + b from both sides and dividing by −2 now yields a·b= a b cos(θ). This proves property 6. Property 7 now follows directly from 6. If a · b = 0 then a b cos(θ) = 0 so either a = 0, in which case a = 0, or b = 0, in which case b = 0, or cos(θ) = 0, which implies that θ = π/2 (since θ lies between 0 and π). This implies a and b are orthogonal. Property 6 can be used to compute the angle between two vectors as shown in the example below. Example 2.3 What is the angle between the vectors whose tails lie at the centre of a cube and whose heads lie on adjacent vertices? To compute this take a cube of side length 2 and centre it at the origin, so that the vertices lie at the points [±1, ±1, ±1]. Then we must find the angle between a = [1, 1, 1] and b = [−1, 1, 1]. Since √ √ a · b = −1 + 1 + 1 = 1 = a b cos(θ) = 3 3 cos(θ) we obtain θ = arccos(1/3) ∼ 1.231 (∼ 70.5◦ ) Here is an example to review the basic operations on vectors we know so far. Example 2.4 Consider the vectors a = (2, 3) and b = (1, −3) in R2 . Compute the following: (a) a + b (b) 3a (c) 2a + 4b 21
  • 22. Chapter 2. Vectors and Geometry 2.3. Geometrical Aspects of Vectors a b projba Figure 2.11: Projection. (d) a · b (e) b Solutions: (a) a + b = (2, 3) + (1, −3) = (3, 0) (b) 3a = 3(2, 3) = (6, 9) (c) 2a + 4b = 2(2, 3) + 4(−1, 3) = (4, 6) + (4, −12) = (8, −6) (d) a · b = (2, 3) · (1, −3) = 2 − 9 = −7. √ (e) b = 12 + (−3)2 = 10. 2.3.3 Projections Suppose a and b are two vectors. The projection of a in the direction of b, denoted projb a, is the vector in the direction of b whose length is determined by drawing a line perpendicular to b that goes through a. In other words, the length of projb a is the component of a in the direction of b. This is shown in Figure 2.11 To compute projb a, we first note that it is a multiple of b. Thus projb a = sb for some number s. To compute s, we use the fact that the vector projb a − a (along the dotted line in the diagram) is orthogonal to b. Thus (projb a−a)·b = 0, or (sb − a) · b = 0, or s = a · b/b · b = a · b/ b 2 . Thus a·b projb a = b. b 2 If b is a unit vector (i.e., b = 1) this expression is even simpler. In this case projb a = (a · b) b. Projections are useful for computing the components of a vector in various directions. An easy example is given be the co-ordinates of a vector. These are 22
  • 23. Chapter 2. Vectors and Geometry 2.3. Geometrical Aspects of Vectors simply the components of a vector in the direction of the standard basis vectors. So in two dimensions a1 = a · i = [a1 , a2 ] · [1, 0] a2 = a · j = [a1 , a2 ] · [0, 1] 2.3.4 MATLAB: norm and dot commands MATLAB has built-in functions that implement most of the mathematical op- erations introduced this course. For example, the commands norm(a) returns the length (norm) of the vector a. dot(a,b) returns the dot product of the vectors a and b (if the vectors do not have the same length, an error results as you would expect). Using these commands and scalar multiplication of vectors, a projection of a onto the direction b can be implemented: (dot(a,b)/norm(b)^2))*b where / denotes division (of scalar quantities in this case) and ^ p gives the p’th power of a quantity. 2.3.5 Problems Problem 2.6 Consider the vectors a = (1, 2) and b = (1, −2) in R2 (the set of vectors with 2 components). Compute the following: 1. a + b 2. 2a 3. a − b 4. a · b 5. b Problem 2.7 A circle in the x1 -x2 plane has centre at (2,5). A given point on its circumference is (3,3). Write an equation that describes all the points (x1 , x2 ) on the circle. Problem 2.8 Find the equation of a sphere centred at a = [a1 , a2 , a3 ] with radius r. (Hint: the sphere is the set of points x = [x1 , x2 , x3 ] whose distance from a is r Problem 2.9 Find the equation of a sphere if one of its diameters has end- points [2, 1, 4] and [4, 3, 10] 23
  • 24. Chapter 2. Vectors and Geometry 2.3. Geometrical Aspects of Vectors Problem 2.10 Compute the dot product of the vectors a and b and find the angle between them. (i) a = [1, 2], b = [−2, 3] (ii) a = [−1, 2], b = [1, 1] (iii) a = [1, 1], b = [2, 2] (iv) a = [1, 2, 1], b = [−1, 1, 1] (v) a = [−1, 2, 3], b = [3, 0, 1] Problem 2.11 Let a = (1, 1, 1) and b = (3, 1, −2). Compute the following: 1. The angle between a and b. 2. proja b (the projection of b in the direction of a). Problem 2.12 Let a = (1, 4, 0) and b = (2, −1, 5). Compute the following: (a) The angle between a and b. (b) proja b (the projection of b in the direction of a). Problem 2.13 For which value of s is the vector [1, 2, s] orthogonal to [−1, 1, 1]? Problem 2.14 Does the triangle with vertices [1, 2, 3], [4, 0, 5] and [3, 4, 6] have a right angle? Problem 2.15 Determine the values of c1 and c2 such that the vector [c1 1 c2 ] is a scalar multiple of [2 -2 3]. Problem 2.16 An air-plane with an approach speed of 70 knots is on approach to runway 26 (i.e., pointing in the direction of 260 degrees). This is shown in Figure 2.12 (left). If the wind is from 330 degrees at 10 knots, what head- ing should the pilot maintain to stay lined up with the runway? What is the groundspeed of the air-plane? Problem 2.17 Suppose the angle of the pendulum shaft makes an angle of θ with the vertical direction as shown in Figure 2.12 (right). The force of gravity has magnitude (length) equal to mg and points downwards. The force along the shaft of the pendulum acts to keep the shaft rigid, i.e., the component of the total force along the shaft is zero. Write down the co-ordinates of the two forces and the total force using two different sets of co-ordinate axes — one horizontal and vertical, and one parallel to and orthogonal to the shaft of the pendulum. Problem 2.18 (Matlab) In Matlab code, if one defines a vector a = [a1 , a2 , · · · , an ], with the ai ’s being any numbers, the output of typing a(j) would be aj . Suppose that you have a two element vector. How would you write a line of Matlab code to compute the norm of the vector, without using the norm or dot commands? 24
  • 25. Chapter 2. Vectors and Geometry 2.4. Determinants and the Cross Product Figure 2.12: Runway diagram for problem 2.16 (left) and the pendulum of problem 2.17 (right) 2.4 Determinants and the Cross Product 2.4.1 The determinant in two and three dimensions The determinant is a number that is associated with a square matrix, that is, a square array of numbers. In two dimensions it is defined by a1 a2 det = a1 b2 − a2 b1 . b1 b2 The definition in three dimensions is   a1 a2 a3 b b b b b b det  b1 b2 b3  = a1 det 2 3 − a2 det 1 3 + a3 det 1 2 c2 c3 c1 c3 c1 c2 c1 c2 c3 = a1 b2 c3 − a1 b3 c2 + a2 b3 c1 − a2 b1 c3 + a3 b1 c2 − a3 b2 c1 We want to determine the relationship between the determinant and the vectors a = [a1 , a2 ] and b = [b1 , b2 ] (in two dimensions) and a = [a1 , a2 , a3 ], b = [b1 , b2 , b3 ] and c = [c1 , c2 , c3 ] (in three dimensions). We will do the two dimensional case now, but postpone the three dimensional case until after we have discussed the cross product. So let a = [a1 , a2 ] and b = [b1 , b2 ] be two vectors in the plane. Define ˆ a = [−a2 , a1 ]. ˆ Notice that a has the same length as a, and is perpendicular to a, since a · a = −a2 a1 + a1 a2 = 0. ˆ ˆ There are exactly two vectors with these properties. The vector a is the one that is obtained from a by a counterclockwise rotation of π/2 (i.e., 90◦ ). To see this, notice that if a lies in the first quadrant (that is, a1 > 0 and a2 > 0) 25
  • 26. Chapter 2. Vectors and Geometry 2.4. Determinants and the Cross Product Figure 2.13: The vector a. ˆ ˆ then a lies in the second quadrant, and so on. Later in the course we will study rotations and this will be a special case. Notice now that the determinant can be written as a dot product. a1 a2 a · b = −a2 b1 + a1 b2 = det ˆ b1 b2 ˆ We want to use the geometric formula for the dot product of a and b. Let θ be the angle between a and b and π/2 − θ be the angle between a and b, as shown ˆ in Figure 2.13. Using the geometric meaning of the dot product, we obtain a1 a2 det = a·b ˆ b1 b2 = ˆ a b cos(π/2 − θ) = a b sin(θ) We need to be a bit careful here. When we were discussing the dot product, we always assumed that the angle between two vectors was in the range 0 to π. In fact, the geometric formula for the dot product is not sensitive to how we measure the angle. Suppose that instead of θ in the range 0 to π we use θ1 = −θ (measuring the angle “backwards”) or θ2 = 2π − θ (measuring the angle going the long way around the circle). Since cos(θ) = cos(−θ) = cos(2π − θ) we have c·d= c d cos(θ) = c d cos(θ1 ) = c d cos(θ2 ). In other words, the geometric formula for the dot product still is true. In the diagram above, we want to let the angle θ between a and b range between −π and π. In this case the angle π/2 − θ between a and b is sometimes ˆ not in the range between 0 or 2π. But if this happens, then it is still the ˆ angle between a and b, just “backwards” or “the long way around.” Thus the geometric formula above still is correct. 26
  • 27. Chapter 2. Vectors and Geometry 2.4. Determinants and the Cross Product Values of θ between 0 and π correspond to the situation where the direction of b is obtained from the direction of a by a counterclockwise rotation of less than π. This is the case in the diagram. On the other hand, θ between −π and 0 corresponds to the case where a clockwise rotation of less than π is needed to get from the direction of a to the direction of b. The quantity sin(θ) can be positive or negative, depending on the orienta- tions of a and b, but in any case the positive quantity b | sin(θ)| is the height of the parallelogram spanned by a and b if we take a to be the base. In this case, the length of the base is a . Recall that the area of a parallelogram is the length of the base times the height. Thus a1 a2 det = Area of parallelogram spanned by a and b b1 b2 The determinant is positive if sin(θ) is positive, that is, if θ is positive. This is the case if the direction of b is obtained by a counterclockwise rotation of half a circle or less from the direction of a. Otherwise the determinant is negative. Notice that the determinant whose rows are the components of two non-zero vectors a and b is zero exactly when the vectors a and b are pointing in the same direction, or in the opposite direction, that is, if one is obtained from the other by scalar multiplication. The sign of the determinant gives information about their relative orientation. 2.4.2 The cross product Unlike the dot product, the cross product is only defined for vectors in three dimensions. And unlike the dot product, the cross product of two vectors is another vector, not a number. If a = [a1 , a2 , a3 ] and b = [b1 , b2 , b3 ], then a × b is a vector given by a × b = [a2 b3 − a3 b2 , a3 b1 − a1 b3 , a1 b2 − a2 b1 ]. An easy way to remember this is to write down a 3 × 3 matrix whose first row contains the unit basis vectors and whose second and third rows contain the components of a and b. Then the cross product is obtained by following the usual rules for computing a 3 × 3 determinant.   i j k a a3 a a3 a a2 det  a1 a2 a3  = i det 2 − j det 1 + k det 1 b2 b3 b1 b3 b1 b2 b1 b2 b3 = [a2 b3 − a3 b2 , a3 b1 − a1 b3 , a1 b2 − a2 b1 ] The geometric meaning of the cross product is given the following three prop- erties: 1. a × b is orthogonal to a and to b 27
  • 28. Chapter 2. Vectors and Geometry 2.4. Determinants and the Cross Product Figure 2.14: The parallelogram spanned by a and b. 2. a × b = a b sin(θ), where θ is the angle between a and b. In this formula, θ lies between 0 and π, so that sin(θ) is positive. This is the same as saying that the length of a × b is the area of the parallelogram spanned by a and b. 3. The vectors a, b and a × b obey the right hand rule. This geometric description of the cross product shows that the definition of the cross product is independent of how we choose our co-ordinate axes. To verify 1, we compute the dot products a · (a × b) and b · (a × b) and verify that they are zero. (This is one of the problems below.) To verify 2 we must show that the length of a × b is the area of the paral- lelogram spanned by a and b, since the quantity a b sin(θ) is precisely this area. Since both the length and the area are positive quantities , it is enough to compare their squares. We have 2 a×b = (a2 b3 − a3 b2 )2 + (a3 b1 − a1 b3 )2 + (a1 b2 − a2 b1 )2 (2.3) On the other hand, the area A of the parallelogram spanned by a and b is length a times the height. This height is the length of the vector b−proja b = b − (a · b)a/ a 2 as shown in Figure 2.14. Using these facts, we arrive at the following formula for the square of the area of the parallelogram. A2 = a 2 b − (a · b)a/ a 2 2 2 2 = a b + (a · b)2 a 2 / a 4 − 2(a · b)2 / a 2 2 2 = a b − (a · b)2 = (a2 1 + a2 2 + a2 )(b2 + b2 + b2 ) − (a1 b1 + a2 b2 + a3 b3 )2 3 1 2 3 (2.4) Expanding the expressions in (2.3) and (2.4) reveals that they are equal. 28
  • 29. Chapter 2. Vectors and Geometry 2.4. Determinants and the Cross Product Notice that there are exactly two vectors satisfying properties 1 and 2, that is, perpendicular to the plane spanned by a and b and of a given length. The cross product of a and b is the one that satisfies the right hand rule. We say that vectors a, b and c (the order is important) satisfy the right hand rule if you can point the index finger of your right hand in the direction of a and the middle finger in the direction of b and the thumb in the direction of c. Try to convince yourself that if a, b and c (in that order) satisfy the right hand rule, then so do b, c, a and c, a, b. Here are some properties of the cross product that are useful in doing com- putations. The first two are maybe not what you expect. 1. a × b = −b × a 2. a × (b × c) = (c · a)b − (b · a)c. 3. s(a × b) = (sa) × b = a × (sb). 4. a × (b + c) = a × b + a × c. 5. a · (b × c) = (a × b) · c. Example 2.5 Let a = (1, 3, −2) and b = (−1, 2, 3). Compute the following: (a) The area of the parallelogram whose sides are a and b. (b) The angle between a and b. Solution: (a) The area of the parallelogram is equal to the length of a × b: ˆ i j ˆ ˆ k a×b= 1 3 −2 i(9 j(2 ˆ = ˆ + 4) + ˆ − 3) + k(2 + 3) = (13, −1, 5) −1 2 3 so the area is √ a×b = 132 + (−1)2 + 52 = 195 ≈ 13.96 (b) Note that the formula a × b = a b sin θ cannot be used for this question since it cannot distinguish between θ and π − θ (think about this point). Instead, use the cos, dot product formula which should always be used for the calculation of angles between vectors unless you really know what you are doing: a·b (1, 3, −2) · (−1, 2, 3) −1 + 6 − 6 −1 cos θ = = =√ √ = a b (1, 3, −2) (−1, 2, 3) 1+9+4 1+4+9 14 so −1 θ = cos−1 ≈ 1.64 radians or ≈ 94.10◦ 14 29
  • 30. Chapter 2. Vectors and Geometry 2.4. Determinants and the Cross Product Figure 2.15: The Triple Product. 2.4.3 The triple product and the determinant in three di- mensions The cross product is defined so that the dot product of a with b × c is a determinant:   a1 a2 a3 a · (b × c) = det  b1 b2 b3  c1 c2 c3 This determinant is called the triple product of a, b and c. Using this fact we can show that the absolute value of the triple product is the volume of the parallelepiped spanned by a, b and c. A diagram of the parallepiped is shown in Figure 2.15. The absolute value of the triple product is |a · (b × c)| = a cos(θ) b × c . Here θ is the angle between a and b × c. The quantity a cos(θ) is the height of parallelepiped and b × c is the area of the base. The product of these is the volume of the parallepiped, as claimed. Thus   a1 a2 a3 det  b1 b2 b3  = Volume of the parallelepiped spanned by a, b and c c1 c2 c3 The sign of the triple product is positive if θ is between zero and π/2 and negative if θ lies between π/2 and π. This is the case if a is on the same side of the plane spanned by b and c as b × c. This holds if the vectors b, c and a (in that order) satisfy the right hand rule. Equivalently a, b and c (in that order) satisfy the right hand rule. Mathematically, it is more satisfactory to define the right hand rule using the determinant. That is, we say that vectors a, b and c satisfy the right hand rule if the determinant a · (b × c) is positive. 30
  • 31. Chapter 2. Vectors and Geometry 2.4. Determinants and the Cross Product 2.4.4 MATLAB: assigning matrices and det and cross com- mands cross The command cross(a,b) computes the cross product a × b. An error results if a or b are not vectors of length 3. As an example, the command cross([1 0 0],[0 1 0]) gives the vector result [0 0 1]. matrices: The syntax to generate a matrix is shown below using a 2×2 example a = [1 2; 3 4] This command assigns a matrix to a that has the vector [1 2] in its first row and [3 4] in its second. Entries of a matrix can be accessed individually, for example a(1,2) is the entry in the first row, second column. zeros: Many applications can lead to large matrices with many rows and columns. Even though MATLAB can do matrix computations, it can be tedious to enter these large matrices by hand. In some cases the matrices have mostly zeros as entries (these matrices are called sparse). In these cases it is more efficient to generate a matrix of all zeros and then modify the entries that are not zero. For example a = zeros(2,2); a(1,1) = 1; generates a 2 × 2 matrix with entries that are all zero except the upper left entry which is 1. Note that zeros(n,m) generates a matrix with n rows and m columns with all zero entries. So zeros(1,m) is a row vector of length m and zeros(n,1) is a column vector of length m with all zero entries. rand: rand (n,m) generates a matrix with n rows and m columns with entries that are random numbers uniformly distributed in the interval [0,1]. det: The command det(a) returns the determinant of the matrix a. An error occurs if a is not a square (same number of rows and columns) matrix. Determinants of larger matrices (than 2 × 2 and 3 × 3 discussed in this section) are discussed in Chapter 4. 2.4.5 MATLAB: generating scripts with the MATLAB editor Often times using the command window in MATLAB to solve a problem can be tedious, because if the need arises to redo the problem, or change a parameter, one has to rewrite it all. The editor comes in handy for such cases. The editor 31
  • 32. Chapter 2. Vectors and Geometry 2.4. Determinants and the Cross Product is a text window (accesed from the command window: File → New → Blank M-file) where one can write commands in the same syntax as the editor, and when one runs it, the results appear in the command window exactly as if one had written them there one after the other. For example, the code to generate three random orthogonal vectors would look something like this: a1 = rand(3,1) b = rand(3,1); a2 = cross(a1,b) a3 = cross(a1,a2) dot(a1,a2) dot(a1,a3) dot(a2,a3) Note that the last three lines are there to check that the three vectors are mutually orthogonal. Once the code was written, save it from the editor window: File → Save as, making sure that the name of the file has a “.m” extension (and the file name should contain no spaces). There are several different ways of running the script, the fastest one is to hit the F5 key. Alternatively, from the editor window it can be run from Debug → Run, or directly from the command window by typing the name of the script into the MATLAB command line. 2.4.6 MATLAB: floating point representation of real num- bers MATLAB can represent integers exactly (up to limited but large size). Using a “floating point representation”, MATLAB can represent most real numbers only approximately (but quite accurately - to 16 digits or so). In certain cases, the errors made in floating point approximation of numbers can be amplified and lead to noticeable errors in computed results. This will not happen typically in the examples and computer labs for Math 152, but the reader should be aware of the possibility. Example 2.6 Consider the vectors a = [1 1 1] √ √ b = [ 2 2 0] and c = a + b. If a 3 × 3 matrix A is made with rows a, b and c then the determinant of A is zero (by construction, the vectors lie on the same plane). If this computation is done in MATLAB, a = [1 1 1]; b = [sqrt(2) sqrt(2) 0 ]; c = a + b; A = [a; b; c]; det(A) 32
  • 33. Chapter 2. Vectors and Geometry 2.4. Determinants and the Cross Product the result is 3.1402e-16 not zero due to floating point approximation of inter- mediate computations. You can type eps in MATLAB to see the maximum relative error made by floating point approximation. On the computer used to do the computation above, eps had a value of 2.2204e-16, so it is believable that the calculation error in the determinant was made by the combination of a few floating point approximations. Try to determine how many decimal digits of accuracy the floating point representation in your calculator uses (typically, the accuracy is greater than what is displayed). 2.4.7 Problems Problem 2.19 Compute [1, 2, 3] × [4, 5, 6] Problem 2.20 Use the definition to find the determinant of the matrix   1 1 1  1 2 3 . 1 0 −1 Do the computation by hand showing your work, but you can check your result using MATLAB. From your result, decide if the vectors [1 1 1], [1 2 3] and [1 0 -1] lie in the same plane (justify your answer, very briefly). Problem 2.21 Verify that a · (a × b) = 0 and b · (a × b) = 0. Problem 2.22 Simplify each of the following expressions: (a) ((1, 4, −1) · (2, 1, 3))((2, 1, 4) × (1, 4, 9)) (b) (7, 1, 0) · ((2, 0, −1) × (1, 4, 3)) (c) (a × b) × (b × a) Problem 2.23 Explain why a b sin(θ) is the area of the parallelogram spanned by a and b. Here θ is the angle between the two vectors. Problem 2.24 Find examples to show that in general a × b = b × a and a × (b × c) = (a × b) × c. Problem 2.25 (Matlab) The Matlab command a=rand(1,n) generates an n×1 vector with random entries. Write a script that generates three random vectors and write what you obtain from a × b − b × a, and from a × (b × c) − (a × b) × c. Does that constitute a proof ? Problem 2.26 Show that a × (b × c) = (a · c)b − (a · b)c. Problem 2.27 (Matlab) Write a script that generates three random vectors and checks that the result from problem 2.26 holds: a × (b × c) = (a · c)b − (a · b)c. 33
  • 34. Chapter 2. Vectors and Geometry 2.5. Lines and Planes Problem 2.28 Derive an expression for (a × b) · (c × d) that involves dot products but not cross products. Problem 2.29 (a) Draw a sketch containing the vectors a, b and a×(a×b). Assume that a and b lie in the plane of the paper and have an acute angle between them. (b) Find a formula for a × (a × b) which involves only a , b and proja b. Hint: use a property of the dot product. Problem 2.30 What is the analog of the cross product in two dimensions? How about four dimensions? 2.5 Lines and Planes 2.5.1 Describing linear sets The following sections we will consider points, lines, planes and space in two and three dimensions. Each of these sets have two complementary (or dual) descriptions. One is is called the parametric form and the other the equation form. Roughly speaking, the parametric form specifies the set using vectors that are parallel to the set, while the equation form uses vectors that are orthogonal to the set. Using the parametric description, it is easy to write down explicitly all the elements of the set, but difficult to check whether a given point lies in the set. Using the equation description its the other way around: if someone gives you a point, it is easy to check whether it lies in the set, but it is difficult to write down explicitly even a single member of the set. One way of thinking about solving a system of linear equations is simply going from one description to the other. This will (hopefully) become clear later on. These sections will always follow the same pattern. We will consider the parametric and equation descriptions, first in the special case when the set passes through the origin. Then we consider the general case. Recall that when we say “the point x” this means “the point at the head of the vector x whose tail is at the origin.” In these sections we have used the notation [x1 , x2 ] instead of [x, y] and [x1 , x2 , x3 ] instead of [x, y, z] for typical points in two and three dimensions. 2.5.2 Lines in two dimensions: Parametric form First we consider lines passing through the origin. Let a = [a1 , a2 ] be vector in the direction of the line. Then all the points x on the line are of the form x = sa for some number s. The number s is called a parameter. Every value of s corresponds to exactly one point (namely sa) on the line. 34
  • 35. Chapter 2. Vectors and Geometry 2.5. Lines and Planes x b a x-q q b a Line through the origin Line through q Figure 2.16: A line in two dimensions. Now we consider the general case. Let q be a point on the line and a lie in the direction of the line. Then the points on the line can be thought of as the points on the line through the origin in the direction of a shifted or translated by q. Then a point x lies on the line through q in the direction of a exactly when x = q + sa for some value of s. 2.5.3 Lines in two dimensions: Equation form First we consider lines passing through the origin shown in Figure 2.16 (left). Let b = [b1 , b2 ] be orthogonal to the direction of line. The point x is on the line exactly when x · b = 0. This can be written x1 b1 + x2 b2 = 0. Now we consider the general case shown in Figure 2.16 (right). Let q be a point on the line and b = [b1 , b2 ] be orthogonal to the direction of line. A point x lies on the line through q in the direction of a exactly when x − q lies on the line through the origin in the direction of a. Thus (x − q) · b = 0. This can be written (x1 − q1 )b1 + (x2 − q2 )b2 = 0 or x1 b1 + x2 b2 = c, where c = q · b. 2.5.4 Lines in three dimensions: Parametric form The parametric form of a line in three (or higher) dimensions looks just the same as in two dimensions. The points x on the line are obtained by starting at some point q on the line and then adding all multiples of a vector a pointing in the direction of the line. So x = q + sa 35
  • 36. Chapter 2. Vectors and Geometry 2.5. Lines and Planes Figure 2.17: A line in three dimensions. The only difference is that now q and a are vectors in three dimensions. 2.5.5 Lines in three dimensions: Equation form We begin with lines through the origin. A line through the origin can be de- scribed as all vectors orthogonal to a plane. Choose two vectors b1 and b2 lying in the plane that are not collinear (or zero). Then a vector is orthogonal to the plane if and only if it is orthogonal to both b1 and b2 . Therefore the line consists of all points x such that x · b1 = 0 and x · b2 = 0. If b1 = [b1,1 , b1,2 , b1,3 ] and b2 = [b2,1 , b2,2 , b2,3 ] then these equations can be written b1,1 x1 + b1,2 x2 + b1,3 x3 = 0 b2,1 x1 + b2,2 x2 + b2,3 x3 = 0 Notice that there are many possible choices for the vectors b1 and b2 . The method of Gaussian elimination, studied later in this course, is a method of replacing the vectors b1 and b2 with equivalent vectors in such a way that the equations become easier to solve. Now consider a line passing through the point q and orthogonal to the directions b1 and b2 as shown in Figure 2.17. A point x lies on this line precisely when x − q lies on the line through the origin that is orthogonal to b1 and b2 . Thus (x − q) · b1 = 0 and (x − q) · b2 = 0. This can be written b1,1 (x1 − q1 ) + b1,2 (x2 − q2 ) + b1,3 (x3 − q3 ) = 0 b2,1 (x1 − q1 ) + b2,2 (x2 − q2 ) + b2,3 (x3 − q3 ) = 0, or b1,1 x1 + b1,2 x2 + b1,3 x3 = c1 b2,1 x1 + b2,2 x2 + b2,3 x3 = c2 where c1 = q · b1 and c2 = q · b2 36
  • 37. Chapter 2. Vectors and Geometry 2.5. Lines and Planes 2.5.6 Planes in three dimensions: Parametric form We begin with planes through the origin. Since a plane is a two dimensional object, we will need two parameters to describe points on the plane. Let a1 and a2 be non-collinear vectors in the direction of the plane. Then every point on the plane can be reached by adding some multiple of a1 to some other multiple of a2 . In other words, points x on the plane are all points of the form x = sa1 + ta2 for some values of s and t. If the plane passes through some point q in the directions of a1 and a2 , then we simply shift all the points on the parallel plane through the origin by q. So x lies on the plane if x = q + sa1 + ta2 for some values of s and t. 2.5.7 Planes in three dimensions: Equation form A plane through the origin can be described as all vectors orthogonal to a given vector b as shown in Figure 2.18. (In this situation, if b has unit length it is called the normal vector to the plane and is often denoted n.) Therefore x lies on the plane whenever x · b = 0, or b1 x1 + b2 x2 + b3 x3 = 0. If a plane with normal vector b is translated so that it passes through the point q, then x lies on the plane whenever x − q lies on the parallel plane through the origin. Thus x lies on the plane whenever (x − q) · b = 0. Equivalently b1 (x1 − q1 ) + b2 (x2 − q2 ) + b3 (x3 − q3 ) = 0, or b1 x1 + b2 x2 + b3 x3 = c, where c = q · b. 2.5.8 Problems Problem 2.31 A line orthogonal to b can be described as the set of all points x whose projections onto b all have the same value. Using the formula for projections, show that this leads to the equation description of the line. Problem 2.32 Find both the parametric form and equation form for the line in Figure 2.19. Write down five points on the line (notice that the parametric form is more useful for this). Check whether the point [ 1012 , 1069 ] is on the line 3 21 (notice that the equation form is more useful for this.) 37
  • 38. Chapter 2. Vectors and Geometry 2.5. Lines and Planes Figure 2.18: A plane in three dimensions. [7,5] [0,1] Figure 2.19: Diagram for problem 2.32. 38
  • 39. Chapter 2. Vectors and Geometry 2.5. Lines and Planes Problem 2.33 Find the equation form for the line [1, 1] + s[−1, 2]. Problem 2.34 Find the parametric form for the line x1 − 3x2 = 5 Problem 2.35 Use a projection to find the distance from the point [−2, 3] to the line 3x1 − 4x2 = −4 Problem 2.36 Consider the plane x − y + 2z = 7. 1. What is the normal direction to the plane? 2. Find the coordinates of any point (your choice) on the plane. Problem 2.37 Let a, b and c be the vertices of a triangle. By definition, the median of a triangle is a straight line that passes through a vertex of the triangle and through the midpoint of the opposite side. (i) Find the parametric form of the equation for each median. (ii) Do all the medians meet at a common point? If so, which point? Problem 2.38 Find a pair of equations which define the line {(2, 0, −4) + s(0, 1, 3) : s ∈ R} Problem 2.39 Find the intersection point between the line {(2, −1, 6) + s(1, −1, 0) : s ∈ R} and the plane {t(0, 1, −3) + u(−1, 2, 0) : t, u ∈ R} Hint: first find an equation for the plane. Problem 2.40 Find the intersection point of the line with parametric form below (1, 2, 3) + t(1, 0, −1) and the plane x + 2y − z = 5. . Problem 2.41 Find the equation of the plane containing the points [1, 0, 1], [1, 1, 0] and [0, 1, 1]. Problem 2.42 Find the equation of the sphere which has the two planes x1 + x2 + x3 = 3 and x1 + x2 + x3 = 9 as tangent planes if the centre of the sphere is on the planes 2x1 − x2 = 0, 3x1 − x3 = 0. Problem 2.43 The planes x + y + z = 2 and x − y + 2z = 7 intersect in a line. Find a parametric representation of this line. 39
  • 40. Chapter 2. Vectors and Geometry 2.6. Introduction to Linear Systems Problem 2.44 Find the equation of the plane that passes through the point [−2, 0, 1] and through the line of intersection of 2x1 + 3x2 − x3 = 0, x2 − 4x2 + 2x3 = −5. Problem 2.45 What’s wrong with the question “Find the equation for the plane containing [1, 2, 3], [2, 3, 4] and [3, 4, 5].”? Problem 2.46 Find the distance from the point p to the plane b · x = c. Problem 2.47 Find the equation for the line through [2, −1, −1] and parallel to each of the two planes x1 + x2 = 0 and x1 − x2 + 2x3 = 0. Express the equation fo the line in both parametric and equation form. Problem 2.48 (Matlab) Plotting figures in Matlab is quite simple. Simply type plot(1,2) in the command window and observe what you get. Copy the follow- ing script (call it basicplot.m for example), and run it: x = -2:0.1:2 m = 2 x_0 = 1 y_0 = 1 y = y_0 + m*(x-x_0) plot(x,y,’.’) Type help plot in the command window to learn about different options for the plot command. Without closing the figure window, type hold on in the command window, and re-run the script after changing the slope to m = -1/2 (the hold on command allows you to overlap plots). Notice that the two lines should be perpendicular, but because of the scaling they appear not to be. Type in the command axis equal to fix that. How would you modify the script to plot a circle of radius 3? 2.6 Introduction to Linear Systems 2.6.1 Description of points and the geometry of solutions to systems of equations So far we have considered the parametric and equation descriptions of lines and planes in two and three dimensions. We can also try to describe points in the same way. This will help you get a geometric picture of what it means to solve a system of equations. The “parametric” description of a point doesn’t have any parameters! It simply is the name of the point x = q. In two dimensions the equation form for describing a point will look like b1,1 x1 + b1,2 x2 = c1 b2,1 x1 + b2,2 x2 = c2 40
  • 41. Chapter 2. Vectors and Geometry 2.6. Introduction to Linear Systems Figure 2.20: Intersection of lines in 2D that are not collinear is a point. where the vectors b1 = [b1,1 , b1,2 ] and b2 = [b2,1 , b2,2 ] are not collinear. Each one of this equations describes a line. The point x = [x1 , x2 ] will satisfy both equations if it lies on both lines, i.e., on the intersection. Since the vectors b1 and b2 are not co-linear, the lines are not parallel, so the intersection is a single point. This situation is shown in Figure 2.20. In three dimensions the equation form for describing a point will look like b1,1 x1 + b1,2 x2 + b1,3 x3 = c1 b2,1 x1 + b2,2 x2 + b2,3 x3 = c2 b3,1 x1 + b3,2 x2 + b3,3 x3 = c3 where b1 , b2 and b3 don’t all lie on the same plane. This can be interpreted as the intersection of three planes in a single point. Notice that going from the equation description of a point to the parametric description just means finding the solution of the system of equations. If, in two dimensions, the vectors b1 and b2 are not collinear, or in three dimensions, b1 , b2 and b3 don’t all lie on the same plane, then the system of equations has a unique solution. Now suppose that you are handed an arbitrary system of equations b1,1 x1 + b1,2 x2 + b1,3 x3 = c1 b2,1 x1 + b2,2 x2 + b2,3 x3 = c2 b3,1 x1 + b3,2 x2 + b3,3 x3 = c3 What does the set of solutions x = [x1 , x2 , x3 ] look like? As we just have seen, if b1 , b2 and b3 don’t all lie on the same plane, there is a unique solution given as the intersection of three planes. Recall that the determinant can be used to test whether the vectors b1 , b2 and b3 lie on the same plane. So a unique solution exists to the equation precisely when   b1,1 b1,2 b1,3 det  b2,1 b2,2 b2,3  = 0 b3,1 b3,2 b3,3 What happens when the determinant is zero and three vectors b1 , b2 and b3 do lie on the same plane? Then it could be that the three planes intersect in a line. 41
  • 42. Chapter 2. Vectors and Geometry 2.6. Introduction to Linear Systems Figure 2.21: Planes intersecting. In this case every point on that line is a solution of the system of equations, and the solution set has a parametric description of the form x = q + sa. It could also be that all three planes are the same, in which case the solution set is the plane. In this case the solution set has a parametric description of the form x = q + s1 a1 + s2 a2 Another possibility is that two of the planes could be parallel with no intersection. In this case there are no solutions at all! Some of these possibilities are illustrated in Figure 2.21. 2.6.2 Describing the whole plane in two dimensions and all of space in three dimensions If the set we are trying to describe is the whole plane in two dimensions or all of space in three dimensions, then we don’t need any equations, since there are no restrictions on the points. However it does make sense to think about the parametric form. Lets start with two dimensions. Consider Figure 2.22. If we pick any two vectors a1 and a2 that don’t lie on the same line, then any vector x = [x1 , x2 ] in the plane can be written as s1 a1 + s2 a2 . Notice that every choice of s1 and s2 corresponds to exactly one vector x. In this situation we could use the parameters s1 and s2 as co-ordinates instead of x1 and x2 . In fact if a1 and a2 are unit vectors orthogonal to each other, this just amounts to changing the co-ordinate axes to lie along a1 and a2 . (The new co-ordinates [s1 , s2 ] are then just what we were calling [x1 , x2 ] before.) (In fact, even if the vectors a1 and a2 are not unit vectors orthogonal to each other, we can still think of them of lying along new co-ordinate axes. However, now the axes have been stretched and sheared instead of just rotated, and need not lie at right angles any more.) The situation in three dimensions is similar. Now we must pick three vectors a1 , a2 and a3 that don’t lie on the same plane. Then every vector x has a unique representation x = s1 a1 + s2 a2 + s3 a3 . Again, we could use s1 , s2 and s3 as co-ordinates in place of x1 , x2 and x3 . Again, if a1 , a2 and 3a3 are orthogonal with unit length, then this amounts to choosing new (orthogonal) co-ordinate axes. 42
  • 43. Chapter 2. Vectors and Geometry 2.6. Introduction to Linear Systems Figure 2.22: A basis in 2D 2.6.3 Linear dependence and independence The condition in two dimensions that two vectors are not co-linear, and the condition in three dimensions that three vectors do not lie on the same plane has now come up several times — in ensuring that a system of equations has a unique solutions and in ensuring that every vector can be written in a unique way in parametric form using those vectors. This condition can be tested by computing a determinant. We will now give this condition a name and define the analogous condition in any number of dimensions. First, some terminology. If a1 , a2 , . . . an are a collection of vectors then a vector of the form s1 a1 + s2 a2 + · · · sn an for some choice of numbers s1 , . . . sn is called a linear combination of a1 , a2 , . . . an . Now, the definition. A collection of vectors a1 , a2 , . . . an is called linearly dependent if some linear combination of them equals zero, i.e., s1 a1 + s2 a2 + · · · sn an = 0 for s1 , . . . sn not all zero. A collection of vectors is said to be linearly independent if it is not linearly dependent. In other words, the vectors a1 , a2 , . . . an are linearly independent if the only way a linear combination of them s1 a1 + s2 a2 + · · · sn an can equal zero is for s1 = s2 = · · · = sn = 0. What does linear dependence mean in three dimensions? Suppose that a1 , a2 and a3 are linearly dependent. Then there are some numbers s1 , s2 and s3 , not all zero, such that s1 a1 + s2 a2 + s3 a3 = 0. Suppose that s1 is one of the non-zero numbers. Then we can divide by −s1 and find that −a1 + s2 a2 + s3 a3 = 0 for s2 = −s2 /s1 and s3 = −s3 /s1 . Thus a1 = s2 a2 + s3 a3 , 43
  • 44. Chapter 2. Vectors and Geometry 2.6. Introduction to Linear Systems or a1 is a linear combination of a2 and a3 . But this implies that a1 lies on the plane spanned by a2 and a3 , i.e., the vectors all lie on the same plane. If s1 happens to be zero we can repeat the same argument with one of the si ’s which is not zero. Thus linear dependence implies that all three vectors lie on the same plane. Conversely, if all three vectors lie on the same plane, then we can write one vector as a linear combination of the other two, a1 = s2 a2 + s3 a3 which implies −a1 + s2 a2 + s3 a3 = 0 which says that the vectors are linearly dependent. So in three dimensions, linear dependence means the vectors lie on the same plane. Similarly, in two dimensions, linear dependence means the vectors are co-linear. One final piece of terminology. A collection of n linearly independent vectors a1 , a2 , . . . an in n dimensional space is called a basis. If a1 , a2 , . . . an is a basis, then every vector x can be written in a unique way as a linear combination x = s1 a1 + s2 a2 + · · · sn an 2.6.4 Problems Problem 2.49 Is the collection of vectors a1 = [1, 1], a2 = [1, 0] a basis for two dimensional space? If so, express the vector x = [0, 1] as a linear combination of a1 and a2 Problem 2.50 Let a = [2, 2, 2] and b = [3, 4, 1]. Find all vectors c such that the list of vectors a, b, c is not a basis of R3 . Problem 2.51 Show that the collection of vectors a = [1, 1, 1], b = [1, 1, 0] and c = [1, 0, 0] is a basis of R3 . Express [1, 2, 3] as a linear combination of the vectors a, b and c. Problem 2.52 Let the vectors a = [1, 0, 4], b = [2, −1, 0]and c = [8, −3, 8]. Do these vectors form a basis of R3 ? Problem 2.53 Is it possible for four vectors to be linearly independent in three dimensional space? Problem 2.54 Show that the collection of vectors a = [2, 1, 3], b = [1, 0, 2] and c = [3, 0, 0] is a basis of R3 . Express [12, 2, 4] as a linear combination of the vectors a, b and c. Problem 2.55 Suppose that a1 , a2 , . . . an is a basis. Show that if some vector x has representation x = s1 a1 + s2 a2 + · · · sn an and x = t1 a1 + t2 a2 + · · · tn an , then s1 = t1 , s2 = t2 ,. . .,sn = tn . (Hint: subtract the two expressions for x and use the fact that the basis vectors are linearly independent.) Problem 2.56 (Matlab) Start by convincing yourself that the vectors (1, 3), and (5, 2) are linearly independent. This can easily be shown by writing a script that plots the vectors: 44
  • 45. Chapter 2. Vectors and Geometry 2.7. Additional Topics Ω a v r θ 0 Figure 2.23: Rotational Motion. plot([1,0],[3,0]) hold on plot([5,0],[2,0]) We can extend this script by also plotting a linear combination of the two vectors: alfa=1; beta=1; clf() plot([alfa*1,0],[alfa*3,0]) hold on plot([beta*5,0],[beta*2,0]) plot([alfa*1+beta*5,alfa*1],[alfa*3+beta*2,alfa*3],’--’) Play around with the script and find parameters alfa and beta so that the vector (1, 1) can be written as a linear combination of (1, 3) and (5, 2). 2.7 Additional Topics These topics are not covered in Math 152 this year. 2.7.1 Application: rotational motion Consider a rigid body rotating about an axis given by the unit vector a at a rate of Ω radians per second. Let r be the position vector of a point on the body as shown in Figure 2.23. What is the velocity of the point? The point travels on a circle with radius r sin(θ), where θ is the angle that r makes with the axis. Therefore, in one second, the point travels a distance of Ω r sin(θ). Thus (i) the magnitude of the velocity is v = Ω r sin(θ). 45
  • 46. Chapter 2. Vectors and Geometry 2.7. Additional Topics (ii) Now notice that v is orthogonal to the plane spanned by a and r. (iii) Finally notice that Ωa, r and v obey the right hand rule. The facts (i), (ii) and (iii) imply that v is exactly the cross product of Ωa and r. It is customary to let Ω denote the vector Ωa. Then v = Ω × r. Problem 2.57 A body rotates at an angular velocity of 10 rad/sec about the axis through the points [1, 1, −1] and [2, −3, 1]. Find the velocity of the point [1, 2, 3] on the body. Problem 2.58 The line L passing through the origin and the point [1,1,1] passes through the midpoint of a thin metal rod of length 2 that is oriented in the direction [1,0,0]. The rod begins rotating about L at 3 revolutions per minute. What is the fastest speed (length of velocity vector) of any point on the rod? Problem 2.59 Imagine a plate that lies in the xy–plane and is rotating about the z–axis. Let P be a point that is painted on this plane. Denote by r the distance from P to the origin, by θ(t) the angle at time t between the line from the origin to P and the x–axis and by [x(t), y(t)] the co-ordinates of P at the time t. Find x(t) and y(t) in terms of θ(t). Compute the velocity of P in two ways: 1. by differentiating [x(t), y(t)] and 2. by computing Ω × r. 2.7.2 Application: 3-D graphics How can we represent a three dimensional object on piece of paper or computer screen? Imagine the object in space and, a certain distance away, a point p representing the eye of the observer. Between the observer and the object is a plane called the view plane. The position of the origin of this plane is described by a point q, and its orientation is given by three orthogonal unit vectors of length 1, denoted e1 , e2 and e3 . (These are not the same as the standard basis vectors i, j and k in this problem.) This situation is shown in Figure 2.24. As usual, only the heads of the vectors (points) p, x, y and q are shown on the diagram. (The origin, where the tails of these vectors lie, is not depicted at all.) We will assume that the view plane is a distance one from the observer in the direction e3 . Thus, e3 can be thought of as the direction that the observer is looking. Think of light rays leaving the object at point x and travelling to the observer’s eye at p. At some point y this line intersects the view plane. All the vectors y on the view plane that correspond to some vector x on our object will furnish the two dimensional representation of the object. How do we determine the point y? The parametric form of points on the plane is q + s1 e1 + s2 e2 . So we must have that y = q + s1 e1 + s2 e2 for some values of s1 and s2 . We also know that the vector x − p is in the same direction 46
  • 47. Chapter 2. Vectors and Geometry 2.7. Additional Topics Figure 2.24: Perspective in 3-D graphics application. as y − p. Therefore they must be multiples, i.e., y − p = λ(x − p) for some number λ. Substituting in our expression for y yields q + s1 e1 + s2 e2 − p = λ(x − p). Since q − p = e3 this gives e3 + s1 e1 + s2 e2 = λ(x − p). Let us take the dot product of both sides of this equation with the unit vectors e3 , e1 and e2 . We can use the fact that ei · ej is zero if i = j and 1 if i = j. Start with e3 . This gives e3 · e3 = 1 = λe3 · (x − p). This determines λ. 1 λ= . e3 · (x − p) Now take the dot product with e1 . This gives e1 · (x − p) s1 = λe1 · (x − p) = e3 · (x − p) Similarly, taking the dot product with e2 leads to e2 · (x − p) s2 = e3 · (x − p) To plot the image of an object, we now simply plot the co-ordinates s1 and s2 corresponding to all the points on the object on the s1 –s2 plane. 47
  • 48. Chapter 2. Vectors and Geometry 2.8. Solutions to Chapter Problems Example 2.7 Take p = [11, 0, 0], q = [10, 0, 0], e1 = [0, 1, 0], e2 = [0, 0, 1] and e3 = [−1, 0, 0]. What is the image of the point x = [1, 1, 1]? We compute x − p = [−10, 1, 1] so that e1 · (x − p) = 1 e2 · (x − p) = 1 e3 · (x − p) = 10 So s1 = s2 = 1/10. Example 2.8 Continue the previous example and compute the image of a line segment between [1, 1, 1] and [2, 0, 1]. These are all points of the form x = [1, 1, 1] + t([2, 0, 1] − [1, 1, 1]) = [1 + t, 1 − t, 1] as t varies between 0 and 1. This time we have x − p = [1 + t − 11, 1 − t, 1] so that e1 · (x − p) = 1−t e2 · (x − p) = 1 e3 · (x − p) = 10 − t Thus s1 = (1−t)/(10−t) and s2 = 1/(10−t). Even though it is not immediately obvious, the points [s1 , s2 ], as t varies, all lie on a line segment. In fact s1 + 9s2 = (1 − t)/(10 − t) + 9/(10 − t) = (10 − t)/(10 − t) = 1. This shows that the points [s1 , s2 ] lie on a line perpendicular to [1, 9]. In fact, it is possible to show that any line segment in space maps to a line segment on the s1–s2 plane. Thus, to plot the image of an object consisting of straight line segments (such as the tetrahedron in the picture) it is only necessary to plot the vertices and then join them by straight lines. Problem 2.60 What are the s√and √ co=ordinates of the point x = [1, 2, 3], 1 s2 √ √ if p, q are as above, e1 = [0, 1/ 2, 1/ 2] and e2 = [0, −1/ 2, 1/ 2]. Problem 2.61 Plot the image on the s1 –s2 plane of the tetrahedron whose √ √ √ vertices are located at [0, 0, 0], [0, 1, 0], [0, 1/2, 3/2] and [ 3/6, 6/3, 1/2] (Use the same values as before: p = [−10, 0, 0], q = [10, 0, 0], e1 = [0, 1, 0], e2 = [0, 0, 1] and e3 = [−1, 0, 0].) Problem 2.62 Suppose that that points x lie on the line x = x0 + tv. Show that corresponding planar points [s1 , s2 ] also lie on a line. (Hint: show that there are numbers a, b, c that do not depend on t, so that as1 + bs2 = c for every t.) Problem 2.63 Consider a different drawing procedure where the point x maps to the point on the view plane given by the intersection of the plane with the line through x parallel to e3 . Find a formula for the s1 and s2 co-ordinates of x. 48
  • 49. Chapter 2. Vectors and Geometry 2.8. Solutions to Chapter Problems 4 3 2 (2,2) (1,−1) 1 (3,1) 0 −1 (1,−1) −2 −2 −1 0 1 2 3 4 5 Figure 2.25: Solution to problem 2.1 2.8 Solutions to Chapter Problems Solution 2.1 See Figure 2.25. Solution 2.2 (i) A straight line passing through the origin in the direction of a. (ii) A ray (half line) passing through the origin in the direction of a. (iii) A straight line parallel to a passing through b. (iv) If a and b do not lie on the same line: a plane passing through the origin parallel to a and b (in two dimensions, this is the whole plane). If a and b lie on the same line: a straight line passing through the origin in the direction of a (and b). (v) If a and b both lie on the same line: a line passing through c parallel to a (and b). If a and b do not lie on the same line: a plane passing through c parallel to a and b (in two dimensions, this is the whole plane). Solution 2.3 a − b is the vector that when added to b gives a. So if we draw it with its tail at b then its head is at a. Similarly b − a when drawn with its tail at a has its head at b. Solution 2.4 To find the midpoint we add half of the vector with its tail at a and head at b to a. So the midpoint is a+(1/2)(b−a) = (1/2)(a+b). Similarly, the point one third of the way from a to b is a + (1/3)(b − a) = (2/3)a + (1/3)b. 49
  • 50. Chapter 2. Vectors and Geometry 2.8. Solutions to Chapter Problems Solution 2.5 The line segment is given by the vectors a + t(b − a) as t varies between 0 and 1. Solution 2.6 a = (1, 2); b = (1, −2) (a) a + b = (2, 0) (b) 2a = (2, 4) (c) a − b = (0, 4) (d) a · b = 1 − 4 = −3. √ √ (e) b = 1 + 4 = 5. Solution 2.7 The radius of the circle is √ √ (2, 5) − (3, 3) = (−1, 2) = 1+4= 5. Thus using the standard equation for a circle √ (x − 2)2 + (y − 5)2 = ( 5)2 = 5. Solution 2.8 The equation is x − a = r, or (x1 − a1 )2 + (x2 − a2 )2 + (x3 − a3 )2 = r or (x1 − a1 )2 + (x2 − a2 )2 + (x3 − a3 )2 = r2 Solution 2.9 The midpoint of the sphere is the midpoint of [2, 1, 4] and [4, 3, 10], i.e., (1/2)([2, 1, 4] + [4, 3, 10]) = [3, 2, 7]. The radius is half the distance between [2, 1, 4] and [4, 3, 10], i.e., √ (1/2) (2 − 4)2 + (1 − 3)2 + (4 − 10)2 = 11. Thus the equation is (x1 − 3)2 + (x2 − 2)2 + (x3 − 7)2 = 11. Solution 2.10 √ √ (a) a · b = −2 + 6 = 4, θ = arccos(4/( 5 13)) = 1.05 . . . (b) a · b = 1, θ = 1.249 . . . (c) a · b = 4, θ = 0 (d) a · b = 2, θ = 1.079 . . . . . . (e) a · b = 0, θ = π/2 . . . 50
  • 51. Chapter 2. Vectors and Geometry 2.8. Solutions to Chapter Problems Solution 2.11 Consider the dot product to determine angles between vectors, not the dot product (which only applies to 3D and cannot distinguish angles θ in the range (0, π/2) from those in (π/2, π), as discussed in the solution to problem 2.10 in the online notes). (a) a = (1, 1, 1), b = (3, 1, −2) a · b =√ + 1 − 2 = 2 3 √ ||a|| = 3, ||b|| = 32 + 12 + (−2)2 = 14. cos θ = √32 14 , √ θ = cos−1 √3√14 1.26 radians, or 2 72.0o (b) Using a · b and ||a|| above, proja b = 2 (1, 1, 1) = 2 , 3 , 3 . Remember, proja b should be in the di- 3 3 2 2 rection of a. Solution 2.12 √ a = (1, 4, 0) a = 17 √ √ b = (2, −1, 5) b = 4 + 1 + 25 = 30 b·a √ 2−4 −2 (a) cos θ = a b = √ 17 30 = √ 510 so −2 θ = cos−1 √ ≈ 1.66 in radians or ≈ 95◦ 510 b·a −2 (b) proja b = a 2a = 17 (1, 4, 0) = (−2/17, −8/17, 0). Solution 2.13 The dot product is zero if −1 + 2 + s = 0, i.e., if s = −1. Solution 2.14 Let a = [1, 2, 3], b = [4, 0, 5] and c = [3, 4, 6]. The sides of the triangle are in the directions of a − b = [−3, 2, −2], a − c = [−2, −2, −3], and b − c = [1, −4, −1]. To see if there are any right angles we compute (a − b) · (a − c) = [−3, 2, −2]·[−2, −2, −3] = 8, (a−b)·(b−c) = [−3, 2, −2]·[1, −4, −1] = −9 and (a − c) · (b − c) = [−2, −2, −3] · [1, −4, −1] = 9. Since none of these are zero, there are no right angles. Solution 2.15 If [c1 , 1, c2 ] = s[2, −2, 3], with s a scalar multiple, then s = − 1 2 (from the values in the second component), hence: c1 = −1, c2 = −3/2. Solution 2.16 The problem becomes much simpler if we choose the y axis to run along the runway and the x axis to run perpendicular to the runway. Replace the wind and plane velocities by vectors w and p as on the diagram in Figure 2.26. The wind vector has components [−10 cos(20◦ ), −10 sin(20◦ )] while the plane velocity has components [70 sin(θ), 70 cos(θ)]. We want the x component to be zero hence we need 10 cos(20◦ ) = 70 sin(θ). This gives θ = 7.7◦ , i.e, the get the heading of the plane we have to add 7.7◦ to the runway direction. Thus the 51
  • 52. Chapter 2. Vectors and Geometry 2.8. Solutions to Chapter Problems Figure 2.26: Coordinate system for Solution 2.16. plane’s heading is 267.7. The groundspeed is the magnitude of the velocity, in this case simply the y component given by 70 cos(7.7◦ ) − 10 sin(20◦ ) = 66 knots. Solution 2.17 If we use the co-ordinate system where the x axis is horizontal and the y axis is vertical, then the force of gravity is Fg = [0, −mg] The unit vector in the direction along the shaft is p = [− sin(θ), cos(θ)]. The force Fs exerted by the shaft is in the direction of the shaft, hence a multiple of p. So Fs = tp for some number t. To find t we use that the component of the total force in the direction of the shaft must be zero. Thus projp Fg + Fs = 0, i.e. p · Fg + t = 0, i.e., t = mg cos(θ). Thus the total force is Fg + Fs = [0, −mg] + mg cos(θ)[− sin(θ), cos(θ)] = mg[− cos(θ) sin(θ), −1 + cos2 (θ)]. Note that this is orthogonal to p, as it must be. If we use co-ordinates where the y axis runs along the shaft of the pendulum, and the x axis perpendicular to it, then Fg = [−mg sin(θ), −mg cos(θ)] while Fs = [0, t] for some value of t. In this case it is simply the y component that must be zero, so t = mg cos(θ) and Fg + Fs = [−mg sin(θ), 0] Solution 2.18 sqrt(a(1)^2+a(2)^2) Solution 2.19 [1, 2, 3] × [4, 5, 6] = [−3, 6, −3] Solution 2.20   1 1 1 det  1 2 3  = 1(−2 − 0) − 1(−1 − 3) + 1(0 − 2) 1 0 −1 = −2 + 4 − 2 = 0 Since the determinant is zero, the volume of the parallelepiped generated by the row vectors is zero. This implies that the vectors lie on the same plane. 52
  • 53. Chapter 2. Vectors and Geometry 2.8. Solutions to Chapter Problems Solution 2.21 a · (a × b) = [a1 , a2 , a3 ] · [a2 b3 − a3 b2 , a3 b1 − a1 b3 , a1 b2 − a2 b1 ] = a1 a2 b3 − a1 a3 b2 + a2 a3 b1 − a2 a1 b3 + a3 a1 b2 − a3 a2 b1 = 0. Now b · (a × b) = −b · (b × a) = 0 by the previous calculation. Solution 2.22 (a) ((1, 4, −1) · (2, 1, 3))((2, 1, 4) × (1, 4, 9)) i j ˆ ˆ ˆ k = (2 + 4 − 3) 2 1 4 1 4 9 i(9 j(4 ˆ = 3(ˆ − 16) + ˆ − 18) + k(8 − 1) = 3(−7, −14, 7) = (−21, −42, 21) (b) (7, 1, 0) · ((2, 0, −1) × (1, 4, 3)) ˆ ˆ i j ˆ k = (7, 1, 0) · 2 0 −1 1 4 3 = ˆ + 4) + ˆ (7, 1, 0) · (i(0 ˆ j(−1 − 6) + k(8 − 0) = (7, 1, 0) · (4, −7, 8) = 28 − 7 + 0 = 21 (c) (a × b) × (b × a) = (a × b) × (−a × b) = −(a × b) × (a × b) = 0 because c × c = 0 for any vector c. Solution 2.23 Here we are assuming that θ lies in [0, π] so that sin(θ) is positive. The quantity a is the length of the base of the parallelogram while b sin(θ) is the height. The area of a parallelogram is the product of these. Solution 2.24 Since a × b = −b × a it is never true that a × b = −b × a, unless a × b = 0. For the other example, just try i, j and k. We have i × j = k so i × (i × j) = i × k = −j On the other hand (i × i) × j = 0 × j = 0 Solution 2.25 a=rand(3,1); b=rand(3,1); c=rand(3,1); cross(a,b) - cross(b,a) cross(a,cross(b,c)) - cross(cross(a,b),c) This actually constitutes a proof, since we found a counter example. We don’t have to show that it’s always the case, we only had to show that in general the two assertions are not true. 53
  • 54. Chapter 2. Vectors and Geometry 2.8. Solutions to Chapter Problems Solution 2.26 This a just a long calculation. One way to break it up is to consider a = i, j and k separately. Suppose we can prove it for these special cases. Then write a general a as a1 i + a2 j + a3 k. Then a × (b × c) = a1 i × (b × c) + a2 j × (b × c) + a3 k × (b × c) Assuming for the moment we know that the special cases hold, then this equals a1 ((i · c)b − (i · b)c) + a2 ((j · c)b − (j · b)c) + a3 ((k · c)b − (k · b)c) = (a · c)b − (a · b)c Now we still have to prove the special cases. For example if a = i we have i × (b × c) =   i j k det  1 0 0  = −(b1 c2 −b2 c1 )j+(b3 c1 −b1 c3 )k = b2 c3 − b3 c2 b3 c1 − b1 c3 b1 c2 − b2 c1 [0, −b1 c2 + b2 c1 , b3 c1 − b1 c3 ] On the other hand(i · c)b − (i · b)c = c1 [b1 , b2 , b3 ] − b1 [c1 , c2 , c3 ] = [0, c1 b2 − b1 c2 , c1 b3 − b1 c3 ]. The other two are similar. Solution 2.27 a=rand(3,1); b=rand(3,1); c=rand(3,1); cross(a,cross(b,c)) - (dot(a,c)*b - dot(a,b)*c) Solution 2.28 (a×b)·(c×d) = ((a×b)×c)·d (by property 5) = −(c×(a×b))·d (by property 1) = −((c · b)a + (c · a)b) · d = −(c · b)(a · d) + (c · a)(b · d) Solution 2.29 (a) Let v = a × (a × b). Since a × b is orthogonal to the paper, v must lie in the plane of the paper. v is also orthogonal to a. Checking orientations with the right hand rule, there are only two sketches (up to rotation and resizing the vectors) shown in Figure 2.27. (b) a × (a × b) = (b · a)a − (a · a)b by property 2 in the notes. 2 b·a = a a−b = a 2 (proja b − b). a 2 Solution 2.30 Actually there is no one right answer to this problem. But in two dimensions the analog of the cross product could be an operation on a single a a2 vector say ×(a) with the property that ×(a) · b = det 1 This implies b1 b2 that ×(a) = [a2 , −a1 ]. Note that ×(a) is orthogonal to a. The answer in 4 dimensions is a little more obscure. In this case the cross product could be an operation on three vectors that produces a vector that is orthogonal to all of them. One way of producing such a vector would be to do the formal calculation of a 4 × 4 determinant, where the first row contains the four unit vectors e1 , e2 , e3 and e4 , while the other three rows contain the entries of a,b and c. Do you see why? (There is another possible, perhaps better answer, involving a different type of product called the wedge product.) 54
  • 55. Chapter 2. Vectors and Geometry 2.8. Solutions to Chapter Problems Figure 2.27: Vector sketch for the solution to 2.29. Solution 2.31 Let q be a point on the line. Then the projection of x in the direction of b is x·b2 . The set of all points whose projection in the direction of b b are the same as the projection of the point q in the direction of b is therefore the set of points x such that x·b2 = q·b2 . Multiplying both sides by by b 2 b b gives x · b = q · b. Solution 2.32 The line is in the direction of a = [7, 5] − [0, 1] = [7, 4] and is therefore perpendicular to b = [4, −7]. q = [0, 1] is a point on the line. So the parametric form for the line is [0, 1] + s[7, 4] and the equation form is 4x1 − 7x2 = −7. So five points on the line are (picking s = −2, 1, 0, 1, 2) [−14, −7], [−7, −3], [0, 1], [7, 5], [14, 9]. To check whether [1012/3, 1069/21] lies on the line, plug it into the equation: 4 · 1012/3 − 7 · 1069/21 = −7 so the point does not lie on the line. Solution 2.33 We can take q = [1, 1] and b = [2, 1] so the equation is 2x1 +x2 = 3. Solution 2.34 We have to find a point on the line. Try for one of the form [t, 0]. This is on the line if t − 3 · 0 = 5. So [5, 0] is on the line. The vector orthogonal to the line is [1, −3] so a vector parallel is [3, 1]. Thus the parametric form is [5, 0] + s[3, 1]. Solution 2.35 Pick a point q on the line and let b be orthogonal to the line. Then the distance from a point x to the line is the length of the projection of x − q onto b as shown in Figure 2.28. In this case we can take b = [3, −4], q = [0, 1] and x = [−2, 3]. The projection is ((x − q) · b/ b 2 )b This equals ([−2, 2] · [3, −4]/25)[3, −4] = (−14/25)[3, 4] The length of this vector is 14/5. Solution 2.36 (a) the normal direction to the plane is (1, −1, 2) (b) the point (7, 0, 0) is on the plane (look for a solution with y = z = 0 that is the intersection of the plane with the line of the x-axis). Solution 2.37 The midpoint between b and c is (b + c)/2. So the vector from a to this midpoint is (b + c)/2 − a and thus the parametric form for the median 55
  • 56. Chapter 2. Vectors and Geometry 2.8. Solutions to Chapter Problems Figure 2.28: The projection used in Solution 2.35. is a + t((b + c)/2 − a). Similarly the parametric forms for the other medians is b + s((a + c)/2 − b and c + r((a + b)/2 − c). When t = s = r = 2/3 these lines meet at the common point (a + b + c)/3 Solution 2.38 The line is in direction (0, 1, 3). Two vectors which are orthog- onal to the line are (1, 0, 0) and (0, 3, −1), because (1, 0, 0) · (0, 1, 3) = 0 and (0, 3, −1) · (0, 1, 3) = 0. Note that there are other choices for these two vectors that lead to different correct answers below. There will be two equations of the form x = c1 3y − z = c2 Substituting the point (2, 0, 4) that we know is on the line gives c1 = 2 and c2 = 4. Thus an equation form of the line is x = 2 3y − z = 4 Solution 2.39 To find an equation for the plane we need an orthogonal vector ˆ i j ˆ ˆ k (0, 1, −3)×(−1, 2, 0) = 0 1 −3 =ˆ i(0+6)+ ˆ ˆ j(3−0)+ k(0+1) = (6, 3, 1). −1 2 0 So an equation for the plane is 6x + 3y + z = c and we know that c = 0 since the plane goes through the origin. Any point on the line has the form (2, −1, 6) + s(1, −1, 0) = (2 + s, −1 − s, 6). Substituting this into the equation for the plane gives 6(2 + s) + 3(−1 − s) + 6 = 0 56
  • 57. Chapter 2. Vectors and Geometry 2.8. Solutions to Chapter Problems 3s + 15 = 0 s = −5 So the intersection point is (2, −1, 6) − 5(1, −1, 0) = (2, −1, 6) + (−5, 5, 0) = (−3, 4, 6). Note: it can be checked that t = −2 and u = 3 in the parametric form of the plane gives this point as well. Solution 2.40 Points on the line satisfy x = 1+t y = 2 z = 3 − t. To be on the plane, x+2y −z = 5, or (1+t)+2(2)−(3−t) = 5, or 2t+2 = 5, or t = 3/2. Thus the intersection occurs at (1, 2, 3) + ( 3 )(1, 0, −1) = ( 5 , 2, 3 ). 2 2 2 Solution 2.41 Two vectors in the direction of the plane are [1, 1, 0] − [1, 0, 1] = [0, 1, −1] and [0, 1, 1] − [1, 0, 1] = [−1, 1, 0] To find a normal vector take the cross product [0, 1, −1] × [−1, 1, 0] = [1, 1, 1]. So the equation of the plane is x1 + x2 + x3 = 2. Solution 2.42 The centre of the sphere lies in the plane halfway between the planes x1 + x2 + x3 = 3 and x1 + x2 + x3 = 9. This is the plane x1 + x2 + x3 = 6. So the centre satisfies the three equations x1 +x2 +x3 =6 2x1 −3x2 =0 3x1 −x3 =0 We will develop efficient techniques for solving such systems of equations. For now, we can just use brute force: The second equation says x2 = 2x1 and the third x3 = 3x1 . Substituting this into the first equation yieldsx1 = 1, so x2 = 2 and x3 = 3. Therefore the centre of the sphere is [1, 2, 3]. To compute the radius, we must find the distance between the planes x1 + x2 + x3 = 3 and √ x1 + x2 + x3 = 6 If we divide the equations by [1, 1, 1] = 3, the first can be interpreted as vectors whose projection onto the direction [1, 1, 1] has length √ √ 3/ 3 = 3, the second can be interpreted as vectors whose √ √ √ projection onto the direction [1, 1, 1] has length 6/ 3 = 2 3. The radius is 3 so the equation is (x1 − 1)2 + (x2 − 2)2 + (x3 − 3)2 = 3. Solution 2.43 x+y+z =2 normal (1, 1, 1) = b1 x − y + 2z = 7 normal (1, −1, 2) = b2 The line direction can be found as i j ˆ ˆ ˆ k   a = b1 × b2 =  1 1 1  = (3, −1, −2). 1 −1 2 57
  • 58. Chapter 2. Vectors and Geometry 2.8. Solutions to Chapter Problems We also need a point on the line. We can look for a point which has z = 0 (the point that is the intersection of the line and the x-y plane). We get the two by two system x + y = 2, x−y =7 Solving the system yields that x = 2 , and y = − 5 . 9 2 9 5 So 2 , − 2 , 0 + s(3, −1, −2) is a parametric form of the line with parameter s. Solution 2.44 Set x2 = 0 and solve 2x1 − x3 = 0 and x1 + 2x3 = −5 giving x1 = −1 andx2 = −2. This implies [−1, 0, −2] is in the plane we are looking for. Set x3 = 0 and solve 2x1 + 3x2 = 0x1 − 4x2 = −5 giving x1 = −15/11 and x2 = 10/11] Hence [−15/11, 10/11, 0] also lies on the plane. We now have three points on the plane. Two vectors in the direction of the plane are therefore [−2, 0, 1] − [−1, 0, −2] = [−1, 0, 3] and[−15/11, 10/11, 0] − [−1, 0, −2] = [−4/11, 10/11, 2]. Thus a normal vector is [−1, 0, 3] × [−4/11, 10/11, 2] = [−30/11, 10/11, −10/11] This is parallel to [−3, 1, −1]. Hence the equation is −3x1 + x2 − x3 = 5. Solution 2.45 The three points are all on the same line. To see this, notice that the vectors [2, 3, 4] − [1, 2, 3] = [1, 1, 1] and [3, 4, 5] − [1, 2, 3] = [2, 2, 2] are parallel. Solution 2.46 The distance of p to the plane b · x = c is the length of the projection of p − q onto b, where q is any point on the plane. A point on the plane is [c/b1 , 0, 0] (unless b1 = 0 in which case we choose either [0, c/b2 , 0] or [0, 0, c/b3 ]) The length of the projection is (p − q) · b/ b = (b · p − c)/ b . Solution 2.47 If the line is parallel to both planes, then it is orthogonal to both normal vectors. Therefore [1, 1, 0] × [1, −1, 2] = [2, −2, −2] is in the direction of the line. Therefore the parametric form of the line is [2, −1, −1] + s[2, −2, −2] The two equations are x1 + x2 = [1, 1, 0] · [2, −1, −1] = 1 and x1 − x2 + 2 = [1, −1, 2] · [2, −1, −1] = 1. Solution 2.48 x=-3:0.1:3 y = sqrt(9-x.^2) plot(x,y,’.’) hold on plot(x,-y,’.’) axis equal Solution 2.49 Yes [1, 1] and [1, 0] do form a basis, since they don’t lie on the same line. (By the way, there is some potential for confusion here. When I say here that [1, 1] and [1, 0] don’t lie on the same line, I’m thinking of them as position vectors, drawn with their tails at the origin. It probably would be more clear to say that the vectors are not parallel.) Every vector in the plane can be written as a linear combination of these two. In particular [0, 1] = [1, 1] − [1, 0] 58
  • 59. Chapter 2. Vectors and Geometry 2.8. Solutions to Chapter Problems Solution 2.50 Note that a and b are linearly independent (they are not multi- ples of one another). So a, b and c will be linearly independent and so a basis for R3 unless c is a linear combination of a and b, that is it can be written as c = ra + sb for some r and s. Solution 2.51 The collection is a basis if the vectors are linearly independent. We consider ra + sb + tc = 0 which leads to r+s+t = 0 r+s = 0 r = 0 Starting from the last equation and working up, we see that r = 0, s = 0, and t = 0 which shows that the vectors are linearly independent and therefore form a basis. The linear independence could also have been shown by considering the determinant 1 1 1 1 1 0 =1=0 1 0 0 Since the determinant is not zero the vectors do not lie on the same plane and so are linearly independent. To write [1, 2, 3] as a linear combination [1, 2, 3] = r[1, 1, 1] + s[1, 1, 0] + t[1, 0, 0] match components working backwards to get successively r = 3, s = −1 and t = −1. Solution 2.52 The vectors a, b, and c are vectors in 3-D. Thus they form a basis if and only if they are linearly independent. The three vectors are linearly independent if and only if the matrix A with rows given by the three vectors satisfies det(A) = 0. Here   1 0 4 A= 2 −1 0  8 −3 8 Solution 2.53 No, it is not possible for four vectors to be linearly independent in three dimensional space. To see this, first notice that if four vectors are lin- early independent any subset of three vectors must also be linearly independent (If three vectors would lie on the same plane, we could find a non-trivial linear combination of those three equal to zero. Then by adding 0 times the left over vector, we would get a non-trivial linear combination of all four vectors equal 59
  • 60. Chapter 2. Vectors and Geometry 2.8. Solutions to Chapter Problems to zero, contradicting their independence.) But this means that those three vectors form a basis for three dimensional space. So the fourth vector must be expressible as a linear combination of the first three. This means the four vectors are not independent. Solution 2.54 It is enough to check that det(A) = 0. We have that   2 1 3 A =  1 0 2 , det(A) = 6 = 0, 3 0 0 so they form a basis. We want to find s1 , s2 , and s3 such that         12 2 1 3  2  = s1  1  + s2  0  + s3  0 . 4 3 2 0 We have the following system of equations: 12 = 2s1 + s2 + 3s3 2 = s1 4 = 3s1 + 2s2 Hence s1 = 2, s2 = −1, s3 = 3. Solution 2.55 If x = s1 a1 + s2 a2 + · · · sn an and x = t1 a1 + t2 a2 + · · · tn an , then 0 = x − x = (s1 − t1 )a1 + (s2 − t2 )a2 + · · · (sn − tn )an . Since the vectors a1 , a2 , . . . an are linearly independent, the only way this can happen is s1 − t1 = 0, . . . , sn − tn = 0, or s1 = t1 , . . . , sn = tn . Solution 2.56 The values alpha = 0.23, beta = 0.15 provide a good ap- proximation to the linear combination. Solution 2.57 The axis is parallel to [2, −3, 1] − [1, 1, −1] = [1, −4, 2]. The unit √ vector in this directions is 1/ 21[1, −4, 2] (Actually there is an ambiguity in the problem: the unit vector could also point in the opposite direction. This would √ change the sign of the answer.) So Ω = (10/ 21)[1, −4, 2]. The vector r has its tail co-inciding with the tail of Ω and its head at [1, 2, 3]. √ r = [1, 2, 3] − So [1, 1, −1] = [0, 1, 4] and the velocity is v = Ω × [0, 1, 4] = (10/ 21)[−18, −4, 1] Solution 2.58 Some preliminary discussion: (i) it does not matter where the centre of the rod is along L, we can take it to be at the origin for convenience. (ii) the fastest speed is at the rod tip, and this does not change in time, so (iii) it is OK to evaluate the speed at the initial position. Ω = 3 · 2π = 6π radians/min. 1 a = √3 (1, 1, 1) unit vector in the direction of the axis of rotation. ˆ 60
  • 61. Chapter 2. Vectors and Geometry 2.8. Solutions to Chapter Problems Figure 2.29: Mapped tetrahedron of Solution 2.61. 6π v = Ωˆ × r = √3 (1, 1, 1) × (1, 0, 0), with r the initial position of the tip. a Now evaluate the cross product: i j ˆ ˆ ˆ k    1 1 1  = (0, 1, −1), 1 0 0 6π √ so v = √3 (0, 1, −1), and hence the speed is ||v|| = 2π 6m/min Solution 2.59 We have [x(t), y(t), 0] = r[cos(θ(t)), sin(θ(t)), 0] so the velocity is ˙ ˙ [x(t), y(t), 0] = r[− sin(θ(t))θ(t), cos(θ(t))θ(t), 0] ˙ ˙ ˙ = rθ(t)[− sin(θ(t)), cos(θ(t)), 0] ˙ On the other hand Ω = [0, 0, θ(t)] so that Ω × r[cos(θ(t)), sin(θ(t)), 0] gives the same answer. Solution 2.60 We have q − p = [20, 0, 0], x − p = [11, 3, 2]. Thus e1 · (q − p) = √ √ e2 · (q − p) = 0, e3 · (q − p) = √−20, e1 · (x − p) =√ 2/2, e2 · (q − p) = 2/2, 5 e3 · (x − p) = −11. So s1 = 50 2/11 and s1 = 10 2/11. √ Solution 2.61 The image of the vertices are [0, 0], [2, 0], [1, 3] and some more horrible expression which is approximately [1.587 . . . , 0.9719 . . .]. So the image on the screen looks something like what√ shown in Figure 2.29. Notice that is the first three points—[0, 0], [2, 0] and [1, 3]— are simply double the x2 and x3 co-ordinates of the original points in space. Can you explain this geometrically? (I know these are supposed to be answers, not questions, but√ still . . .) (By the √ way, I actually intended the last point in the question to be [ 6/3, 1/2, 3/6]. Can you say why?) Solution 2.62 We can write 1 [s1 , s2 ] = [e1 · (x0 − p) + te1 · v, e2 · (x0 − p) + te2 · v] e3 · (x0 − p) + te3 · v 61
  • 62. Chapter 2. Vectors and Geometry 2.8. Solutions to Chapter Problems This means that as1 + bs2 = c can be rewritten a(e1 · (x0 − p) + te1 · v) + b(e2 · (x0 − p) + te2 · v) − c(e3 · (x0 − p) + te3 · v) = 0 This holds for every t if both these equations hold: a(e1 · (x0 − p)) +b(e2 · (x0 − p)) −c(e3 · (x0 − p)) = 0 ae1 · v +be2 · v −ce3 · v =0 This is a system of two equations in three unknowns, a, b, and c. Such a system always has a non trivial solution (as we will see in the following chapter.) Solution 2.63 In this case we want y − x to point in the same direction as e3 so y − x = λe3 . On the other hand y lies on the plane of the screen, so y = q + s1 e1 + s2 e2 . Therefore q − x + s1 e1 + s2 e2 = λe3 . Now to find s1 and s2 take dot products with e1 and e2 . This yields s1 = −e1 · (q − x) and s2 = −e2 · (q − x). 62
  • 63. Chapter 3 Solving Linear Systems 3.1 Linear Systems 3.1.1 General Form of Linear Systems So far, we have seen systems of linear equations as the equations that describe points, lines and planes. However, linear systems of equations show up in many other ways in engineering problems. We will solve linear systems to find the behaviour of an electrical circuit. Other examples would be the calculation of equilibrium temperature distributions or electric fields. Such examples often involve the discretization of a continuous function. In other words, a continuous function like the temperature distribution in a body (which has a value for each of the infinitely many points in the body) will be replaced by a list of temperatures at a large but finite number n of closely spaced points. This gives rise to a system of linear equations in n unknowns, where n can be in the tens of thousands, or higher. Therefore, we want to develop a technique to solve systems of linear equations in n unknowns when n is large. The most general form of a linear system of equations is a1,1 x1 + a1,2 x2 +··· + a1,n xn = c1 a2,1 x1 + a2,2 x2 +··· + a2,n xn = c2 . . . . . . . . . . . . am,1 x1 + am,2 x2 + · · · + am,n xn = cm Here the numbers ai,j and cj are known, and the goal is to find all values of x1 , . . . , xn that satisfy all the equations. 3.1.2 Solving Linear Systems by Substitution Let us start with an example, which we will solve using the method of substitu- tion. 63
  • 64. Chapter 3. Solving Linear Systems 3.1. Linear Systems Example 3.1 Consider the system of equations x1 + x2 + x3 = 6 x1 − x2 + x3 = 0 2x1 + x2 − 8x3 = −11 One could try to proceed as follows. Solve the first equations for, say, x3 . This gives x3 = 6 − x1 − x2 . Now substitute this value for x3 into the second and third equations. This gives x1 − x2 + (6 − x1 − x2 ) = 0 2x1 + x2 − 8(6 − x1 − x2 ) = −11 or −2x2 = −6 10x1 + 9x2 = 37 Now solve the first of these equations for x2 and substitute into the last equation. This gives x2 = 3 and x1 = 1. Finally we can go back and calculate x3 = 6 − 1 − 3 = 2. Although this procedure works fine for n = 2 or even n = 3, it rapidly be- comes unwieldy for larger values of n. We will now introduce a technique called Gaussian elimination that works well for large n and can be easily implemented on a computer. We have already observed that there may be many systems of equations with the same solution. When there are only two unknowns, this amounts to saying that different pairs of lines may intersect in the same point. Gaussian elimination is based on the following idea. We introduce three elementary row operations. These operations change the system of the equations into another system with exactly the same the set of solutions. We then apply these elemen- tary row operations in a systematic way to change the system of equations into a system that is easily solved. 3.1.3 Elementary row (equation) operations The first elementary row operation is 1. Multiplication of a row (equation) by a non-zero number For example, if we multiply the first equation in the system of Example 3.1 above by 3, we end up with 3x1 + 3x2 + 3x3 = 18 x1 − x2 + x3 = 0 2x1 + x2 − 8x3 = −11 64
  • 65. Chapter 3. Solving Linear Systems 3.1. Linear Systems This new system of equations has exactly the same solutions as the original system, because we can undo the elementary row operation simply by dividing the first equation by 3. Thus the values x1 , x2 , x3 solve this system if and only if they solve the original system. (Notice that this would not be true if we multiplied by zero. In that case we could not undo the operation, and the new system of equations could well have more solutions than the original system.) Any row operations we do can be undone by other row operations, and the set of solutions of the linear system remain unchanged. The second elementary row operation is 2. Adding a multiple of one row (equation) to another row For example, if we added 2 times the first row to the second row in our example we would obtain the system x1 + x2 + x3 = 6 3x1 + x2 + 3x3 = 12 2x1 + x2 − 8x3 = −11 Again, the new system of equations has exactly the same solutions as the original system, since we could undo this elementary row operation by subtracting 2 times the first row from the second row. The third and final elementary row operation is 3. Interchanging two rows (equations) For example, if we swapped the first and second equations in our original system we would end up with x1 − x2 + x3 = 0 x1 + x2 + x3 = 6 2x1 + x2 − 8x3 = −11 This obviously doesn’t change the solutions of the system since we have the same equalities. 3.1.4 Augmented Matrices To save unnecessary writing, we now set up an streamlined notation for systems of linear equations. Notice that the only thing that distinguished one system of equations from another are the coefficients. So, as shorthand, we can write the system of equations of Example 3.1 x1 + x2 + x3 = 3 x1 − x2 + x3 = 3 2x1 + x2 − 8x3 = −4 simply as   1 1 1 3 1 −1 1 3  2 1 −8 −4 65
  • 66. Chapter 3. Solving Linear Systems 3.1. Linear Systems This is called an augmented matrix. “Augmented” refers to the column to the right of the line that contains the information about the right side of each equation. 3.1.5 Problems Problem 3.1 Express the system x1 − 2x2 + 3x3 = 6 4x1 − 5x2 − 6x3 = 7 8x1 + 9x2 + 10x3 = 11 as an augmented matrix. Problem 3.2 Start with the system x1 + x2 + x3 = 6 x1 − x2 + x3 = 0 2x1 + x2 − 8x3 = −11 and perform the following sequence of row operations: 1. Subtract the first row from the second row 2. Subtract twice the first row from the third row 3. Multiply the second row by −1/2 4. Add the second row to the third row 5. Multiply the third row by −1/10 Solve the resulting system of equations by starting with the third equation, then the second and then the first. Problem 3.3 (Matlab) Let’s try to solve problem 3.2 using Matlab. 1. Start by generating an extended matrix to represent the system of equations: A=[1, 1, 1, 6; 1, -1, 1, 0; 2, 1, -8, -11] 2. Subtract the first row from the second row: A(2,:)=A(2,:)-A(1,:) 3. Subtract twice the first row from the third row: A(3,:)=A(3,:)-2*A(1,:) 4. Multiply the second row by −1/2: 66
  • 67. Chapter 3. Solving Linear Systems 3.2. Gaussian Elimination A(2,:)=-A(2,:)/2 5. etc. What other commands are necessary to arrive at the solution? Does it coincide with the paper and pen solution? Problem 3.4 Start with the system x1 − x2 + x3 = 10 x1 − x2 − x3 = 6 6x1 + 3x2 + x3 = 0 Perform the following sequence of row operations 1. Subtract the second row from the first row 2. Add the first row to the second row 3. Subtract the first row from the third row 4. Multiply the second row by 3 5. Add the second row to the third row Show that the resulting system of equations can be easily solved and find the solution of the above system of equations. Problem 3.5 Without using the calculator or computer, find the solution to the system 2x1 + x2 = 5 3x1 + 5x2 = −10 You can leave the solutions as fractions. 3.2 Gaussian Elimination Recall that we want to use a sequence of elementary row operations to turn an arbitrary system of equations into an easily solved system of equations (with exactly the same solutions). What equations are easily solved? Well, the easiest possible equations to solve are ones of the form below, written as an augmented matrix:   1 0 0 3 0 1 0 3  0 0 1 −4 If we translate from the shorthand back to the equations they represent, the first row says x1 = 3, the second row says x2 = 3 and the third row says x3 = −4. In other words, we can just read off the values of x1 , x2 and x3 in the rightmost column. The equations are already solved, and there is nothing left to do! Slightly more work, but still easy to do, are upper triangular systems. These are systems where all the entries below the diagonal are equal to zero, as in   1 1 1 3  0 −1 1 3  0 0 −4 −8 67
  • 68. Chapter 3. Solving Linear Systems 3.2. Gaussian Elimination The reason these are easy to solve is that the equation represented by the jth row only involves the variables xj , xj+1 , . . . , xn . So if we start with the last equation (in the example −4x3 = −8), we can solve it immediately for xn (in the example x3 = 2). Now we move up one equation. This equation only involves xn−1 and xn , and we already know xn . So we can solve it for xn−1 (in the example −x2 + x3 = 3 so −x2 + 2 = 3 so x2 = −1). We can continue in this way until all the xn ’s have been found. (In the example there is one more equation x1 + x2 + x3 = 3 or x1 − 1 + 2 = 3 or x1 = 2.) In practise (i.e., in a typical computer program) the procedure that is ac- tually used is to apply a sequence of row operations to turn the system of equations into an upper triangular system. Then the equations are solved one by one, starting at the bottom and working up. This is the most efficient way to solve a system of equations. However, its sometimes convenient to apply row operations to bring the equation into the “completely solved” form. Then, you can just read off the solution from the last column. Let us now do a bunch of examples to illustrate this procedure. I’ll cook them up so that everything that possibly can go wrong, does go wrong. (I’ll also cook them up so that the numbers come out looking nice. This will definitely not be the case in an example coming up in a real application!). Here is a shorthand for indicating which elementary row operation was done. The notation 3(1, :) means the first row was multiplied by the non-zero number 3. The notation (2, :) = (2, :) − 4(5, :) means that 4 times the fifth row was subtracted from the second row. Finally, (2, :) ↔ (3, :) means that the second and third row were interchanged. Example 3.2 Let us start with   1 2 −2 −7 −29  1 2 −1 −5 −18     0 3 0 −3 −6  −1 4 1 1 14 We are trying to put this matrix in upper triangular form. So we start by trying to produce zero entries in the first column under the top entry. We can do this by adding multiples of the first row to the other rows. So, the first move is to subtract the first row from the second row. The result is   1 2 −2 −7 −29  0 0 1  2 11  (2, :) = (2, :) − (1, :)   0 3 0 −3 −6  −1 4 1 1 14 The third row already has a zero in the first column, so there is nothing to do here. To put a zero in the fourth row we add the first row to the last row.   1 2 −2 −7 −29  0 0 1 2 11     0 3 0 −3 −6  0 6 −1 −6 −15 (4, :) = (4, :) + (1, :) 68
  • 69. Chapter 3. Solving Linear Systems 3.2. Gaussian Elimination Now we shift our attention to the second column. We want to produce zeros below the diagonal. If we attempt to do this by adding multiples of the first row to other rows, we will destroy the zeros that we have already produced. So we try to use the second row. This is where we run into the first glitch. Since the entry in the second column of the second row is zero, adding a multiple of this row to the others won’t have any effect on the numbers in the second column that we are trying to change. To remedy this we simply swap the second and third rows.   1 2 −2 −7 −29  0 3 0 −3 −6    (2, :) ↔ (3, :)  0 0 1 2 11  0 6 −1 −6 −15 Now we can complete the job on the second column by subtracting 2 times the second row from the last row.   1 2 −2 −7 −29  0 3 0 −3 −6     0 0 1 2 11  0 0 −1 0 −3 (4, :) = (4, :) − 2(2, :) Now we shift our attention to the third column. To produce a zero in the entry below the diagonal we must add the third row to the last row.   1 2 −2 −7 −29  0 3 0 −3 −6     0 0 1 2 11  0 0 0 2 8 (4, :) = (4, :) + (3, :) The matrix is now in upper triangular form. Let us find the solution. This last row is shorthand for the equation 2x4 = 8. So x4 = 2. The third row now gives x3 + 2(4) = 11, so x3 = 3. The second row gives 3x2 − 3(4) = −6 so x2 = 2. Finally the first row gives x1 + 2(2) − 2(3) − 7(4) = −29 so x1 = 1. Example 3.3 There is really no need to do anything more in Example 3.2, but let us continue with elementary row operations to put the equations into the “completely solved” form, just to see how this goes. First we divide the second row by 3.   1 2 −2 −7 −29  0 1 0 −1 −2  (2, :) = (1/3)(2, :)    0 0 1 2 11  0 0 0 2 8 Now we subtract twice the second row from the first row.   1 0 −2 −5 −25 (1, :) = (1, :) − 2(2, :)  0 1 0 −1 −2     0 0 1 2 11  0 0 0 2 8 69
  • 70. Chapter 3. Solving Linear Systems 3.2. Gaussian Elimination Now add twice the third row to the first row. Then divide the last row by 2.   1 0 0 −1 −3 (1, :) = (1, :) + 2(3, :)  0 1 0 −1 −2     0 0 1 2 11  0 0 0 1 4 (4, :) = (1/2)(4, :) Finally, we add various multiples of the last row to the previous rows.   1 0 0 0 1 (1, :) = (1, :) + (4, :)  0 1 0 0 2  (2, :) = (2, :) + (4, :)    0 0 1 0 3  (3, :) = (3, :) − 2(4, :) 0 0 0 1 4 We now can read the solution off from the last column. In the previous example there was a unique solution to the system of equa- tions. We already know, from the geometrical meaning of the equations, that sometimes there will be lots of solutions depending on a parameter. This is expected to happen when there are fewer equations than unknowns (e.g., the intersections of two planes in three dimensional space is usually a line) but will also occur in certain degenerate cases when the number of equations is equal to or more than the number of unknowns (e.g., three, or even four, planes may intersect in a line too). What happens in the procedure of row reductions when there are parameters in the solution? Let us look at another example. Example 3.4 Consider the system   1 3 2 −2 −1  1  3 4 −2 3    −2 −6 −4 5 5  −1 −3 2 1 6 Perform Gaussian Elimination on this system. We begin, as before, by trying to produce zeros in the first column under the diagonal entry. This procedure yields   1 3 2 −2 −1  0 0 2 0  4  (2, :) = (2, :) − (1, :)   0 0 0 1 3  (3, :) = (3, :) + 2(1, :) 0 0 4 −1 5 (4, :) = (4, :) + (1, :) As in the previous example, there is now a zero sitting in the diagonal spot in the second column. Last time, we swapped rows at this point to put a non-zero entry in this place. But now, all the other entries below this one are zero too! So there is nothing we can swap in to save the situation. (Clearly, swapping the first row down is not a good idea, since that would destroy the zero in the first column.) So we just have to admit defeat, and shift our attention one column 70
  • 71. Chapter 3. Solving Linear Systems 3.2. Gaussian Elimination to the right. We subtract twice the second row from the fourth row.   1 3 2 −2 −1  0 0 2 0 4     0 0 0 1 3  0 0 0 −1 −3 (4, :) = (4, :) − 2(2, :) Now we complete the job by adding the third row to the last row.   1 3 2 −2 −1  0 0 2 0 4     0 0 0 1 3  0 0 0 0 0 (4, :) = (4, :) + (3, :) What are the solutions? The third equation says x4 = 3 and the second equation says x3 = 2. There is nothing new here. However the first equation introduces not just one, but two new variables x1 and x2 . It reads x1 + 3x2 + 2(2) − 2(3) = −1, or, x1 + 3x2 = 1 Clearly, there are infinitely many values of x1 and x2 that satisfy this equation. In fact, if we fix x2 to be any arbitrary value x2 = s, and then set x1 = 1 − 3s, x1 and x2 will be solutions. So for any choice of s x1 = 1 − 3s, x2 = s, x3 = 2, x4 = 3 is a solution. There are infinitely many solutions depending on a parameter s. We could also write this as x1 1 −3        x2   0   1    =   + s x3 2 0  x4 3 0 and recognize the solutions as a line in four dimensional space passing through [1, 0, 2, 3] in the direction [−3, 1, 0, 0]. There is one situation left to consider, namely when there are no solutions at all. Geometrically, this happens, for example, when we are trying to find the intersection of two parallel planes. Let us look at an example. Example 3.5 Perform Gaussian Elimination on the following system   1 3 1  1 4 2     −1 −3 0  2 6 4 We begin in the usual way.   1 3 1  0 1  1  (2, :) = (2, :) − (1, :)   0 0 1  (3, :) = (3, :) + (1, :) 0 0 2 (4, :) = (4, :) − 2(1, :) 71
  • 72. Chapter 3. Solving Linear Systems 3.2. Gaussian Elimination * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Figure 3.1: After Gaussian Elimination the augmented matrix will be in row echelon form (left). With further work, the augmented matrix can be put in reduced row echelon form (right). There is nothing left to do in the second column, so we shift our attention to the third column and subtract twice the third row from the fourth row.   1 3 1  0 1 1     0 0 1  0 0 0 (4, :) = (4, :) − 2(3, :) Now we are done. If we write down the equation corresponding to the third row, we get 0x1 + 0x2 = 1, or 0 = 1. Clearly there is no choice of x1 or x2 that makes this true. So this is a system of equations with no solutions. Let us summarize what we have done in this section. Every system of equa- tions can be brought into upper triangular form using a sequence of elementary row transformations. The resulting upper triangular matrix will look something like what is shown in Figure 3.1 (left). In this diagram, all the entries below the staircase line are zero. The boxes represent non-zero entries. The stars represent arbitrary entries, that may or may not be zero. Each circled star corresponds to a parameter that must be introduced. If we want to put this example in completely reduced form, we use elementary row operations to zero out the entries lying above the boxes too. Then we multiply each row by a number so that the corner entries (in the boxes) become 1. The completely reduced form for the example above would look like the diagram in Figure 3.1 (right). The official name of this form is the reduced row echelon form. If the bottom of the matrix has a row that is all zeroes, except for the augmented entry, then the system of equations has no solutions. This is because the bottom row stands for an equation of the form 0 = with = 0. A typical example is shown in Figure 3.2 (left). If all the steps on the staircase in the non-augmented part of the matrix have size one, then there are no parameters to introduce, and the solution is unique. Notice that in this case there are the same number of equations as variables. A typical example is shown in Figure 3.2 (right). 72
  • 73. Chapter 3. Solving Linear Systems 3.2. Gaussian Elimination * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * Figure 3.2: Augmented matrices after Gaussian Elimination with no solutions (left) and with a single solution (right). Finally, we introduce some terminology. The rank of a matrix is the number of non-zero rows in the matrix obtained after reducing it to the upper triangular form described above. In other words the rank is the number of boxes in the diagrams above. We can now rephrase the different possibilities in terms of rank. If the rank of the augmented matrix is greater than the rank of the unaugmented matrix (i.e., the matrix without the last column) then there are no solutions. If the rank of the matrix is equal to the number of unknowns then the solution is unique. If the rank r of the matrix is equal to the rank of the unaugmented matrix, but less than the number n of unknowns, then there are n − r parameters in the solution. 3.2.1 Using MATLAB for row reductions MATLAB has a built in command called rref that reduces a matrix to reduced row echelon form. Let us try it on the example in the previous section. First we define the initial matrix A. Remember that the last column of this matrix is the augmented part. A = [1 2 -2 -7 -29; 1 2 -1 -5 -18; 0 3 0 -3 -6; -1 4 1 1 14] To find the reduced row echelon form, simply type >> rref(A) ans = 1 0 0 0 1 0 1 0 0 2 0 0 1 0 3 0 0 0 1 4 Notice that this MATLAB command did the work of Examples 3.2 and 3.3. The solution to the system can be read off the result of the rref command above. It is important to realize that floating point rounding errors as discussed in Section 2.4.6 can lead to errors in solutions to linear systems computed by MATLAB and other computational tools. At worst, these errors will lead to 73
  • 74. Chapter 3. Solving Linear Systems 3.2. Gaussian Elimination MATLAB finding “solutions” to problems that do not have exact solutions. In these cases, solutions will often have very large values and often MATLAB will give a warning about the problem being “ill-conditioned”. 3.2.2 Problems Problem 3.6 Show that the lower triangular system of equations represented by   1 0 0 3  1 −1 0 3  2 1 −8 −4 is also easily solved, by easily solving it! It’s just a matter of convention whether we aim for upper triangular or lower triangular systems in the elimination pro- cedure. Problem 3.7 The following equations have already been put in upper triangular form. In each case there are infinitely many solutions, depending on one or more parameters. Write down the general expression for the solution in terms of parameters.   1 2 1 2 1  0 0 1 1 4     0 0 0 1 2  0 0 0 0 0   1 2 1 2 1  0 0 1 1 4     0 0 0 0 0  0 0 0 0 0 1 2 1 2 1 0 0 1 1 4 Problem 3.8 Consider the system of equations represented by the augmented matrix   1 2 2 2 1  1 3 −1 3 −1     1 0 1 1 5  0 3 −2 2 −6 Show that this set of equations has infinitely many solutions and find a general parametric representation of the solutions. Problem 3.9 Consider the system of equations represented by the augmented matrix   1 2 2 −7 20  3 6  −3 −5 −15    0 6 0 −6 −10  2 −8 −2 −2 30 74
  • 75. Chapter 3. Solving Linear Systems 3.2. Gaussian Elimination Put this matrix in upper triangular form and find the solution of the linear system of equations. You can leave your answers in fractions but show the operations you perform in full detail. Problem 3.10 Solve the following system of equations. x1 − 2x2 + 3x3 = 2 2x1 − 3x2 + 2x3 = 2 3x1 + 2x2 − 4x3 = 9 Problem 3.11 Solve the following system of equations. 2x1 + x2 − 1x3 = 6 x1 − 2x2 − 2x3 = 1 −x1 + 12x2 + 8x3 = 7 Problem 3.12 Solve the following system of equations. x1 + 2x2 + 4x3 = 1 x1 + x2 + 3x3 = 2 2x1 + 5x2 + 9x3 = 1 Problem 3.13 Solve the following system of equations. x1 + 2x2 + 4x3 = 1 x1 + x2 + 3x3 = 2 2x1 + 5x2 + 9x3 = 3 Problem 3.14 Solve the following system of equations. 3x1 + x2 − x3 + 2x4 = 7 2x1 − 2x2 + 5x3 − 7x4 = 1 −4x1 − 4x2 + 7x3 − 11x4 = −13 Problem 3.15 For what values of a, b, c, d, α and β does the system of equa- tions ax1 + bx2 = α cx1 + dx2 = β have a unique solution? Problem 3.16 Consider the system of equations represented by the augmented matrix   1 2 0 7  4 8 6 10  −4 −8 10 81 How many solutions does this linear system of equations have? Problem 3.17 (Matlab) Consider the following system of equations x1 + 2x2 + 4x3 = 7 4x1 + x2 + 3x3 = 2 0 + 5x2 + 9x3 = a Write a script in Matlab that generates the augmented matrix and solves the system with the rref command for various values of a. Specifically, how does the solution of x1 vary when you range a from 1 to 10 (equally spaced)? 75
  • 76. Chapter 3. Solving Linear Systems 3.3. Homogeneous Equations 3.3 Homogeneous Equations If the coefficients on the right sides of a system of equations are all zero, the system is said to be homogeneous. In other words, a homogeneous system is a system of equations of the form b1,1 x1 + b1,2 x2 +··· + b1,n xn = 0 b2,1 x1 + b2,2 x2 +··· + b2,n xn = 0 . . . . . . . . . . . . bm,1 x1 + bm,2 x2 + · · · + bm,n xn = 0 Given a system of equations, the associated homogeneous system is the homo- geneous system of equations you get by setting all the right sides to zero. Geometrically, homogeneous systems describe points, lines and planes that pass through the origin. In fact x = 0, i.e., x1 = 0, x2 = 0, . . . , xn = 0 is always a solution to a homogeneous system of equations. When are there other (nonzero) solutions to the above homogeneous sys- tem? We have n unknowns and m equations. When we perform the Gaussian reduction, the right-hand sides of the equations will stay zero so the augmented matrix will generally have the form 1 ∗ ∗ ∗ ··· ··· ··· ∗ 0    0 1 ∗ ∗ ··· ··· ··· ∗ 0   0 0 0 1 ∗ ··· ··· ∗ 0    ··· ··· ··· ··· ··· ··· ··· ··· ···    0 0 0 ··· 0 1 ∗ ∗ 0 .    0 0 0 ··· ··· ··· 1 ∗ 0     0 0 0 ··· 0 0 0 1 0    ··· ··· ··· ··· ··· ··· ··· ··· ···   0 0 0 ··· 0 0 0 0 0 The last several lines may be identically zero. In the last section we saw that there are solutions depending on parameters if the number of variables is greater than the rank of the matrix. Thus, if n (the number of unknowns) is bigger than the number of non-zero lines in the above row-reduced matrix, then there exists a non-zero solution. Otherwise only a trivial solution x1 = 0, x2 = 0, . . . , xn = 0 is present. We illustrate the idea with some examples below. Example 3.6 Consider a homogeneous system 3x1 + 6x2 + x3 = 0 6x1 + 2x2 + 2x3 = 0 x1 + x2 + 3x3 = 0 The augmented matrix can be reduced by row operations to the form (check!)   1 0 0 0 0 1 0 0, 0 0 1 0 76
  • 77. Chapter 3. Solving Linear Systems 3.3. Homogeneous Equations which implies x1 = x2 = x3 = 0. And, in agreement with our above statement, the number of variables (3) is not more than the number of non-zero rows (also 3). Example 3.7 Consider another homogeneous system: −x1 + 2x2 + 4x3 = 0 2x1 − 4x2 − 8x3 = 0. x1 − x2 + 3x3 = 0 Its augmented matrix is       −1 2 4 0 −1 2 4 0 1 0 10 0  2 −4 −8 0 →  0 0 0 0 → 0 1 7 0, 1 −1 3 0 0 1 7 0 0 0 0 0 and the number of nonzero rows is 2, which is less than the number of unknowns, 3. Hence by the above statement there must be a nonzero solution. We find x1 = −10x3 , x2 = −7x3 , with no requirement on x3 . Hence x3 is any number t, and we obtain infinitely many nonzero solutions x1 = −10t, x2 = −7t, x3 = t, t ∈ (−∞, ∞), one for each value of t. In a similar manner, if for some homogeneous system with 4 variables the augmented matrix has only 2 nonzero rows, then the general solution has 4-2=2 free (undefined) variables on which the other two depend. 3.3.1 Properties of solutions of homogeneous systems. 1. A homogeneous system has either one zero-solution (x1 = ... = xn = 0) or infinitely-many solutions that depend on parameters. 2. If (x1 , ..., xn ) and (y1 , ..., yn ) are solutions to a given homogeneous system, (x1 + y1 , ..., xn + yn ) is also a solution. (Solutions are additive.) 3. If (x1 , ..., xn ) is a solution to a given homogeneous system, (ax1 , ..., axn ) is also a solution, for any number a. (Solutions are scalable.) The first statement follows from our previous discussion; the other two are easy to verify, using the initial homogeneous system. 3.3.2 Connection of solutions to homogeneous and inho- mogeneous systems. The importance of homogeneous equations comes from the following fact. If x = [x1 , x2 , . . . , xn ] and y = [y1 , y2 , . . . , yn ] are two solutions to a (not necessarily 77
  • 78. Chapter 3. Solving Linear Systems 3.3. Homogeneous Equations homogeneous) system of equations, b1,1 x1 + b1,2 x2 +··· + b1,n xn = c1 b2,1 x1 + b2,2 x2 +··· + b2,n xn = c2 . . . . . . . . . . . . bm,1 x1 + bm,2 x2 + · · · + bm,n xn = cm then the difference x − y = [x1 − y1 , x2 − y2 , . . . , xn − yn ] solves the associated homogeneous system. This is a simple calculation b1,1 (x1 − y1 ) + b1,2 (x2 − y2 ) +··· + b1,n (xn − yn ) = (c1 − c1 ) = 0 b2,1 (x1 − y1 ) + b2,2 (x2 − y2 ) +··· + b2,n (xn − yn ) = (c2 − c2 ) = 0 . . . . . . . . . . . . bm,1 (x1 − y1 ) + bm,2 (x2 − y2 ) + · · · + bm,n (xn − yn ) = (cm − cm ) = 0 To see the implications of this let us suppose that x = q is any particular solution to a (non-homogeneous) system of equations. Then if y is any other solution y − x = z is a solution of the corresponding homogenous system. So y = q + z. In other words any solution can be written as q+ some solution of the corresponding homogenous system. Going the other way, if z is any solution of the corresponding homogenous system, then q + z solves the original system. This can be seen by plugging q + z into the equation. So the structure of the set of solutions is x = q + ( solution to homogeneous system) As you run through all solutions to the homogenous system on the right, x runs through all solutions of the original system. Notice that it doesn’t matter which q you choose as the starting point. This is completely analogous to the parametric form for a line, where the base point can be any point on the line. If we have applied the process of Gaussian elimination to the original system, and concluded that the general solution has parameters, we will end up with a general solution of the form q + s1 a1 + · · · + sn an . Notice that q is a particular solution (corresponding to all parameters equal to zero) and s1 a1 + · · · + sn an is the general solution to the corresponding homogeneous system. These considerations have practical importance if you have to solve a bunch of systems, all with the same coefficients on the left side, but with different coef- ficients on the right. In this situation, you could first find the general solution to the corresponding homogeneous system of equations. Then to find the general solution to one of the systems, you would only need to find a single particular solution, and then add the general solution to the homogeneous system to ob- tain all solutions. The only trouble with this is that it might not really be any easier to find a single particular solution than it is to find all solutions. 78
  • 79. Chapter 3. Solving Linear Systems 3.3. Homogeneous Equations Example 3.8 Find the general solution of the system of equations   1 1 0 0 1  −1 −1 1 2 1 3 3 −1 −2 1 In the form x = q + s1 a1 + s2 a2 . Verify that a1 and a2 solve the corresponding homogeneous equation. The matrix   1 1 0 0 1  −1 −1 1 2 1  3 3 −1 −2 1 reduces to   1 1 0 0 1  0 0 1 2 2  0 0 0 0 0 so the solutions are x1 = 1 − s1 , x2 = s1 , x3 = 2 − 2s2 , x4 = s2 . This can be written 1 −1 0       0  1   0  x =   + s1   + s2  2 0 −2  0 0 1 It’s easy to check that a1 = [−1, 1, 0, 0] and a2 = [0, 0, −2, 1] solve the corre- sponding homogeneous system. 3.3.3 Problems Problem 3.18 Find the general solution of the system of equations   1 0 1 0 10  −1 1 1 1 4  0 1 2 1 14 In the form x = q + s1 a1 + s2 a2 , where a1 and a2 solve the corresponding homogeneous system of equations. Problem 3.19 Consider the system of equations   1 1 0 0 4  −1 −1 1 2 −1  3 3 −1 −2 9 Verify that [4, 0, 3, 0] is a solution and write down the general solution. Problem 3.20 Consider the system of equations given by the augmented matrix   1 −3 4 6  1 9 −10 10  0 6 −7 2 Put this matrix in reduced row echelon form and comment on the number of solutions of this system of equations. 79
  • 80. Chapter 3. Solving Linear Systems 3.4. Geometric Applications 3.4 Geometric Applications Now we will apply Gaussian elimination to some of the geometry problems we studied in the first part of this course. Let us start with the question of linear independence. Recall that a collection of vectors x1 , x2 , . . . , xn is called linearly dependent if we can find some non-zero coefficients c1 , c2 , . . . , cn such that c1 x1 + c2 x2 + · · · + cn xn = 0 This is actually a homogeneous system of linear equations for the numbers c1 , . . . , cn . If c1 = c2 = · · · = cn = 0 is the only solution, then the vectors are linearly independent. Otherwise, they are linearly dependent. To decide, we must set up the matrix for the system of equations and perform a row reduction to decide if there is a unique solution or not. In setting up the equations, it is convenient to treat the xi ’s as column vectors. Example 3.9 Decide if       1 1 1 x1 =  2  x2 =  1  x3 =  2  0 1 1 are linearly independent. The equation c1x1 + c2 x2 + c3 x3 = 0 can be written c1 +c2 +c3 =0 2c1 +c2 +2c3 =0 0c1 +c2 +c3 =0 The matrix for this system of equations is   1 1 1 2 1 2 0 1 1 Since this is a homogeneous system, we don’t have to write the augmented part of the matrix. Performing a row reduction yields   1 1 1 0 1 0 0 0 1 Since the number of non-zero rows is the same as the number of variables (three) there are no non-zero solutions. Therefore the vectors are linearly independent. The row reduction in Example 3.9 also shows that any vector y in R3 can be written as a linear combination of x1 , x2 and x3 . Writing y as a linear combination of x1 , x2 and x3 means finding coefficients c1 , c2 and c3 such that 80
  • 81. Chapter 3. Solving Linear Systems 3.4. Geometric Applications c1x1 + c2 x2 + c3 x3 = y. This is a (non-homogeneous) system of linear equations with augmented matrix   1 1 1 y1  2 1 2 y2  0 1 1 y3 Using the same Gaussian elimination steps as above, this matrix reduces to   1 1 1 ∗ 0 1 0 ∗ 0 0 1 ∗ where the ∗’s are some numbers. This system has a (unique) solution. Example 3.10 Here is another geometric example. Do the planes whose equa- tions are given by x1 + x2 + x3 = 1, 2x1 + x2 + 2x1 = 1 and x2 = 1 intersect in a single point? To answer this, we note that the intersection of the three planes is given by the set of points that satisfy all three equations. In other words they satisfy the system of equations whose augmented matrix is   1 1 1 1 2 1 2 1 0 1 0 1 A row reduction yields   1 1 1 1 0 −1 0 −1  0 0 0 0 Thus solutions are given by     0 1 1 + s 0  0 −1 This is the parametric equation of a line. Thus the three planes intersect in a line, not a point. 3.4.1 Problems Problem 3.21 Are the following vectors linearly dependent or independent? 1 1 1       2  1  0 x1 =   x2 =   x3 =   0 −1 1 2 1 0 Can every vector in R4 be written as a linear combination of these vectors? How about the vector the 2    4  y1 =  ? −3 4 81
  • 82. Chapter 3. Solving Linear Systems 3.5. Resistor Networks Problem 3.22 Consider the following vectors a1 , a2 and a3 such that: 1 3 −3 a1 = a2 = a3 = 2 1 4 −15 Are they linearly independent? Can the vector y = be written as a 5 linear combination of a1 and a2 ? Problem 3.23 Consider the following 4 dimensional vectors a1 , a2 , and a3 such that       1 0 10  1   0   0  a1 =    0  a2 =  4   a3 =   −5   0 −3 0 Are these linearly independent? Can the vector y below be written as linear combination of the above three vectors? Problem 3.24 (Matlab) Consider the following 3 vectors a1 , a2 , and a3 :       1 −1 4 a1 =  2  a2 =  2  a3 =  1  3 −1 −1 The Matlab command plot3([a,0], [b,0], [c,0]) draws a line between (0, 0, 0) and a point with coordinates (a, b, c). Write a script that draws the three vectors (use the command hold on to be able to overlap them). Are the vectors linearly independent? 3.5 Resistor Networks 3.5.1 Elements of Basic Circuits Electrical current, often denoted with the variable I, is a measure of charge flow with MKS units of Amperes or Amps (Coulombs per second, where a Coulomb is a measure of electrical charge). The voltage V is an electrical potential measured in Volts (Joules per Coulomb). Thus IV has the units of power (J/s or Watts). A resistor is a simple electrical component that obeys Ohm’s law, that the current through it is proportional to the voltage drop across it. The constant of proportionality is called the resistance R in Ohms (abbreviated Ω, V/A). The current I through a resistor across which there is a voltage drop V satisfies V = IR. The current goes from high to low potential through the resistor. The resistor networks considered in these notes are circuits with three types of components: 82
  • 83. Chapter 3. Solving Linear Systems 3.5. Resistor Networks Figure 3.3: Elements in resistor networks. 1. Resistors 2. Voltage sources 3. Current sources The notation we will use for these elements in circuit schematics is shown in Figure 3.3. Later (in Chapter 5) we will see that such a network represents a network with additional elements (inductors and capacitors) at a given instant in time. At a given time, an inductor acts as a current source and a capacitor acts as a voltage source. The current through the capacitor at that instant determines the rate of change of voltage across it, and the voltage across the inductor determines the rate of change of current through it. In the network, the resistances of all resistors will be given. The voltage across all voltage sources and the currents through all current sources will be given. There are two basic questions to answer about a resistor network with these components. basic problem: Find the currents through each resistor and each power source and also the voltage drops across each current source. This problem is a linear system of equations and so serves as an example of the mathematical techniques learnt in this chapter. fundamental problem: An important sub-problem is to find the currents through every power source and the voltage drops across each current source. These quantities can be written in terms of the given voltages and current sources directly (eliminating the terms involving resistor currents). Later, solving this problem will tell us how to write a differential equation for the currents through inductors and the voltages across capacitors. There are two fundamental laws governing the behaviour of circuits that can be used to set up equations that can be solved to answer the questions above. They are Kirchhoff’s laws: 1. The sum of voltage drops around any closed loops in the network must be zero. 83
  • 84. Chapter 3. Solving Linear Systems 3.5. Resistor Networks 2. The sum of currents entering a node must be zero. 3.5.2 Two Simple Examples Made Complicated Consider the simple network with one current source and one resistor shown in Figure 3.4 (left). Clearly, the current through the resistor must be I and the voltage drop across the resistor must be IR using Ohms Law (use the signs in the diagram for the direction of the drop). Example 3.11 Consider the same example, introducing two nodes into the net- work as shown in Figure 3.4 (right). We now have three unknowns, V1 , V2 and I2 . Find a linear system for these unknowns and then solve the system. We have made this simple example more complicated but we will learn something as we work through it. Note that by specifying voltages at nodes (which will determine voltage drops across components) we will always satisfy Kirchhoff’s first law. Let us write down every other law that applies to this diagram: I − I2 = 0, Kirchhoff’s second law at node 2 I2 − I = 0, Kirchhoff’s second law at node 1 V2 − V1 = I2 R, Ohm’s Law over the resistor If you didn’t look too closely, you might be happy thinking these are three equations for the three unknowns I2 , V1 and V2 . However, rewriting gives I2 = I I2 = I V2 − V1 − I2 R = 0 In augmented matrix form we can write the system and do Gaussian Elimina- tion:     1 0 0 I 1 0 0 I  1 0 0 I  ∼  0 −1 1 −RI  −R −1 1 0 0 0 0 0 In the augmented matrix above, the unknowns are ordered I2 , V1 and then V2 . The solutions are I2 = I (expected), V2 = s and V1 = s − RI where s is a parameter that can take any value. This seems much more complicated that the intuitive solution at the beginning of this section. However, the conclusions are the same: the current through the resistor is I and the voltage drop across the resistor is V2 − V1 = s − (s − RI) = RI. The arbitrary constant in the voltage occurs here because no reference voltage has been specified (no point in the circuit has been grounded). 84
  • 85. Chapter 3. Solving Linear Systems 3.5. Resistor Networks Figure 3.4: The simple resistor network considered in Example 3.11. Example 3.12 Consider the circuit in Figure 3.5. Learning from the last ex- ample, we have set a reference voltage at the lower left corner of the circuit, and then voltage at the upper left corner is known. The current I1 is the branch current from the V2 node to the V1 node. Form a linear system matching cur- rents and voltages across resistors to Ohms Law, and matching branch currents at the two nodes. Solve the linear system. There are 5 unknowns V1 , V2 , I1 , I2 and I3 in the circuit as shown in the figure (the augmented matrices below will be written with the unknowns in this order). Ohm’s law on the four resistors gives the following linear equations (in order of small to large resistance) 12 − V1 = I1 V1 − V2 = 2I2 V1 − V2 = 3I3 V2 = 4I1 and matching the currents at the two nodes gives I1 = I2 + I3 I2 + I3 = I1 This gives six equations in five unknowns! (maybe you already see why this happened to us). Writing the six equations above in an augmented matrix gives     −1 0 −1 0 0 −12 1 0 0 0 0 2184/217  1 −1 0 −2 0 0   0 1 0 0 0 1680/217       1 −1 0 0 −3   0 0 1 0 0 0   420/217   ∼   0  1 −4 0 0 0   0 0 0 1 0  252/217   0 0 1 −1 −1 0   0 0 0 0 1 168/217  0 0 −1 1 1 0 0 0 0 0 0 0 On the left above is the result of Gaussian elimination to reduced row echelon form. The solutions for V1 , V2 , I1 , I2 and I3 can be read off the last column of the augmented matrix after reduction (V1 = 2184/217 etc.). Notice that the “extra” equation became the bottom row (all zeros) in the reduced form, which 85
  • 86. Chapter 3. Solving Linear Systems 3.5. Resistor Networks Figure 3.5: The resistor network considered in Example 3.12. indicates that there was redundant information given in the linear system. If you go back to the expressions for conservation of current at the two nodes above, it is easy to see that these two equations carry the same information. As the two previous examples show, one has to be a bit careful picking the unknowns and equations in a circuit to get a unique solution without introduc- ing redundant equations. There are several ways to do this and when solving small circuits by hand the linear system can be made much simpler to solve if you pick the “right” technique. In the next section we will describe the “loop current” technique which always leads to a solvable linear system with no extra (redundant) equations. 3.5.3 Loop Currents We want to be able to see any resistor network and write down equations that will solve it uniquely, with no redundant equations like in the previous example and no non-uniqueness (like that coming from the lack of a reference potential). This can always be done using the following variables: loop currents, that is currents in every elementary loop of the network, and voltage drops across any current sources. This technique is described in more detail below. Consider a circuit that can be drawn on a piece of paper with no branches overlapping (a so-called planar network). The branches in the circuit divide the diagram into smaller areas. The set of branches around each of these small areas is called an elementary loop. By assigning a loop current to each elementary closed loop of the circuit, the second of Kirchhoff’s Circuit Laws is satisfied automatically because in a closed loop the current entering any one point is equal to the current travelling away from that point. Consider Figure 3.7 (left). 86
  • 87. Chapter 3. Solving Linear Systems 3.5. Resistor Networks There are three elementary loops and loop currents i1 , i2 and i3 associated with each of them. Loop currents sum when they overlap in a branch. For example, the current downwards through the 3 Ω resistor in Figure 3.7 is i2 − i3 . Be careful of signs as you sum loop currents. In the example above, i2 is downwards through the 3 Ω resistor but i3 is upwards, hence it appears with a negative sign in our expression. In a circuit, it is convenient to take loop currents and voltage drops across current sources as the unknowns. The first step in solving any electric network is to identify the number of elementary loops in the network. If there are m independent loops present, then variables i1 , i2 , . . . , im must be introduced to represent the loop currents of each. If there are n current sources, then the variables v1 , v2 , . . . vn must be introduced to represent the voltage drop across each source. Together there are n + m unknowns. We can apply Kirchhoff’s voltage law to each of the m loops and obtain m linear equations for the unknowns. The current through each current source must match the loop currents through it. This gives n more linear equations for a total of n + m linear equations for the n + m unknowns. Example 3.13 Solve the resistor network in Figure 3.6. Note that this is the same circuit as Example 3.12 but here the loop current method will be used. The unknowns are the loop currents i1 and i2 (there are no current sources). Remember that the loop currents add in shared components. For example, the current downwards in the 2Ω resistor is i1 − i2 . Using loop currents Kirch- hoff’s second law is always satisfied. The equations needed to solve for the loop currents are obtained by summing voltage drops around each elementary loop: i1 + 2(i1 − i2 ) + 4i1 − 12 = 0, voltage drops going around loop 1 3i2 + 2(i2 − i1 ) = 0, voltage drops going around loop 2. Collecting terms: 7i1 − 2i2 = 12 −2i1 + 5i2 = 0 which can be solved to give i2 = 24/31 and i1 = 420/217. With these values of i1 and i2 the current through each resistor and the power source can be determined, solving the problem. There are easier ways to solve this particular small problem (the easiest is probably to use combinations of the series and parallel resistor laws). However, the loop current rule works for networks of arbitrarily large size and leads to systems of equations with a relatively small number of unknowns. On tests and exams, it is expected that students will be able to apply the idea of loop currents. Example 3.14 Solve the network shown in Figure 3.7 There are three inde- pendent loop currents which are labelled i1 , i2 and i3 . There is a single current 87
  • 88. Chapter 3. Solving Linear Systems 3.5. Resistor Networks Figure 3.6: The resistor network for Example 3.13 source of 4A with voltage drop v across it (minus to plus in the direction of the current). There are also two 10V voltage sources. Remember that the current in an electrical branch shared by two loop currents is equal to the (signed) sum of the two loops currents, i.e. the current moving to the left through the 5Ω resistor of Figure 3.7 is i1 − i3 and thus the voltage drop is 5(i1 − i3 ). Be careful also of the sign of voltage drops. Moving around loop 1 in the circuit in Fig- ure 3.7 clockwise, when the current source is crossed, there a voltage increase of v, so this would be a voltage drop of −v in the expression for Kirchhoff’s second law for this loop written below. We sum the voltage drops around loop 1 beginning at the current source and moving clockwise (the same direction as i1 ) to obtain −v + 2i1 + 5(i1 − i3 ) + 2(i1 − i2 ) = 0 which can be simplified to 9i1 − 2i2 − 5i3 − v = 0 The equations for the voltage drops around loops 2 and 3 are derived similarly −2i1 + 5i2 − 3i3 = −10 −5i1 − 3i2 + 8i3 = 10 The final linear equation comes from matching the loop currents to the current source: i1 = 4 The four linear equations above can be solved for the four unknowns i1 , i2 , i3 , v. The solution can be found using MATLAB (the details are in the computer lab #4 guide): i1 = 4A i2 ≈ 2.3871A i3 ≈ 4.6452A v = 8V Now that the loop currents are determined, the branch currents can be written down. For example the current over the 5Ω resistor to the right is i3 − i1 = 0.6452A. The right hand panel of Figure 3.7 displays all six branch currents. 88
  • 89. Chapter 3. Solving Linear Systems 3.5. Resistor Networks Figure 3.7: The left panel displays the schematic circuit from Example 3.14 with two 10V voltage sources, four resistors and a current source of 4A. The loop currents i1 , i2 i3 represent the current in each of the independent closed loops. The right panel is the solution to the electric network on the left panel with branch currents shown. Example 3.15 Solve the resistor network in Figure 3.8. In this case, solve both the basic problem when V = 9 and I = 1, and then the fundamental problem for arbitrary V and I. The unknowns for the problem are i1 , i2 , i3 and E. One equation in the system of unknowns comes from the fact that the loop current variables must match the current source: i3 = −I. Note that this equation is so simple we will no longer consider i3 as a variable but replace i3 by the known value −I in the equations below. Voltage drops across the three loops give: i1 + 2(i1 − i2 ) + 5i1 − V = 0 3(i2 − i3 ) + 2(i2 − i1 ) = 0 5i3 + E + 3(i3 − i2 ) = 0 Since I and V and i3 (by the discussion above) are known quantities, we move them to the right hand side of the linear equations for i1 , i2 and E which are written below 8i1 − 2i2 = V −2i1 + 5i2 = −3I (3.1) −3i2 + E = 8I. 89
  • 90. Chapter 3. Solving Linear Systems 3.5. Resistor Networks Figure 3.8: The resistor network for Example 3.15 With V = 9 and I = 1 this is solved using Gaussian Elimination to give the 1 solution E = 7 2 , i2 = −1/6 and i1 = 13/12. Note that the negative value for i2 means that this loop current physically goes in the opposite direction to that in the Figure. For the fundamental problem, we consider (3.1) for arbitrary values of V and I. We write the system as an augmented matrix and do Gaussian Elimination with symbolic terms in the right hand sides:     8 −2 0 V 1 −1/4 0 V /8  −2 5 0 −3I  ∼  0 1 0 −2/3I + V /18  0 −3 1 8I 0 0 1 6I + V /6 so E = 6I +V /6 (the voltage across the current source in terms of the given volt- age and currents of sources) and (after some algebra) i1 = 72 V − 1 I (the current 10 6 through the power source). An alternate approach to solving the fundamental problem is given in next section. 3.5.4 Alternate Presentation of Resistor Networks Linear systems from resistor networks was presented in a previous version of the notes in a different way. That previous presentation is reproduced here beginning in the next paragraph. This alternate explanation may be helpful to some students. Consider the circuit shown in Figure 3.9. We won’t be able to solve this circuit until we a studied differential equations in the last part of this course. However we can make some progress using what we know already. There are three types of components: resistors, inductors (coils) and ca- pacitors. Associated with each component is the current I flowing through that component, and the voltage drop V across that component. If there are n differ- ent components in a circuit, then there are 2n variables (currents and voltages) to determine. In the circuit above there are 8. Of course, these variables are not all independent. They satisfy two types of linear relations: algebraic and differential. We won’t touch the differential relations for now, but we can consider the algebraic relations. 90
  • 91. Chapter 3. Solving Linear Systems 3.5. Resistor Networks R1 I1V1 IV C R2 I V 22 44 L I3 3 V Figure 3.9: A resistor network. The first algebraic relation relates the current and voltage drop across a resistor. If R is the resistance and I and V are the current and voltage drop respectively, then V = IR. In our example, this gives two equations V1 = I1 R1 V2 = I2 R2 The other two algebraic relations are Kirchhoff’s laws. The first of these states that the total voltage drop across any loop in the circuit is zero. For the two loops in the example circuit, this gives the equations V4 − V2 = 0 V1 + V3 + V2 = 0 Notice we have to take the direction of the arrows into account. The second Kirchhoff law states that current cannot accumulate at a node. At each node, the current flowing in must equal the current flowing out. In the example circuit there are three nodes, giving the equations. I4 + I2 − I1 = 0 I1 − I3 = 0 I3 − I2 − I4 = 0 We now want to pick a few of the variables, and solve for all the rest in terms of these. In a small circuit like the example, this can be done “by hand.” For example, its pretty obvious that I1 = I3 and V2 = V4 so one could eliminate two variables right off the bat. However, it is also useful to have a systematic way of doing it, that will work for any circuit (but probably will require a computer for anything but the simplest circuit). As a rule of thumb, you can pick the voltages across the capacitor and the currents across the inductors as basic variables and solve for the rest in terms of these. In other words, we want I3 and V4 to be parameters when we solve 91
  • 92. Chapter 3. Solving Linear Systems 3.5. Resistor Networks the system of equations. To accomplish this we will choose the order of the variables with I3 and V4 at the end of the list. With this in mind we choose the order I1 , I2 , I4 , V1 , V2 , V3 , I3 , V4 . Then the equations become R1 I1 −V1 =0 R2 I2 −V2 =0 −V2 +V4 =0 V1 +V2 +V3 =0 −I1 +I2 +I4 =0 I1 −I3 =0 −I2 −I4 +I3 =0 The matrix for this system is (since it is a homogeneous system of equations, we don’t have to bother writing the augmented part) R1 0 0 −1 0 0 0 0    0 R2 0 0 −1 0 0 0   0 0 0 0 −1 0 0 1     0 0 0 1 1 1 0 0    −1 1 1 0 0 0 0 0   1 0 0 0 0 0 −1 0   0 −1 −1 0 0 0 1 0 Here is the reduced form of this matrix.   1 0 0 0 0 0 −1 0 1 0  1 0 0 0 0 0 − R2   1  0  0 1 0 0 0 −1 R2  0  0 0 1 0 0 −R1 0   0  0 0 0 1 0 0 −1   0 0 0 0 0 1 R1 1  0 0 0 0 0 0 0 0 Thus I1 = I3 1 I2 = V4 R2 1 I4 = I3 − V4 R2 V1 = R1 I3 V2 = V4 V3 = −R1 I3 − V4 So we have succeeded in expressing all the variables in terms of I3 and V4 . We therefore need only determine these to solve the circuit completely. 92
  • 93. Chapter 3. Solving Linear Systems 3.5. Resistor Networks Figure 3.10: Circuit diagrams for Problems 3.25 (left) and 3.26 (right). 3.5.5 Problems Problem 3.25 Find the currents and voltages in each component of the circuit shown in Figure 3.10 (left). Problem 3.26 The resistances of the resistors shown in the circuit shown in Figure 3.10 (right) are R1 = 4Ω, R2 = 1Ω, R3 = R4 = 2Ω. Find the current I2 that flows through resistor R2 if the voltage across the batteries are E1 = 5V and E2 = 3V . In which direction does I2 flow? Solve the problem as a system of linear equations for the loop currents i1 and i2 shown in the diagram. Problem 3.27 Consider the resistor network: Given R1 = 3[Ω], R2 = 1[Ω], R3 = 4[Ω], V = 26[V ] and I = 2[A], answer the following questions: a) What is the voltage drop through R3 ? b) What is the current flow through R2 ? c) What is the voltage drop through R1 ? Problem 3.28 Consider the following resistor network: 93
  • 94. Chapter 3. Solving Linear Systems 3.5. Resistor Networks Suppose that R1 = 4[Ω], R2 = 1[Ω], R3 = 2[Ω] and that the current flow through R3 is 1.5[A]. Solve the resistor network and answer the following questions: a) What is the voltage drop across R3 ? b) What is the current flow through R2 ? c) What is the voltage drop through V ? Problem 3.29 Consider the following resistor network: Suppose that R1 = 4[Ω], R2 = 2[Ω], R3 = 10[Ω] and that V = 60[V ]. Solve the resistor network and answer the following questions: a) What is the voltage drop through R2 ? b) What is the current flow through R1 ? c) What is the current flow through R3 ? Problem 3.30 Find the loop currents i1 , i2 , i3 in the following network: where R1 = 1[Ω], R2 = 3[Ω], R3 = 5[Ω], R4 = 2[Ω], E1 = 10[V ] and E2 = 4[V ]. 94
  • 95. Chapter 3. Solving Linear Systems 3.5. Resistor Networks Problem 3.31 Consider the following network: where R1 = 1[Ω], R2 = 2[Ω], R3 = [Ω], R4 = [Ω], V = 25[V ] and I = 3[A]. a) Set up and solve the linear system for the loop currents i1 , i2 , i3 and the voltage E across the current source. b) What is the voltage drop through R2 ? c) What is the current flow through V ? d) What is the voltage drop through I? Problem 3.32 Consider the resistor network: with R1 = 2[Ω], R2 = 5[Ω], R3 = 3[Ω], V = 10[V ], I2 = 3[A]. Suppose that the voltage drop through I2 is E2 = 5[V ]. a) What is the current flow through R2 ? b) What is the voltage drop through I1 ? c) What is the current flow through I1 ? Problem 3.33 If a circuit contains only resistors, then we can solve it com- pletely using the ideas of this section. Write down the linear equations satisfied by the currents in the circuit shown in Figure 3.11. In this diagram, the compo- nent on the far left is a voltage source (battery). The voltage across the voltage source is always E. 95
  • 96. Chapter 3. Solving Linear Systems 3.6. Additional Topics R1 R3 R2n-1 E R2 R4 R 2n Figure 3.11: The circuit from Problem 3.33. Figure 3.12: The minimization of a quadratic function in one variable. 3.6 Additional Topics 3.6.1 Quadratic Functions Let begin by recalling how we would find the minimum of a quadratic function in one variable, namely a parabola given by f (x) = ax2 + bx + c as shown in Figure 3.12. We simply find the value of x for which the derivative is zero, that is, we solve f (x) = 0. Notice that since f is quadratic, this is a linear equation 2ax + b = 0 which is easily solved for x = −b/2a (provided a = 0). So the minimum value is f (−b/2a) = −b2 /(4a) + c. Of course, if a is negative, then the parabola points downwards, and we have found the maximum value, not the minimum value. A quadratic function of two variables x1 and x2 is a function of the form f (x1 , x2 ) = ax2 + 2bx1 x2 + cx2 + dx1 + ex2 + f. 1 2 (The 2 in front of b is just for convenience.) For what values of x1 and x2 is f (x1 , x2 ) the smallest? Just like with the parabola in one variable, there may be no such values. It could be that f has a maximum instead, or that f has what is called a saddle point. However if f does have a minimum, the procedure 96
  • 97. Chapter 3. Solving Linear Systems 3.6. Additional Topics described below is guaranteed to find it. (If f has a maximum or saddle point, the procedure will find these points instead.) The idea behind finding the minimum is simple. Suppose that x1 and x2 are the values for which f (x1 , x2 ) is smallest. Then the function g(s) = f (x1 +s, x2 ) must have a minimum at s = 0. So g (0) = 0. But d g (s) = f (x1 + s, x2 ) ds d = a(x1 + s)2 + 2b(x1 + s)x2 + cx2 + d(x1 + s) + ex2 + f 2 ds = 2a(x1 + s) + 2bx2 + d so that the condition is g (0) = 2ax1 + 2bx2 + d = 0. Notice that this expression can be obtained by holding x2 fixed and differenti- ating with respect to x1 . It is called the partial derivative of f with respect to ∂f x1 and is denoted ∂x1 . ∂f The same argument can be applied to h(s) = f (x1 , x2 + s) (or ∂x2 .) This yields ∂f (x1 , x2 ) h (0) = = 2bx1 + 2cx2 + e = 0. ∂x2 Therefore we conclude that the pair of values x1 and x2 at which f achieves its minimum satisfy the system of linear equations 2ax1 +2bx2 = −d 2bx1 +2cx2 = −e This is a 2 by 2 system with augmented matrix 2a 2b −d 2b 2c −e This is easily generalized to n variables. In this case the quadratic function is given by n n n f (x1 , x2 , . . . , xn ) = ai,j xi xj + bi xi + c i=1 j=1 i=1 To see this is the same, let us expand out the first term when n = 2. Then n n ai,j xi xj = a1,1 x1 x1 + a1,2 x1 x2 + a2,1 x2 x1 + a2,2 x2 x2 i=1 j=1 = a1,1 x2 + (a1,2 + a2,1 )x1 x2 + a2,2 x2 1 2 So this is just the same as before with a1,1 = a, a1,2 + a2,1 = 2b and a2,2 = c. Notice that we might as well assume that ai,j = aj,i , since replacing both ai,j and aj,i with (a1,2 + a2,1 )/2 doesn’t change f . 97
  • 98. Chapter 3. Solving Linear Systems 3.6. Additional Topics (xn,yn) (x i,yi) yi yi -axi-b ax i+b (x1,y1) xi Figure 3.13: Fitting a line through data. If this function f has a minimum we can find it by generalizing the procedure above. In other words we try to find values of x1 , . . . , xn for which ∂f /∂x1 = ∂f /∂x2 = · · · = ∂f /∂xn = 0. This leads to a system of n linear equations whose associated augmented matrix is  2a 2a1,2 . . . 2a1,n −b1  1,1  2a2,1 2a2,2 . . . 2a2,n −b2   .  . . . . . .  .  . . . . 2an,1 2an,2 . . . 2an,n −bn 3.6.2 Least squares fit As a first application let us consider the problem of finding the “best” straight line going through a collection of data points (x1 , y1 ), (x2 , y2 ), . . . , (xn , yn ). (Care- ful! the xi ’s are not the unknowns in this problem, but rather the known fixed data points, together with the yi ’s.) Consider Figure 3.13. Which straight line fits best? There is no one answer. One can measure how good the fit of a straight line is in various ways. However the following way of measuring the fit results in a problem that is easy to solve. Each line is given by an equation y = ax+b. So the variables in this problem are a and b. We want to find the values of a and b that give the best fitting line. The vertical distance between the point (xi , yi ) and the line is given by |yi − axi − b|. We will take as a measure of the fit, the square of this quantity, added up over all the data points. So f (a, b) = (yi − axi − b)2 i = yi + x2 a2 + b2 − 2xi yi a − 2yi b + 2xi ab 2 i i = x2 i a2 + 2 xi ab + nb2 − 2 xi yi a−2 yi b+ 2 yi i i i i i Here we used that ( i 1) = n, the number of points. Therefore the linear 98
  • 99. Chapter 3. Solving Linear Systems 3.6. Additional Topics equations we must solve for a and b are 2 i x2 i 2( i xi ) 2 ( i xi yi ) 2( i xi ) 2n 2( i yi ) We could solve these equations numerically in each particular case, but since its just a 2 by 2 system we can also write down the answer explicitly. In fact, the solution to A B E C D F is DE−BF AD−BC AF −CE , AD−BC provided AD − BC = 0, as you may check directly, or derive using a sequence of row transformations. So in this case n( xi yi ) − ( xi )( yi ) a = n( x2 ) − ( xi )2 i ( x2 )( yi ) − ( xi )( xi yi ) i b = n( x2 ) − ( xi )2 i Example 3.16 Suppose we want to find the best straight line through the points (1, 1), (2, 2), (2, 3), (3, 3) and (3, 4). Calculate 1 =n =5 xi =1+2+2+3+3 = 11 yi =1+2+3+3+4 = 13 x2i =1+4+4+9+9 = 27 2 yi = 1 + 4 + 9 + 9 + 16 = 39 xi yi = 1 + 4 + 6 + 9 + 12 = 32 so a = (5 · 32 − 11 · 13)/(5 · 27 − 112 ) = 17/14 = 1.214 . . . and b = (27 · 13 − 11 · 32)/(5 · 27 − 112 ) = −1/14 = −0.0714 . . . The result is shown in Figure 3.14. 3.6.3 Equilibrium configuration of hanging weights and springs Consider the problem of n vertically hanging weight connected by springs. What is the equilibrium configuration? We can solve this problem by calculating the total potential energy of the system. The equilibrium configuration minimizes the total potential energy. A diagram of the setup is shown in Figure 3.15. Our goal is to compute the numbers x1 , . . ., xn . In the diagram n = 3. 99
  • 100. Chapter 3. Solving Linear Systems 3.6. Additional Topics Figure 3.14: The solution in Example 3.16. k1 , l1 x1 m1 k2 , l x2 2 m2 k3 , l x3 3 m3 Figure 3.15: Equilibrium configuration of springs. There are two sources of potential energy. One is the potential energy stored in the spring. This is equal to ks2 /2, where k is the spring constant that measures the stiffness of the spring, and s is the amount that the spring has been stretched from its natural length. In our problem, suppose that the spring constant of the ith spring is ki and its natural length is li . Then the potential energy stored in the ith spring is ki (xi − xi−1 − li )2 /2. To make this formula work out correctly for the first spring we set x0 = 0. The other source of potential energy is gravity. The gravitational potential energy of the ith weight is −mi gxi . The reason for the minus sign is that we are measuring distances downward. Thus the total potential energy in the system for n weights is the function n ki f (x1 , x2 , . . . , xn ) = (xi − xi−1 − li )2 − mi gxi . i=1 2 When n = 3 this becomes k1 k2 k3 f (x1 , x2 , x3 ) = (x1 −l1 )2 + (x2 −x1 −l2 )2 + (x3 −x2 −l3 )2 −m1 gx1 −m2 gx2 −m3 gx3 2 2 2 This is a quadratic function, so we know how to find the minimum. The equa- tions are obtained by taking partial derivatives: To get the first equation we 100
  • 101. Chapter 3. Solving Linear Systems 3.6. Additional Topics hold x2 and x3 fixed and differentiate with respect to x1 and so on. Thus the equations are k1 (x1 − l1 ) − k2 (x2 − x1 − l2 ) − m1 g = 0 k2 (x2 − x1 − l2 ) − k3 (x3 − x2 − l3 ) − m2 g = 0 k3 (x3 − x2 − l3 ) − m3 g = 0 The augmented matrix for this system is   k1 + k2 −k2 0 m1 g + k1 l1 − k2 l2  −k2 k2 + k3 −k3 m2 g + k2 l2 − k3 l3  0 −k3 k3 m3 g + k3 l3 Example 3.17 Suppose that the spring constants are k1 = 1, k2 = 2 and k3 = 1. The masses are all equal to 1, g = 10 and the natural length of the springs is 1 for all springs (in appropriate units). Find the equilibrium configuration. We must solve   3 −2 0 9  −2 3 −1 11  0 −1 1 11 Gaussian elimination gives   3 −2 0 9 0 −1 1 11  0 0 2 106 which can be solved to give x1 = 31, x2 = 42, x3 = 53. 3.6.4 Problems Problem 3.34 Find the “best” straight line going through the points (1, 1), (2, 1), (2, 3), (3, 4), (3, 5) and (4, 4). Problem 3.35 Consider the problem of finding the parabola y = ax2 + bx + c that best fits the n data points (x1 , y1 ) . . . (xn , yn ). Derive the system of three linear equations which determine a, b and c. (You need not solve solve them!) Problem 3.36 Write down the augmented matrix for a system of n weights and springs. Problem 3.37 Write down the system of equations you would have to solve if there are 5 identical springs with ki = 1 and li = 1 and five weights with m1 = 1, m2 = 2, m3 = 3, m4 = 4, and m5 = 5. 101
  • 102. Chapter 3. Solving Linear Systems 3.7. Solutions to Chapter Problems 3.7 Solutions to Chapter Problems Solution 3.1 The augmented matrix is   1 −2 3 6 A =  4 −5 −6 7  8 9 10 11 Solution 3.2 Here is the sequence of systems of equations you would get: x1 + x2 + x3 = 6 x1 − x2 + x3 = 0 2x1 + x2 − 8x3 = −11 x1 + x2 + x3 = 6 − 2x2 = −6 2x1 + x2 − 8x3 = −11 x1 + x2 + x3 = 6 − 2x2 = −6 − x2 − 10x3 = −23 x1 + x2 + x3 = 6 x2 = 3 − x2 − 10x3 = −23 x1 + x2 + x3 = 6 x2 = 3 − 10x3 = −20 So x3 = 2, x2 = 3 and x1 = 1. Solution 3.3 A = [1, 1, 1, 6; 1, -1, 1, 0; 2, 1, -8, -11] A(2,:) = A(2,:) - A(1,:) A(3,:) = A(3,:) - 2*A(1,:) A(2,:) = -A(2,:)/2 A(3,:) = A(3,:) + A(2,:) A(3,:) = -A(3,:)/10 We get the same solution as in the pen-and-paper version in problem 3.2. Solution 3.4 Perform the sequence of operations to finally end up with the system 2x3 = 4 3x1 − 3x2 + 3x3 = 30 9x1 + 2x3 = 26. Hence, x1 = 22/9, x2 = −50/9, and x3 = 2. 102
  • 103. Chapter 3. Solving Linear Systems 3.7. Solutions to Chapter Problems Solution 3.5 We want to solve the system 2x1 + x2 = 5 3x1 + 5x2 = −10. We shall use the method of substitution. From the first equation, we solve for x2 : x2 = 5 − 2x1 We then substitute it into the second equation: 3x1 + 5(5 − 2x1 ) = −10 . We now have a decoupled equation, and the solution is x1 = 5, x2 = −5. Solution 3.6 The first equation reads x1 = 3. The second reads x1 − x2 = 3, or 3 − x2 = 3, or x2 = 0. The third reads 2x1 + x2 − 8x3 = −4, or 6 + 0 − 8x3 = −4 or x3 = 5/4. Solution 3.7 The last equation gives x4 = 2 and the second last one x3 = 2. Then we have to introduce a parameter x2 −s and the we find that x1 = −5−2s −5 −2      0   1  Thus x =   + s  In the second system, we have to introduce a 2 0 2 0 parameter right off the bat. So x4 = s1 and x3 = 4 − s1 . Moving up one row, we have to introduce another parameter x2 = s2 and then x1 = 1 − 2s2 − (4 − s1 ) − −3 −1 −2        0   0   1  2s1 = −3 − s1 − 2s2 so x =   + s1   + s2   The third system is 4 −1 0 0 1 0 just the same. The extra rows of zeros have no effect. Solution 3.8 Perform the following sequence of row operations: i) (2, :) = (2, :) − (1, :) ii) (3, :) = (3, :) − (1, :) iii) (3, :) = (3, :) + 2(2, :) iv) (4, :) = (4, :) − 3(2, :) v) (4, :) = (4, :) + (3, :), in order to get the reduced row echelon form (RREF)   1 2 2 2 1  0  1 −3 1 −2  .  0 0 −7 1 0  0 0 0 0 0 103
  • 104. Chapter 3. Solving Linear Systems 3.7. Solutions to Chapter Problems Thus, we have that the rank of the augmented matrix, which is the same as the rank of the unaugmented matrix is r = 3, while the number of unknowns is n = 4. Therefore, the system has infinitely many solutions, that can be represented as the general solution x = q + sa, where s is any real number, q = (q1 , q2 , q3 , q4 ) is any particular solution of the original system, and a = (a1 , a2 , a3 , a4 ) is any non-zero solution of the corresponding homogeneous system. From the row echelon form we can see that q has to satisfy −7q3 + q4 = 0. Let q4 = 7, then q3 = 1. For the other two q values we have q2 = −2 + 3q3 − q4 = −6 q1 = 1 − 2q2 − 2q3 − 2q4 = −3. Hence, we have that q = (−3, −6, 1, 7). Now, to find a, the homogenous row echelon form is   1 2 2 2 0  0 1 −3 1 0   0 0 −7 1 0  .   0 0 0 0 0 Thus, a has to satisfy −7a3 + a4 = 0. Take a4 = 7, then a3 = 1. From the second row we have that a2 = 3a3 − a4 = −4, and from the first row we get that a1 = −2a2 − 2a3 − 2a4 = −8. So a = (−8, −4, 1, 7), and a general form of the solution is     −3 −8  −6   −4  x=  1  + s 1     7 7 Solution 3.9 Perform the following sequence of row operations i) (2, :) = (2, :) − 3(1, :) ii) (4, :) = (4, :) − 2(1, :) iii) (4, :) = (4, :) + 2(3, :) iv) (3, :) ↔ (2, :) v) (4, :) ↔ (3, :) vi) (4, :) = (4, :) − 3 (3, :), 2 104
  • 105. Chapter 3. Solving Linear Systems 3.7. Solutions to Chapter Problems to transform the augmented matrix into the reduced row echelon form   1 2 2 −7 20  0 6 0 −6 −10   .  0 0 −6 0 −30  0 0 0 16 −30 We can now solve the system starting from the bottom: 30 15 x4 = − =− 16 8 −30 x3 = =5 −6 −10 + 6x4 85 x2 = =− 6 24 95 x1 = 20 − 2x2 − 2x3 + 7x4 = 24 Solution 3.10 The matrix   1 −2 3 2  2 −3 2 2  3 2 −4 9 reduces to   1 −3 3 2  0 1 4 −2  0 0 19 19 which has as solution x1 = 3, x2 = 2, x3 = 1. Solution 3.11 The matrix   2 1 −1 6  1 −2 −2 1  −1 12 8 7 reduces to   2 1 −1 6  0 −5/2 −3/2 −2  0 0 0 0 which has as solution x1 = 13/5 + 4s/5, x2 = 4/5 − 3s/5, x3 = s. Solution 3.12 The matrix   1 2 4 1  1 1 3 2  2 5 9 1 reduces to   1 2 4 1  0 −1 −1 1  0 0 0 0 105
  • 106. Chapter 3. Solving Linear Systems 3.7. Solutions to Chapter Problems Thus , setting x3 = s, we obtain x2 = −1 − s and x1 = 3 − 2s. Solution 3.13 The matrix   1 2 4 1  1 1 3 2  2 5 9 3 reduces to   1 2 4 1  0 −1 −1 1  0 0 0 2 which has no solutions. Solution 3.14 The matrix   3 1 −1 2 7  2 −2 5 −7 1  −4 −4 7 −11 −13 reduces to   3 1 −1 2 7  0 −8/3 17/3 −25/3 −11/3  0 0 0 0 0 which has as solution x1 = 15/8 − 3s1 /8 + 3s2 /8, x2 = 11/8 + 17s1 /8 − 25s2 /8, x3 = s1 , x4 = s2 . Solution 3.15 Probably the easiest way to do this problem is to think geomet- rically. This system of equations describes the intersection of two lines. The lines will intersect in a single point if they are not parallel. This will happen exactly when the two vectors [a, b] and [c, d] are not parallel. Recall that this can be tested using the determinant. So the equations have a unique solution a b exactly when det = ad − bc = 0. c d Solution 3.16 Perform the following sequence of row operations: i) (2, :) = (2, :) − 4(1, :) ii) (3, :) = (3, :) + 4(1, :) 5 iii) (3, :) = (3, :) + 3 (3, :) to get the reduced row echelon matrix   1 2 0 7  0 0 6 −18  . 0 0 0 139 The last equation says 0 = 139, which cannot happen, therefore this linear system has zero solutions. Solution 3.17 The script is the following: 106
  • 107. Chapter 3. Solving Linear Systems 3.7. Solutions to Chapter Problems for a = 1:10 A = [1 2 4 7; 4 1 3 2; 0 5 9 a]; rref(A) end If the value of a changes linearly, then the solution to x1 (or for that matter, of x2 and x3 too) changes linearly with a. Solution 3.18 Perform the following sequence of row operations: i) (2, :) = (2, :) + (1, :) ii) (3, :) = (3, :) − (2, :) to get the reduced row echelon matrix   1 0 1 0 10  0 1 2 1 14  . 0 0 0 0 0 Thus the row echelon form for the corresponding homogeneous system is   1 0 1 0 0  0 1 2 1 0 . 0 0 0 0 0 Let q be any solution to the original system. For example, check that q = (9, 11, 1, 1) solves the system. Now, let a1 and a2 be any two linearly independent vectors which solve the corresponding homogeneous system. For example, check that a1 = (1, 1, −1, 1), and a2 = (1, −1, −1, 3) are two linearly independent vectors, which solve the homogeneous system. Thus, one representation for the solutions of the system is       9 1 1  + s1  1  + s2  −1  .  11      x=  1   −1   −1  1 1 3 Solution 3.19 The general solution is [4, 0, 3, 0] + s1 [−1, 1, 0, 0] + s2 [0, 0, −2, 1]. Solution 3.20 Perform the following sequence of row operations: i) (2, :) = (2, :) − (1, :) ii) (3, :) = (3, :) − 1 (2, :) 2 1 iii) (2, :) = 12 (2, :) iv) (3, :) = 1 (3, :) 4 107
  • 108. Chapter 3. Solving Linear Systems 3.7. Solutions to Chapter Problems v) (2, :) = (2, :) + 1 (3, :) 3 vi) (1, :) = (1, :) − 6(3, :) vii) (1, :) = (1, :) + 3(2, :) to get the reduced row echelon matrix 1   1 0 2 0  0 1 −7 0 . 6 0 0 0 1 Here the rank of the augmented matrix (ra = 2) is less than the rank of the unaugmented matrix ()ru = 3), therefore the linear system has zero solutions. Solution 3.21 To decide whether the vectors are linearly independent we must decide whether the homogeneous system of equations represented by the matrix 1 1 1 1 1 1     2 1 0  0 −1 −2   has a non zero solution. Row reduction yields  . 0 −1 1 0 0 3  2 1 0 0 0 0 This shows that the zero solution is unique and therefore the vectors are indepen- dent. It can’t happen that three vectors span a four dimensional space. To test whether y1 is a linear combination of the xi ’s we try to solve the equation with 1 1 1 2 1 1 1 2     2 1 0 4   0 −1 −2 0  augmented matrix  . Row reduction gives  . 0 −1 1 −3 0 0 3 −3 2 1 0 4 0 0 0 0 This system does have a solution, therefore the vector y1 is a linear combination of the xi ’s. Solution 3.22 Three vectors in the plane cannot be linearly independent. There are two ways to see that y is a linear combination of a1 and a2 : • a1 and a2 are linearly independent because 1 2 det = −5 3 1 Therefore {a1 , a2 } form a basis of R2 , and every vector is a linear combi- nation of a1 and a2 . • Solve the system x1 a1 + x2 a2 = y i.e. 1 3 −15 x1 + x2 = . 2 1 5 108
  • 109. Chapter 3. Solving Linear Systems 3.7. Solutions to Chapter Problems This system yields the augmented matrix 1 3 −15 1 3 −15 → (2, :) = (2, :) − 2(1, :) → 2 1 5 0 −5 35 1 1 3 −15 → (2, :) = − (2, :) → , 5 0 1 −7 which gives x2 = 7, and x1 = 6. Solution 3.23 The vectors a1 , a2 , and a3 are linearly independent if the only solution to the system x1 a1 + x2 a2 + x3 a3 = 0 is x1 = x2 = x3 = 0 So,     1 0 10 0 1 0 10 0  → (2, :) = (2, :)1− (1, :) →  0 0 −10 0   1 0 0 0      0 4 −5 0  (4, :) = − 3 (4, :)  0 4 −5 0  0 −3 0 0 0 1 0 0   1 0 10 0 1 (2, :) = − 10 (2, :)  0 0 1 0  (2, :) ↔ (3, :) → →  0 1 0 0  → (4, :) = (4, :) − 4(3, :) + 5(2, :)  (3, :) ↔ (4, :) 0 4 −5 0   1 0 10 0  0 1 0 0  →  0 0 1 0  0 0 0 0 The system has a unique solution x1 = x2 = x3 = 0, hence the three vectors are linearly independent. To find if y can be written as a linear combination of the three vectors, we look for a solution of the system     1 0 10 11 1 0 10 11  → (2, :) = (2, :) − (1, :) →  0 0 −10 −10  →  1 0 0 1      0 4 −5 −1  (3, :) = (3, :) + (4, :)  0 1 −5 9  0 −3 0 10 0 −3 0 10   1 0 10 11 1 (2, :) = − 10 (2, :)  0 0 1 1  → 0 1 −5  (4, :) = (4, :) + 3(3, :) 9  0 0 −15 37 The second row of the reduced row echelon matrix implies that x3 = 1, while the third row yields that x3 = − 37 . This contradiction means that the system 15 has no solution, and y is therefore not a linear combination of a1 , a2 , and a3 . Solution 3.24 109
  • 110. Chapter 3. Solving Linear Systems 3.7. Solutions to Chapter Problems plot3([1,0],[2,0],[3,0]) hold on plot3([-1,0],[2,0],[1,0]) plot3([4,0],[1,0],[-1,0]) In the figure window, if you go to Tools → Rotate 3D, you will see right away that the three vectors do not share the same plane, therefore they must be linearly independent. Solution 3.25 We apply Kirchhoff’s junction rule I1 + I2 − I3 = 0 and Kirchhoff’s loop rules, moving around first the left loop and then the right loop in a clockwise direction: −2I1 + 5I2 + 6 = 0 −5 − 6I3 − 5I2 = 0 Writing these three equations in an augmented matrix with the unknowns or- dered I1 , I2 and I3 gives   1 1 −1 0  −2 5 0 −6  0 −5 −6 5 which is solved to give I1 = 41/52, I2 = −23/26 and I3 = −5/52 with units of Amperes. The voltage drops are I1 R1 = 41/26, I2 R2 = 115/26 and I3 R3 = 15/26 across the three resistors, with drops in the direction of positive currents. Note that this problem is more easily solved using two loop currents (try it and make sure you get the same solution). Solution 3.26 We write voltage drops around the two loops in the direction of the loop currents: 4i1 + (i1 − i2 ) + 5 = 0 or 5i1 − i2 = −5 (i2 − i1 ) + 6i2 − 3 = 0 or − i1 − i2 = −5 which is solved to give i2 = 5/17 and i1 = −16/17. Thus, I2 = i2 − i1 = 21/17. Since it is positive, this current flows to the right. Solution 3.27 We use loop currents to solve the network depicted in figure 3.27. Kirchhoff’s second law gives that i2 = −I, and the first law applied to the closed loop on the left gives R1 i1 + R2 (i1 − i2 ) = V. As R1 = 3[Ω], R2 = 1[Ω], R3 = 4[Ω], V = 26[V ], and I = 2[A], we get i2 = −2[A] (3.2) 3i1 + i1 + 2 = 26 (3.3) 110
  • 111. Chapter 3. Solving Linear Systems 3.7. Solutions to Chapter Problems Figure 3.16: Problem 3.27. Figure 3.17: Problem 3.28. Hence i1 = 6[A], and from the sign we get that i2 flows in the opposite direction as initially supposed. a) Voltage drop = R3 i2 = −8[V ]. The - sign indicates that the potential is higher at the left of R3 than on the right. b) The flow through R2 is i1 − i2 = 8[A]. c) Voltage drop = R1 i1 = 18[V ]. Solution 3.28 Again, we use loop currents to solve the problem depicted in figure 3.28. R1 i1 + R2 (i1 − i2 ) = V R3 i2 + R2 (i2 − i1 ) = 0, with R1 = 4[Ω], R2 = 1[Ω], R3 = 2[Ω]. The flow through R3 is i2 , which is 1.5[A]. The system becomes 4i1 + (i1 − 1.5) = V 3 + (1.5 − i1 ) = 0 3 5i1 − V = 2 i1 = 9 2 Therefore V = 21[V ], and i1 = 4.5[A] 111
  • 112. Chapter 3. Solving Linear Systems 3.7. Solutions to Chapter Problems Figure 3.18: Problem 3.29. a) Voltage drop = R3 i2 = 3[V ]. b) Current flow = i1 − i2 = 3[A]. c) Voltage drop = V = 21[V ] Solution 3.29 As before: R1 (i1 − i2 ) = V , R2 i2 + R3 i2 + R1 (i2 − i1 ) = 0 with R1 = 4[Ω], R2 = 2[Ω], R3 = 10[Ω], V = 60[V ]. Hence, the system becomes 4(i1 − i2 ) = 15 , 2i2 + 10i2 + 4(i2 − i1 ) = 0 i1 − i2 = 15 . 4i2 − i1 = 0 Solving the system we get i1 = 20[A], i2 = 5[A]. a) Voltage drop = R2 i2 = 10[V ]. b) Current flow through R1 = i1 − i2 = 15[A]. c) Current flow through R3 = i2 = 5[A]. Solution 3.30 Using Kirchhoff’s laws we get the following equations:   R1 i1 + R2 (i1 − i2 ) = E1 R3 i2 + R2 (i2 − i1 ) = −E2 , R4 i3 = E2  112
  • 113. Chapter 3. Solving Linear Systems 3.7. Solutions to Chapter Problems Plugging in the values of R1 , R2 , R3 , R4 , E1 , and E2 , we get   i1 + 3(i1 − i2 ) = 10 5i2 + 3(i2 − i1 ) = −4 , 2i3 = 4  We have that i3 = 2. To find i1 and i2 we solve the augmented system 4 −3 10 → (1, :) = (1, :) + (2, :) → −3 8 −4 1 5 6 1 5 6 → (2, :) = (2, :) + 3(1, :) → . −3 8 −4 0 23 14 68 14 So i1 = 23 [A], i2 = 23 [A], i3 = 2[A]. Solution 3.31 From Kirchhoff’s first law: I + i3 = i2 From Kirchhoff’s second law: R1 i1 + R2 (i1 − i2 ) = V E + R2 (i2 − i1 ) = 0 R4 i3 + R3 i3 = E Substituting in the actual values, we get   i1 + (i1 − i2 ) = 25  i2 − i3 = 3   E + 2(i2 − i1 ) = 0  2i3 = E  The system, with respect to i1 , i2 , i3 , and E, yields the following augmented matrix:     2 −1 0 0 25 2 −1 0 0 25  0  1 −1 0 3   → (3, :) = (3, :) + (1, :) →   0 1 −1 0 3    −2 2 0 1 0   0 1 0 1 0  0 0 2 −1 0 0 0 2 −1 0   2 −1 0 0 25  0 1 −1 0 3  → (3, :) = (3, :) − (2, :) →   → (4, :) = (4, :) − 2(3, :) →  0 0 1 1 −3  0 0 2 −1 0   2 −1 0 0 25  0 1 −1 0 3     0 0 1 1 −3  0 0 0 −3 6 113
  • 114. Chapter 3. Solving Linear Systems 3.7. Solutions to Chapter Problems Figure 3.19: Problem 3.32. 27 a) Then i1 = 2 [A], i2 = 2[A], i3 = −1[A], E = −2[V ]. b) Voltage drop: R2 (i1 − i2 ) = 23[V ]. 27 c) Current flow through V : i1 = 2 [A]. d) Voltage drop through I: E = −2[V ]. Solution 3.32 From Kirchhoff’s second law: I1 = i1 I2 = −i2 From Kirchhoff’s first law: R1 i1 + R2 (i1 − i2 ) = E1 V + R3 i2 + E2 + R2 (i2 − i1 ) = 0, which becomes    i1 = I1 i2 = −3    7i1 − 5i2 = E1 10 + 8i2 − 5i1 + 5 = 0  We seek i1 and E1 : 7i1 − E1 = −15 . −5i1 = 9 9 12 Therefore we have that i1 = − 5 [A], E1 = 5 [V ]. a) Current flow through R2 : i1 − i2 = 6 [A]. 5 114
  • 115. Chapter 3. Solving Linear Systems 3.7. Solutions to Chapter Problems 12 b) Voltage drop through I1 = E1 = 5 [V ]. 9 c) Current flow through I1 = i1 = − 5 [A]. Solution 3.33 Note that this solution follows the alternate description of resis- tor networks in the notes. Using the identities Vi = Ii Ri , the voltage equations for the loops can be written I1 R1 + I2 R2 + E = 0 I3 R3 + I4 R4 − I2 R2 = 0 . . . I2n−1 R2n−1 + I2n R2n − I2n−2 R2n−2 = 0 The current equations for the nodes are I1 − I3 − I2 = 0 I3 − I5 − I4 = 0 . . . I2n−1 − I2n = 0 Solution 3.34 We have n = 6, xi = 15, x2 = 43, i yi = 18, xi yi = 52, so a = 14/11 ∼ 1.27 . . . and b = −2/11 ∼ −0.18 . . .. Solution 3.35 We would want to minimize the quadratic function f (a, b, c) = (ax2 + bxi + c − yi )2 i = a2 x4 + b2 x2 + c2 + yi + 2abx3 + 2acx2 − 2ax2 yi + 2bcxi − 2bxi yi − 2cyi i i 2 i i i = x4 a2 + i x2 b2 + nc2 + 2 i x3 ab + 2 i x2 ac + 2 i xi bc −2 x2 yi a − 2 i xi yi b − 2 yi c + 2 yi The corresponding system of equations is x4 x3 x2   i i i x2 yi i 3 2  xi xi ( xi ) ( xi yi )  x2 i ( xi ) n ( yi ) (I’ve divided each equation by two.) 115
  • 116. Chapter 3. Solving Linear Systems 3.7. Solutions to Chapter Problems Solution 3.36 The matrix is k1 + k2 −k2 0 0 ... 0 0 0 m1 g + k1 l1 − k2 l2    −k2 k2 + k3 −k3 0 ... 0 0 0 m2 g + k2 l2 − k3 l3   0 −k3 k3 + k4 −k4 ... 0 0 0 m3 g + k3 l3 − k4 l4     0 0 −k4 k4 + k5 ... 0 0 0 m4 g + k4 l4 − k5 l5     0 0 0 −k5 ... 0 0 0 m5 g + k5 l5 − k6 l6     . . . . . . . . . . . . . . . .  . . . . . . . .      0 0 0 0 . . . −kn−1 kn−1 + kn −kn mn−1 g + kn−1 ln−1 − kn ln  0 0 0 0 ... 0 −kn kn mg + kn ln Solution 3.37 The system of equations would be 2 −1 0 0 0 g    −1 2 −1 0 0 2g   0 −1 2 −1 0 3g    0 0 −1 2 −1 4g   0 0 0 −1 1 5g + 1 116
  • 117. Chapter 4 Matrices and Determinants 4.1 Matrix operations A matrix is a rectangular array of numbers. Here is an example of an m × n matrix. a 1,1 a1,2 · · · a1,n   a2,1 a2,2 · · · a2,n  A= . . . .  . . . .  . am,1 am,2 · · · am,n This is sometimes abbreviated A = [ai,j ]. An m × 1 matrix is called a column vector and a 1 × n matrix is called a row vector. (The convention is that m × n means m rows and n columns). Addition and scalar multiplication are defined for matrices exactly as for vectors. If s is a number a 1,1 a ··· a 1,2   sa 1,n sa · · · sa 1,1  1,2 1,n  a2,1 a2,2 ··· a2,n   sa2,1 sa2,2 ··· sa2,n  s .  . . . = . . . , . . . .   . . . . . .  . am,1 am,2 · · · am,n sam,1 sam,2 · · · sam,n and a 1,1 a1,2 ··· a1,n  b 1,1 b1,2 ··· b1,n   a +b 1,1 1,1 a1,2 + b1,2 ··· a1,n + b1,n   a2,1 a2,2 ··· a2,n   b2,1 b2,2 ··· b2,n   a2,1 + b2,1 a2,2 + b2,2 ··· a2,n + b2,n   . . . + . . . = . . .   . . .   . . . . . . . . . . . . . . .    am,1 am,2 · · · am,n bm,1 bm,2 · · · bm,n am,1 + bm,1 am,2 + bm,2 · · · am,n + bm,n The product of an m × n matrix A = [ai,j ] with a n × p matrix B = [bi,j ] is a m × p matrix C = [ci.j ] whose entries are defined by n ci,j = ai,k bk,j . k=1 117
  • 118. Chapter 4. Matrices and Determinants 4.1. Matrix operations A1 A1 B A1 B A1 B3 1 2 B B B3 = 1 2 A2 A2 B A2 B A2 B 1 2 3 A3 A3 B A3 B A3 B3 1 2 A4 A 4 B A 4 B A 4 B3 1 2 Figure 4.1: Schematic of matrix multiplication as the inner product of rows and columns of the product matrices. An easy way to remember this is to chop the matrix A into m row vectors of length n and to chop B into p column vectors also of length n, as in the following diagram. The i, jth entry of the product is then the dot product Ai · Bj . This is shown schematically in Figure 4.1. It is important to notice that the matrix product AB only makes sense if the the number of columns of A equals the number of rows of B. So A2 = AA only makes sense for a square matrix. Here is an example  1 2      1 0 1 2 1×1+0×3+1×5+2×7 1×2+0×4+1×6+2×8 20 24 3 4  1 1 1 4  = 1 × 1 + 1 × 3 + 1 × 5 + 4 × 7 1 × 2 + 1 × 4 + 1 × 6 + 4 × 8  =  37 44  5 6 0 0 1 1 0×1+0×3+1×5+1×7 0×2+0×4+1×6+1×8 12 14 7 8 Notice that if A is an m × n matrix, and x1    x2  x =  x3     .   .  . xn and b1    b2  b3   b=   . .   .  bm Then the equation Ax = b is a short way of writing the system of linear equations corresponding to the augmented matrix [A|b]. 118
  • 119. Chapter 4. Matrices and Determinants 4.1. Matrix operations We will see shortly why matrix multiplication is defined the way it is. For now, you should be aware of some important properties that don’t hold for matrix multiplication, even though they are true for multiplication of numbers. First of all, in general, AB is not equal to BA, even when both products are defined and have the same size. For example, if 0 1 A= 0 0 and 1 0 B= 0 0 then 0 0 AB = 0 0 but 0 1 BA = . 0 0 This example also shows that two non-zero matrices can be multiplied together to give the zero matrix. Here is a list of properties that do hold for matrix multiplication. 1. A + B = B + A 2. A + (B + C) = (A + B) + C 3. s(A + B) = sA + sB 4. (s + t)A = sA + tA 5. (st)A = s(tA) 6. 1A = A 7. A + 0 = A (here 0 is the matrix with all entries zero) 8. A − A = A + (−1)A = 0 9. A(B + C) = AB + AC 10. (A + B)C = AC + BC 11. A(BC) = (AB)C 12. s(AB) = (sA)B = A(sB) 119
  • 120. Chapter 4. Matrices and Determinants 4.1. Matrix operations 4.1.1 MATLAB Multiplication of matrices can be done using the * operator just as for multi- plication of scalars. An error results if the matrices are not of compatible size. Powers of matrices can be found using the ^ command like for scalars. The MATLAB command A^4 produces the same result as A*A*A*A where A is a previously defined, square matrix. Using these commands might be helpful in working out the details of Problem 4.4 below, although you should work out the first few matrix powers by hand for practise. Taking high powers of a matrix will also be helpful in understanding the long time behaviour of random walks described in Section 4.3. 4.1.2 Problems Problem 4.1 Define     −1 2 2 1 2 3 A= B =  −3 1C = [2 −2 0 ] D =  −11  1 2 1 −2 1 2 Compute all products of two of these (i.e., AB, AC, etc.) that are defined. Problem 4.2 Consider the following matrices:   3 0 0 1 1 4 2 A =  −1 2  ; B = ; C= . 0 0 3 1 5 1 1 Compute all the possible products between them. Problem 4.3 Compute A2 = AA and A3 = AAA for   0 a b A = 0 0 c 0 0 0 and   1 0 a A = 0 1 0 0 0 1 Problem 4.4 Let 1 1 A= 0 1 120
  • 121. Chapter 4. Matrices and Determinants 4.2. Linear Transformations and Matrices (a) Find A2 , A3 and A4 . (b) Find Ak for all positive integers k. (c) Find etA (part of the problem is to invent a reasonable definition!) (d) Find a square root of A (i.e., a matrix B with B 2 = A). (e) Find all square roots of A. Problem 4.5 Compute Ak for k = 2, 3, 4 when 0 1 0 0   0 0 1 0 A= 0 0 0 1  0 0 0 0 4.2 Linear Transformations and Matrices 4.2.1 Linear Transformations Recall that a function f is a rule that takes an input value x and produces an output value y = f (x). Functions are sometimes called transformations or maps (since they transform, or map, the input value to the output value). In calculus, you have mostly dealt with functions whose input values and output values are real numbers. However, it is also useful to consider functions whose input values and output values are vectors. We have already encountered this idea when we discussed quadratic func- tions. A quadratic function such as f (x1 , x2 , x3 ) = x2 + x2 + x2 can be con- 1 2 3 sidered as a transformation (or map) whose input is the vector x = [x1 , x2 , x3 ] and whose output is the number y = x2 + x2 + x2 . In this case we could write 1 2 3 f (x1 , x2 , x3 ) as f (x). An example of a transformation whose inputs and outputs are both vectors in two dimensions is rotation by some angle, say 45◦ . If x is the input vector, then the output vector R(x) is the vector you get by rotating x by 45◦ in the counter-clockwise direction as shown in Figure 4.2 (left). A transformation T is called linear if for any two input vectors x and y and any two numbers s and t, T (sx + ty) = sT (x) + tT (y) (4.1) This condition is saying that when we scalar multiply and add two vectors, it doesn’t matter whether we (i) do scalar multiplication and addition first and then apply a linear transformation, or (ii) do a linear transformation first and then do scalar multiplication and addition. In both cases we get the same an- swer. The linearity condition (4.1) is equivalent to the following two conditions: (i) For any two vectors x and y, T (x + y) = T (x) + T (y). 121
  • 122. Chapter 4. Matrices and Determinants 4.2. Linear Transformations and Matrices R(x + y)=R(x )+R(y ) y= R(x) R(x) R(y) x x y x+y Rotation by 45 degrees Rotation by 45 degrees Figure 4.2: Rotation of a vector in 2D (left), graphical evidence that property (i) of linear transformations holds for rotation in 2D (right). (ii) For any vector x and any scalar s, T (sx) = sT (x) Notice that the quadratic function f above is not a linear transformation, since f (2x) = (2x1 )2 + (2x2 )2 + (2x3 )2 = 4(x2 + x2 + x2 ) = 4f (x). 1 2 3 So f (2x) is not equal to 2f (x) as would need to be true if f were linear. However, rotation by 45◦ is a linear transformation. The picture in Figure 4.2 (right) demonstrates that condition (i) holds. The most important example of a linear transformation is multiplication by a matrix. If we regard vectors as column vectors, then multiplying an n dimensional vector x with an m × n matrix A results in an m dimensional vector y = Ax. The linearity property (4.1) is a consequence of properties 9 and 12 of matrix multiplication listed in Section 4.1. We will see that in fact every linear transformation is of this form. 4.2.2 Rotations in two dimensions Let us obtain a formula for the transformation that rotates a vector in two dimensions counterclockwise by θ degrees. Let x be an arbitrary vector. Denote by Rotθ x the vector obtained by rotating x counterclockwise by θ degrees. If the angle between x and the x axis is φ, then the components of x can be written x = [x1 , x2 ] with x1 = x cos(φ) and x2 = x sin(φ). This is shown graphically in Figure 4.2.2 (right). To obtain the vector that has been rotated by θ degrees, we simply need to add θ to φ in this representation. Thus y = Rotθ x = [y1 , y2 ], where y1 = x cos(φ + θ) and y2 = x sin(φ + θ). To simplify this we can use the addition formulae for sin and cos. Recall that cos(a + b) = cos(a) cos(b) − sin(a) sin(b) sin(a + b) = cos(a) sin(b) + sin(a) cos(b) 122
  • 123. Chapter 4. Matrices and Determinants 4.2. Linear Transformations and Matrices y= Rotθ x a b θ x φ projba y = || x || cos(φ+θ) x1= || x || cos( φ) 1 Figure 4.3: Graphical representation of a projection (left). Details of compo- nents in a rotation by an angle θ. Thus y1 = x cos(φ + θ) = x (cos(φ) cos(θ) − sin(φ) sin(θ)) = cos(θ)x1 − sin(θ)x2 and so y2 = x sin(φ + θ) = x (sin(φ) cos(θ) + cos(φ) sin(θ)) = sin(θ)x1 + cos(θ)x2 Notice now that this can be written as a matrix product: y1 cos(θ) − sin(θ) x1 = y2 sin(θ) cos(θ) x2 The matrix cos(θ) − sin(θ) , sin(θ) cos(θ) also denoted Rotθ , is called a rotation matrix. What this formula is saying is that the linear transformation of rotation by θ degrees in the same as the linear transformation of multiplication by the matrix. In other words, if we want to know the co-ordinates of the vector obtained by rotating x by θ degrees, we simply calculate Rotθ x. 4.2.3 Projections in two dimensions Now we consider the transformation which projects a vector x in the direction of another vector a as shown in Figure 4.2.2 (left). We already have a formula for this transformation. In the special case that a has unit length, the formula is Proja x = (x · a)a. 123
  • 124. Chapter 4. Matrices and Determinants 4.2. Linear Transformations and Matrices It follows from the properties of the dot product that Proja (sx + ty) = ((sx + ty) · a)a = ((sx · a + ty · a)a = s((x · a)a) + t((y · a)a) = sProja x + tProja y Thus Proja is a linear transformation. Let us now see that Proja is also given by multiplication by a matrix. If a = [a1 , a2 ], then (x1 a1 + x2 a2 )a1 Proja x = (x1 a1 + x2 a2 )a2 a2 x1 + a1 a2 x2 1 = a2 a1 x1 + a2 x2 2 a21 a1 a2 x1 = a2 a1 a22 x2 If a is the unit vector making an angle of θ with the x axis, then a1 = cos(θ) and a2 = sin(θ). Using half angle formulae, we have 1 + cos(2θ) a2 1 = cos2 (θ) = 2 2 2 1 − cos(2θ) a2 = sin (θ) = 2 sin(2θ) a1 a2 = cos(θ) sin(θ) = 2 Thus the matrix which when multiplied by x produces the projection of x onto the line making an angle of θ with the x axis is given by 1 1 + cos(2θ) sin(2θ) Projθ = 2 sin(2θ) 1 − cos(2θ) 4.2.4 Reflections in two dimensions A third example of a geometric linear transformation is reflection across a line. The following figure illustrates reflection across a line making an angle θ with the x axis. Let Ref θ x denote the reflected vector as shown in Figure 4.4. We can obtain the matrix for reflection from the following observation. The vector with tail at x and head at Projθ x is Projθ x − x. If we add twice this vector to x, we arrive at Ref θ x. Therefore Ref θ x = x + 2(Projθ x − x) = 2Projθ x − x 1 0 Now if I = , then Ix = x for any vector x, since 0 1 1 0 x1 1x1 + 0x2 x1 = = 0 1 x2 0x1 + 1x2 x2 124
  • 125. Chapter 4. Matrices and Determinants 4.2. Linear Transformations and Matrices x Proj (x) θ θ Ref (x ) θ Figure 4.4: Graphical representation of a reflection. I is called the identity matrix. Now we can write Ref θ x = 2Projθ x − Ix = (2Projθ − I)x. This means that the matrix for reflections is 2Projθ − I. Explicitly 1 + cos(2θ) sin(2θ) 1 0 Ref θ = − sin(2θ) 1 − cos(2θ) 0 1 cos(2θ) sin(2θ) = sin(2θ) − cos(2θ) 4.2.5 Every linear transformation is multiplication by a matrix We have just seen three examples of linear transformations whose action on a vector is given by multiplication by a matrix. Now we will see that for any linear transformation T (x) there is a matrix T such that T (x) is the matrix product T x. To illustrate this suppose that T is a linear transformation that takes three dimensional vectors as input. Let e1 , e2 and e3 be the standard basis vectors in three dimensions, that is       1 0 0 e1 =  0  e2 =  1  e3 =  0  0 0 1 Then any vector can be written         x1 1 0 0 x =  x2  = x1  0  + x2  1  + x3  0  = x1 e1 + x2 e2 + x3 e3 x3 0 0 1 Now, using the linearity property of the linear transformation T , we obtain T (x) = T (x1 e1 + x2 e2 + x3 e3 ) = x1 T (e1 ) + x2 T (e2 ) + x3 T (e3 ) 125
  • 126. Chapter 4. Matrices and Determinants 4.2. Linear Transformations and Matrices e2 T(e 2) =[-sin( θ),cos( θ)] T( e 1) =[cos(θ ),sin(θ)] θ e 1 Figure 4.5: Derivation of the matrix representing 2D rotation by rotating coor- dinate directions. Now take the three vectors T (e1 ), T (e2 ) and T (e3 ) and put them in the columns of a matrix which we’ll also call T . Then   x1 T x = T (e1 ) T (e2 ) T (e3 )  x2  = T (e1 )x1 + T (e2 )x2 + T (e3 )x3 = T (x) x3 In other words, the action of the transformation T on a vector x is the same as multiplying x by the matrix T = T (e1 ) T (e2 ) T (e3 ) The same idea works in any dimension. To find the matrix of a linear transformation T (x) we take the standard basis vectors e1 , e2 , . . . , en (where ek has zeros everywhere except for a 1 in the kth spot) and calculate the ac- tion of the linear transformation on each one. We then take the transformed vectors T (e1 ), T (e2 ), . . . , T( en ) and put them into the columns of a matrix T = T (e1 ) T (e2 ) · · · T (en ) . This matrix T then reproduces the action of the linear transformation, that is, T (x) = T x. To see how this works in practise, let’s recalculate the matrix for rotations 1 in two dimensions. Under a rotation angle of θ, the vector e1 = gets 0 cos(θ) 0 transformed to T (e1 ) = while the vector e2 = gets transformed sin(θ) 1 − sin(θ) to T (e2 ) = . This is shown graphically in Figure 4.5. cos(θ) According to our prescription, we must now put these two vectors into the columns of a matrix. This gives the matrix cos(θ) − sin(θ) T = sin(θ) cos(θ) which is exactly the same as Rotθ . 126
  • 127. Chapter 4. Matrices and Determinants 4.2. Linear Transformations and Matrices 4.2.6 Composition of linear transformations and matrix product Suppose we apply one linear transformation T and then follow it by another linear transformation S. For example, think of first rotating a vector and then reflecting it across a line. Then S(T (x)) is again a linear transformation, since S(T (sx + ty)) = S(sT (x) + tT (y)) = sS(T (x)) + tS(T (y)). What is the matrix for the composition S(T (x))? We know that there are matrices S and T that reproduce the action of S(x) and T (x). So T (x) is the matrix product T x and S(T x)) is the matrix product S(T x) (here the parenthesis just indicate in which order we are doing the matrix product) But matrix multiplication is associative. So S(T x) = (ST )x. In other words the matrix for the composition of S(x) and T (x) is simply the matrix product of the corresponding matrices S and T . For example, to compute the matrix for the transformation of rotation by 45◦ followed by reflection about the line making an angle of 30◦ with the x axis we simply compute the product cos(60◦ ) sin(60◦ ) cos(45◦ ) − sin(45◦ ) Ref 30◦ Rot45◦ = sin(60◦ ) − cos(60◦ ) sin(45◦ ) cos(45◦ ) √ √ √ 1 3 2 2 2 2 2 − = √ √ √2 3 2 2 2 −12 2 2 √ √ √ √ 2+ 6 − 2+ 6 = √ 4√ √4 √ 6− 2 − 6− 2 4 4 4.2.7 Problems Problem 4.6 Let a be a fixed nonzero vector. Show that the transformation T (x) = x + a is not a linear transformation. Problem 4.7 Let a be a fixed vector. Show that the transformation T (x) = a·x is a linear transformation (whose output values are numbers). Problem 4.8 Find the matrices which project on the lines (a) x1 = x2 (b) 3x1 + 4x2 = 0 Problem 4.9 Find the matrices that reflect about the lines (a) x1 = x2 (b) 3x1 + 4x2 = 0 127
  • 128. Chapter 4. Matrices and Determinants 4.2. Linear Transformations and Matrices Problem 4.10 Find the matrices which rotate about the origin in two dimen- sions by (a) π/4 (b) π/2 (c) π Problem 4.11 Find the matrix which first reflects about the line making an angle of φ with the x axis, and then reflects about the line making an angle of θ with the x axis. Give another geometric interpretation of this matrix. Problem 4.12 Let f : R2 → R2 be the linear transformation that reflects points across the line x = 2y. 1. What is the image of (1, 10), that is, the vector 1 f ? 10 2. Write down the matrix of f . Problem 4.13 Let g : R2 → R2 be the linear transformation that first reflects points across the line x = −y and then rotates points π/2 radians. Write down the matrix of f . Problem 4.14 Find the matrix that rotates about the z axis by and angle of θ in three dimensions. Problem 4.15 Let T : R4 → R3 be the map defined by T (x1 , x2 , x3 , x4 ) = (x1 + 4x2 + 5x3 , 3x1 − 2x2 + x3 − x4 , −x1 − x3 + x4 ) Show that T is a linear transformation. Problem 4.16 Let T : R2 → R3 given by T (x) = Ax, where   1 2 A =  0 1 . 1 1 Determine whether each given vector is in the range of T . Recall that the range of T is every vector that can “come out” or T , that is every vector in R3 that can be written as T (x) for some x in R2 . Note that answering these questions is the same as determining if a linear system has a solution.   1 1.  4  . 2 128
  • 129. Chapter 4. Matrices and Determinants 4.3. Application: random walks   1 2.  1  . 1 Problem 4.17 Suppose T : R3 → R2 is a linear transformation such that       1 1 0 1 2 −1 T  0  = , T  −1  = , T  0  = . 1 0 5 0 0 1  1 1. Find T  2 . 3 2. Write down the matrix of T . Problem 4.18 Let g : R3 → R2 and h : R2 → R4 be linear transformations given by     x1 x1 x1 − x3 + x2 x1  −x1  g  x2  = , h = . x3 − 2x2 x2  x2  x3 −x2 The questions below concern h ◦ g, the composition of h and g defined by h ◦ g(x) = h(g(x)) for x ∈ R3 . Note that this composition is defined because the output of g and the input of h have the same dimension, two.   1 1. Find h ◦ g  0 . −5 2. Find the matrices of g and h. 3. Find the matrix of h ◦ g. 4.3 Application: random walks Consider a system with three states, labelled 1, 2 and 3 in Figure 4.6. To make the problem more vivid, one can imagine these as being actual locations. A random walker starts off at some location, say location 1 at time 0. Then at a sequence of times, 1, 2, . . . , n . . ., the walker either stays where he is, or moves to one of the other locations. The next location is chosen randomly, but according to the transition probabilities pi,j . These are numbers between 0 and 1 that measure how likely it is that, starting from location j, the walker 129
  • 130. Chapter 4. Matrices and Determinants 4.3. Application: random walks p 33 3 p p 23 32 p p p 22 31 13 2 p 21 p 12 1 p 11 Figure 4.6: Graphical description of a random walk with three states. will move to location i. If pi,j = 0, then there is no chance that the walker will move from j to i, and if pi,j = 1, then the walker will move for sure from j to i. Since the walker must either move from j to another site or stay put, the sum of these probabilities must equal one: 3 p1,j + p2,j + p3,j = pi,j = 1 i=1 At each time n there is a vector   xn,1 xn =  xn,2  xn,3 that gives the probabilities that the walker is in location 1, 2 or 3 at time n. Let us compute the vector xn+1 , given xn . To start, we must compute xn+1,1 , the probability that the walker is at location 1 at time n + 1. There are three ways the walker can end up at location 1. The walker might have been at location 1 at time n and have stayed there. The probability of this is p1,1 xn,1 . Or he might have been at location 2 and have moved to 1. The probability of this is p1,2 xn,2 . Finally, he might have been at location 3 and have moved to 1. The probability of this is p1,3 xn,3 . Thus the total probability that the walker is at location 1 at time n + 1 is the sum xn+1,1 = p1,1 xn,1 + p1,2 xn,2 + p1,3 xn,3 Similarly, for all i xn+1,i = pi,1 xn,1 + pi,2 xn,2 + pi,3 xn,3 But this is exactly the formula for matrix multiplication. So xn+1 = P xn where P is the matrix with entries [pij ]. 130
  • 131. Chapter 4. Matrices and Determinants 4.3. Application: random walks Now suppose the initial probabilities are given by some vector x0 . For ex- ample, if the walker starts off at location 1, then   1 x0 =  0  0 Then after one time step, we have x1 = P x0 Then, x2 = P x1 = P 2 x0 and so on, so after n time steps xn = P n x0 3 Notice that the sum xn,1 + xn,2 + xn,3 = i=1 xn,i should equal one, since the total probability of being in one of the three locations must add up to one. If the initial vector has this property, then it is preserved for all later times. 3 3 To see this, suppose that j=1 xj = 1 Then, since i=1 pij = 1 3 3 3 (P x)i = pij xj i=1 i=1 j=1 3 3 = pij xj j=1 i=1 3 = xj = 1 j=1 In other words, the vector for the next time step also has components summing to one. Of course, one can generalize this to a system with an arbitrary number of states or locations. Later in the course we will see how to use eigenvalues and eigenvectors to efficiently compute the limit as n tends to infinity of this expression. A specific but somewhat nerdy example is given below. Example 4.1 Ydnew the sorcerer and his apprentice, Xavier, have a magical duel as part of a circus act. They take turns casting spells, which don’t always work, but when they do the opponent is knocked out. Xavier always goes first in the duel but his spells only work 1/3 of the time. Ydnew’s spells work 1/2 of the time. When they practise, they find that each wins half the time. However, in performances, the duel is limited to three attempts by Xavier. After that, if there has been no knock-out, Ydnew is declared the winner. (a) Describe the duel as a random walk. 131
  • 132. Chapter 4. Matrices and Determinants 4.3. Application: random walks (b) Write the matrix for the random walk. (c) Use the matrix to analyze the probability that Ydnew will win the duel when Xavier’s attempts are limited and also when they are unlimited. There are four possible states: (1) No winner yet, Xavier’s turn (2) No winner yet, Ydnew’s turn (3) Xavier has won. (4) Ydnew has won. These are shown with transition probabilities in Figure 4.7. The transition matrix with the node ordering we have chosen is   0 1/2 0 0  2/3 0 0 0  P =  1/3 0 1 0   0 1/2 0 1 The duel begins in state (1), that is Xavier’s turn, no winner yet:   1 (0)  0  x =   0  0 The circus duel ends after 3 attempts by Xavier, that is after 5 transitions in the state diagram (five rounds XYXYX). Using MATLAB we compute   0  0.0741  x(5) = P 5 x(0) ≈   0.4815  .  0.4444 This means that Ydnew will win with probability 0.5185, the sum of the second and fourth components in x(5) (he can win by a knock-out or if Xavier has not defeated him after his third attempt). To investigate what happens in the unlimited turn version of the duel they use when they practise, you can compute x(n) = P n x(0) for n increasingly large. This gives strong numerical evidence that   0  0  lim x(n) =  1/2  .  n→∞ 1/2 132
  • 133. Chapter 4. Matrices and Determinants 4.3. Application: random walks Figure 4.7: Diagram of the sorcerers’ duel described in Example 4.1. That is, in the unlimited duel they each win half the time, as they observed. This unlimited case can be analyzed rigorously. The techniques used are not part of the course, but it is done below for completeness. Xavier has a chance to win on any odd turn. On turn 1 he can win if his spell works (1/3 chance). On turn 3 Xavier can win if his spell missed on turn 1, Ydnew missed on turn 2 but Xavier succeeds on turn 3 ( 2 1 1 = 1 chance). On turn 5 he can win if there 323 9 were failures in turns 1-4 but success on turn 5, with a chance of 3 1 3 1 1 = 27 . 2 2 2 23 1 Notice the pattern that each successive chance to win goes down by a factor of 3. The total chance of success is the sum of the chances to win at every opportunity: ∞ i 1 1 1 1 1 1 1 1 1 1 1 + + + · · · = (1 + + + · · ·) = = = 3 9 27 3 3 9 3 i=0 3 3 1 − 1/3 2 as predicted from the MATLAB numerical results. To get the second last term above, the expression for the sum of a geometric series was used. Example 4.2 Investigate a modified sorcerer’s duel to the one described in Ex- ample 4.1 in which both sorcerers are given a shield which can protect them from one successful spell cast on them. Here, there are 10 possible states (1) No winner yet, Xavier’s turn, both have shields still. (2) No winner yet, Xavier’s turn, Xavier has lost his shield, but Ydnew still has his. (3) No winner yet, Xavier’s turn, Xavier still has his shield, but Ydnew has lost his. (4) No winner yet, Xavier’s turn, both have lost their shields. (5-8) Same as (1)-(4) above, but Ydnew’s turn. (9) Xavier has won. 133
  • 134. Chapter 4. Matrices and Determinants 4.3. Application: random walks (10) Ydnew has won. Using the ordering of states above, the transition matrix is   0 0 0 0 1/2 0 0 0 0 0  0 0 0 0 1/2 1/2 0 0 0 0     0 0 0 0 0 0 1/2 1/2 0 0     0 0 0 0 0 0 1/2 0 0 0     2/3 0 0 0 0 0 0 0 0 0  P = 0   2/3 0 0 0 0 0 0 0 0    1/3 0 2/3 0 0 0 0 0 0 0     0 1/3 0 2/3 0 0 0 0 0 0     0 0 1/3 1/3 0 0 0 0 1 0  0 0 0 0 0 1/2 0 1/2 0 1 This matrix can be entered in MATLAB (remember the techniques for enter- ing sparse matrices, that is matrices with mostly zero entries, learnt in your computer lab #2). Starting with x(0) = e1 we can compute x(5) = P 5 x(0) numerically in MATLAB. Adding components 5-8 (Ydnew wins by default) and component 10 (Ydnew wins by knock-out) gives approximately 0.7778, the chance that Ydnew will win the circus duel. The unlimited duel can be investi- gated numerically, giving   0   0     0     0   n (0)  0  lim P x ≈    n→∞  0     0     0    0.3750  0.6250 So in an unlimited match, Ydnew will win about 62.5% of the time. 4.3.1 Problems Problem 4.19 Consider a random walk with 3 states, where the probability of staying in the same location is zero. Suppose • the probability of moving from location 1 to location 2 is 1/2 • the probability of moving from location 2 to location 1 is 1/3 • the probability of moving from location 3 to location 1 is 1/4 134
  • 135. Chapter 4. Matrices and Determinants 4.3. Application: random walks Write down the matrix P . What is the probability that a walker starting in location 1 is in location 2 after two time steps? Problem 4.20 Consider a random walk with 3 states, where the probability of staying in the same location is zero. Suppose • the probability of moving from location 1 to location 2 is 1/3, that is p2,1 = 1 3. • the probability of moving from location 2 to location 1 is 2/5, that is p1,2 = 2 5. • the probability of moving from location 3 to location 1 is 1/4, that is p1,3 = 1 4. 1. Write down the matrix P. 2. What is the probability that a walker starting in location 1 is in location 2 after three time steps? 3. What is the probability that a walker starting in location 3 is in location 1 after two time steps? Problem 4.21 Consider a random walk with 4 states, where all the probabilities pi,j are equal to 1/4. 1. Compute P and P n , for every positive integer n. 2. Compute the probability location vectors xn (xn = P n x0 ) in each of the following cases:     1 0  0   0  x0 =   , or x0 =   .  0   0  0 1 Problem 4.22 Suppose that the matrix P given below is the matrix of a random walk:  1 1 2  3 4 5 1 1 P = 0 2 5 . 2 1 2 3 4 5 1. What is the probability that a walker starting in location 2 is in location 1 after one time step? 2. What is the probability that a walker starting in location 2 is in location 1 after two time steps?   1/3 3. If x0 =  1/3  what is the probability that the walker is in location 3 2/3 after two time steps? 135
  • 136. Chapter 4. Matrices and Determinants 4.4. The Transpose Figure 4.8: Diagram of the random walk in Problem 4.24. Problem 4.23 Consider a random walk with 3 states, where all the probabilities pi,j are all equal to 1/3. What is P , P n ? Compute the     probabilities P n x0 when 1 0 x0 =  0  (em i.e the walker starts in location 1), x0 =  1  (em i.e. the walker 0   0 0 starts in location 2), and x0 =  0  (em i.e. the walker starts in location 3) 1 Problem 4.24 Consider a random walk with three states with transition prob- abilities shown in Figure 4.8. 1. Suppose the system starts in state 3. What is the probability that it is in state 2 after 2 steps? 2. Given that   0.25 0.25 0.25 P k →  0.375 0.375 0.375  0.375 0.375 0.375 as k tends to infinity, what are the probabilities of the system being in each state after the system has been running for a long time. Show that these probabilities do not depend on the initial state. 4.4 The Transpose If A is an m×n matrix, then its transpose AT is the matrix obtained by flipping A about its diagonal. So the columns of AT are the rows of A (in the same order) and vice versa. For example, if 1 2 3 A= 4 5 6 136
  • 137. Chapter 4. Matrices and Determinants 4.4. The Transpose then   1 4 AT =  2 5 3 6 Another way of saying this is that the i, jth entry of AT is the same as the j, ith entry of A, that is, aT = aj,i . i,j There are two important formulae to remember for the transpose of a matrix. The first gives a relation between the transpose and the dot product. If A is an m × n matrix, then for every x ∈ Rn and y ∈ Rm we have y · (Ax) = (AT y) · x (4.2) The proof of this formula is a simple calculation.   m n y · (Ax) = yi  ai,j xj  i=1 j=1 m n = yi ai,j xj i=1 j=1 n m = aT yi xj j,i j=1 i=1 n m = aT yi j,i xj j=1 i=1 = (AT y) · x In fact, the formula (4.2) could be used to define the transpose. Given A, there is exactly one matrix AT for which (4.2) is true for every x and y, and this matrix is the transpose. The second important formula relates the transpose of a product of matrices to the transposes of each one. For two matrices A and B such that AB is defined the formula reads (AB)T = B T AT (4.3) Notice that the order of the factors is reversed on the right side. To see why (4.3) is true, notice that on the one hand y · (ABx) = ((AB)T y) · x while on the other hand y · (ABx) = y · (A(Bx)) = (AT y) · (Bx) = (B T AT y) · x Thus ((AB)T y) · x = (B T AT y) · x for every x and y. This can only be true if (4.3) holds. 137
  • 138. Chapter 4. Matrices and Determinants 4.5. Matrix Inverses 4.4.1 MATLAB The MATLAB operator ’ can be used to take the transpose of a matrix. 4.4.2 Problems   x1 y1 Problem 4.25 Verify formula (4.2) for x =  x2  y = and y2 x3 1 2 3 A= 4 5 6 Problem 4.26 What is (AT )T ? Problem 4.27 Verify (4.3) for 1 2 1 2 3 A= and B = 3 1 4 5 6 Problem 4.28 Show that if A and B are both m × n matrices such that y · (Ax) = y · (Bx) for every y ∈ Rm and every x ∈ Rn , then A = B. Problem 4.29 Show that if you think of (column) vectors in Rn as n × 1 matrices then x · y = xT y Now use this formula and (4.3) to derive (4.2). 4.5 Matrix Inverses To solve the (scalar) equation ax = b for x we simply multiply both sides by a−1 = a . Then, since a−1 a = 1, we find 1 x = a−1 b. Of course, if a = 0 this doesn’t work, since we cannot divide by zero. In fact, if a = 0 the equation ax = b either has no solutions, if b = 0, or infinitely many solutions (every value of x), if b = 0. We have seen that a system of linear equations can be rewritten Ax = b where is A is a known matrix x is the unknown vector to be solved for, and b is a known vector. Suppose we could find an inverse matrix B (analogous to 138
  • 139. Chapter 4. Matrices and Determinants 4.5. Matrix Inverses a−1 ) with the property that BA = I (recall that I denotes the identity matrix). Then we could matrix multiply both sides of the equation by B yielding BAx = Bb But BAx = Ix = x, so x = Bb. Thus there is a unique solution and we have a formula for it. Just as in the numerical case, where a could be zero, we can’t expect to find an inverse matrix in all cases. After all, we know that there are linear systems of equations with no solutions and with infinitely many solutions. In these situations there can be no inverse matrix. When considering matrix inverses, we will always assume that we are dealing with square (i.e., n × n) matrices. Definition 4.1 If A is an n × n matrix, then B is called the inverse of A, and denoted B = A−1 , if BA = I where I is the n × n identity matrix (with each diagonal entry equal to 1 and all other entries 0). Here is an example. Suppose 2 1 A= 5 3 then the inverse matrix is 3 −1 B = A−1 = −5 2 since 2 1 3 −1 6−5 3−3 1 0 = = 5 3 −5 2 10 − 10 −5 + 6 0 1 This means that to solve the linear equation 2x1 +x2 =2 5x1 +3x2 =4 we write it as a matrix equation 2 1 x1 2 5 3 x2 4 and then multiply both sides by the inverse to obtain x1 3 −1 2 2 = = x2 −5 2 4 −2 139
  • 140. Chapter 4. Matrices and Determinants 4.5. Matrix Inverses Figure 4.9: Diagram of the reduced form of an invertible matrix. Here is an example of a matrix that doesn’t have an inverse. Let 1 0 A= . 0 0 To see that A doesn’t have an inverse, notice that the homogeneous equations 0 Ax = 0 has a non-zero solution x = . If A had an inverse B, then we could 1 multiply both sides of the equation Ax = 0 by B to obtain x = B0 = 0. But this is false. Therefore there cannot be an inverse for A. Clearly, having an inverse is somehow connected to whether or not there are any non-zero solutions of the homogeneous equation Ax = 0. Recall that Ax = 0 has only the zero solution precisely when Ax = b has a unique solution for any b. Let A be an n × n matrix. The following conditions are equivalent: (1) A is invertible. (2) The equation Ax = b always has a unique solution. (3) The equation Ax = 0 has as the only solution x = 0. (4) The rank of A is n. (5) The reduced form of A is as shown in Figure 4.9. We already know that the conditions (2), (3), (4) and (5) are all equivalent. We also just have seen that if A is invertible with inverse B, then the solution of Ax = b is x = Bb so it exists and since we have a formula for it, it is unique. So we just have to show that if Ax = b always has a unique solution, then A has an inverse. Consider the transformation that takes a vector b to the unique solution x of Ax = b, em i.e., T b = x. It is easy to check that this is a linear transformation, since if T b1 = x1 , i.e., Ax1 = b1 and T b2 = x2 , i.e., Ax2 = b2 , then A(t1 x1 + t2 x2 ) = t1 Ax1 + t2 Ax2 = t1 b1 + t2 b2 , so that T (t1 b1 +t2 b2 ) = t1 x1 +t2 x2 = t1 T b1 +t2 T b1 Since T is a linear transformation, 140
  • 141. Chapter 4. Matrices and Determinants 4.5. Matrix Inverses it is given by some matrix B, and since T (Ax) = x, we must have BAx = x which implies that BA is the identity matrix. Going back to our first example, notice that not only is BA = I, but AB = I too, since 3 −1 2 1 6−5 −2 + 2 1 0 = = −5 2 5 3 15 − 15 −5 + 6 0 1 For a general choice of A and B, BA need not be equal to AB. But if B is the inverse of A, then it is always true that AB = BA = I. To see this, suppose that A is invertible with BA = I but we don’t know yet whether AB = I. So what we need to show is that for every vector x, ABx = x. First notice that if A is invertible, then any vector x can be written in the form Ay for some y, since this is just the same as saying that the equation Ay = x has a solution y. Thus ABx = ABAy = AIy = Ay = x. 4.5.1 Computing the inverse How can we compute the inverse of an n × n matrix A? Suppose that A is invertible and B is the inverse. Then AB = I. We can rewrite this equation as follows. Think of the columns of B as being column vectors so that B = b1 b2 · · · bn Then the rules of matrix multiplication imply that AB = Ab1 Ab2 · · · Abn Now the identity matrix can also be written as a matrix of column vectors. In this case the kth column is simply the matrix with zeros everywhere except for a 1 in the kth place, in other words the vector ek . Thus I = e1 e2 · · · en So if AB = I then the n equations Ab1 = e1 Ab2 = e2 . . . Abn = en hold. If we solve each of these equations for b1 , b2 , . . ., bn , then we have found the inverse B. 141
  • 142. Chapter 4. Matrices and Determinants 4.5. Matrix Inverses Here is a simple example. Suppose we want to find the inverse for 2 1 A= . 5 3 According to our discussion, we must solve Ab1 = e1 and Ab2 = e2 . The augmented matrix for Ab1 = e1 is 2 1 1 5 3 0 We now perform a sequence of row operations. First divide the first row by 2. This gives 1 1/2 1/2 . 5 3 0 Now subtract 5 times the first row from the second row. This gives 1 1/2 1/2 . 0 1/2 −5/2 Now subtract the second row from the first row. This gives 1 0 3 . 0 1/2 −5/2 Finally, multiply the second row by 2. This gives 1 0 3 . 0 1 −5 3 Therefore b1 = The augmented matrix for Ab2 = e2 is −5 2 1 0 5 3 1 We now perform a sequence of row operations. First divide the first row by 2. This gives 1 1/2 0 . 5 3 1 Now subtract 5 times the first row from the second row. This gives 1 1/2 0 . 0 1/2 1 Now subtract the second row from the first row. This gives 1 0 −1 . 0 1/2 1 142
  • 143. Chapter 4. Matrices and Determinants 4.5. Matrix Inverses Finally, multiply the second row by 2. This gives 1 0 −1 . 0 1 2 −1 Therefore b2 = . So 2 3 −1 B = b1 b2 = −5 2 Notice that we performed exactly the same sequence of row operations in finding b1 and b2 . This is because the row operations only depend on the left side of the augmented matrix, in other words, the matrix A. If we used this procedure to find the inverse of an n × n matrix, we would end up doing exactly the same row operations n times. Clearly this is a big waste of effort! We can save a lot of work by solving all the equations at the same time. To do this we make a super-augmented matrix with both right sides. 2 1 1 0 5 3 0 1 Now we only have to go through the sequence of row operations once, keeping track of both right sides simultaneously. Going through the same sequence, we obtain 1 1/2 1/2 0 . 5 3 0 1 1 1/2 1/2 0 . 0 1/2 −5/2 1 1 0 3 −1 . 0 1/2 −5/2 1 1 0 3 −1 . 0 1 −5 2 Notice that the vectors b1 and b2 are automatically arranged as columns on the right side, so the matrix on the right is the inverse B. The same procedure works for any size of square matrix. To find the inverse of A form the super-augmented matrix [A|I]. Then do a sequence of row oper- ations to reduce A to the identity. If the resulting matrix is [I|B] then B is the inverse matrix. What happens if A doesn’t have an inverse? In this case it will be impossible to reduce A to the identity matrix, since the rank of A is less than n. So the procedure will fail, as it must. As another example, let us now compute the inverse of an arbitrary invertible 2 × 2 matrix a b A= . c d 143
  • 144. Chapter 4. Matrices and Determinants 4.5. Matrix Inverses We will see that A is invertible precisely when its determinant ∆ = ad − bc is non-zero. So let’s assume this is the case, and do the computation. To start with, let’s assume that ac = 0. Then neither a or c are zero. Here is the sequence of row transformations. a b 1 0 c d 0 1 ac bc c 0 c(1) ac ad 0 a a(2) Notice that multiplication by a and by c would not be legal row transformations if either a or c were zero. ac bc c 0 0 ad − bc −c a (2) − (1) 1 b/a 1/a 0 (1/ac)(1) 0 1 −c/∆ a/∆ (1/∆)(2) 1 0 d/∆ −b/∆ (1) − (b/a)(2) 0 1 −c/∆ a/∆ (1/∆)(2) Thus the inverse matrix is 1 d −b A−1 = . ad − bc −c a This was derived under the additional assumption that ac = 0. However one can check directly that the same formula works, so long as ∆ = ad − bc = 0. 4.5.2 Inverses of Products If both A and B are invertible, then so is AB. The inverse of AB is given by B −1 A−1 . To check this, simply compute ABB −1 A−1 = AIA−1 = AA−1 = I. If one of A or B is not invertible then AB is not invertible. To see this recall that a matrix C is not invertible exactly whenever there is a non-zero solution x to Cx = 0. If B is not invertible, then there is a non-zero vector x with Bx = 0. Then ABx = A0 = 0 so AB is not invertible too. If B is invertible, but A is not, then there is a non-zero x with Ax = 0. Let y = B −1 x. Since B −1 is invertible, y cannot be zero. We have ABy = ABB −1 x = Ax = 0 so AB is not invertible. 144
  • 145. Chapter 4. Matrices and Determinants 4.5. Matrix Inverses 4.5.3 MATLAB If A is an invertible n × n matrix then the MATLAB command inv(A) will return its inverse. A system of linear equations corresponding to Ax = b where A is an invertible n × n matrix and b is a column vector with n components can be solved with the command x = Ab Note that if A is not square or A is square but not invertible the corresponding system cannot be solved in this way. In this case, use the MATLAB command rref on the augmented matrix of the system as described in Chapter 3. 4.5.4 Problems Problem 4.30 Which of the following matrices are invertible? 1 2 (a) 3 4   1 2 3 (b)  0 3 4 0 1 1   1 2 3 (c)  0 3 4 0 1 2 Problem 4.31 Find the inverse for 1 2 3 5 Problem 4.32 Determine which of these matrices are invertible, and find the inverse for the invertible ones.   2 3 −1 (a)  1 2 3  −1 −1 4   1 −1 1 (b)  −1 2 −1  2 −1 1   1 1 1 (c)  1 2 3  1 4 9   2 1 4 (d)  3 2 5  0 −1 1 145
  • 146. Chapter 4. Matrices and Determinants 4.6. Return to Resistor Networks   1 0 a (e)  0 1 0 0 0 1   1 a b (f )  0 1 c 0 0 1 Problem 4.33 The following matrices are invertible. Find their inverses.     1 2 −3 1 1 1 1  −1 3 −3 −2  A =  0 2 3 , B =  .  2 0 1 5  5 5 1 3 1 −2 5 Problem 4.34 Consider the system   x + 5z = 6 x − 2y + 3z = 14 2x + y − 3z = −2  1. Write the above system in the matrix-form, in other words find a matrix   6 A such that Ax =  14 . −2 2. Find the inverse of A, if possible. 3. How many solutions the system have? Write them all down. 4.6 Return to Resistor Networks Let us revisit the resistor networks we considered in section 3.5. These networks consist of resistors and voltage and current sources. The fundamental problem was introduced, which was to write the current through the voltage sources and the voltage across current sources in terms of arbitrary values of the sources. Recall example 3.15 which considered the circuit shown again in figure 4.10. The sources are V and I. The fundamental problem is to determine E and J (the current through the voltage source) in terms of V and I. The solution found in section 3.5.3 was E = 6I + V /6 (4.4) 1 10 J = − I+ V 6 72 This solution can be given in matrix-vector form as 1 E 6 6 I = (4.5) J −16 10 72 V 146
  • 147. Chapter 4. Matrices and Determinants 4.6. Return to Resistor Networks Figure 4.10: The circuit considered in Example 3.15. where we will call the 2 × 2 matrix in the equation above F . It can be shown that for any circuit with n sources, the solution of the fundamental problem can be written as multiplication by an n × n matrix. We will show below how F can be constructed in a systematic way. Look back to section 3.5.3 to see how we arrived at (4.4). There were three parts to the process: 1. We wrote the linear system for the loop currents and the voltage drop across the current source. The equations for the system had the sources I and V in the right hand sides. 2. We solved the system symbolically in terms of the right hand side involving V and I. 3. We identified the elements of the solution that solved the fundamental problem. We will now describe these three steps by matrix multiplication and do the multiplication as an alternate way to determine the matrix F in (4.5). Recall that the unknowns for the circuit are the three loop currents and E, the voltage across the current source. Define the intermediate unknown column vector x = (i1 , i2 , i3 , E)T . Proceeding as in example 3.15 we find equations for x by considering the volt- age drops around each elementary loop (the first three equations) around each elementary loop (the first three equations) and matching the current through the current source (the last equation below): 8i1 −2i2 = V −2i1 +5i2 −3i3 = 0 (4.6) −3i2 +8i3 +E = 0 i3 = −I Here, we did not eliminate i3 from the equations as was done in section 3.5.3. The system (4.6) can be written I F2 x = F1 (4.7) V 147
  • 148. Chapter 4. Matrices and Determinants 4.6. Return to Resistor Networks where     8 −2 0 0 0 1  −2 5 −3 0   0 0  F2 =    and F1 =    0 −3 8 1  0 0  0 0 1 0 −1 0 Since F2 is invertible (always true for our approach using loop currents and current source voltages as variables) we can proceed from (4.7) to −1 I x = F2 F1 (4.8) V Now to solve the fundamental problem of the circuit we want J = i1 , the current through the voltage sources, and E, the voltage across the current source. We can write E = F3 x (4.9) J where 0 0 0 1 F3 = . 1 0 0 0 Combining (4.8) and (4.9) we obtain E −1 I = F3 F2 F1 (4.10) J V Comparing (4.10) to (4.5) we see that −1 F = F3 F2 F1 where F is the matrix representing the fundamental solution. Computation −1 does indeed show that F3 F2 F1 equals F with the matrices defined above. The solution of the fundamental problem for any circuit can be written as the product of three matrices in the same process described above. We illustrate this with a more complex circuit below. Example 4.3 Find the matrix for the fundamental solution of the circuit shown in Figure 4.11. Here there are four sources I1 , I2 , V1 and V2 so the fundamental problem will be written     E1 I1  E2   I2   J1  = F  V1  (4.11)     J2 V2 where F is a 4 × 4 matrix to be determined. The intermediate unknowns in the circuit will be x = (i1 , i2 , i3 , i4 , E1 , E2 )T 148
  • 149. Chapter 4. Matrices and Determinants 4.6. Return to Resistor Networks Figure 4.11: The circuit considered in Example 4.3. All resistors are 1Ω. as shown in the figure. Equations for these unknowns are found using the loop current method: 3i1 −i2 −i4 −E2 = 0 −i1 +3i2 −i3 +E1 = 0 −i2 +3i3 −i4 = V1 − V2 (4.12) −i1 −i3 +3i4 +E2 = V1 i2 = −I1 −i1 +i4 = −I2 where the first four equations above come from matching voltage drops around the four elementary loops and the last two equations come from matching the loop currents to the current sources. Note that (4.12) can be written   I1  I  F2 x = F1  2   V1  (4.13) V2 where     3 −1 0 −1 0 −1 0 0 0 0   −1 3 −1 0 1 0    0 0 0 0   0 −1 3 −1 0 0   0 0 1 −1  F2 =   and F1 =     −1 0 −1 3 0 1    0 0 1 0   0 1 0 0 0 0   −1 0 0 0  −1 0 0 1 0 0 0 −1 0 0 149
  • 150. Chapter 4. Matrices and Determinants 4.7. Determinants Now J2 = i4 − i3 and J1 = i3 so we can write   E1  E2   J1  = F3 x (4.14)   J2 where   0 0 0 0 1 0  0 0 0 0 0 1  F3 =   0 . 0 1 0 0 0  0 0 −1 1 0 0 As above we can combine (4.13) and (4.14) to obtain     E1 I1  E2  −1  I2   J1  = F3 F2 F1  V1  .     J2 V2 Comparing the equation above to the desired form (4.11) we see that the fun- damental matrix F can be computed as −1 F = F3 F2 F1 . A MATLAB computation with the matrices defined above gives   2.1818 0.2727 0.8182 −0.4545  0.2727 1.9091 0.7273 −0.1818  F ≈ −0.4545 −0.1818 0.4545  −0.3636  0.0909 −0.3636 −0.0909 0.2727 This technique can be used to find the fundamental solution of circuits even if they are very large. You can imagine that the process of setting up the −1 matrices F1 , F2 and F3 , computing an approximation of F2 and multiplying the matrices together can all be automated and done computationally. In Chapter 5, fundamental matrices for circuits will be used to investigate the transient behaviour of circuits with capacitors and inductors as well as resistors. 4.7 Determinants 4.7.1 Definition of Determinants We have already encountered determinants for 2 × 2 and 3 × 3 matrices. For 2 × 2 matrices a a1,2 det 1,1 = a1,1 a2,2 − a1,2 a2,1 . a2,1 a2,2 150
  • 151. Chapter 4. Matrices and Determinants 4.7. Determinants For 3 × 3 matrices we can define the determinant by expanding along the top row:   a1,1 a1,2 a1,3 a a2,3 a a2,3 a a2,2 det  a2,1 a2,2 a2,3  = a1,1 det 2,2 −a1,2 det 2,1 +a1,3 det 2,1 a3,2 a3,3 a3,1 a3,3 a3,1 a3,2 a3,1 a3,2 a3,3 If we multiply out the 2 × 2 determinants in this definition we arrive at the expression   a1,1 a1,2 a1,3 det  a2,1 a2,2 a2,3  = a1,1 a2,2 a3,3 −a1,1 a2,3 a3,2 +a1,2 a2,3 a3,1 −a1,2 a2,1 a3,3 +a1,3 a2,1 a3,2 −a1,3 a2,2 a3,1 a3,1 a3,2 a3,3 We now make a similar definition for an n×n matrix. Let A be an n×n matrix. Define Mi,j to be the (n − 1) × (n − 1) matrix obtained by crossing out the ith row and the jth column. So, for example, if 1 2 3 4   5 6 7 8 A= 9 0 1 2  3 4 5 6 then × × × ×     5 7 8 5 × 7 8  M1,2 =  = 9 1 2 9 × 1 2 3 5 6 3 × 5 6 We now define the determinant of an n × n matrix A to be n det(A) = a1,1 det(M1,1 )−a1,2 det(M1,2 )+· · ·±a1,n det(M1,n ) = (−1)j+1 a1,j det(M1,j ). j=1 Of course, this formula still contains determinants on the right hand side. How- ever, they are determinants of (n − 1) × (n − 1) matrices. If we apply this definition to those determinants we get a more complicated formula involving (n − 2) × (n − 2) matrices, and so on, until we arrive at an extremely long expression (with n! terms) involving only numbers. Calculating an expression with n! is completely impossible, even with the fastest computers, when n gets reasonable large. For example 100! ≈ 10158 . Yet, your computer at home can compute the determinant of a 100 × 100 matrix in less than a second. The secret, of course, is to compute the determinant in a different way. We start by computing the determinant of triangular matrices. 4.7.2 Determinants of Triangular matrices Recall that triangular matrices are matrices whose entries above or below the diagonal are all zero. For 2 × 2 matrices a1,1 a1,2 det = a1,1 a2,2 − a1,2 0 = a1,1 a2,2 0 a2,2 151
  • 152. Chapter 4. Matrices and Determinants 4.7. Determinants and a1,1 0 det = a1,1 a2,2 − 0a2,1 = a1,1 a2,2 a2,1 a2,2 so the determinant is the product of the diagonal elements. For 3 × 3 matrices   a1,1 0 0 a 0 det  a2,1 a2,2 0  = a1,1 det 2,2 −0+0 a3,2 a3,3 a3,1 a3,2 a3,3 = a1,1 a2,2 a3,3 A similar expansion shows that the determinant of an n × n lower triangular matrix is the product of the diagonal elements. For upper triangular matrices we have   a1,1 a1,2 a1,3 a a2,3 0 a2,3 0 a2,2 det  0 a2,2 a2,3  = a1,1 det 2,2 −a1,2 det +a1,3 det 0 a3,3 0 a3,3 0 0 0 0 a3,3 Since we already know that the determinant of a 2 × 2 triangular matrix is the product of the diagonals, we can see easily that the last two terms in this expression are zero. Thus we get   a1,1 a1,2 a1,3 a a2,3 det  0 a2,2 a2,3  = a1,1 det 2,2 0 a3,3 0 0 a3,3 = a1,1 a2,2 a3,3 Once we know that the determinant of a 3 × 3 upper triangular matrix is the product of the diagonal elements, we can do a similar calculation to the one above to conclude that determinant of a 4 × 4 upper triangular matrix is the product of the diagonal elements, and so on. Thus, the determinant of any (upper or lower) triangular n × n matrix is the product of the diagonal elements. We know that an arbitrary n×n matrix can be reduced to an upper (or lower) triangular matrix by a sequence of row operations. This is the key to computing the determinant efficiently. We need to determine how the determinant of a matrix changes when we do an elementary row operation on it. 4.7.3 Summary of determinant calculation rules We summarize the results on determinant calculation shown in more detail in the additional topics sections 4.8.6 to 4.8.12 below. The first three rules below show how a determinant changes when row operations are applied to it. This can be used to simplify determinant calculation as shown in the next section. In the statements below, A is a square matrix. 1. If B is obtained from A by multiplying one row of A by the constant c then det(B) = c det(A). 152
  • 153. Chapter 4. Matrices and Determinants 4.7. Determinants 2. If B is obtained from A by switching two rows of A then det(B) = − det(A). 3. If B is obtained from A by adding a multiple of one row to another then det(B) = det(A) 4. det(A) = 0 if and only if A is not invertible. 5. For all square matrices B of the same size as A, det(AB) = det(A) det(B). 6. det(AT ) = det(A). 4.7.4 Calculation of determinant using row operations We can now use elementary row operations to compute the determinant of   1 2 3 1 2 1 2 3 0 The sequence of row operations that transforms this matrix into an upper trian- gular one is (R2)-(R1), (R3)-2(R1), exchange (R2) and (R3). The determinant doesn’t change under the first two transformations, and changes sign under the third. Thus     1 2 3 1 2 3 det( 1 2 1 ) = det( 0 0 −2 ) 2 3 0 2 3 0   1 2 3 = det( 0 0 −2 ) 0 −1 −6   1 2 3 = − det( 0 −1 −6 ) 0 0 −2 = −(1)(−1)(−2) = −2 4.7.5 More expansion formulae We can use the properties of the determinant to derive alternative expansion formulae. Recall that we defined the determinant to be n det(A) = (−1)j+1 a1,j det(M1,j ). j=1 In other words, we expanded along the top row. Now let’s see that we can expand along other rows as well. Let A be the original matrix with rows a1 = 153
  • 154. Chapter 4. Matrices and Determinants 4.7. Determinants [a1,1 , a1,2 , . . . , a1,n ], . . . an = [an,1 , an,2 , . . . , an,n ]. For example, if A is a 5 × 5 matrix then a1    a2  A =  a3    a4   a5 Suppose we want to expand along the fourth row. Let A be the matrix, where the fourth row of A has been moved to the first row, with all other rows still in the same order, i.e., a4    a1  A =  a2    a3   a5 How is the determinant of A related to the determinant of A? We can change A to A be a series of row flips as follows: a1 a1 a1 a4          a2   a2   a4   a1  A =  a3  ,  a4  ,  a2  ,  a2  = A         a4 a3 a3 a3         a5 a5 a5 a5 We have performed 3 flips, so det(A ) = (−1)3 det(A) = − det(A). In general, to move the ith row to the top in this way, we must perform i − 1 flips, so det(A ) = (−1)i−1 det(A) Notice that A is a matrix with the properties (1) a1,j = ai,j , since we have moved the ith row to the top (2) M1,j = Mi,j , since we haven’t changed the order of the other rows. Therefore det(A) = (−1)i−1 det(A ) n = (−1)i−1 (−1)j+1 a1,j det(M1,j ) j=1 n = (−1)i−1 (−1)j+1 ai,j det(Mi,j ) j=1 n = (−1)i+j ai,j det(Mi,j ) j=1 This is the formula for expansion along the ith row. 154
  • 155. Chapter 4. Matrices and Determinants 4.7. Determinants As an example let’s compute the determinant of a 3 × 3 matrix by expanding along the second row.   1 2 3 2 3 1 3 1 2 det  1 3 1  = − det + 3 det − det 2 1 1 1 1 2 1 2 1 = −2 + 6 + 3 − 9 − 2 + 2 = −2 The formula for expanding along the ith row is handy if the matrix happens to have a row with many zeros. Using the fact that det(A) = det(AT ) we can also write down expansion formulae along columns, since the columns of A are the rows of AT . We end up with the formula n det(A) = (−1)i+j ai,j det(Mi,j ) i=1 As an example let’s compute the determinant of a 3 × 3 matrix by expanding along the second column.   1 2 3 1 1 1 3 1 3 det  1 3 1  = −2 det + 3 det − 2 det 1 1 1 1 1 1 1 2 1 = −2 + 2 + 3 − 9 − 2 + 6 = −2 The formula for expanding along the jth column is handy if the matrix happens to have a column with many zeros. 4.7.6 MATLAB The MATLAB command det(A) can be used to determine the determinant of a square matrix as introduced in Chapter 3. 4.7.7 Problems Problem 4.35 Find the determinant of 1 1 1 1   1 2 4 8  1 3 9 27   1 4 16 64 Problem 4.36 Find the determinant of 1 −1 1 −1   1 2 4 8  1 −2 4 −8   1 1 1 1 155
  • 156. Chapter 4. Matrices and Determinants 4.7. Determinants Problem 4.37 Consider the matrix   2 0 2 4  0 0 3 2  A=  2 . 2 4 4  3 0 6 2 1. Find the determinant of A by expanding on the first row. 2. Find the determinant of A by expanding on the row or column of your choice. This expansion should be simpler than the one used in part (a). Problem 4.38 Compute   1 0 1 det  1 2 3 3 0 1 by expanding along the second row, and by expanding along the third column. Problem 4.39 Find the determinant of A given above by using row operations to put A into echelon form. Problem 4.40 We know that the determinant of the n × n matrix   a1   a2    ..   .  an is a1 a2 · · · an if the blank entries represent 0. What is the equivalent formula for the determinant of the n × n matrix   an . ..   ?     a2  a1 Justify your answer. Problem 4.41 Find all values of λ for which the matrix   2−λ 1 0  −1 −λ 1  1 3 1−λ is not invertible. Problem 4.42 Determine whether each of the following statements is true or false. If true, explain briefly why it is true without showing specific examples. If false, explain briefly why it is false or give a simple counterexample, and give a correct statement by making small changes to the underlined text. 156
  • 157. Chapter 4. Matrices and Determinants 4.8. Additional Topics 1. The determinant of a square matrix is the product of the entries on its main diagonal. 2. For every matrix A, we have det(AAT ) = det(AT A) 3. If A and B are both invertible n×n matrices, then det(B −1 AB) = det(BAB −1 ) = det(A). 4. If an n × n matrix A is multiplied by a scalar c, the determinant of the resulting matrix is c · det(A). 5. If the column vectors of an n × n matrix are linearly independent, then the row vectors of the same matrix are also linearly independent. 4.8 Additional Topics 4.8.1 Application: General Least Squares Let us restate our results from Chapter 3 on minimization of quadratic functions using matrix notation. A quadratic function of   x1 x= .   .  . xn in Rn can be written in matrix form as f (x) = x · Ax + b · x + c where A is an n×n matrix and b ∈ Rn and c is a number. The vector containing the partial derivatives can be computed to be   ∂f /∂x1 . . T  = (A + A )x + b    . ∂f /∂xn Recall that we made the assumption that aij = aji when we considered this problem before. This property can be stated in compact form as A = AT . If this is true then (A + AT ) = 2A so   ∂f /∂x1 . .  = 2Ax + b    . ∂f /∂xn To find the minimum value of f (if it exists) we need to find the value of x for which the vector above is zero. In other words, x solves the equation 2Ax = −b. This is the same equation that we derived before. 157
  • 158. Chapter 4. Matrices and Determinants 4.8. Additional Topics 4.8.2 Least squares solutions Let’s take another look the situation where a system of linear equations, which we now can write Bx = c, has no solution. Typically this will be the case if there are more equations than variables, that is, B is an matrix with more rows than columns. In this case there is no value of x that makes the left side equal the right side. However, we may try to find the value of x for which the right side Bx is closest to the left side c. One way to go about this is to try to minimize distance between the left and right sides. It is more convenient to minimize the square of the distance. This quantity can be written 2 Bx − c = (Bx − c) · (Bx − c) = (Bx) · (Bx) − (Bx) · c − c · (Bx) + c · c = x · (B T Bx) − 2(B T c) · x + c · c This is a quadratic function, written in matrix form. We want to use the formula of the previous section with A = B T B and b = B T c. Before we can do so, we must verify that A = B T B satisfies AT = A. This is true because (AT A)T = AT (AT )T = AT A Thus the formula of the previous section implies that the minimum occurs at the value of x that solves the linear equation B T Bx = B T c Here we have cancelled a factor of 2 on each side. Now let’s derive the same result in another way. Think of all the values of Bx, as x ranges through all possible values in Rn as forming a (high dimensional) plane in Rm . Our goal is to find the value of x so that the corresponding value of Bx on the plane is closest to c. Using the analogy to the geometric picture in three dimensions, we see that the minimum will occur when Bx−c is orthogonal to the plane. This means that the dot product of Bx − c with every vector in the plane, that is, every vector of the form By, should be zero. Thus we have (By) · (Bx − c) = 0 for every y ∈ Rn . This is the same as y · (B T (Bx − c)) = y · (B T Bx − B T c) = 0 for every y ∈ Rn . This can happen only if B T Bx = B T c which is the same result we obtained before. 158
  • 159. Chapter 4. Matrices and Determinants 4.8. Additional Topics 4.8.3 Problems Problem 4.43 Find the least squares solution to x1 +x2 =1 x1 =1 x1 +x2 =0 Compare Bx and b. Problem 4.44 Refer back to the least squares fit example, where we tried to find the best straight line going through a collection of points (xi , yi ). Another way of formulating this problem is this. The line y = ax + b passes through the point (xi , yi ) if axi + b = yi (4.15) So, saying that the straight line passes through all n points is the same as saying that a and b solve the system of n linear equations given by (4.15) for i = 1, . . . , n. Of course, unless the points all actually lie on the same line, this system of equations has no solutions. Show that the least squares solution to this problem is the same as we obtained before. (You may simplify the problem by assuming there are only three points (x1 , y1 ), (x2 , y2 ) and (x3 , y3 ).) 4.8.4 Elementary matrices Recall that there are three row operation that are used in Gaussian elimination: (1) multiplication of a row by a non-zero number, (2) add a multiple of one row to another row and (3) exchanging two rows. It turns out that each elementary row operation can be implemented by left multiplication by a matrix. In other words, for each elementary row operation there is a matrix Q such that QA is what you get by doing that row operation to the matrix A. Here is an example. Suppose   1 0 2 1 A = 2 0 0 1 1 2 3 4 and suppose that the row operation is multiplying the first row by 2. Then the matrix you get by doing that row operation to the matrix A is   2 0 4 2 A = 2 0 0 1 1 2 3 4 In this case the matrix Q turns out to be   2 0 0 Q = 0 1 0 0 0 1 159
  • 160. Chapter 4. Matrices and Determinants 4.8. Additional Topics Since      2 0 0 1 0 2 1 2 0 4 2 0 1 02 0 0 1 = 2 0 0 1 0 0 1 1 2 3 4 1 2 3 4 i.e., QA = A . Now suppose that the elementary row operation is subtracting twice the first row from the second row. Then the matrix you get by doing that row operation to the matrix A is   1 0 2 1 A =  0 0 −4 −1  1 2 3 4 In this case the matrix Q turns out to be   1 0 0 Q =  −2 1 0  0 0 1 Since      1 0 0 1 0 2 1 1 0 2 1  −2 1 02 0 0 1 = 0 0 −4 −1  0 0 1 1 2 3 4 1 2 3 4 i.e., again, QA = A . Finally, suppose that the elementary row operation is exchanging the second and the third rows. Then the matrix you get by doing that row operation to the matrix A is   1 0 2 1 A = 1 2 3 4 2 0 0 1 In this case the matrix Q turns out to be   1 0 0 Q = 0 0 1 0 1 0 Since      1 0 0 1 0 2 1 1 0 2 1 0 0 12 0 0 1 = 1 2 3 4 0 1 0 1 2 3 4 2 0 0 1 i.e., again, QA = A . How can we find the matrices Q (called elementary matrices)? Here is the procedure. Start with the identity matrix I and do the row transformation to it. The resulting matrix Q is the matrix that implements that row transformation by multiplication from the left. Notice that this is true in the examples above. 160
  • 161. Chapter 4. Matrices and Determinants 4.8. Additional Topics In the first example, the row transformation was multiplying the first row by 2. If you multiply the first row of   1 0 0 I = 0 1 0 0 0 1 by two you get   2 0 0 Q = 0 1 0. 0 0 1 In the second example, the row transformation was subtracting twice the first row from the second row. If you subtract twice the second row from the first row of I by two you get   1 0 0 Q =  −2 1 0  . 0 0 1 In the third example, the row transformation was exchanging the second and third rows. If you exchange the second and third rows of I, you get   1 0 0 0 0 1. 0 1 0 Elementary matrices are useful in theoretical studies of the Gaussian elimi- nation process. We will use them briefly when studying determinants. Suppose A is a matrix, and R is its reduced form. Then we can obtain R from A via a sequence of elementary row operations. Suppose that the corresponding elementary matrices are Q1 , Q2 , . . ., Qk . Then, starting with A, the matrix after the first elementary row operation is Q1 A, then after the second elementary row operation is Q2 Q1 A, and so on, until we have Qk Qk−1 · · · Q2 Q1 A = R. Now let us apply the inverse matrices, starting with Q−1 . This gives k Q−1 Qk Qk−1 · · · Q2 Q1 A = Qk−1 · · · Q2 Q1 A = Q−1 R. k k Continuing in this way we see that A = Q−1 Q−1 · · · Q−1 R 1 2 k In the special case that A is an n × n invertible matrix, A can be reduced to the identity matrix. In other words, we can take R = I. In this case A can be written as a product of elementary matrices. A = Q−1 Q−1 · · · Q−1 I = Q−1 Q−1 · · · Q−1 1 2 k 1 2 k 161
  • 162. Chapter 4. Matrices and Determinants 4.8. Additional Topics Notice that in this case A−1 = Qk Qk−1 · · · Q2 Q1 . As an example, let us write the matrix 2 1 A= 5 3 as a product of elementary matrices. The sequence of row transformations that reduce A to the identity are: 1) (1/2)(R1) 2) (R2)-5(R1) 3) (R1)-(R2) 4) 2(R2) The corresponding elementary matrices and their inverses are 1/2 0 2 0 Q1 = Q−1 = 1 0 1 0 1 1 0 1 0 Q2 = Q−1 = 2 −5 1 5 1 1 −1 1 1 Q3 = Q−1 = 3 0 1 0 1 1 0 1 0 Q4 = Q−1 = 4 0 2 0 1/2 Therefore A = Q−1 Q−1 Q−1 Q−1 1 2 3 4 or 2 1 2 0 1 0 1 1 1 0 = 5 3 0 1 5 1 0 1 0 1/2 4.8.5 Problems Problem 4.45 Each elementary matrix is invertible, and the inverse is also an elementary matrix. Find the inverses of the three examples of elementary matrices above. Notice that the inverse elementary matrix is the matrix for the row transformation that undoes the original row transformation. Problem 4.46 Write the matrix   2 3 −1  1 2 3  −1 −1 1 as a product of elementary matrices. 162
  • 163. Chapter 4. Matrices and Determinants 4.8. Additional Topics 4.8.6 Exchanging two rows changes the sign of the deter- minant We start with the elementary row operation of exchanging two rows. For 2 × 2 determinants, a b det = ad − bc, c d while c d det = cb − da = −(ad − bc), a b so exchanging two rows changes the sign of the determinant. We can do a similar calculation for 3 × 3 matrices. Its a a bit messier, but still manageable. Again, we find that exchanging two rows changes the sign of the determinant. How about the n × n case? We will assume that we have already proved the result for the (n − 1) × (n − 1) case, and show how we can use this to show the result for an n × n matrix. Thus knowing the result for 2 × 2 matrices, implies it for 3 × 3, which in turn implies it for 4 × 4 matrices, and so on. We consider three cases, depending on which rows we are exchanging. Suppose A is the original matrix and A is the matrix with two rows exchanged. (1) Exchanging two rows other than the first row: In this case we cross out the first row and any column from A we obtain M1,j which is the same as the matrix M1,j (corresponding to A) except with two of its rows exchanged. Since the size of M1,j is n − 1 we know that det(M1,j ) = − det(M1,j ) so n det(A ) = (−1)j+1 a1,j det(M1,j ) j=1 n = − (−1)j+1 a1,j det(M1,j ) j=1 = − det(A) (2) Exchanging the first and second row. Do see that this changes the sign of the determinant we have to expand the expansion. The following is a bit sketchy. I’ll probably skip it in class, but give the argument here for completeness. If we expand M1,j we get j−1 n det(M1,j ) = (−1)k+1 a2,k det(M1,2,j,k ) + (−1)k a2,k det(M1,2,j,k ) k=1 k=j+1 where M1,2,j,k is the matrix obtained from A by deleting the first and second rows, and the jth and kth columns. Inserting this into the expansion for A gives n j−1 n n det(A) = (−1)j+k a1,j a2,k det(M1,2,j,k )− (−1)j+k a1,j a2,k det(M1,2,j,k ) j=1 k=1 j=1 k=j+1 163
  • 164. Chapter 4. Matrices and Determinants 4.8. Additional Topics The sum splits into two parts. Flipping the first two rows of A just exchanges the two sums. In other words S − R becomes R − S which is −(S − R). So exchanging the first two rows also changes the sign of the determinant. (3) Exchanging the first row with the kth row. We can effect this exchange by first exchanging the kth and the second row, then exchanging the first and the second row, then exchanging the kth and the second row again. Each flip changes the determinant by a minus sign, and since there are three flips, the overall change is by a minus sign. Thus we can say that for any n × n matrix, exchanging two rows changes the sign of the determinant. One immediate consequence of this fact is that a matrix with two rows the same has determinant zero. This is because if exchange the two rows the determinant changes by a minus sign, but the matrix doesn’t change. Thus det(A) = − det(A) which is only possible if det(A) = 0. 4.8.7 The determinant is linear in each row separately To say that the determinant is linear in the jth row means that if we write a matrix as a matrix of row vectors, a  1  a2   .   .   .  A=   aj   .   .  . an then a1 a1 a1        a2   a2   a2   . .   .   .   .   .  .  .   .    det( ) = s det( ) + t det( )    sb + tc  b  c  .  .   .   .   .     . .  . . an an an It is easy to from the expansion formula that the determinant is linear in the first row. For a 3 × 3 example we have   sb1 + tc1 sb2 + tc2 sb3 + tc3 det( a2,1 a2,2 a2, 3 ) a3,1 a3,2 a3, 3 = (sb1 + tc1 ) det(M1,1 ) − (sb2 + tc2 ) det(M1,2 ) + (sb3 + tc3 ) det(M1.3 ) = s(b1 det(M1,1 ) − b2 det(M1,2 ) + b3 det(M1.3 )) 164
  • 165. Chapter 4. Matrices and Determinants 4.8. Additional Topics +t(c1 det(M1,1 ) − c2 det(M1,2 ) + c3 det(M1.3 ))     b1 b2 b3 c1 c2 c3 = s det( a2,1 a2,2 a2, 3 ) + t det( a2,1 a2,2 a2, 3 ) a3,1 a3,2 a3, 3 a3,1 a3,2 a3, 3 A similar calculation can be done for any n × n matrix to show linearity in the first row. To show linearity in some other row, we first swap that row and the first row, then use linearity in the first row, and then swap back again. So a1 sb + tc      a2   a2   . .   . .  . .     det( ) = − det( )      sb + tc   a1   . .   . .   .   .  an an b c      a2   a2   .   .   .   .   .   .  = −s det( ) − t det( )  a1   a1   .   .   .  .  .  . an an a1 a1      a2   a2   .   .   .   .   .   .  = s det( ) + t det( ) b  c   .   .   .  .  .  . an an Notice that linearity in each row separately does not mean that det(A + B) = det(A) + det(B). Note that multiplying a row by a constant multiplies the determinant by the constant. This is a special case of linearity. 4.8.8 Adding a multiple of one row to another doesn’t change the determinant Now we will see that the most often used row operation—adding a multiple of one row to another—doesn’t change the determinant at all. Let A be an n × n 165
  • 166. Chapter 4. Matrices and Determinants 4.8. Additional Topics matrix. Write A as a matrix of rows. a1   ..   .   ai     .  A= .   .  a   j  .   .  . an Adding s times the ith row to the jth row yields a1   . . .     ai      . .  A =  .   a + sai   j   . .  .   an So a1 a1 a1       . .  .  .   .  .  .  .  .     ai  ai   ai           . . ) = det( . ) + s det( . ) = det(A) + 0    .   det(A ) = det(  .   .   .  .  a + sa  a  ai     j i  j   .   .   .  .  .   .  . . .   an an an Here we used linearity in a row and the fact that the determinant of a matrix with two rows the same is zero. 4.8.9 The determinant of QA To begin, we compute the determinants of the elementary matrices. Recall that if A is the matrix obtained from A by an elementary row operation, then (1) det(A ) = − det(A) if the row operation is swapping two rows (2) det(A ) = s det(A) if the row operation is multiplying a row by s (3) det(A ) = det(A) if the row operation is adding a multiple of one row to another 166
  • 167. Chapter 4. Matrices and Determinants 4.8. Additional Topics Recall that the elementary matrices are obtained from the identity matrix I by an elementary row operation. So we can take A = I and A = Q in the formulae above to obtain (1) det(Q) = − det(I) = −1 if the row operation is swapping two rows (2) det(Q) = s det(I) = s if the row operation is multiplying a row by s (3) det(Q) = det(I) = 1 if the row operation is adding a multiple of one row to another Going back to the first set of formulae, we have that in each case A = QA. In each case the factor in front of det(A) is exactly det(Q) So we see that in each case det(QA) = det(Q) det(A). This formula can be generalized. If Q1 , Q2 , . . ., Qk are elementary matrices then det(Q1 Q2 Q3 · · · Qk A) = det(Q1 ) det(Q2 Q3 · · · Qk A) = det(Q1 ) det(Q2 ) det(Q3 · · · Qk A) and so on, so we arrive at the formula det(Q1 Q2 Q3 · · · Qk A) = det(Q1 ) det(Q2 ) · · · det(Qk ) det(A). 4.8.10 The determinant of A is zero exactly when A is not invertible Recall that if R denotes the reduced form of A, obtained by performing the sequence of row reductions corresponding to Q1 , Q2 , . . ., Qk , then A = Q−1 Q−1 · · · Q−1 R 1 2 k Each Q−1 is an elementary matrix, therefore i det(A) = det(Q−1 ) det(Q−1 ) · · · det(Q−1 ) det(R) 1 2 k If A is not invertible, then R has a row of zeros along the bottom. Thus R is an upper triangular matrix with at least one zero on the diagonal. The determinant of R is the product of the diagonal elements so det(R) = 0. Thus det(A) = 0 too. If A is invertible, then we can reduce A to to identity matrix. In other words, we can take R = I. Then det(R) = 1. Each det(Q−1 ) is non-zero too, i so det(A) = 0. 4.8.11 The product formula: det(AB) = det(A) det(B) If either A or B is non-invertible, then AB is non-invertible too. Thus det(AB) = 0 and one of det(A) or det(B) is zero, so det(A) det(B) = 0 too. Thus det(AB) = det(A) det(B). If both A and B are invertible, then A = Q−1 Q−1 · · · Q−1 1 2 k 167
  • 168. Chapter 4. Matrices and Determinants 4.8. Additional Topics so det(A) = det(Q−1 ) det(Q−1 ) · · · det(Q−1 ) 1 2 k and B = Q−1 Q−1 · · · Q−1 ˜ ˜ 1 2 ˜ j so det(B) = det(Q−1 ) det(Q−1 ) · · · det(Q−1 ) ˜ 1 ˜ 2 ˜ j Therefore AB = Q−1 Q−1 · · · Q−1 Q−1 Q−1 · · · Q−1 1 2 k ˜ ˜ 1 2 ˜ j so det(AB) = det(Q−1 ) det(Q−1 ) · · · det(Q−1 ) det(Q−1 ) det(Q−1 ) · · · det(Q−1 ) = det(A) det(B) 1 2 k ˜ 1 ˜ 2 ˜ j 4.8.12 The determinant of the transpose Recall that the transpose AT of a matrix A is the matrix you get when you flip A about its diagonal. If A is an n × n matrix, so is AT and we can ask what the relationship between the determinants of these two matrices is. It turns out that they are the same. det(AT ) = det(A). If A is an upper or lower triangular matrix, this follows from the fact that the determinant of a triangular matrix is the product of the diagonal entries. If A is an arbitrary n × n matrix then the formula follows from two facts. (1) The transpose of a product of two matrices is given by (AB)T = B T AT . This implies that (A1 A2 · · · An )T = AT · · · AT AT . n 2 1 (2) For an elementary matrix Q we have det(QT ) = det(Q). If you accept these two facts, then we may write A = Q−1 Q−1 · · · Q−1 R 1 2 k where R is upper triangular. Thus AT = RT (Q−1 )T · · · (Q−1 )T (Q−1 )T k 2 1 so det(AT ) = det(RT ) det((Q−1 )T ) · · · det((Q−1 )T ) k 1 = det(R) det(Q−1 ) · · · det(Q−1 ) k 1 = det(Q−1 ) · · · det(Q−1 ) det(R) 1 k = det(A) 168
  • 169. Chapter 4. Matrices and Determinants 4.8. Additional Topics 4.8.13 An impractical formula for the inverse We can use the expansion formulae of the previous section to obtain a formula for the inverse of a matrix A. This formula is really only practical for 3×3 matri- ces, since for larger matrices, the work involved in computing the determinants appearing is prohibitive. We begin with the expansion formula n det(A) = (−1)i+j ai,j det(Mi,j ) j=1 If A is invertible, then det(A) = 0 so we can divide by it to obtain n (−1)i+j det(Mi,j ) 1= ai,j j=1 det(A) Now suppose we take the matrix A and replace the ith row by the kth row for some k = i. The resulting matrix A has two rows the same, so its determinant is zero. Its expansion is the same as that for A, except that ai,j is replaced by ak,j . Thus, if k = i n 0= (−1)i+j ak,j det(Mi,j ) j=1 Dividing by det(A) yields n (−1)i+j det(Mi,j ) 0= ak,j j=1 det(A) Now let B be the matrix with entries (−1)i+j det(Mj,i ) bi,j = det(A) This turns out to be the inverse A−1 . It gets a bit confusing with all the indices, but let’s think about what we need to show. The k, ith entry of the product AB is given by n (AB)k,i = ak,j bj,i j=1 n (−1)i+j det(Mi,j ) = ak,j j=1 det(A) According to the formulae above, this sum is equal to 1 if k = i and equal to 0 if k = i. In other words, AB is the identity matrix I. This shows that B = A−1 . Remember that 169
  • 170. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems 4.8.14 Cramer’s rule, an impractical way to solve systems Given an impractical way to compute the inverse, we can derive an impractical formula for the solution of a system of n equations in n unknowns, i.e., a matrix equation Ax = b x1 b1     The solution x is equal to A−1 b. So if x =  x2  and b =  b2 , then using . . . . .xn .bn the formula of the previous section for the inverse, we get n n (−1)i+j det(Mj,i ) xi = A−1 bj = i,j bj j=1 j=1 det(A) but this is exactly (1/ det(A)) times the formula for expanding the determinant matrix obtained from A by replacing the ith column with b. Thus det(matrix obtained from A by replacing the ith column with b) xi = det(A) 4.8.15 Problems   1 0 1 Problem 4.47 Find the inverse of  1 2 3  using the “impractical” for- 3 0 1 mula. Problem 4.48 Solve the equation      1 0 1 x1 1  1 2 3   x2  =  0  3 0 1 x3 1 using Cramer’s rule. 4.9 Solutions to Chapter Problems Solution 4.1 −13 7 AB = −9 5   1 2 −1 BA =  −2 −4 −8  −1 −2 −5 −14 AD = −18 170
  • 171. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems CB = [ 4 2] CD = [ 26 ] 4 −4 0    −22 22 0  DC =  4 −4 0  AC, CA, DA, BC, BD and DB are not defined. Solution 4.2   3 0 0 1 1 4 2 A =  −1 2 ; B= ; C= ; 0 0 3 1 5 1 1 The matrix A is 3x2, the matrix B is 2x2, and the matrix C is 2x3. For a product of two matrices to be possible, the number of columns in the left matrix has to be equal to the number of rows in the right matrix. Hence there are 5 possible products:   0 3 1. A · B =  0 −1   0 1  3 12 6 2. A · C =  5 −2 8  4 5 7 0 0 3. B·B = 0 0 3 1 5 4. B · C = 0 0 0 1 10 5. C ·A= 13 7 Solution 4.3 If   0 a b A = 0 0 c 0 0 0 then   0 0 ac A2 =  0 0 0  0 0 0   0 0 0 A3 =  0 0 0  0 0 0 If   1 0 a A = 0 1 0 0 0 1 171
  • 172. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems then   1 0 2a A2 =  0 1 0  0 0 1   1 0 3a A3 =  0 1 0  0 0 1 Solution 4.4 1 k a,b) Ak = 0 1 c) Use the power series formula x2 x3 ex = 1 + x + + + ··· 2 3! and substitute in the matrix tA for each occurrence of x. Substitute the identity matrix for 1. This gives 1 0 1 1 t2 1 2 etA = +t + + ... 0 1 0 1 2 0 1 t2 t3 2 3 1+t+ 2 + 3! + ··· t + 2 t2 + 3 t + · · · 3! = 2 3 0 1 + t + t2 + t + · · · 3! t2 t3 3 1+t+ 2 + 3! + ··· t + t2 + t + · · · 2! = 2 3 0 1 + t + t2 + t + · · · 3! et tet = 0 et a b d,e) We are looking for all matrices that satisfy B 2 = A. Let B = . c d a2 + bc ab + bd Then B 2 = so we need to satisfy the equations ca + dc bc + d2 a2 + bc = 1 ab + bd = 1 ca + dc = 0 bc + d2 = 1 The third equation says c(a + d) = 0 so either c = 0 or a + d = 0. But a + d = 0 would contradict the second equations, so we must have c = 0. So a2 = 1 b(a + d) = 1 d2 = 1 172
  • 173. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems So a = ±1 and d = ±1. If a = 1 then to satisfy the second equation we must have d = 1 and b = 1/2. If If a = −1 then to satisfy the second equation we must have d = −1 and b = −1/2. Thus the two square roots 1 1/2 −1 −1/2 are and . In general, a matrix may have more 0 1 0 −1 than two square roots. Solution 4.5 0 1 0 0   0 0 1 0 A= 0 0 0 1  0 0 0 0 0 0 1 0   2 0 0 0 1 A = 0 0 0 0  0 0 0 0 0 0 0 1   3 0 0 0 0 A = 0 0 0 0  0 0 0 0 A4 = A5 = · · · = 0 Solution 4.6 T (x + y) = x + y + a whereas T (x) + T (y) = x + a + y + a = x + y + 2a. Since these are not equal, T is not linear. Solution 4.7 It follows from the properties of the dot product that a·(sx+ty) = sa · x + ta · y. Solution 4.8 One way to do these problems is to determine the angle θ that the lines make with the x axis, and then substitute into the formula. So, for part (a), we have θ = π/4 (45◦ ). Thus the projection matrix is 1 1 + cos(2θ) sin(2θ) 1 1 + cos(π/2) sin(π/2) 1 1 1 = = 2 sin(2θ) 1 − cos(2θ) 2 sin(π/2) 1 − cos(π/2) 2 1 1 Another way to do this problem is to go back to the derivation of the projection matrix, and redo the formula for the matrix for projection in the direction of a when a is not necessarily a unit vector. This gives 1 a21 a1 a2 . a2 1 + a2 2 a2 a1 a22 For part (a) the vector a = [1, 1] so this formula gives the same answer. For part (b) the vector a = [4, −3] so the projection matrix is 1 16 −12 . 25 −12 9 173
  • 174. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems Solution 4.9 Since we have already computed the corresponding projection matrices, we can use the formula 2P − I to get the reflection matrices. This gives (a) 0 1 1 0 (b) 7/25 −24/25 −24/25 −7/25 Solution 4.10 Here we can just substitute into the formula for the rotation matrix. This gives (a) √ √ 1/√2 −1/ 2 √ 1/ 2 1/ 2 (b) 0 −1 1 0 (c) −1 0 0 −1 Solution 4.11 We need to compute the matrix product cos(2θ) sin(2θ) cos(2φ) sin(2φ) sin(2θ) − cos(2θ) sin(2φ) − cos(2φ) This equals cos(2θ) cos(2φ) + sin(2θ) sin(2φ) cos(2θ) sin(2φ) − sin(2θ) cos(2φ) sin(2θ) cos(2φ) − cos(2θ) sin(2φ) sin(2θ) sin(2φ) + cos(2θ) cos(2φ) Using the addition formulae for cos and sin this can be rewritten cos(2θ − 2φ) − sin(2θ − 2φ) sin(2θ − 2φ) cos(2θ − 2φ) This is the matrix for rotation by 2θ − 2φ. 174
  • 175. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems Figure 4.12: Problem 4.12. Solution 4.12 We first find the angle θ between the line x = 2y and the x-axis. 2 1 Note that (using Pythagoras) cos θ = √5 , sin θ = √ . 5 Now, 1 cos(2θ) sin(2θ) 1 f = 10 sin(2θ) − cos(2θ) 10 So, we need to calculate cos(2θ) and sin(2θ). We have cos(2θ) = cos2 θ − sin2 θ = 4 − 5 = 3 , and sin(2θ) = 2 sin θ cos θ = 5 1 5 2 √ 1 4 2 √5 5 = 5 . So, 3 4 1 1 1 5 5 1 3 4 1 43 f = 4 = = . 10 5 −35 10 5 4 −3 10 5 −26 Therefore, the matrix of f is 1 3 4 . 5 4 −3 Solution 4.13 Notice that g is in fact the composition of Refθ and Rπ/2 , that is: g = Rπ/2 ◦ Refθ where θ is the angle between the line x = −y and the x-axis (so, θ = π/2+π/4 = 3π/4). The matrix of g is given by the product of the matrices Rπ/2 and Ref3π/4 . Note that [g] = [Rπ/2 ][Ref3π/4 ] 175
  • 176. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems cos π/2 − sin π/2 cos 6π/4 sin 6π/4 = sin π/2 cos π/2 sin 6π/4 − cos 6π/4 0 −1 0 −1 1 0 = = 1 0 −1 0 0 −1 Solution 4.14 Under this rotation, the vector e1 = [1, 0, 0] gets transformed to [cos(θ), sin(θ), 0], e2 = [0, 1, 0] gets transformed to [− sin(θ), cos(θ), 0], and e3 = [0, 0, 1] is transformed to itself (em i.e. it doesn’t change). Putting this in the columns of a matrix yields   cos(θ) − sin(θ) 0  sin(θ) cos(θ) 0  0 0 1 Solution 4.15 Let x = (x1 , x2 , x3 , x4 ) y = (y1 , y2 , y3 , y4 ) 1. T (x + y) = T (x1 + y1 , x2 + y2 , x3 + y3 , x4 + y4 ) = ((x1 + y1 ) + 4(x2 + y2 ) + 5(x3 + y3 ), 3(x1 + y1 ) − 2(x2 + y2 ) + ... (x3 + y3 ) − (x4 + y4 ), −(x1 + y1 ) − (x3 + y3 ) + (x4 + y4 )) = (x1 + 4x2 + 5x3 + y1 + 4y2 + 5y3 , 3x1 − 2x2 + x3 − x4 + ... 3y1 − 2y2 + y3 − y4 , −x1 − x3 + x4 − y1 − y3 + y4 ) = (x1 + 4x2 + 5x3 , 3x1 − 2x2 + x3 − x4 , −x1 − x3 + x4 ) + ... (y1 + 4y2 + 5y3 , 3y1 − 2y2 + y3 − y4 , −y1 − y3 + y4 ) = T (x) + T (y). This holds for any x, y ∈ R4 2. T (cx) = T (cx1 , cx2 , cx3 , cx4 ) = (cx1 + 4cx2 + 5cx3 , 3cx1 − 2cx2 + cx3 − cx4 , −cx1 − cx3 + cx4 ) = (c(x1 + 4x2 + 5x3 ), c(3x1 − 2x2 + x3 − x4 ), c(−x1 − x3 + x4 )) = c(x1 + 4x2 + 5x3 , 3x1 − 2x2 + x3 − x4 , −x1 − x3 + x4 ) = cT (x). This holds for any x ∈ R4 , and any c ∈ R. It is clear then that T is linear. Solution 4.16 First recall the definition: a vector y = (y1 , y2 , y3 ) is said to be in the range of T if there exists a vector x = (x1 , x2 ) such that T (x) = y. 176
  • 177. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems a) We should be looking for x = (x1 , x2 ), if any, such that T (x) = (1, 4, 2). But       1 1 2 x1 + 2x2  4 =T x1 x1 x1 =A = 0 1  = x2 . x2 x2 x2 2 1 1 x1 + x2 We get the following system of equations:   x1 + 2x2 = 1 x1 + 8 = 1 x1 = −7 x2 = 4 ⇒ ⇒ x1 + 4 = 2 x1 = −2 x1 + x2 = 2  So, the system does not have any solutions. In other words, there is no x = (x1 , x2 ) that satisfy the system above. So there is no x such that T (x) = (1, 4, 2). Hence, (1, 4, 2) is not in the range of T . b) We are looking for x = (x1 , x2 ), if any, such that T (x) = (1, 1, 1). So       1 1 2 x1 + 2x2  1 =T x1 x1 x1 =A = 0 1  = x2 . x2 x2 x2 1 1 1 x1 + x2 We get the following system of equations:   x1 + 2x2 = 1 x1 + 2 = 1 x1 = −1 x2 = 1 ⇒ ⇒ x1 + 1 = 2 x1 = 0 x1 + x2 = 1  So, as in the previous case, the system does not have any solutions; in other words, (1, 1, 1) is not in the range of T . Solution 4.17 a) Note that         1 1 1 0  2  = 3  0  − 2  −1  + 3  0  . 3 0 0 1 So,         1 1 1 0 T  2  = 3T  0  − 2T  −1  + 3T  0  3 0 0 1 1 2 −1 −4 = 3 −2 +3 = 1 0 5 18 177
  • 178. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems b) In order to write the matrix of T , we need to find T (0, 1, 0). Note that       0 1 1  1  =  0  −  −1  . 0 0 0 So,       0 1 1 T  1  = T  0  − T  −1  0 0 0 1 2 −1 = − = . 1 0 1 Now, the matrix of T is given by        1 0 0 1 −1 −1 T = T  0  T  1  T  0  = 1 1 5 0 0 1 Solution 4.18 a)      1 1 1+5+0 (h ◦ g)  0  = h g  0  = h (4.16) −5 − 2 × 0 −5 −5   6 6  −6  = h  −5  . =  (4.17) −5 5 b) We first find the matrix of g. To do so, we need to know g(1, 0, 0), g(0, 1, 0), and g(0, 0, 1). We have that       1 0 0 1 1 −1 g 0  = , g 1  = , g 0  = , . 0 −2 1 0 0 1 So, the matrix of g is 1 1 −1 g= . 0 −2 1 Now we find the matrix of h. As with g, we need to find h(1, 0), and h(0, 1). We have that     1 0 1  −1  0  0  h = , h =  1 .  0  0  1 0 −1 178
  • 179. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems So, the matrix of h is   1 0  −1 0  h=  0 . 1  0 −1 c) The matrix of h ◦ g is given by     1 0 1 1 −1  −1 0  1 1 −1  −1 −1 1  h◦g = [h]4×2 [g]2×3 =   =   0 1  0 −2 1  0 −2 1  0 −1 0 2 −1 4×3 We can confirm them that h ◦ g : R3 → R4 . So, the matrix of h ◦ g will have 4 rows and 3 columns, as expected. Solution 4.19 We can determine the transition matrix, since we know each pi,i = 0, and the sum over each column is one. We are given that p2,1 = 1/2. Then, since p1,1 + p2,1 + p3,1 = 1, we have 0 + 1/2 + p3,1 = 1, so p3,1 = 1/2. Similarly p1,2 = 1/3, p3,2 = 2/3, and p1,3 = 1/4, p2,3 = 3/4. Thus   0 1/3 1/4 P =  1/2 0 3/4  1/2 2/3 0   1 If we start out with the probability vector x0 =  0  (i.e., the walker is in 0 location 1) then after two time steps the probability vector is P 2 x0 , i.e.,      7/24 1/6 1/4 1 7/24 x2 =  3/8 2/3 1/8   0  =  3/8  1/3 1/6 5/8 0 1/3 So the probability that the walker is in position 2 after two time steps is 3/8. Solution 4.20 a) Since Pi,i = 0, and the sum of entries in each column is 1, the matrix P is   0 2/5 1/4 P =  1/3 0 3/4  . 2/3 3/5 0 b) The random walker starts at location 1. So x0 = (1, 0, 0)T . The positions after three time steps will be       0 3/10 1/4 x1 = P x0 =  1/3  , x2 = P x1 =  1/2  , x3 = P x2 =  1/4  , 2/3 1/5 1/2 179
  • 180. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems So, x3,2 , that is, the probability that the random walker is in location 2 after 1 3 steps, will be 4 = 0.25. c)       0 1/4 18 1  x0 =  0  , x1 = P x0 =  3/4  , x2 = P x1 = 5 , 60 1 0 37 So, x2,1 , the probability that the random walker is in location 1 after 2 steps, 18 will be 60 = 0.333... Solution 4.21 a)     1/4 1/4 1/4 1/4 1 1 1 1  1/4 1/4 1/4 1/4  1  1 1 1 1  P  1/4 1/4 =   (4.18) 1/4 1/4  4  1 1 1 1  1/4 1/4 1/4 1/4 1 1 1 1     4 4 4 4 1 1 1 1 11 4 4 4 4  1  1 1 1 1  P2 =  = · 4  . (4.19) 44 4 4 4 4  4·4  1 1 1 1  4 4 4 4 1 1 1 1 So, P 2 = P , thus P 3 = P · P 2 = P · P = P . Hence, P n = P for every n ≥ 1 b) Let   1  0  x0 =   .  0  0 Then   1 1 1  x1 =   . 4 1  1 Hence, xn = P n x0 . But P n = P , for every n ≥ 1. So,   1 n 1 1  xn = P x0 = P x0 = x1 =   . 4 1  1 Now, let     0 1  0  1 1  x0 =   ,  0  x1 = P x0 =   . 4 1  1 1 180
  • 181. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems Again   1 1 1  xn = P n x0 = P x0 =   . 4 1  1 Solution 4.22   1/3 1/4 2/5 P =  0 1/2 1/5  2/3 1/4 2/5 a) We are given   0 x0 =  1  . 0 Then   1/4 x1 = P x0 =  1/2  . 1/4 1 So, x1,1 = 4 is the probability that the random walker is location 1 after one time step. b) Again   0 x0 =  1  . 0 Then 37    1/4 120 3 37 x1 = P x0 =  1/2  , x2 = P x1 =  10  ⇒ x2,1 = . 47 120 1/4 120 c)   1/3 x0 =  1/3  . 1/3 We have 59 1853     180 5400 7 46 x1 = P x0 =  30 , x2 = P x1 =  225  79 2443 180 5400 2443 Therefore, x2,3 = 5400 181
  • 182. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems Solution 4.23 In this situation the transition matrix is   1 1 1 1 P = 1 1 1 3 1 1 1 so P = P 2 = p3 = · · · P n . In this case, if we start out with any probability vector x0 = [ x1 x2 x3 cr ] with x1 + x2 + x3 = 1, then   1/3 P x0 =  1/3  = P n x0 1/3 for each n. Solution 4.24 (a) The transition matrix is   1/2 0 1/3 P =  1/4 1/2 1/3  1/4 1/2 1/3 The system starts in state 3 so x(0) = e3 . After one step, the probability vector is      1/2 0 1/3 0 1/3 x(1) = P x(0) =  1/4 1/2 1/3   0  =  1/3  1/4 1/2 1/3 1 1/3 That is, there is a equal probability of being in any state. After another step,      1/2 0 1/3 1/3 5/18 x(2) = P x(1) =  1/4 1/2 1/3   1/3  =  13/36  1/4 1/2 1/3 1/3 13/36 So the probability that the system is in state 2 is 13/36. (b) Suppose that the initial probabilities are given by the vector   x1 x(0) =  x2  . x3 Then after k steps, the probabilities will be x(k) = P k x(0) . After many steps (using the limiting behaviour of P k stated in the question):    0.25 0.25 0.25 x1 x(k) ≈  0.375 0.375 0.375   x2  0.375 0.375 0.375 x3 182
  • 183. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems   0.25x1 + 0.25x2 + 0.25x3 =  0.375x1 + 0.375x2 + 0.375x3  0.375x1 + 0.375x2 + 0.375x3   0.25 = (x1 + x2 + x3 )  0.375  0.375   0.25 =  0.375  0.375 where the last line follows because the initial probabilities must sum to one. Note that the state tends to this probability distribution no matter what the initial state is (or initial probabilities are). This is known as the steady state probability vector of this random walk. Solution 4.25 x1 + 2x2 + 3x3 Ax = 4x1 + 5x2 + 6x3 so y · Ax = y1 x1 + 2y1 x2 + 3y1 x3 + 4y2 x1 + 5y2 x2 + 6y2 x3 . On the other hand   y1 + 4y2 AT y =  2y1 + 5y2  3y1 + 6y2 so (AT y) · x = y1 x1 + 4y2 x1 + 2y1 x2 + 5y2 x2 + 3y1 x3 + 6y2 x3 . These expressions are equal. Solution 4.26 (AT )T = A.   9 7 9 12 15 Solution 4.27 AB = so (AB)T =  12 11 . On the other 7 11 15     15 15 1 4 9 7 1 3 hand B T AT =  2 5  =  12 11 . 2 1 3 6 15 15 Solution 4.28 If y = ei and x = ej (vectors that are all zeros except in one spot) then y · (Ax) = ei · (Aej ) is the matrix entry aij . So we can conclude that all the matrix entries of A are the same as those for B. This means the matrices must be the same. Solution 4.29 x · (Ay) = xT Ay = (AT x)T y = (AT x) · y Solution 4.30 To determine whether these matrices invertible we reduce them using Gaussian elimination. This gives 1 2 (a) 0 −2 183
  • 184. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems   1 2 3 (b)  0 3 4  0 0 −1/3   1 2 3 (c)  0 3 4  0 0 2/3 Since these all have rank equal to their size, they are all invertible. Solution 4.31 The inverse is −5 2 3 −1 Solution 4.32   2 3 −1 (a) This matrix reduces to  0 1/2 7/2  and so is not invertible 0 0 0   −1 0 1 (b) The inverse is  1 1 0  3 1 −1   3 −5/2 1/2 (c)  −3 4 −1  1 −3/2 1/2   −7 5 3 (d)  3 −2 −2  3 −2 −1   1 0 −a (e) 0 1 0  0 0 1   1 −a ac − b (f) 0 1 −c  0 0 1 Solution 4.33 In order to find the inverse of A, we form the augmented matrix [A|I]. Then we apply elementary row operations to A, to reduce A to its rref- form. If A is invertible, then using this method [A|I] transfers to [I|B], where B is the inverse of A. Now, let   1 1 1 A= 0 2 3  5 5 1 184
  • 185. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems and form the augmented matrix:   1 1 1 1 0 0  0 2 3 0 1 0  5 5 1 0 0 1 The reduced row echelon form of the augmented matrix is   1 0 0 13/8 −1/2 −1/8  0 1 0 −15/8 1/2 3/8  0 0 1 5/4 0 −1/4 Therefore, the inverse of A is   13/8 −1/2 −1/8 −1 A =  −15/8 1/2 3/8  . 5/4 0 −1/4 Let   1 2 −3 1  −1 3 −3 −2  B=  2  0 1 5  3 1 −2 5 Applying the same method, we observe that B is not invertible because   1 0 0 2  0 1 0 1  rref (B) =   0 0 . 1 1  0 0 0 0 Note that rank of B = 3, which is the number of non-zero rows in rref(B). Solution 4.34 a) Let   1 0 5 A= 1 −2 3  2 1 −3 be the coefficient matrix. Then   6 Ax =  14  −2 b)     3/28 5/28 5/14 3 5 10 1  A−1 =  9/28 −13/28 1/14  = 9 −13 2  28 5/28 −1/28 −1/14 5 −1 −2 185
  • 186. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems c) In order to solve the system Ax = b, we multiply both sides by A−1 :     6 17 1 A−1 Ax = A−1 b = A−1  14  = −33  7 −2 5   17 ⇒ Ix = x =  −33  . 5 So, the system has the unique solution   17 x =  −33  5 Solution 4.35 We reduce the matrix as follows: 1 1 1 1   1 2 4 8  1 3 9 27   1 4 16 64 1 1 1 1    0 1 3 7  (R2) − (R1) 0 2 8 26 (R3) − (R1)   0 3 15 63 (R4) − (R1) 1 1 1 1   0 1 3 7  0 0 2 12 (R3) − 2(R2)   0 0 6 42 (R4) − 3(R2) 1 1 1 1   0 1 3 7  0 0 2 12   0 0 0 6 (R4) − 3(R3) None of these operations affect the determinant. So the determinant of the original matrix is the same as the determinant of the reduced diagonal matrix. This determinant is the product of the diagonal elements which equals 12. Solution 4.36 We reduce the matrix as follows: 1 −1 1 −1   1 2 4 8  1 −2 4 −8   1 1 1 1 1 −1 1 −1    0 3 3 9  (R2) − (R1) 0 −1 3 −7 (R3) − (R1)   0 2 0 2 (R4) − (R1) 186
  • 187. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems 1 −1 1 −1   0 1 1 3  (1/3)(R2) 0 −1 3 −7   0 2 0 2 1 −1 1 −1   0 1 1 3  0 0 4 −4 (R3) + (R2)   0 0 −2 −4 (R4) − 2(R2) 1 −1 1 −1   0 1 1 3  0 0 1 −1 (1/4)(R3)   0 0 −2 −4 1 −1 1 −1   0 1 1 3  0 0 1 −1   0 0 0 −6 (R4) + 2(R3) The two operations that changed the determinant were multiplying the the second row by 1/3 and multiplying the third row by 1/4. Thus the determinant of the diagonal matrix is (1/3)(1/4)× the determinant of the original matrix. Hence the determinant of the original matrix is 3 × 4 × (−6) = −72. Solution 4.37 a)   2 0 2 4  0 0 3 2  det A = det    2 2 4 4  3 0 6 2       0 3 2 0 0 2 0 0 3 = 2 det  2 4 4  − 0 + 2 det  2 2 4  − 4 det  2 2 4  0 6 2 3 0 2 3 0 6 2 4 2 4 2 2 = 2 0 − 3 det + 2 det + 2(2) det − ··· 0 2 0 6 3 0 2 2 4(3) det 3 0 = 2(0 − 3(4 − 0) + 2((2 − 0)) + 4(0 − 6) − 12(0 − 6) = 2(12) − 24 + 72 = 72. b)   2 0 2 4  0 0 3 2  det A = det   2  2 4 4  3 0 6 2 187
  • 188. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems   2 2 4 = −0 + 0 − 2 det  0 3 2 +0 3 6 2 2 4 2 2 = −2 −0 + 3 det − 2 det 3 2 3 6 = −2(3(4 − 12) − 2(12 − 6)) = −2(−24 − 12) = 72 Solution 4.38   1 0 1 0 1 1 1 1 0 det  1 2 3  = −1 det + 2 det − 3 det 0 1 3 1 3 0 3 0 1 = −1 × 0 + 2 × (1 − 3) − 3 × 0 = −4   1 0 1 1 2 1 0 1 0 det  1 2 3  = 1 det − 3 det + 1 det 3 0 3 0 1 2 3 0 1 = 1 × (0 − 6) − 3 × 0 + 1 × (2 − 0) = −4 Solution 4.39   2 0 2 4  0 0 3 2  1 D = det   2  → (1, :) = (1, :) 2 4 4  2 3 0 6 2   1 0 1 2 1  0 0 3 2  (3, :) = (3, :) − 2(1, :) D = det   → 2  2 2 4 4  (4, :) = (4, :) − 3(1, :) 3 0 6 2   1 0 1 2 1  0 0 3 2  D = det   0  → (4, :) = (4, :) − (2, :) 2 2 2 0  0 0 3 −4   1 0 1 2 1  0 0 3 2  D = det   0  → (2, :) ↔ (3, :) 2 2 2 0  0 0 0 −6   1 0 1 2 1  0 2 2 0  − D = det   0  2 0 3 2  0 0 0 −6 1 Therefore, − 2 D = (1)(2)(3)(−6) = −36 → D = 72. 188
  • 189. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems Solution 4.40 If n is even, then n row interchanges are required to reduce the 2 matrix to a diagonal matrix. Therefore, the determinant is (−1)n/2 a1 a2 · · · an . If n is odd, then n−1 row interchanges are required to reduce the matrix to 2 a diagonal matrix (note that the centre row stays). Therefore, the determinant n−1 is (−1) 2 a1 a2 · · · an . In other words:     an a1 . a2 ..     det   = (−1)k det   = (−1)k a1 a2 · · · an ,     ..  a 2   .  a1 an where n 2 n even k= n−1 2 n odd Alternatively, we can expand the determinant on the first row:     an an−1  . ..   . ..  det   = (−1)1+n an det        a2   a2  a1 a1   an−2 . ..   = (−1)1+n an (−1)1+(n−1) an−1 det      a2  a1 = ··· = (−1)1+n an (−1)1+(n−1) an−1 · · · (−1)1+1 a1 = (−1)(1+n)+n+···+2 a1 a2 · · · an (n+1)(n+2) −1 = (−1) 2 a1 a2 · · · an Solution 4.41 A matrix is not invertible if and only if its determinant is equal to zero. Therefore, all we need to do is find the determinant of the matrix, and determine the values of λ that make the determinant zero.   2−λ 1 0 −λ 1 −1 1 det  −1 −λ 1  = (2 − λ) det − det +0 3 1−λ 1 1−λ 1 3 1−λ = (2 − λ)(−λ(1 − λ) − 3) − (−(1 − λ) − 1) = (2 − λ) λ2 − λ − 3 − (λ − 2) = (2 − λ)(λ2 − λ − 2) = (2 − λ)(λ − 2)(λ + 1) = 0 ⇒ λ = 2, −1 189
  • 190. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems Solution 4.42 a) False Counterexample: 1 1 det =0=1 1 1 The correct statement should be: The determinant of a triangular (or diagonal) matrix is the product of the entries of its main diagonal. b) False Counterexample: A = [1 2] ∈ R1×2 . Then 1 det(AAT ) = det [1 2] 2 = det([5]) = 5 1 det(AT A) = det [1 2] 2 1 2 = det =0 2 4 Therefore, det(AAT ) = (AT A). The correct statement should be: For every square matrix A, we have det(AAT ) = (AT A). c) True Since det(A) det(B) det(B −1 AB) = det(B −1 ) det(A) det(B) = = det(A) det(B) det(B) det(A) det(BAB −1 ) = det(B) det(A) det(B −1 ) = = det(A). det(B) d) False Counterexample: Let 1 0 A= 0 1 Then det(A) = 1 2 0 det(2A) = det =4 0 2 ⇒ det(A) = 2 det(A). 190
  • 191. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems The correct statement should be: If an n × n matrix A is multiplied by a scalar c, the determinant of the resulting matrix is det(cA) = cn · det(A). e) True Since suppose the column vectors of A ∈ Rn×n are linearly inde- pendent, then Ax = 0 has a unique solution x = 0. The row vectors of AT will also be linearly independent, hence AT x = 0 also has a unique solution x = 0. Furthermore, since the columns are linearly independent, det(A) = 0 ⇒ det(AT ) = det(A) = 0.     1 1 1 3 2 Solution 4.43 B =  1 0  and b =  1  so B T B = and B T b = 2 2 1 1 0 2 1 The equation B T Bx = B T b has solution x = . For this vector 1   −1/2 1/2 x we have Bx =  1  1/2     x1 1 y1 Solution 4.44 We obtain B =  x2 1  and b =  y2  so that B T B = x3 1 y3 x2 i xi T xi yi (where n = 3 in this example) and B b = . Thus xi n yi we end up with the same equations as before. Solution 4.45 The inverse operation to multiplying the first row by two is multiplying thefirst row by    1/2. Therefore the inverse elementary matrix to 2 0 0 1/2 0 0  0 1 0  is  0 1 0 . 0 0 1 0 0 1 The inverse operation to subtracting twice the first row from the second row is adding twice the first row  the second row. Therefore the inverse  to   1 0 0 1 0 0 elementary matrix to  −2 1 0  is  2 1 0 . 0 0 1 0 0 1 The inverse operation to exchanging the last two rows  is exchanging them  1 0 0 again. Therefore the inverse elementary matrix to  0 0 1  is the same   0 1 0 1 0 0 matrix  0 0 1 . 0 1 0 191
  • 192. Chapter 4. Matrices and Determinants 4.9. Solutions to Chapter Problems In each case one can check directly that the inverse matrices when multiplied by the original matrices give the identity matrix. Solution 4.46 We can reduce A to the identity with the following row operations: (R2) − (1/2)(R1), (R3) + (1/2)(R1), (R3) − (R2), (R1) − 6(R2), (R2) − (5/4)(R3), (R1) + 7(R3), (1/2)(R1), 2(R2), (1/2)(R3). So 1 0 0 1 0 0 1/2 0 0 1 0 7 1 0 0 1 −6 0 1 0 0 1 0 0 1 0 0 0 0 1 0 0 1/2 0 0 2 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1 0 −5/4 1 0 0 1 0 0 1 0 0 1 −1 0 1 0 1/2 1 0 0 1 −1/2 0 1 0 0 1 A=I so 1 0 0 1 0 0 1 0 0 1 6 0 1 0 0 1 0 −7 2 0 0 1 0 0 1 0 0 A= 1/2 0 1 0 0 1 0 −1/2 1 0 0 1 0 0 1 1 0 1 0 0 1 0 0 1 0 0 1 0 5/4 1 0 0 1 0 0 1 0 0 1 0 0 1 0 0 1/22 0 0 1 0 0 1 0 0 2 Solution 4.47 The determinant is −4 by problem 4.38, and 2 3 1 3 1 2 M1,1 = M1,2 = M1,3 = 0 1 3 1 3 0 0 1 1 1 1 0 M2,1 = M2,2 = M2,3 = 0 1 3 1 3 0 0 1 1 1 1 0 M3,1 = M3,2 = M3,3 = 2 3 1 3 1 2 Thus the inverse is    1 1  2 −0 −2 −2 0 1  1 2 1 −(−8) −2 −2  =  −2 2 2  −4 3 −6 −0 2 2 0 −12 Solution 4.48 According to Cramer’s rule   1 0 1 1 x1 = det  0 2 3 = 0 −4 1 0 1   1 1 1 1 3 x2 = det  1 0 3 = − −4 2 3 1 1   1 0 1 1 x3 = det  1 2 0 = 1 −4 3 0 1 192
  • 193. Chapter 5 Complex numbers 5.1 Complex arithmetic Complex numbers can be thought of as points on the xy plane. The point (x, y), thought of as a complex number, is written x + iy (or x + jy if you are an electrical engineer). The i stands for an “imaginary” quantity such that i2 = −1. If z = x + iy then x is called the real part of z and y is called the imaginary part of z. Complex numbers are added just as if they were vectors in two dimensions. If z = x + iy and w = s + it, then z + w = (x + iy) + (s + it) = (x + s) + i(y + t) To multiply two complex numbers, just remember that i2 = −1. So if z = x + iy and w = s + it, then zw = (x + iy)(s + it) = xs + i2 ytr + iys + ixt = (xs − yt) + i(xt + ys) The modulus of a complex number, denoted |z| is simply the length of the corresponding vector in two dimensions. If z = x + iy |z| = |x + iy| = x2 + y 2 An important property is |zw| = |z||w| just like for real numbers. The complex conjugate of a complex number z, denoted z , is the reflection ¯ of z across the x axis (also called the real axis). Thus x + iy = x − iy. Thus complex conjugate is obtained by changing all the i’s to −i’s. We have zw = z w ¯¯ and z z = |z|2 ¯ 193
  • 194. Chapter 5. Complex numbers 5.2. Complex exponential This last equality is useful for simplifying fractions of complex numbers by turning the denominator into a real number, since z zw ¯ = w |w|2 For example, to simplify (1 + i)/(1 − i) we can write 1+i (1 + i)2 1 − 1 + 2i = = =i 1−i (1 − i)(1 + i) 2 A complex number z is real (i.e. the y part in x+iy is zero) whenever z = z. We ¯ also have the following formulas for the real and imaginary part. If z = x + iy then x = (z + z )/2 and y = (z − z )/(2i) ¯ ¯ Complex numbers are indispensable in many practical calculations. We will discuss complex exponentials when we talk about differential equations. The reason why we are interested in them now is the following fact: Theorem 5.1 If we use complex numbers, every polynomial can be completely factored. In other words, given a polynomial λn + an−1 λn−1 + · · · + a1 λ + a0 , there exist (possibly complex) numbers r1 , r2 , . . . , rn such that λn + an−1 λn−1 + · · · + a1 λ + a0 = (λ − r1 )(λ − r2 ) · · · (λ − rn ) The numbers r1 are the values of λ for which the polynomial is zero. So for example the polynomial λ2 + 1 has no real roots, since there is no real number λ for which it is zero. However there are two complex roots, ±i and λ2 + 1 = (λ + i)(λ − i) Of course, actually finding the roots of a high degree polynomial is difficult. Here are some points to keep in mind. You can always find the roots of a quadratic polynomial using the quadratic formula. In other words, the roots of aλ2 + bλ + c are √ −b ± b2 − 4ac 2a If the quantity inside the square root is negative, then the roots are complex. So, for example the roots of λ2 + λ + 1 are √ √ √ √ √ −1 ± 12 − 4 −1 ± −3 −1 ± −1 3 −1 3 = = = ±i . 2 2 2 2 2 5.2 Complex exponential We begin by considering the differential equation y (t) = y(t) 194
  • 195. Chapter 5. Complex numbers 5.2. Complex exponential In other words we are looking for a function whose derivative is equal to the function. The exponential is such a function, so y(t) = et is a solution to this differential equation. So is y(t) = Cet , where C is a constant. Now consider the equation y (t) = ay(t) where a is a real number. Then, using the chain rule, we see that y(t) = Ceat is a solution for any choice of constant C. Notice that the constant C is y(0), the value of the solution at time zero. If we insist that the solution at time t = 0 take on a particular value y(0) = y0 Then this forces the constant C to be y0 Now consider the equation y (t) = iy(t) A solution to this equation is given by y(t) = cos(t) + i sin(t) To check this, just differentiate. y (t) = − sin(t) + i cos(t) = i(cos(t) + i sin(t)) = iy(t) So it is natural to define the exponential, eit , of a purely imaginary number it to be eit = cos(t) + i sin(t) The complex exponential satisfies the familiar rule ei(s+t) = eis eit since by the addition formulas for sine and cosine ei(s+t) = cos(s + t) + i sin(s + t) = cos(s) cos(t) − sin(s) sin(t) + i(sin(s) cos(t) + cos(s) sin(t)) = (cos(s) + i sin(s))(cos(t) + i sin(t)) = eis eit Now it easy to check that solutions to y (t) = iby(t) are given by y(t) = Ceibt , where C is an arbitrary constant. Since we are dealing with complex numbers, we allow C to be complex too. The exponential of a number that has both a real and imaginary part is defined in the natural way. ea+ib = ea eib = ea (cos(b) + i sin(b)) 195
  • 196. Chapter 5. Complex numbers 5.3. Polar representation of a complex number r eiθ r eiθ θ Figure 5.1: Polar representation of a complex number. and it is easy to check that the solution to the differential equation y (t) = λy(t) = (a + ib)y(t) is given by y(t) = Ceλt = Ce(a+ib)t . As before, if we insist that the solution at time t = 0 take on a particular value y(0) = y0 , then this forces the constant C to be y0 . 5.3 Polar representation of a complex number Notice that the number eiθ = cos(θ) + i sin(θ) lies on the unit circle on the complex plane, at the point making an angle of θ (radians) with the x axis. If we multiply eiθ by a real number r, then we obtain the complex number whose polar co-ordinates are r and θ. This is shown in Figure 5.1. Notice that r is exactly the length (sometimes called the modulus) of the complex number reiθ . The angle θ is called the argument. This representa- tion of complex numbers makes the definition of complex multiplication more transparent. We have r1 eiθ1 r2 eiθ2 = r1 r2 ei(θ1 +θ2 ) In other words, when we multiply two complex numbers, the moduli get multiplied and the arguments get added. 5.4 MATLAB Complex numbers are handled naturally by MATLAB. Specifying a complex number is done with the syntax z = 1+3i. The commands real(z) and imag(z) return the real and imaginary parts of a complex number z. The commands abs(z) and conj(z) return the length and conjugate of z. Typing sqrt(-1) returns 196
  • 197. Chapter 5. Complex numbers 5.5. Problems ans = 0 + 1.0000i 5.5 Problems Problem 5.1 Show that |zw| = |z||w| for complex numbers z and w. Problem 5.2 Show that zw = z w for complex numbers z and w. ¯¯ Problem 5.3 Show that z z = |z|2 for every complex numbers z. ¯ Problem 5.4 Simplify the following expressions to the form x + iy. 1. i(2 − 3i)(−2 + i) 5 2. (1 − i)(2 − i)(3 − i) 3. (−1 + i)50 Problem 5.5 Prove that z1 |z1 | 1. = ; z2 |z2 | 2. (z n ) = (¯)n for all n ∈ N; z 1 3. = z if |z| = 1; ¯ z 4. z is either real or pure imaginary if (¯)2 = z 2 . z Problem 5.6 There are three values of z which gives z 3 = −i. Find all three values in the form x + iy. 5.6 Solutions to Chapter Problems Solution 5.1 If z = x + iy and w = s + it then zw = xs − yt + i(xt + ys) so |zw|2 = (xs − yt)2 + (xt + ys)2 = x2 s2 + y 2 t2 − 2xyst + x2 t2 + y 2 s2 + 2xyst = (x2 + y 2 )(s2 + t2 ) = |z|2 |w|2 . Solution 5.2 If z = x + iy and w = s + it then zw = xs − yt + i(xt + ys) so zw = xs − yt − i(xt + ys). On the other hand z w = (x − iy)(s − it) = ¯¯ xs − yt − i(xt + ys). Solution 5.3 If z = x + iy then z z = (x + iy)(x − iy) = x2 + y 2 = |z|2 . ¯ Solution 5.4 197
  • 198. Chapter 5. Complex numbers 5.6. Solutions to Chapter Problems a) i(2 − 3i)(−2 + i) = (3 + 2i)(−2 + i) = −6 − 4i + 3i + 2i2 = −8 − i b) 5 5(1 + i)(2 + i)(3 + i) = (1 − i)(2 − i)(3 − i) (1 − i)(2 − i)(3 − i)(1 + i)(2 + i)(3 + i) 5(1 + 3i)(3 + i) = 2 · 5 · 10 5(10i) = = i/2 100 c) √ 3π 50 (−1 + i)50 = 2ei 4 75π = 225 ei 2 = 225 ei( 2 +36π) 3π = 225 (−i) = −225 i Solution 5.5 a) Let z1 = |z1 |eiθ1 , z2 = |z1 |eiθ2 , z1 |z1 |eiθ1 |z1 | i(θ1 −θ2 ) |z1 | = = e = z2 |z2 |eiθ2 |z2 | |z2 | Alternatively, 2 z1 z1 z1 z1 z1 z1 z1 |z1 |2 = = = = z2 z2 z2 z2 z2 z2 z2 |z2 |2 z1 Since z2 > 0, |z1 | > 0, |z2 | > 0, we have that z1 |z1 | = z2 |z2 | b) Let z = reiθ . Then n z n = rn einθ = rn e−inθ = re−iθ = (z)n 198
  • 199. Chapter 5. Complex numbers 5.6. Solutions to Chapter Problems c) If |z| = 1, then z = eiθ . So 1 1 = iθ = e−iθ = z z e d) Let z = x + iy. Then (z)2 = (x − iy)2 = x2 − y 2 − 2ixy z 2 = (x + iy)2 = x2 − y 2 + 2ixy. If (z)2 = z 2 , then x2 − y 2 − 2ixy = x2 − y 2 + 2ixy, therefore the condition is that −2ixy = 2ixy, that is, xy = 0. Hence, (z)2 = z 2 only when z is either real or pure imaginary. Solution 5.6 If z 3 = −i, then 3π 1/3 z = (−i)1/3 = ei 2 +2kπ 3π 1/3 π k=0 ⇒ ei 2 = ei 2 = i √ 7π 1/3 7π 7π 7π 3 1 k=1 ⇒ ei 2 = ei 6 = cos + i sin =− − i 6 6 2√ 2 11π 1/3 11π 11π 11π 3 1 k=2 ⇒ ei 2 = ei 6 = cos + i sin = − i 6 6 2 2 199
  • 200. Chapter 6 Eigen-analysis 6.1 Eigenvalues and eigenvectors Definition 6.1 Let A be an n × n matrix. A number λ and a vector x are called an eigenvalue eigenvector pair if (1) x = 0 (2) Ax = λx In other words, the action of A on the vector x is to stretch or shrink it by an amount λ without changing its direction. We say λ is an eigenvalue of A if there exists a vector x so that λ and x are an eigenvalue eigenvector pair. Notice that we do not allow x = 0. If we did, any number λ would be an eigenvalue. However we do allow λ = 0. Saying that 0 is an eigenvalue of A means that there is a non-zero solution x (the eigenvector) of Ax = 0x = 0. So we see that 0 is an eigenvalue of A precisely when A is not invertible. Let’s look at some examples. Consider first the matrix of reflection about a line making an angle of θ with the x axis shown in Figure 6.1. Let x be any vector that lies along the line. Then the reflection doesn’t affect x. This means that Rx = x. In other words, x is an eigenvector with eigenvalue 1. On the other hand, suppose that y is a vector at right angles to the line. Then the reflection flips y into minus itself. So Ry = −y. In other words, y is an eigenvector with eigenvalue −1. If we take any other vector and reflect it, we don’t end up with a vector that lies on the same line as the original vector. Thus there are no further eigenvectors or eigenvalues. An important point to notice is that the eigenvector is not uniquely deter- mined. The vector x could be any vector along the line, and y could be any vector orthogonal to the line. In fact, if we go back to the original definition of eigenvalue and eigenvector we can see that if λ and x are an eigenvalue eigen- vector pair, then so are λ and sx for any non-zero number s, since sx = 0 and Asx = sAx = sλx = λsx. So the important thing about an eigenvector is its 200
  • 201. Chapter 6. Eigen-analysis 6.1. Eigenvalues and eigenvectors y x = Rx θ Ry = - y Figure 6.1: Eigenvalues and eigenvectors of a 2D reflection. direction, not its length. However, there is no such ambiguity in the definition of the eigenvalue. The reflection matrix has exactly two eigenvalues: 1 and −1. In some sense, the reflection matrix R illustrates the most satisfactory sit- uation. R is a 2 × 2 matrix with two distinct eigenvalues. The corresponding eigenvectors x and y are linearly independent (in fact they are orthogonal) and form a basis for two dimensional space. It will be important in applications to determine whether or not there exists a basis of eigenvectors of a given matrix. In this example, x and y are a basis of eigenvectors of R. As our next example, consider the identity matrix I. Since the identity matrix doesn’t change a vector, we have Ix = x for any vector x. Thus any vector x is an eigenvector of I with eigenvalue 1. This example shows that a given eigenvalue may have many eigenvectors associated with it. However, in this example, there still exists a basis of eigenvectors: any basis at all is a basis of eigenvectors of I. Next we will consider the rotation matrix . . . and run into trouble. Suppose R is the matrix of rotation by π/4 (i.e. 45◦ ). Then Rx is never in the same direction as x, since R changes the direction of x by π/4. So R has no eigenvalues and no eigenvectors. This unfortunate state of affairs will cause us to make a considerable detour into the theory of complex numbers. It turns out that if we work with complex numbers rather than real numbers, then the rotation matrix has eigenvalues too. 6.1.1 Computing the eigenvalues and eigenvectors We now consider the problem of finding all eigenvalue eigenvector pairs for a given n × n matrix A. To start, suppose someone tells you that a particular value λ is an eigenvalue of A. How can you find the corresponding eigenvector x? This amounts to solving the equation Ax = λx for x. This can be rewritten (A − λI)x = 0, where I denotes the identity matrix. In other words x is a non-zero solution to a homogeneous equation. It can be found by Gaussian elimination. 201
  • 202. Chapter 6. Eigen-analysis 6.1. Eigenvalues and eigenvectors For example, suppose you know that 4 is an eigenvalue of   3 −6 −7  1 8 5 . −1 −2 1 To find the corresponding eigenvector, we must solve         3 −6 −7 1 0 0 x1 0  1 8 5  − 4  0 1 0   x2  =  0  . −1 −2 1 0 0 1 x3 0 This can be written      −1 −6 −7 x1 0  1 4 5   x2  =  0  . −1 −2 −3 x3 0 To solve this we reduce the matrix. This yields   −1 −6 −7  0 −2 −2  0 0 0 The fact that the rank of this matrix is less than 3 confirms that 4 is indeed an eigenvalue. If the rank of the matrix were 3 then the only solution to the equation would be 0 which is not a valid eigenvector. Taking x3 = s as a parameter, we find that x2 = −s and x1 = −s. Thus   −1 x = s  −1  1 is an eigenvector (for any non-zero choice of s). In particular, we could take s = 1. Then   −1 x =  −1  . 1 When doing calculations by hand, it makes sense to take the scalar multiple s in the eigenvector calculation so that it simplifies the form of the eigenvector (clears common denominators in the components, for example). Eigenvectors computed in MATLAB are normalized (scaled so that they have length 1). Now that we have a method for finding the eigenvectors once we know the eigenvalues, the natural question is: Is there a way to determine the eigenvalues without knowing the eigenvectors? This is where determinants come in. The number λ is an eigenvector if there is some non-zero solution x to the equation (A − λI)x = 0. In other words, λ is an eigenvalue if the matrix (A − λI) is not invertible. This happens precisely when det(A − λI) = 0. 202
  • 203. Chapter 6. Eigen-analysis 6.1. Eigenvalues and eigenvectors This gives us a method for finding the eigenvalues. Compute det(A − λI). This will be a polynomial in λ. The eigenvalues will be exactly the values of λ that make this polynomial zero, i.e., the roots of the polynomial. So here is the algorithm for finding the eigenvalues and eigenvectors: (1) Compute det(A − λI) and find the values of λ for which it is zero. These are the eigenvalues. (2) For each eigenvalue, find the non-zero solutions to (A − λI)x = 0. These are the eigenvectors. I should mention that this is actually only a practical way to find eigenvalues when the matrix is small. Finding eigenvalues of large matrices is an important problem and many efficient methods have been developed for use on computers. Example 6.1 Find the eigenvalues and eigenvectors of 2 1 A= . 1 2 First we compute det(A − λI) = (2 − λ)(2 − λ) − 1 = λ2 − 4λ + 3 We can find the roots of this polynomial using the quadratic formula or by factoring it by inspection. We get λ2 − 4λ + 3 = (λ − 1)(λ − 3), so the eigenvalues are 1 and 3. Now we find the eigenvector for λ = 1. We must solve (A − I)x = 0. The matrix for this homogeneous system of equations is 1 1 . 1 1 Reducing this matrix yields 1 1 0 0 so an eigenvector is 1 −1 Next we find the eigenvector for λ = 3. We must solve (A − 3I)x = 0. The matrix for this homogeneous system of equations is −1 1 1 −1 203
  • 204. Chapter 6. Eigen-analysis 6.1. Eigenvalues and eigenvectors Reducing this matrix yields −1 1 0 0 so an eigenvector is 1 . 1 Example 6.2 Let us find the eigenvalues and eigenvectors of   3 −6 −7 A= 1 8 5 . −1 −2 1 First we compute   3−λ −6 −7 det(A − λI) = det  1 8−λ 5  −1 −2 1−λ = (3 − λ)((8 − λ)(1 − λ) + 10) + 6((1 − λ) + 10) − 7(−2 + (8 − λ)) = −λ3 + 12λ2 − 44λ + 48 It is not always easy to find the zeros of a polynomial of degree 3. However, if we already know one solution, we can find the other two. Sometimes, one can find one solution by guessing. In this case we already know that 4 is a solution (since this is the same matrix that appeared in the example in the last section). We can check this: −64 + 12 × 16 − 44 × 4 − 48 = 0 This means that λ3 +12λ2 −44λ+48 can be factored as −λ3 +12λ2 −44λ+48 = (λ − 4)q(λ), where q(λ) is a second degree polynomial. To find q(λ) we can use long division of polynomials. −λ2 +8λ −12 λ−4 −λ3 +12λ2 −44λ +48 3 2 −λ +4λ 8λ2 −44λ 8λ2 −32λ −12λ +48 −12λ +48 This yields q(λ) = −λ2 + 8λ − 12 This can be factored using the quadratic formula (or by inspection) as q(λ) = −(λ − 2)(λ − 6) So we conclude −λ3 + 12λ2 − 44λ + 48 = −(λ − 4)(λ − 2)(λ − 6) 204
  • 205. Chapter 6. Eigen-analysis 6.1. Eigenvalues and eigenvectors and the eigenvalues are 2, 4 and 6. Now we find the eigenvector for λ = 2. We must solve (A−2I)x = 0. The matrix for this homogeneous system of equations is   1 −6 −7  1 6 5  −1 −2 −1 Reducing this matrix yields   1 −6 −7 0 −8 −8  0 0 0 so an eigenvector is   1  −1  1 Next we find the eigenvector for λ = 4. We must solve (A − 4I)x = 0. The matrix for this homogeneous system of equations is   −1 −6 −7  1 4 5  −1 −2 −3 Reducing this matrix yields   −1 −6 −7  0 −2 −2  0 0 0 so an eigenvector is   −1  −1  1 Finally we find the eigenvector for λ = 6. We must solve (A − 6I)x = 0. The matrix for this homogeneous system of equations is   −3 −6 −7  1 2 5  −1 −2 −5 Reducing this matrix yields   −3 −6 −7  0 0 8  0 0 0 so an eigenvector is   −2  1  0 205
  • 206. Chapter 6. Eigen-analysis 6.1. Eigenvalues and eigenvectors Example 6.3 (Repeated Eigenvalues) Find the eigenvalues and eigenvec- tors of   1 1 0 A = 0 2 0. 0 −1 1 Consider   1−λ 1 0 det(A − λI) = det  0 2−λ 0 . 0 −1 1−λ In this case it makes sense to expand along the last column. This yields det(A − λI) = 0 − 0 + (1 − λ)(1 − λ)(2 − λ) = (1 − λ)2 (2 − λ) This is already factored, so the zeros are λ = 1 and λ = 2. Notice that the factor (1 − λ) occurs occurs to the second power. In this situation there are fewer distinct eigenvalues than we expect. Lets compute the eigenvectors. To find the eigenvectors for λ = 1 we must solve the homogeneous equation with matrix A − I, i.e.,   0 1 0 0 1 0 0 −1 0 This reduces to   0 1 0 0 0 0 0 0 0 and we find that there are two parameters in the solution. The set of solutions in parametric form is     1 0 s0 + t0 0 1 We can find two linearly independent solutions by setting s = 1, t = 0 and s = 0, t = 1. This gives     1 0 0,0 0 1 To find the eigenvectors for λ = 2 we must solve the homogeneous equation with matrix A − 2I, i.e.,   −1 1 0  0 0 0  0 −1 −1 This reduces to   −1 1 0  0 −1 −1  0 0 0 206
  • 207. Chapter 6. Eigen-analysis 6.1. Eigenvalues and eigenvectors and we find that the set of solutions in parametric form is   1 s 1  −1 Setting s = 1 gives the eigenvector   1  1  −1 In this 3 × 3 example, even though there are only two distinct eigenvalues, 1 and 2, there are still three independent eigenvectors (i.e., a basis), because the eigenvalue 1 has two independent eigenvectors associated to it. Example 6.4 (Repeated Eigenvalues with “Missing” Eigenvectors) Find the eigenvalues and eigenvectors of 2 1 A= . 0 2 Here det(A − λI) = (λ − 2)2 so there is only one eigenvalues λ = 2. To find the eigenvectors, we must solve the homogeneous system with matrix 0 1 . 0 0 The solutions are 1 s 0 so there is only one eigenvector direction. So here is a matrix that does not have a basis of eigenvectors. Matrices like this, that have too few eigenvectors, will not be studied further in this course, but they do occur in applications. 6.1.2 Complex eigenvalues and eigenvectors Since eigenvalues are found as roots of polynomials, we now see that they can be complex. A discussion of complex eigenvalues and eigenvectors is given below. Example 6.5 Lets consider the matrix of rotation by π/2. This is the matrix 0 1 A= . −1 0 We compute −λ 1 det(A − λI) = det = λ2 + 1 −1 −λ 207
  • 208. Chapter 6. Eigen-analysis 6.1. Eigenvalues and eigenvectors The roots are ±i so the eigenvalues are i and −i. Now we compute the eigenvec- tor corresponding to the eigenvalue i. We must solve the homogeneous equation with matrix −i 1 −1 −i Notice that we will have to do complex arithmetic to achieve this, since the matrix now has complex entries. To reduce this matrix we have to add i times the first row to the second row. This gives −i 1 −i 1 = −1 + −i2 −i + i 0 0 So if we let the x2 = s, then −ix1 + s = 0, or x1 = −is. So the solution is −i s 1 and we may choose s = 1. Lets check that this is really an eigenvector: 0 1 −i 1 −i = =i . −1 0 1 i 1 To find the other eigenvector we can use a trick. Suppose that the original matrix A has only real entries. This will always be the case in our examples. Suppose that A has a complex eigenvalue eigenvector pair λ and x. Then ¯x ¯ x Ax = λx. Taking the complex conjugate of this equation, we obtain A¯ = λ¯ . (Here conjugating a matrix or a vector just means conjugating each entry). ¯ x ¯x ¯ Now, since A has real entries, A = A. Hence A¯ = λ¯ . In other words λ is an ¯ eigenvalue with eigenvector x. In the present example, we already know that ¯ = −i is an eigenvalue. But now we don’t have to compute the eigenvector that i goes along with it. It is simply the conjugate of the one we already computed. So the eigenvector corresponding to −i is i 1 The eigenvalues of A are the zeros or roots of the polynomial det(A − λI). If we use complex numbers then det(A − λI) can be completely factored, i.e., det(A − λI) = ±(λ − λ1 )(λ − λ2 ) · · · (λ − λn ) Finding the roots may be difficult. However for 2 × 2 matrices we may use the quadratic formula. If all the roots are distinct (i.e., λi = λj for i = j) then the corresponding eigenvectors x1 , x2 , . . . , xn are linearly independent (I didn’t show you why this is true, so I’m just asking you to believe it!) and therefore form a basis. If there are repeated roots, then there are fewer than n distinct eigenvalues. In this situation, it might happen that there are not enough eigenvectors to 208
  • 209. Chapter 6. Eigen-analysis 6.1. Eigenvalues and eigenvectors form a basis . However it also might happen that more than one eigenvector associated to a given eigenvalue, so that in the end there are enough eigenvectors to form a basis. Compare Examples 6.3 and 6.4 from earlier, where we saw that either situation can occur. Unfortunately, the only way we have to find out is to try to compute them all. 6.1.3 MATLAB When applied to a square matrix A, eig(A) will return the eigenvalues and the eigenvectors of A. To use this command enter the following in MATLAB: >> [P,D] = eig(A) What will be returned is a matrix P with the normalized (unit length) eigen- vectors of A in its columns, and a matrix D with the eigenvalues of A along it’s diagonal. The eigenvalue corresponding to the ith column of P is found in the (i, i) position of D. Using the eig command above will return complex eigenvalues and eigenvectors when present. Example 6.6 Consider A   1 4 5 A= 6 3 9  (6.1) 2 7 8 We can enter A into MATLAB and find its eigenvectors and eigenvalues with the following commands: >> A=[1 4 5; 6 3 9; 2 7 8]; >> [P,D] = eig(A) P = -0.3919 -0.5895 0.2238 -0.6401 -0.5446 -0.8511 -0.6609 0.5966 0.4750 D = 15.9657 0 0 0 -0.3653 0 0 0 -3.6004 These results tell us that A has eigenvectors {v1 , v2 , v3 } and corresponding eigenvalues {λ1 , λ2 , λ3 } as follows:        −0.3919 −0.5895 0.2238  {v1 , v2 , v3 } ≈  −0.6401  ,  −0.5446  ,  −0.8511  −0.6609 0.5966 0.4750   209
  • 210. Chapter 6. Eigen-analysis 6.1. Eigenvalues and eigenvectors {λ1 , λ2 , λ3 } ≈ {15.9657, −0.3653, −3.6004} 6.1.4 Problems 1 1 1 1 Problem 6.1 Show that and are eigenvectors for the matrix . 1 −1 1 1 What are the corresponding eigenvalues? Problem 6.2 Suppose P is a projection matrix. What are the eigenvalues and eigenvectors of P ? Problem 6.3 Find the eigenvalues and eigenvectors for 0 3 −2 −8 29 −10 −9 −14 a) b) c) d) 3 0 4 10 105 −36 7 12 Problem 6.4 Find the eigenvalues and the corresponding eigenvectors of the matrix 2 3 . 2 1 Problem 6.5 Find the eigenvalues and eigenvectors for         0 −1 1 1 1 1 7 −9 −15 31 −100 70 a)  1 0 2  b)  1 0 −2  c)  0 4 0  d)  18 −59 42  2 0 2 1 −1 1 3 −9 −11 12 −40 29 Problem 6.6 Let P be a 2 × 2 transitional probability matrix in the form p11 1 − p22 . 1 − p11 p22 Prove that one of the eigenvalues of P must be 1, and another one must be in the interval [−1, 1]. (Hint: Let c = p11 + p22 .) Problem 6.7 Find the eigenvalues and the corresponding eigenvectors of the following matrix.   0 1 −1 A= 5 0 1  0 1 −1 Problem 6.8 Find the eigenvalues and the corresponding eigenvectors of the following matrix.   2 0 1 A= 0 2 1  1 0 2 210
  • 211. Chapter 6. Eigen-analysis 6.2. Eigenanalysis simplifies matrix powers Problem 6.9 Is there a rank two matrix M, such that vectors − µ1 = [1, 2, 3]T → v → − µ = [3, 2, 1]T are eigenvectors of M, both corresponding to the same and v 2 eigenvalue µ1 = µ2 = −1? If your answer is yes then find such a matrix, and if your answer is no, then justify your answer. Problem 6.10 Find the eigenvalues and the corresponding eigenvectors of the matrix 2 3 . −2 −1 Problem 6.11 Given a 2 × 2 matrix a i A= i b a, Find values for a and b that A2 = A. b, Find values for a and b that A3 = A and A4 = A. c, Find values for a and b that A8 = A and A9 = A. Problem 6.12 Like in the previous question, given a 2 × 2 matrix a i A= i b a, Find values for a and b that the two eigenvalues of A are µ1 = 2 + i and µ2 = 2 − i. b, Find the eigenvectors of A in the previous question. (where the two eigen- values are 2 ± i) 6.2 Eigenanalysis simplifies matrix powers In the previous section we learnt how to find eigenvalues and eigenvectors of a matrix, including the case when they are complex. There are two main uses of this eigenanalysis: efficiently computing powers of a matrix (studied in this section) and in the solution of differential equations considered in Section 6.3 below. Recall that in the random walk application in Section 4.3 we were interested in high powers of a matrix, specifically lim P n x(0) n→∞ where P is the matrix of transition probabilities and x(0) is the column vector of initial probabilities. We will explore the use of eigenanalysis to simplify our understanding of these kind of problems in two examples below. 211
  • 212. Chapter 6. Eigen-analysis 6.2. Eigenanalysis simplifies matrix powers Example 6.7 We consider again the sorcerers’ duel in Example 4.1. We will consider what happens if the duel is allowed to continue without limit until there is a winner. Rather than compute lim P n x(0) n→∞ numerically in MATLAB as was done in Section 4.3 we will use an eigenanalysis of P to understand this limit. The transition matrix for this problem is 0 1/2 0 0    2/3 0 0 0  P = 1/3 0 1 0  0 1/2 0 1 with initial state x(0) = (1, 0, 0, 0)T . The eigenanalysis of P is summarized below: λ1 = 1 k1 = (0, 0, 1, 0)T λ2 = 1 k2 = (0, 0, 0, 1)T 1 √ 2 √ 1 λ3 = √ k3 = (1 − 3, √ (1 − 3), √ , 1)T 3 3 3 1 √ 2 √ 1 λ4 = − √ k4 = (1 + 3, − √ (1 + 3), − √ , 1)T 3 3 3 Note that λ = 1 is a repeated eigenvalue but there there is still a basis of eigenvectors (the set of eigenvectors associated with λ = 1 is two-dimensional). We can write x(0) = c1 k1 + c2 k2 + c3 k3 + c4 k4 (6.2) for some coefficients c1 , c2 , c3 and c4 uniquely determined. Equation (6.2) can be written in matrix-vector form T c = x(0) (6.3) where c = (c1 , c2 , c3 , c4 )T and T is the 4 × 4 matrix with eigenvectors k1 , k2 , k3 and k4 in its columns. Solving (6.3) (I used MATLAB) gives c1 = 1/2, c2 = 1/2, c3 ≈ −0.6830 and c4 ≈ 0.1830. With these values of c (6.2) is a representation of x(0) as a linear combination of eigenvectors of P . This makes working out later states x(n) easy, as shown below. x(1) = P x(0) = P (c1 k1 + c2 k2 + c3 k3 + c4 k4 ) = c1 λ1 k1 + c2 λ2 k2 + c3 λ3 k3 + c4 λ4 k4 1 1 = c1 k1 + c2 k2 + √ c3 k3 − √ c4 k4 3 3 where in the middle line we have remembered that the ki vectors are eigenvec- tors, so P ki = λki for i = 1, 2, 3, 4. Similarly 1 1 x(2) = P x(1) = P 2 x(0) = c1 k1 + c2 k2 + c3 k3 + c4 k4 3 3 212
  • 213. Chapter 6. Eigen-analysis 6.2. Eigenanalysis simplifies matrix powers and n n 1 1 x(n) = P n x(0) = c1 k1 + c2 k2 + √ c3 k3 + − √ c4 k4 (6.4) 3 3 This formula (6.4) is a simple formula for the state at any time n that does not involve much computational work. In addition, it is easy to see from this formula that lim x(n) = (0, 0, 1/2, 1/2)T n→∞ as found numerically in Example 4.1 Example 6.8 The weather in Vancouver is either good, average or bad on any given day. If the weather is good on any day, there is a 60% chance the weather will be good, 30% chance average, and 10% bad on the next day. If the weather is average, then on the next day there is a 40% chance of good, 30% of average and 30% of bad weather. If the weather is bad then on the next day there is a 40% chance of good, 50% of average and 10% of bad. If the weather is good today, what will the weather be like a long time from now? We number the states 1) good 2) average 3) bad The corresponding transition matrix is   6 4 4 1  P = 3 3 5 10 1 3 1 The initial state is x(0) = [1, 0, 0]T . The eigenanalysis of P is summarized below: λ1 = 1, k1 = [1/2, 1/3, 1/6]T λ2 = 0.2, k2 = [−2, 1, 1]T λ3 = −0.2, k1 = [0, −1, 1]T As above, we put the eigenvectors into the columns of a matrix T and solve T c = (1, 0, 0)T for c = (1, −1/4, 1/12). The right hand side (1, 0, 0)T of the system above corresponds to the initial state x(0) of good weather. As before, this gives us the representation of the initial state as a linear combination of the eigenvectors: 1 1 x(0) = k1 − k2 + k3 . 4 12 213
  • 214. Chapter 6. Eigen-analysis 6.2. Eigenanalysis simplifies matrix powers Again, we see that multiplying by P in this representation leads to an easy formula since ki are eigenvectors: 1 1 x(n) = P n x(0) = k1 − (0.2)n k2 + (−0.2n )k3 4 12 Note that after a long time (n → ∞) the second and third terms above tend to zero, so lim x(n) = k1 = (1/2, 1/3, 1/6) n→∞ so after a long time after the first nice day, the weather will have a 1/2 chance of being good, 1/3 average and 1/6 bad. Let us consider the example above a bit more closely. Intuitively, the weather after a long time should not depend on what it was like the day you started. In fact, we can show that lim x(n) = k1 (6.5) n→∞ for any starting probability x(0) (for which the entries must be non-negative and sum to one). To show this, we have to show that writing x(0) = c1 k1 + c2 k2 + c3 k3 (6.6) (0) always gives c1 = 1 no matter what x is. Note that the entries of k2 and k3 sum to zero and the entries of k1 and x(0) sum to one. So by summing the entries of (6.6) we see that c1 = 1 which shows that (6.5) is true for any starting probability x(0) as our intuition predicted. The probability k1 that all initial states tend to is called an equilibrium probability. In some cases described in the theorem below you can guarantee the existence of an equilibrium probability. Theorem 6.1 If P is a transition matrix (non-negative entries with all columns summing to one) that in addition has all positive entries then P has an eigen- value 1 with a single eigenvector k1 that can chosen to be a probability vector. All other eigenvalues λ satisfy |λ| < 1 with eigenvectors with components that sum to zero. Thus, lim x(n) = k1 n→∞ for any x(0) . That is, k1 is an equilibrium probability. Note that the Example 6.8 the transition matrix satisfied the conditions of the theorem and had a equilibrium probability. The transition matrix of Example 6.7 did not satisfy the conditions of the theorem and does not have an equilibrium probability (depending on the initial state, you can tend to different fractions of times that each sorcerer wins the duel). We can summarize the process used to analyze these examples: writing a vector x (the initial probability in our examples) as a linear combination of eigenvectors of a matrix A (the transition matrix for our examples) then easily writing An x as a linear combination of eigenvectors. This process can be summarized in matrix-vector notation as Diagonalization discussed in more detail in Section 6.5.1. 214
  • 215. Chapter 6. Eigen-analysis 6.3. Systems of linear differential equations 6.2.1 Problems Problem 6.13 Find the eigenvalues and corresponding eigenvectors of the stochas- tic matrix P below. Use the eigenvectors and eigenvalues to describe lim P n x(0) , n→∞ where x(0) = [1, 0, 0]T , and 1 1   0 4 2 1 1 1 P = 2 2 2  1 1 2 4 0 Problem 6.14 What is the necessary condition on a and b, for P having an equilibrium probability? Find the equilibrium probability vector (if it exists) for a = b = 1/4. 1 1 1   4 3 2 1 1 P = 2 3 b  1 1 4 3 a 6.3 Systems of linear differential equations Consider the system of differential equations y1 (t) = a1,1 y1 (t) +a1,2 y2 (t) (6.7) y2 (t) = a2,1 y1 (t) +a2,2 y2 (t) This system of equations describes a situation where we have two quantities y1 and y2 , where the rate of change of each one of the quantities depends on the values of both. We can rewrite this as a matrix equation. Let y(t) be the vector y1 (t) y(t) = , y2 (t) and define the derivative of a vector to be the vector of derivatives, i.e., y1 (t) y (t) = . y2 (t) Define A to be the matrix a1,1 a1,2 A= . a2,1 a2,2 Then the system of equations (6.7) can be rewritten y (t) = Ay. A general system of linear equations has this form, except y(t) is an n- dimensional vector and A is an n × n matrix. How can we find solutions to such 215
  • 216. Chapter 6. Eigen-analysis 6.3. Systems of linear differential equations a system of equations? Taking a hint from the scalar case, we can try to find solutions of the form y(t) = eλt x where x is a fixed vector (not depending on t). With this definition y (t) = λeλt x so that y = Ay whenever λeλt x = Aeλt x = eλt Ax Dividing by eλt , this condition becomes λx = Ax. In other words, y(t) = eλt x is a solution exactly whenever λ and x are an eigenvalue eigenvector pair for A. So we can find as many solutions as we have eigenvalue eigenvector pairs. To proceed we first notice that if y1 (t) and y2 (t) are two solutions to the equation y = Ay, then a linear combination c1 y1 (t) + c2 y2 (t) is also a solution, since d c1 y1 (t) + c2 y2 (t) = c1 y1 (t) + c2 y2 (t) dt = c1 Ay1 (t) + c2 Ay2 (t) = A c1 y1 (t) + c2 y2 (t) Notice that we are assuming that the constants c1 and c2 do not depend on t. Similarly, if y1 (t), y2 (t), . . ., yn (t) are n solutions then c1 y1 (t) + c2 y2 (t) + · · · + cn yn (t) is a solution for any choice of constants c1 , c2 , . . . , cn . Now suppose that A is an n × n matrix. Suppose that λ1 , λ2 , . . . , λk are its eigenvalues with eigenvectors x1 , x2 , . . . , xk . Then we have that for any choice of constants c1 , c2 , . . . , ck , y(t) = c1 eλ1 t x1 + c2 eλ2 t x2 + · · · + ck eλk t xk (6.8) is a solution. Have we found all solutions? In other words, could there be a solution of the equation that is not this form, or is every solution of the form (6.8) for some choice of c1 , c2 , . . . , ck ? There is a theorem in differential equations that says that given an initial condition x0 there is one and only one solution of y = Ay satisfying y(0) = y0 . So our theoretical question above is equivalent to the following quite practical question. Given an initial vector y0 , does there exist a solution y(t) of the form (6.8) whose value at zero is the given initial condition, i.e., y(0) = y0 ? This will be true if, given any vector x0 , one can find c1 , c2 , . . . , ck so that y(0) = c1 x1 + c2 x2 + · · · + ck xk = x0 216
  • 217. Chapter 6. Eigen-analysis 6.3. Systems of linear differential equations This is exactly the condition that the eigenvectors form a basis. It turns out that in the “bad” cases where there are not enough eigenvectors of A to form a basis, there are solutions that don’t have the form (6.8). Now suppose that there are n eigenvectors that do form a basis. How can we actually find the numbers c1 , c2 , . . . , cn such that c1 x1 + c2 x2 + · · · + ck xn = x0 ? Just notice that this is a system linear equations c1    c2  x1 x2 · · · xn  .  = x0  .  . cn so you know what to do. Example 6.9 Find the general solution to the system of equations y1 (t) = 2y1 (t) +y2 (t) y2 (t) = y1 (t) +2y2 (t) This is equivalent to the matrix equation 2 1 y (t) = y(t) 1 2 2 1 1 The matrix has eigenvector and eigenvalues λ1 = 1, x1 = and 1 2 −1 1 λ2 = 3, x2 = . The eigenvectors x1 and x2 form a basis, so the general 1 solution is 1 1 y(t) = c1 eλ1 t x1 + c2 eλ2 t x2 = c1 et + c2 e3t −1 1 Example 6.10 Continue Example 6.9 above and find the solution satisfying the initial condition 2 y(0) = 1 We have to find constants c1 and c2 so that 1 1 2 c1 + c2 = −1 1 1 This is the same as solving 1 1 c1 2 = −1 1 c2 1 The solution is c1 1/2 = c2 3/2 217
  • 218. Chapter 6. Eigen-analysis 6.3. Systems of linear differential equations Example 6.11 Now let’s do an example where the eigenvalues are complex. Consider the equation 0 1 y (t) = y(t). −1 0 0 1 Find the general solution of this differential equation. The matrix has −1 0 −i eigenvector and eigenvalues λ1 = i, x1 = and complex conjugate λ2 = −i, 1 i x2 = . The eigenvectors x1 and x2 form a basis, so the general solution is 1 −i i y(t) = c1 eλ1 t x1 + c2 eλ2 t x2 = c1 eit + c2 e−it 1 1 In most applications, the solutions that we are interested in are real. The solution above looks decidedly complex! Remember, however, that the constants c1 and c2 can be complex too. Perhaps for special choices of c1 and c2 the solution will turn out to be real. This is in fact always true when the original matrix is real. In this case the complex eigenvalues and eigenvectors occur in conjugate pairs. So if y1 (t) = eλt x is a solution, then so is ¯ y1 (t) = eλt x ¯ ¯ So if we choose c1 = a/2 and c2 = a/2 for a real number a, then ¯ ¯ c1 eλt x + c2 eλt x = a/2(eλt x + eλt x) ¯ ¯ λt = a (e x) (here stands for the real part. We used that for a complex number z, z + z = ¯ 2 z). Similarly, if we choose c1 = a/2i and c2 = −a/2i, then ¯ ¯ c1 eλt x + c2 eλt x = a/2i(eλt x − eλt x) ¯ ¯ λt = a (e x) where () denotes the imaginary part of the argument. The upshot is that the real and imaginary parts of a solution are also solutions. Its sometimes easier to just start with one the complex solutions and find its real and imaginary parts. This gives us two real solutions to work with. Notice that it doesn’t matter which one of the complex solutions we pick. Because they are conjugate, their real parts are the same, and their imaginary parts differ by only a minus sign. Example 6.12 Continuing Example 6.11 In the example we have −i y1 (t) = eit 1 218
  • 219. Chapter 6. Eigen-analysis 6.3. Systems of linear differential equations −ieit = eit −i(cos(t) + i sin(t)) = cos(t) + i sin(t) −i cos(t) + sin(t) = cos(t) + i sin(t) sin(t) − cos(t) = +i cos(t) + sin(t) The real part and imaginary part are sin(t) cos(t) and − cos(t) sin(t) One can check directly that these are solutions to the original equation. The general solution can also be written sin(t) − cos(t) a1 + a2 cos(t) sin(t) The advantage of this way of writing the solution is that if we choose a1 and a2 to be real the solution is real too. Example 6.13 Continuing Example 6.12 Now suppose we want to satisfy an initial condition. Let’s find the solution y(t) of the equation that satisfies 2 y(0) = . −2 There are two ways to proceed. we could use the complex form of the general solution. Then we must find c1 and c2 such that −i i 2 c1 + c2 = 1 1 −2 This amounts to solving −i i c1 2 = 1 1 c2 −2 The solution is −1 c1 −i i 2 = c2 1 1 −2 1 1 −i 2 = −2i −1 −i −2 i + 1/2 = i − 1/2 219
  • 220. Chapter 6. Eigen-analysis 6.3. Systems of linear differential equations So c1 = i + 1/2 and c2 = i − 1/2. If we plug these into the expression for the general solution we get the right answer. However, there is still a fair amount of complex arithmetic needed to show explicitly that the solution is real. It’s easier to start with the real solutions. In this approach we must find a1 and a2 so that sin(0) − cos(0) 0 −1 2 a1 + a2 = a1 + a2 = cos(0) sin(0) 1 0 −2 Thus a1 = a2 = −2 so the solution is sin(t) − cos(t) −2 sin(t) + 2 cos(t) −2 + −2 = cos(t) sin(t) −2 cos(t) − 2 sin(t) Example 6.14 Now let’s do an example where the eigenvalues are complex, and have both a real and imaginary part. Let’s solve −1 1 y (t) = y(t) −1 −1 with initial condition 1 y(0) = 1 The first step is to find the eigenvalues and eigenvectors. I’ll omit the compu- 1 tations. The result is λ1 = −1 + i with eigenvector x1 = and the complex i 1 conjugates λ2 = −1 − i with eigenvector x2 = . Thus a solution is −i 1 y1 (t) = e(−1+i)t i To find real solutions we calculate the real and imaginary parts of this. e(−1+i)t y1 (t) = ie(−1+i)t e−t eit = ie−t eit e−t (cos(t) + i sin(t)) = ie−t (cos(t) + i sin(t)) e−t cos(t) e−t sin(t) = −t + i −t −e sin(t) e cos(t) So the general solution can be written e−t cos(t) e−t sin(t) a1 −t + a2 −t −e sin(t) e cos(t) 220
  • 221. Chapter 6. Eigen-analysis 6.3. Systems of linear differential equations To satisfy the initial condition, we need 1 0 1 a1 + a2 = 0 1 1 so that a1 = 1 and a2 = 1. Thus the solution is cos(t) + sin(t) y(t) = e−t − sin(t) + cos(t) 6.3.1 Problems −2 −8 Problem 6.15 Find the general solution to y = Ay when A = . 4 10 (Hint: This matrix appeared in the problems of last chapter). Find the solution 1 satisfying the initial condition y(0) = . 1 1 −2 Problem 6.16 Find the general solution to y = Ay when A = . 2 1 Find both the complex form and the real form. Find the solution satisfying the 1 initial condition y(0) = . 1   6 0 13 Problem 6.17 Find the general solution to y = Ay when A =  5 1 13 . −2 0 −4 Find both the complex form  and the real form. Find the solution satisfying the 1 initial condition y(0) =  1 . 1 Problem 6.18 Find the general solution of the following system of differential equations: 1 1 y (t) = y(t) 5 1 Problem 6.19 Find the solution of the following system of differential equa- tions, that satisfy the initial condition y(0) = [0, 1]T . 3 −1 y (t) = y(t) 1 −1 Problem 6.20 Find the real form of the solution of the following system of differential equations, that satisfy the initial condition y(0) = [2, −1]T . −3 −1 y (t) = y(t) 7 1 Problem 6.21 Is it true that every 3 × 3 matrix with real entries always has at least one real eigenvalue? Why? 221
  • 222. Chapter 6. Eigen-analysis 6.4. LCR circuits Figure 6.2: Diagrams of capacitors and inductors. 6.4 LCR circuits 6.4.1 Capacitors and inductors We introduce new elements into our study of electrical circuits: capacitors and inductors. At an instant in time t, a capacitor acts as a voltage source with voltage V (t). At a given instant in time an inductor acts as a current source with current I(t). These circuit elements are shown in Figure 6.2. The voltage across the capacitor changes proportional to the current i (with direction as in Figure 6.2) through it dV i =− (6.9) dt C where the constant of proportionality C is the capacitance of the capacitor with MKS units of Farads. Note that a capacitor with large capacitance can provide more current for the same drop in voltage. The current through an inductor changes proportional to the voltage v across it (with direction as in Figure 6.2) dI v =− (6.10) dt L where the constant of proportionality L is the inductance with MKS units of Henrys. We consider here at a high level how the behaviour in time of circuits with many capacitors, inductors and resistors. At a given instant in time, capacitors can be treated as a voltage sources and inductors as current sources. To de- termine how these sources change in time, the current through the capacitors (6.9) and the voltage across the inductors (6.10) is needed. However, deter- mining these from the sources is the fundamental circuit problem considered in Section 3.5.3 and Section 4.6. Thus, a system of differential equations can be derived for circuit networks with capacitors and inductors by solving the fun- damental problem for the circuit and scaling the result by the capacitance and inductance of the elements. Some specific examples are given below. 222
  • 223. Chapter 6. Eigen-analysis 6.4. LCR circuits Figure 6.3: A simple LRC circuit. 6.4.2 Differential equations for LCR circuits Example 6.15 The simplest example of an LRC circuit is the simple series circuit shown in Figure 6.3. We will derive a system of differential equations for V (t) and I(t) in the circuit and determine what combination of values for the components L, R, and C lead to oscillations. The equation is simple enough that the solution to the fundamental problem can just be written down. The current through the capacitor is I and the voltage v across the inductor is IR − V . Using the relationships (6.9, 6.10) we write dV = −I/C dt dI = −(IR − V )/L = V /L − IR/L dt or in matrix vector form x = Ax T where x = (V, I) and 0 −1/C A= 1/L −R/L The eigenvalues λ of A satisfy R 1 det(A − λI) = λ2 + λ+ =0 L LC with solutions 1 λ= −R ± R2 − 4L/C . 2L Considering this expression carefully you can see that if the λ values are real they are negative. If they are complex, the real part is −R/(2L). Thus solutions always decay exponentially. This is to be expected in a circuit with no external power. For the circuit to have oscillations, the eigenvalues must be complex, so R2 − 4L/C < 0 223
  • 224. Chapter 6. Eigen-analysis 6.4. LCR circuits Figure 6.4: Circuit diagram for Example 6.16. which is commonly rewritten as R C < 1. 2 L Example 6.16 Consider the circuit shown in Figure 6.4. The capacitor has capacitance C = 1 Farad and the inductor inductance L = 1 Henry. The capacitor is charged up to 12V and then the switch is closed (so E(0) = 12 and I(0) = 0). (a) Derive a system of differential equations for I(t) and E(t). (b) We expect that E(t) and I(t) → 0 as t → ∞ (no external power). Will there be oscillations in E and I in time? Remember that if i(t) is the current upward through the capacitor then dE i = − = −i (6.11) dt C since C = 1. If e(t) is the voltage across the inductor as shown then dI e = − = −e (6.12) dt L since L = 1. It is still necessary to work out i(t) and e(t) in terms of E(t) and I(t). This is the fundamental problem for the circuit. We solved this problem in Example 3.13 in Section 3.6 and considered it from a different perspective in the additional notes to Chapter 4. We found that 5 1 i = E− I 36 6 1 e = E + 6I 6 224
  • 225. Chapter 6. Eigen-analysis 6.4. LCR circuits R1 I1V1 IV C R2 I V 22 44 L I3 3 V Figure 6.5: Circuit diagram. Inserting these into (6.11) and (6.12) gives the desired system for E and I: dE 5 1 = − E+ I dt 36 6 dI 1 = − E − 6I dt 6 or in vector form x = Ax where E −5/36 1/6 x= and A = I −1/6 −6 An eigenanalysis of A gives λ1 ≈ −0.1436 and λ2 ≈ −5.9953. Since these are not complex, the circuit does not exhibit oscillations. 6.4.3 Alternate description of LCR circuits LCR circuits were presented differently in a previous version of the notes. That approach is reproduced below starting in the next paragraph. It follows the discussion in Section 3.5.4. We now return to the circuit that we discussed previously, shown in Fig- ure 6.5. Recall that we chose as basic variables I3 and V4 and solved for all the other variables in terms of these. The result was I1 = I3 1 I2 = V4 R2 1 I4 = I3 − V4 R2 V1 = R1 I3 V2 = V4 V3 = −R1 I3 − V4 Now we can complete the job and determine I3 and V4 . We have to take into account now that the currents and voltages are functions of time. The relations 225
  • 226. Chapter 6. Eigen-analysis 6.4. LCR circuits between currents and voltages across capacitors and inductors involves the time derivatives. If I and V are the current flowing through and the voltage across a capacitor with capacitance C, then dV 1 = I dt C If I and V are the current flowing through and the voltage across an inductor with inductance L, then dI 1 = V dt L Notice that we have chosen as basic the variables that get differentiated. Using these relations for I3 and V4 yields dI3 1 = V3 dt L dV4 1 = I4 dt C Now we re-express everything in terms of I3 and V4 using the equations we derived previously. dI3 1 −R1 1 = (−R1 I3 − V4 ) = I3 − V4 dt L L L dV4 1 1 1 1 = (I3 − V4 ) = I3 − V4 dt C R2 C R2 C This can be written as −R1 1 I3 L −L I3 = 1 1 V4 C − R2 C V4 Lets try to determine for what values of R1 , L, C and R2 the circuit exhibits oscillations. Recall that the solution will involve sines and cosines whenever the matrix has complex eigenvalues. The polynomial det(A − λI) = λ2 + bλ + c, where R1 1 b= + L R2 C and R1 1 c= + . R2 LC LC √ The eigenvalues are the roots of this polynomial, given by (−b ± b2 − 4c)/2. These will be complex if b2 < 4c, i.e., if 2 R1 1 R1 1 + <4 + L R2 C R2 LC LC Notice that this can be achieved by decreasing R1 and increasing R2 226
  • 227. Chapter 6. Eigen-analysis 6.5. Additional Topics R V C I L Figure 6.6: Circuit diagram for Problem 6.23. 6.4.4 Problems Problem 6.22 In the circuit in Figure 6.5, suppose that R1 = R2 = 1 ohm, C = 1 farad and L = 1 henry. If the initial current across the inductor is I3 (0) = 1 ampere and initial voltage across the capacitor is V4 (0) = 1 volt, find I3 (t) and V4 (t) for all later times. What is V1 (t)? Problem 6.23 Consider the circuit with diagram in Figure 6.6 Write down the system of equations satisfied by I and V . For what values of L, C and R does the circuit exhibit oscillations? Suppose that R = 1 ohm, C = 1 farad and L = 1 henry. If the initial current across the inductor is I(0) = 1 ampere and initial voltage across the capacitor is V (0) = 1 volt, find I(t) and V (t) for all later times. 6.5 Additional Topics 6.5.1 Diagonalization Diagonal matrices (that is, matrices that have zero entries except on the diago- nal) are extremely easy to work with. For a start, the eigenvalues of a diagonal matrix are exactly the diagonal entries. If λ1 0 ··· 0    0 λ2 ··· 0  D= .  . . . .  .  . . . 0 0 · · · λn then det(D − λI) = (λ1 − λ)(λ2 − λ) · · · (λn − λ) which is zero precisely when λ equals one of λ1 , λ2 , . . ., λn . The corresponding eigenvectors are just the standard basis vectors e1 , e2 , . . ., en . 227
  • 228. Chapter 6. Eigen-analysis 6.5. Additional Topics It is also easy to compute powers of a diagonal matrix. We simply obtain  λk 0 ··· 0  1  0 λk2 ··· 0 Dk =  .   . . . . .  . . .  0 0 · · · λk n This formula makes it easy to compute the matrix exponential of D. Recall that the matrix etD is defined to be the matrix power series t2 2 t3 3 etD = I + tD + D + D + ··· 2 3! Using the formula above we find that  t2 λ2  1  0 ··· 0   tλ1 0 ··· 0  1 0 ··· 0  2 t2 λ2 0 1 ··· 0  0 tλ2 ··· 0   0  2 ··· 0 etD  = . .+ . 2  + ··· . . . .  . . . . + . .   . . . . . . . . .  . . .  . .  0 0 ··· 1 0 0 · · · tλn 2 2 t λn 0 0 ···  t2 λ2 2 1 + tλ1 + 2 1 + ··· 0 ··· 0  t2 λ2   0 1 + tλ2 + 2 2 ··· 0  =    . . . . . .   . . .2 2  t λn 0 0 · · · 1 + tλn + 2 + · · · etλ1 0 ··· 0   tλ2  0 e ··· 0 =  .  . . . . . . . . 0 0 etλn Things are not quite so simple for an arbitrary n × n matrix A. However, if A has a basis of eigenvectors then it turns out that there exists an invertible matrix B such that AB = DB, where D is the diagonal matrix whose diagonal elements are the eigenvalues of A. Multiplying by B −1 from either the left or right gives A = BDB −1 , D = B −1 AB In fact, B is simply the matrix whose columns are the eigenvectors of A. In other words, if x1 , x2 , . . ., xn are the eigenvectors for A then B = x1 x2 · · · xn . To see this notice that AB = A x1 x2 · · · xn = Ax1 Ax2 · · · Axn = λ1 x1 λ2 x2 · · · λn xn 228
  • 229. Chapter 6. Eigen-analysis 6.5. Additional Topics λ1 0 ··· 0    0 λ2 ··· 0  = x1 x2 · · · xn  .  . . . .  .  . . . 0 0 · · · λn = BD The assumption that A has a basis of eigenvectors implies that the matrix B is invertible. Using the representation A = BDB −1 it is easy to calculate powers of A. We have A2 = BDB −1 BDB −1 = BDIDB −1 = BD2 B −1 and similarly Ak = BDk B −1 Therefore we can now also sum the power series for the exponential and obtain etA = BetD B −1 6.5.2 Computing high powers of a matrix Recall that when we were discussing the random walk problem, we ended up with the problem of computing the limit for large n of P n x0 where P is the matrix of transition probabilities. We can now solve this problem using diagonalization. Lets do an example. Suppose that 1 1 P = 2 4 2 3 3 4 We wish to compute P n for large n. We begin by diagonalizing P . This involves finding the eigenvalues and eigenvectors. I won’t give the details of this computation. The results are 1 −1 λ1 = 1, x1 = and λ2 = 1/12, x1 = . So 8/3 1 −1 1 −1 1 0 1 −1 P = 8 1 8 3 1 0 12 3 1 and −1 1 −1 1n 0 1 −1 Pn = 8 1 n 8 3 1 0 12 3 1 1 n 1 But 1n = 1 for all n and 12 → 0 as n → ∞, since 12 < 1. So −1 1 −1 1 0 1 −1 lim P n = 8 8 n→∞ 3 1 0 0 3 1 3 3 1 −1 1 0 11 11 = 8 −8 3 3 1 0 0 11 11 3 3 = 11 11 8 8 11 11 229
  • 230. Chapter 6. Eigen-analysis 6.5. Additional Topics 6.5.3 Another formula for the determinant If A has a basis of eigenvectors, then we can get another formula for the deter- minant. Using the multiplicative property of the determinant, we have det(A) = det(BDB −1 ) = det(B) det(D) det(B)−1 = det(D). But det(D) is just the product of the diagonal elements, i.e., the eigenvalues. Thus the determinant of A is the product of its eigenvalues: det(A) = λ1 λ2 · · · λn . Actually, this is true even if A doesn’t have a basis of eigenvectors and isn’t diagonalizable. 6.5.4 The matrix exponential and differential equations The matrix exponential etA can be used to solve the differential equation y (t) = Ay(t) with initial condition y(0) = x0 d tA To see this notice that etA satisfies the differential equation dt e = AetA . This follows from the power series representation t2 2 t3 3 etA = I + tA + A + A + ··· 2 3! since d tA 2t 2 3t2 3 e = A+ A + A + ··· dt 2 3! t2 = A + tA2 + A3 + · · · 2! t2 = A(I + tA + A2 + · · ·) 2 = AetA Also e0A = I These two facts imply that y(t) = etA x0 is the solution to our differential d equation and initial condition, since y (t) = dt etA x0 = AetA x0 = Ay(t) and 0A y(0) = e x0 = Ix0 = x0 . The matrix exponential is a nice theoretical construction. However, to actu- ally compute the matrix exponential using diagonalization involves just the same ingredients—computing the eigenvalues and vectors—as our original solution. In fact it is more work. 230
  • 231. Chapter 6. Eigen-analysis 6.5. Additional Topics However, there is one situation where the matrix exponential gives us some- thing new. This is the situation where A does not have a basis of eigenvectors. The power series definition of the matrix exponential still makes sense, and can 1 1 compute it in certain special cases. Consider the matrix A = . This 0 1 matrix does not have a basis of eigenvectors. So it cannot be diagonalized. et tet However, in a homework problem, you showed that etA = . Thus the 0 et solution to 1 1 y (t) = y(t) 0 1 with initial condition 2 y(0) = 1 is 2 et tet 2 2et + tet y(t) = etA = = 1 0 et 1 et Notice that this solution involves a power of t in addition to exponentials. 6.5.5 Converting higher order equations into first order systems So far we have only considered first order differential equations. In other words, the equations have only involved first derivatives y (t) and not higher derivatives like y (t). However higher order equations, especially second order equations, occur often in practical problems. In this section we will show that a higher order linear differential equation can be converted into an equivalent first order system. Suppose we want to solve the equation y (t) + ay (t) + by(t) = 0 with initial conditions y(0) = y0 y (0) = y0 Define the new functions z1 (t) and z2 (t) by z1 (t) = y(t) z2 (t) = y (t) Then z1 (t) = y (t) = z2 (t) z2 (t) = y (t) = −ay (t) − by(t) = −az2 (t) − bz1 (t) 231
  • 232. Chapter 6. Eigen-analysis 6.5. Additional Topics and z1 (0) = y0 z2 (0) = y0 z1 (t) In other words the vector satisfies the equation z2 (t) d z1 (t) 0 1 z1 (t) = dt z2 (t) −b −a z2 (t) with initial condition z1 (0) y = 0 . z2 (0) y0 Example 6.17 Suppose we want to solve the second order equation y + 4y + y = 0 with initial conditions y(0) = 1, y (0) = 0 If we let z1 (t) = y(t) and z2 (t) = y (t) then d z1 (t) 0 1 z1 (t) = dt z2 (t) −1 −4 z2 (t) with initial condition z1 (0) 1 = . z2 (0) 0 To solve this we first find the eigenvalues and eigenvectors. They are λ1 = √ 1√ √ 1√ −2 + 3, x1 = and λ1 = −2 − 3, x1 = So the general −2 + 3 −2 − 3 solution is √ √ 1√ 1√ c1 e(−2+ 3)t + c2 e(−2− 3)t −2 + 3 −2 − 3 To satisfy the initial condition, we need 1√ 1√ 1 c1 + c2 = −2 + 3 −2 − 3 0 The solution is √ c1 √ + 1/2 3/3 = c2 − 3/3 + 1/2 Thus z1 (t) √ √ 1√ √ √ 1√ = ( 3/3+1/2)e(−2+ 3)t +(− 3/3+1/2)e(−2− 3)t z2 (t) −2 + 3 −2 − 3 and so √ √ √ √ y(t) = z1 (t) = ( 3/3 + 1/2)e(−2+ 3)t + (− 3/3 + 1/2)e(−2− 3)t 232
  • 233. Chapter 6. Eigen-analysis 6.5. Additional Topics k ,l m x(t) Figure 6.7: A simple mass-spring system. Actually, to solve the equation y (t) + ay (t) + by(t) = 0 it is not really necessary to turn it into a first order system. We can simply try to find solutions of the form y(t) = eλt . If we plug this into the equation we get (λ2 + aλ + b)eλt which is zero if λ is a root of λ2 + aλ + b. This polynomial has two roots, which yields two solutions. Still, the procedure of turning a higher order equation into a first order sys- tem is important. This is because on a computer it is much easier to solve a first order system than a high order equation. If the coefficients a and b are functions of t, then exact solutions (like exponentials) usually can’t be found. However, one can still turn the equation into a first order system y (t) = A(t)y(t) where the matrix now depends on t and solve this on a computer. 6.5.6 Springs and weights To begin, lets consider the situation where we have a single weight hanging on a spring shown in Figure 6.7. We want to determine how the weight moves in time. To do this we calculate the forces acting on the weight and use Newton’s law of motion F = ma. One force acting on the weight are the force of gravity. This acts in the positive x direction (i.e., downward) and has magnitude mg. The other force is due to the spring. It’s magnitude is k(x − l) in the negative x direction. The acceleration is the second derivative x (t). Thus the total force is F = mg − k(x(t) − l) and ma = mx (t) Newton’s law reads mx (t) = mg − k(x − l) = −kx + mg + lk This is not quite in the form we can handle, due to the term mg + lk on the right. What we must do is first find the equilibrium solution. In a previous lecture we found the equilibrium position by minimizing the potential energy. There is another, equivalent, way. That is to find the value of x for which the total force is zero. In other words −kxeq + mg + lk = 0 233
  • 234. Chapter 6. Eigen-analysis 6.5. Additional Topics or xeq = (mg + lk)/k Notice that the total force can be written −kx + mg + lk = −k(x − xeq ) Now let y(t) = x(t)−xeq be the displacement from the equilibrium point. Notice that y (t) = x (t) and y (t) = x (t), since xeq is a constant. So the equation for y(t) is my (t) = −ky(t) or k y (t) + y(t) = 0 m We could turn this into a first order system. However, it is easier to try solutions of the form eλt . Substituting this into the equation yields (λ2 + k/m)eλt = 0 √ so we require that λ2 + k/m = 0, or λ = ±i k/m. Thus, solutions are ei √ k/mt and e−i k/mt . To obtain real solutions, we can take the real and imaginary parts. This gives as solutions sin( k/mt) and cos( k/mt) , and the general solution is c1 sin( k/mt) + c2 cos( k/mt) We can make the equation a little more interesting by adding friction. A frictional force is proportional to the velocity and acts in the direction opposite to the motion. Thus the equation with friction reads k y (t) + βy (t) + y(t) = 0 m This can be solved by turning it into a first order system, or directly, using trial solution of the form eλt as above. Now we turn the problem with several weights and springs shown in Fig- ure 6.8. In this problem matrices play an essential role. We begin by computing the forces acting on each weight. Let us start with the first weight. Gravity is pulling down, and the springs above and below are pushing or pulling with a force proportional to their extensions. Thus the total force on the first weight is m1 g − k1 (x1 − l1 ) + k2 (x2 − x1 − l2 ). To get the signs right on the spring forces, think of what happens when one of the xi ’s gets large. For example, when x1 gets large, the first spring stretches and pulls up, so the sign of the force should be negative for large x1 . So Newton’s equation for the first weight is m1 x1 (t) = m1 g−k1 (x1 −l1 )+k2 (x2 −x1 −l2 ) = −(k1 +k2 )x1 +k2 x2 +m1 g+k1 l1 −k2 l2 or k1 + k2 k2 k1 l1 − k2 l2 x1 (t) = − x1 + x2 + g + m1 m1 m1 234
  • 235. Chapter 6. Eigen-analysis 6.5. Additional Topics k1 , l1 x1 m1 k2 , l2 x2 m 2 k3 , l x3 3 m3 Figure 6.8: A more complicated mass-spring system. Similarly the equations for the second and third weights are k2 k2 + k3 k3 k2 l2 − k3 l3 x2 (t) = x1 − x2 + x3 + g + m2 m2 m2 m2 k3 k3 k3 l3 x3 (t) = x2 − x3 + g + m3 m3 m3 Thus can be written as a second order matrix equation x (t) = Kx(t) + b where   x1 (t) x(t) =  x2 (t)  , x3 (t) − k1m1 2 +k k2 0   m1 k2 K= m2 − k2m2 3 +k k3 m2  k3 k 0 m3 − m33 and −k g + k1 l1m1 2 l2   −k b =  g + k2 l2m2 3 l3  . g + k3 l33 m With this notation, the equilibrium solution is the value of x that makes all the forces zero. That is, Kxeq + b = 0 or, xeq = −K −1 b As in the case of one weight the force side of the equation can now be written as Kx + b = K(x + K −1 b) = K(x − xeq ) 235
  • 236. Chapter 6. Eigen-analysis 6.5. Additional Topics so if we define y(t) = x(t) − xeq , the equation for y(t) is y (t) = Ky(t) To solve this second order 3 × 3 system, we could turn it in to a first order 6 × 6 system. However, just as in the case of a single higher order equation we can proceed directly. We try trial solutions of the form eκt y. Substituting this into the equation, we see that the equation is satisfied if κ2 y = Ky in other words, κ2 is an eigenvalue of K with eigenvector y, or κ is one of the two square roots of and eigenvalue. So, if K has eigenvalues λ1 , λ2 and λ3 with eigenvectors y1 , y2 and y3 , √ √ √ then six solutions of the equation are given by e λ1 t y1 , e− λ1 t y1 , e λ2 t y2 , √ √ √ e− λ2 t y2 , e λ3 t y3 and e− λ3 t y3 . If some of the λi ’s are negative, then these solutions are complex exponentials, and we may take their real and imaginary parts to get real solutions. The general solution is a linear combination of these, and the coefficients in the linear combination may be chosen to satisfy an initial condition. To make this clear we will do an example. Suppose that all the masses mi , lengths li and spring constants ki are equal to 1. Then   −2 1 0 K =  1 −2 1  0 1 −1 Suppose that the initial position of the weights is x1 = 30, x2 = 60 and x3 = 70, and that the initial velocities are x1 = 1 and x2 = x3 = 0. We will determine the positions of the weights for all subsequent times. The numbers in this problem don’t turn out particularly nicely, so I’ll just give them to 3 significant figures. The first step is to find the eigenvalues and eigenvectors of K. They are given by λ1 = −0.198 λ2 = −1.55 λ3 = −3.25       0.445 −1.25 1.80 x1 =  0.802  x2 =  −0.555  x3 =  −2.25  1.00 1.00 1.00 √ √ √ Let µ1 = 0.198 = 0.445, µ2 = 1.55 = 1.25 and µ3 = 3.25 = 1.80 Then if y(t) = x(t) − xeq , then general solution for y(t) is y(t) = (c1 eiµ1 t + d1 e−iµ1 t )x1 + (c2 eiµ2 t + d2 e−iµ2 t )x2 + (c3 eiµ3 t + d3 e−iµ3 t )x3 where c1 , d1 , c2 , d2 , c3 , d3 are arbitrary constants. Taking real and imaginary parts, we can also write the general solution as y(t) = (a1 cos(µ1 t)+b1 sin(µ1 t))x1 +(a2 cos(µ2 t)+b2 sin(µ2 t))x2 +(a3 cos(µ3 t)+b3 sin(µ3 t))x3 236
  • 237. Chapter 6. Eigen-analysis 6.5. Additional Topics where a1 , b1 , a2 , b2 , a3 , b3 are arbitrary constants. Notice that we can find the general solution for y(t) = x(t) − xeq without knowing xeq . However, since the initial conditions were given in terms of x and not y, we now have to find xeq to be able to convert initial conditions for x to initial conditions for y. If we work in units where g = 10 then   10 b =  10  11 and  31 xeq = −K −1 b =  52  63 so       30 31 −1 y(0) = x(0) − xeq =  60  −  52  =  8  70 63 7 Also   1 y (0) = x (0) =  0  0 So to satisfy the first initial condition, since cos(0) = 1 and sin(0) = 0, we need that   −1 y(0) = a1 x1 + a2 x2 + a3 x3 =  8  . 7 Explicitly, we need to solve     a1 −1 [x1 |x2 |x3 ]  a2  =  8  , a3 7 or      0.445 −1.25 1.80 a1 −1  0.802 −0.555 −2.25   a2  =  8  1.00 1.00 1.00 a3 7 This is not a pretty sight, but I can punch the numbers into my computer and find that     a1 7.04  a2  =  1.33  a3 −1.37 To satisfy the second initial condition, we differentiate the expression for the general solution of y(t) and set t = 0. This gives   1 µ1 b1 x1 + µ2 b2 x2 + µ3 b3 x3 =  0  0 237
  • 238. Chapter 6. Eigen-analysis 6.5. Additional Topics Solving this numerically gives     µ1 b1 0.242  µ2 b2  =  −0.435  µ3 b3 0.194 Finally, we divide by the µi s to give     b1 0.543  b2  =  −0.348  b3 1.80 Now we have completely determined all the constants, so the solution is com- plete. 6.5.7 Problems Problem 6.24 Consider the second order equation y − 5y + 6y = 0 with initial conditions y(0) = 1, y (0) = 0. Solve this by turning it into a 2 × 2 system. Then solve it directly by using trial solutions eλt . Problem 6.25 Consider the second order equation y +y +y =0 with initial conditions y(0) = 1, y (0) = 0. Solve this by turning it into a 2 × 2 system. Then solve it directly by using trial solutions eλt . Problem 6.26 How can you turn a third order equation y + ay + by + cy = 0 into an equivalent 3 × 3 system of equations? Problem 6.27 Suppose K is a 3 × 3 matrix with eigenvalues and eigenvectors given by λ1 = −1 λ2 = −4 λ3 = −9       1 1 0 x1 =  0  x2 =  0  x3 =  1  1 −1 0 Find the solution of y (t) = Ky(t) satisfying   1 y(0) =  2  1 238
  • 239. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems   0 y (0) =  1  1 Problem 6.28 Consider a system of two hanging weights and springs. Suppose that all the masses, spring constants and spring lengths are equal to one, and that g = 10. Find the positions x1 (t) and x2 (t) for all times if x1 (0) = 20, x2 (0) = 30, x1 (0) = 1, x2 (0) = 1. 6.6 Solutions to Chapter Problems 1 1 1 2 1 1 Solution 6.1 We compute = =2 , so is an eigen- 1 1 1 2 1 1 1 1 1 0 1 1 vector with eigenvalue 2. Also = =0 , so is an 1 1 −1 0 −1 −1 eigenvector with eigenvalue 0. Solution 6.2 If P projects onto some line, then a vector x lying on that line doesn’t get changed by P so P x = x and x is an eigenvector with eigenvalue 1. On the other hand, if x is perpendicular to the line, then P x = 0 = 0x so x is an eigenvector with eigenvalue 0. Solution 6.3 a) det(A − λI) = λ2 − 9 = (λ − 3)(λ + 3) so the eigenvalues are λ1 = 3 and λ2 = −3. To find the eigenvector for λ1 = 3 we must solve the homogeneous −3 3 −3 3 equation with matrix . The matrix reduces to and the 3 −3 0 0 1 eigenvector is x1 = . To find the eigenvector for λ2 = −3 we must 1 3 3 solve the homogeneous equation with matrix . The matrix reduces 3 3 3 3 1 to and the eigenvector is x2 = . 0 0 −1 b) det(A − λI) = λ2 − 8λ + 12 = (λ − 2)(λ − 6) so the eigenvalues are λ1 = 3 and λ2 = −3. To find the eigenvector for λ1 = 2 we must solve the homogeneous −4 −8 −4 −8 equation with matrix . The matrix reduces to and the 4 8 0 0 2 eigenvector is x1 = . To find the eigenvector for λ2 = 6 we must solve −1 −8 −8 the homogeneous equation with matrix . The matrix reduces to 4 4 −8 −8 1 and the eigenvector is x2 = . 0 0 −1 c) det(A − λI) = λ2 + 7λ + 6 = (λ + 6)(λ + 1) so the eigenvalues are λ1 = −6 and λ2 = −1. To find the eigenvector for λ1 = −6 we must solve the 239
  • 240. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems −10 35 homogeneous equation with matrix . The matrix reduces to −30 105 35 −10 2 and the eigenvector is x1 = . To find the eigenvector for 0 0 7 30 −10 λ2 = −1 we must solve the homogeneous equation with matrix . 105 −35 30 −10 1 The matrix reduces to and the eigenvector is x2 = . 0 0 3 d) det(A−λI) = λ2 −3λ−10 = (λ+2)(λ−5) so the eigenvalues are λ1 = −2 and λ2 = 5. To find the eigenvector for λ1 = −2 we must solve the homogeneous −7 −14 −7 −14 equation with matrix . The matrix reduces to and 7 14 0 0 2 the eigenvector is x1 = . To find the eigenvector for λ2 = 5 we must −1 −14 −14 solve the homogeneous equation with matrix . The matrix 7 7 −14 −14 1 reduces to and the eigenvector is x2 = . 0 0 −1 Solution 6.4 2−λ 3 det = (2 − λ)(1 − λ) − 6 = 2 − 3λ + λ2 − 6 2 1−λ = λ2 − 3λ − 4 = (λ − 4)(λ − 5) = 0 ⇒ λ = −1, 4. The eigenvector for λ = −1: 3 3 k1 3k1 + 3k2 = 0 =0 ⇒ ⇒ k1 = −k2 2 2 k2 2k1 + 2k2 = 0 If we set k2 = 1 for convenience, we have that the first eigenpair is −1 λ1 = −1, v1 = . 1 Now, the eigenvector for λ = 4: −2 3 k1 −2k1 + 3k2 = 0 3 =0 ⇒ ⇒ k1 = k2 2 −3 k2 2k1 − 3k2 = 0 2 If we set k2 = 1 for convenience, we have that the second eigenpair is 3/2 λ2 = 4, v2 = . 1 Solution 6.5 240
  • 241. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems a) det(A − λI) = −λ3 + 2λ2 + λ − 2 = −(λ − 1)(λ − 2)(λ + 1) so the eigenvalues are λ1 = 1 and λ2 = 2 and λ3 = −1. To find the eigenvector for 1 = 1  λ −1 −1 1 we must solve the homogeneous equation with matrix  1 −1 2 . The   2 0 1  −1 −1 1 1 matrix reduces to  0 −2 3  and the eigenvector is x1 =  −3 . To 0 0 0 2 find the eigenvector for λ2 = 2 we must solve the    homogeneousequation −2 −1 1 −2 −1 1 with matrix  1 −2 2 . The matrix reduces to  0 −1 1  and the 2  0 0 0 0 0 0 eigenvector is x2 =  1 . To find the eigenvector for λ3 = −1 we must solve 1   1 −1 1 the homogeneous equation with matrix  1 1 2 . The matrix reduces   2 0  3 1 −1 1 3 to  0 2 1  and the eigenvector is x3 =  1 . 0 0 0 −2 √ √ b) det(A − λI) = −λ3 + 2λ2 + 3λ −√ = −(λ − 2)(λ − 3)(λ + 3) so the 6 √ eigenvalues are λ1 = 2 and λ2 = 3 and λ3 = − 3.. To find the eigen- vector for λ1 = 2 we must solve the homogeneous equation with matrix     −1 1 1 0 −1 −1  1 −2 −2 . The matrix reduces to  0 0 0  and the eigen- 1 −1 −1  −1 1 1 0 √ vector is x1 =  −1 . To find the eigenvector for λ2 = 3 we must solve 1  √  1− 3 1 √ 1 the homogeneous equation with matrix  1 − 3 −2 . The √  √ 1  −1 1 − 3 1− 3 1√ 1 √ matrix reduces to  0 2 − 3 −3 + 2 3  and the eigenvector is   0 0 0 −1 √ √ x2 =  − 3 . To find the eigenvector for λ3 = − 3 we must solve the ho- 1  √  1+ 3 1 √ 1 mogeneous equation with matrix  1 + 3 −2 . The matrix re- √ 1 −1 1 + 3 241
  • 242. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems  √    1+ 3 1√ √1 −1 √ duces to  0 2+ 3 2 3 + 1  and the eigenvector is x3 =  3 . 0 0 0 1 c) det(A − λI) = −λ3 + 48λ − 128 = −(λ + 8)(λ − 4)2 so the eigenvalues are λ1 = −8 and λ2 = 4. To find the eigenvector for λ1 = −8 we must   15 −9 −15 solve the homogeneous equation with matrix  0 12 0 . The matrix   3 −9 −3   15 −9 −15 1 reduces to  0 12 0  and the eigenvector is x1 =  0 . To find 0 0 0 1 the eigenvector(s) for λ2 = 4 we must solve the homogeneous equation with     3 −9 −15 3 −9 −15 matrix  0 4 0 . The matrix reduces to  0 0 0 . Thus 3 −9 −15 0 0 0 there  two eigenvectors corresponding to this eigenvalues, for example are   3 5 x2 =  1  and x3 =  0 . 0 1 d) det(A − λI) = −λ3 + λ2 + λ − 1 = −(λ + 1)(λ − 1)2 so the eigenvalues are λ1 = −1 and λ2 = 1. To find the eigenvector for λ1 = −1 we must   32 −100 70 solve the homogeneous equation with matrix  18 −58 42 . The matrix   12 −40   30 32 −100 70 5 reduces to  0 −14 21  and the eigenvector is x1 =  3 . To find the 0 0 0 2 eigenvector for λ2 = 1 we must solve the homogeneous equation with matrix     30 −100 70 30 −100 70  18 −60 42 . The matrix reduces to  0 0 0 Thus there are 12 −40 28 0 0 0   7 two eigenvectors corresponding to this eigenvalues, for example x2 =  0    −3 0 and x3 =  7 . 10 Solution 6.6 p11 − λ 1 − p22 det = (p11 − λ)(p22 − λ) − (1 − p11 )(1 − p22 ) 1 − p11 p22 − λ = p11 p22 − (p11 + p22 )λ + λ2 − (1 − (p11 + p22 ) + p11 p22 ) = λ2 − (p11 + p22 )λ + (p11 + p22 ) − 1 = 0 242
  • 243. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems Let c = p11 + p22 , then λ2 − cλ + c − 1 = 0 c± c2 − 4(c − 1) λ = √ 2 c ± c2 − 4c + 4 = 2 c ± (c − 2) = 2 Since p11 and p22 are in the interval [0, 1], we have that c ∈ [0, 2]. The first eigenvalue is λ1 = c+c−2 = c − 1, therefore λ1 ∈ [−1, 1]. 2 The second eigenvalue is λ2 = c−(c−2) = 1. 2 Solution 6.7   −λ 1 −1 det(A − λI) = det  5 −λ 1  0 1 −1 − λ −λ 1 1 −1 = −λ det − 5 det 1 −1 − λ 1 −1 − λ = −λ(λ + λ2 − 1) − 5(−λ) = −λ3 − λ2 + λ + 5λ = 0 = λ(λ2 + λ − 6) = λ(λ − 2)(λ + 3) ⇒ λ = 0, 2, −3. To find the eigenvector corresponding to λ1 = 0, we find the RREF of the extended matrix (A − λ1 I, 0), keeping in mind that one of the variables is a free variable. 1       0 1 −1 0 1 0 5 0 −1 (A − λ1 I, 0) =  5 0 1 0  →  0 1 −1 0  ; vλ1 = α  5  . 0 1 −1 0 0 0 0 0 5 Now, for the eigenvector corresponding to λ2 = 2:       −2 1 −1 0 1 0 −1 0 1 (A − λ2 I, 0) =  5 0 1 0  →  0 1 −3 0  ; vλ2 = α 3 . 0 1 −3 0 0 0 0 0 1 And for the eigenvector corresponding to λ3 = −3:       3 1 −1 0 1 0 −1 0 1 (A − λ3 I, 0) =  5 3 1 0  →  0 1 2 0 ; vλ3 = α  −2  . 0 1 2 0 0 0 0 0 1 243
  • 244. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems Solution 6.8   2−λ 0 1 det(A − λI) = det  0 2−λ 1  1 0 2−λ 2−λ 1 0 1 = (2 − λ) det + det 0 2−λ 2−λ 1 = (2 − λ)3 − (2 − λ) = (2 − λ) (2 − λ)2 − 1 = 0 = (2 − λ)(λ2 − 4λ + 3) = (λ − 2)(λ − 1)(λ − 3) ⇒ λ = 1, 2, 3. To find the eigenvector corresponding to λ1 = 1, we again find the RREF of the extended matrix (A − λ1 I, 0), keeping in mind that one of the variables is a free variable.       1 0 1 0 1 0 1 0 1 (A − λ1 I, 0) =  0 1 1 0  →  0 1 1 0 ; vλ1 = α  1  . 1 0 1 0 0 0 0 0 −1 Now, for the eigenvector corresponding to λ2 = 2:       0 0 1 0 1 0 0 0 0 (A − λ2 I, 0) =  0 0 1 0  →  0 0 1 0 ; vλ2 = α 1 . 1 0 0 0 0 0 0 0 0 And for the eigenvector corresponding to λ3 = 3:       −1 0 1 0 1 0 −1 0 1 (A − λ3 I, 0) =  0 −1 1 0  →  0 1 −1 0 ; vλ3 = α 1 . 1 0 −1 0 0 0 0 0 1 Solution 6.9 The answer is yes. The two eigenvectors and eigenvalues define six equations. There are nine unknowns -the entries of M- however, there is an extra condition; the rank of M is two. There are several solutions, here we present one. Let’s try   a11 0 a13 M =  a21 0 a23  . a31 0 a33 We have here six unknowns and six equations, so there is a chance to find a solution. If the two non-zero column vectors are independent, then we are done, it has rank 2.           a11 0 a13 1 −1 a11 0 a13 3 −3  a21 0 a23   2  =  −2  , and  a21 0 a23   2  =  −2  a31 0 a33 3 −3 a31 0 a33 1 −1 244
  • 245. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems a11 + 3a13 = −1, and 3a11 + a13 = −3 From this we have a13 = 0, and a11 = −1. a21 + 3a23 = −2, and 3a21 + a23 = −2 1 From this we have that a21 = a23 = − 2 . a131 + 3a33 = −3, and 3a31 + a33 = −1 This is similar to the first pair: a33 = −1, and a31 = 0. The resulting M matrix is   −1 0 0 M =  −2 0 −1  . 1 2 0 0 −1 M has rank 2, and           −1 0 0 1 −1 −1 0 0 3 −3  −1 1 0 − 1   2  =  −2  , and  − 2 0 − 1   2  =  −2  2 2 2 0 0 −1 3 −3 0 0 −1 1 −1 Solution 6.10 2−λ 3 det = (2 − λ)(−1 − λ) + 6 = −2 − 2λ + λ + λ2 + 6 −2 −1 − λ √ √ 1 15 1 15 = λ2 − λ + 4 = λ− +i λ− −i =0 2 2 2 2 √ 1 15 ⇒ λ= ±i 2 2 Solution 6.11 a) Find values for a and b such that A2 = A a i a i a i a2 − 1 = a ai + bi = i = ⇒ i b i b i b ai + bi = i −1 + b2 = b Hence, a2 − a − 1 = 0, and a + b = 1. Upon solving the system we get √ √ 1± 5 1 5 a= , b= 2 2 b) If A3 = A, then we can try A−1 A3 = A−1 A ⇒ A2 = I 245
  • 246. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems a i a i 1 0 a2 − 1 = 1 ai + bi = 0 = ⇒ i b i b 0 1 ai + bi = 0 −1 + b2 = 1 Hence, a2 = 2, and a = −b. We get √ √ a = ± 2, b= 2 √ 2 2 i √ =I ⇒ A3 = A; A4 = I = A i − 2 c) From the previous solution, we want A8 = I = A. If A4 = I, then A4 · A4 = A8 = I, hence A9 = A. Solution 6.12 a) a i A= i b The two eigenvalues are 2 + i and 2 − i. The characteristic polynomial is (a − λ)(b − λ) + 1 = 0 2 λ − (a + b)λ + ab + 1 = 0 a+b± (a + b)2 − 4ab − 4 λ1,2 = 2 2 a+b a+b λ1,2 = ± −1 2 2 λ1,2 should be 2 ± i. This is only possible if a+b = 0, hence a = b, and 2 a + b = 4. Therefore, a = b = 2, and the matrix is 2 i A= i 2 b) Now, to find the eigenvectors, we now that the eigenvalues are λ1,2 = 2 + i, 2 − i For the first eigenvector we have −i i x1 (A − λ1 I)v1 = =0 i −i x2 Hence x1 = x2 , and v1 = α(1, 1)T . For the second eigenvector we have i i x1 (A − λ2 I)v2 = =0 i i x2 Hence x1 = −x2 , and v2 = α(1, −1)T . 246
  • 247. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems Solution 6.13 1 1   0 4 2 1 1 1 P = 2 2 2  1 1 2 4 0 1 1 The characteristic polynomial isλ − 2 λ2 3 − 2 λ. From this, the eigenvalues are found to be λ1,2,3 = 0, 1, − 1 . 2 To find the eigenvector associated with λ1 , we find the reduced row-echelon form of the matrix (P − λ1 I): 0 1 2 0 1       4 1 1 1 0 1 0 −1 0  1 1 1 0  →  0 1 2 0  →  0 1 2 0  2 2 2 1 1 2 4 0 0 2 1 0 0 0 −1 −2 0     1 0 −1 0 1 →  0 1 2 0  → v1 =  −2  0 0 0 0 1 To find the eigenvector associated with λ2 , we find the RREF of the matrix (P − λ2 I): 1 1 1 −1 −1 0       −1 4 2 0 4 2 1 0 −1 0  1 −1 1 0  →  0 −3 3 0  →  0 1 −2 0  2 2 2 4 2 1 1 3 2 4 −1 0 0 4 −3 0 2 0 −1 −2 0   1 → v2 =  −2  1 Finally, to find the eigenvector associated with λ3 , we find the RREF of the matrix (P − λ3 I):  1 1 1   1    2 4 2 0 1 2 1 0 1 0 1 0  1 1 1 0  →  0 3 0 0  →  0 1 0 0  2 2 4 1 1 1 2 4 2 0 0 0 0 0 0 −1 −2 0   −1 → v3 =  0  1 We can express x(0) as a linear combination of the eigenvectors:         1 1 −1 1 1 1  1 x(0) = −2  + 2 − 0  =  0 . 4 4 2 1 1 1 0 From the notes, we know that for n > 0 x(n) = P n x(0) , therefore the zero eigenvalue and the −1/2 disappear for large n, and we have that  1  4 lim x(n) =  1 2  n→∞ 1 4 247
  • 248. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems Solution 6.14 1 1 1   4 3 2 1 1 P = 2 3 b  1 1 4 3 a We know that a ≥ 0, b ≥ 0, and that a + b = 1/2. If a > 0, and b > 0, by theorem 6.1 we know that the existence of an equilibrium probability is guaranteed. There are two cases left to be analyzed: i) a = 0 and b = 1/2 ii) a = 1/2 and b = 0 In both cases one value of the matrix is zero, so we can’t apply theorem 6.1 directly. On the other hand, P 2 for both cases is i) 17 13 7   48 36 24 P2 =  5 12 4 9 5 12  11 7 7 48 36 24 ii) 17 13 3   48 36 8 P2 =  7 24 5 18 1 4  17 13 3 48 36 8 with all positive entries in both cases, so there is always an equilibrium proba- bility. Now, for the case a = b = 1/4 we have to find the aforementioned equilib- rium probability vector. From theorem 6.1, the vector will be the eigenvector corresponding to the eigenvalue λ = 1, with all its entries summing to one. We find the eigenvalue by finding the reduced row-echelon form of the extended matrix (P − λI, 0):  3 1 1    −4 3 2 0 4 0 −5 0 1 2 1  0 16 −21 0  P = 2 −3 4 0  → 1 1 4 3 −3 0 0 0 0 0  4  20 → v = α  21  16 Since the entries of v have to sum to one, we get that the equilibrium probability vector is   20 1  21  57 16 248
  • 249. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems −2 Solution 6.15 The eigenvalues and vectors are λ1 = 2 x1 = and λ2 = 6 1 −1 x2 = . Thus the general solution is 1 −2 −1 c1 e2t + c2 e6t 1 1 To satisfy the initial condition we need −2 −1 1 c1 + c2 = 1 1 1 or −2 −1 c1 1 = 1 1 c2 1 which has solution c1 −2 = c2 3 so the solution is −2 −1 −2e2t + 3e6t 1 1 i Solution 6.16 The eigenvalues and vectors are λ1 = 1 + 2i x1 = and 1 −i λ2 = 1 − 2i x2 = . Thus the general solution in complex form is is 1 i −i c1 e(1+2i)t + c2 e(1−2i)t 1 1 To find two real solution we take the real and imaginary parts of i ie(1+2i)t e(1+2i)t = 1 e(1+2i)t i(cos(2t) + i sin(2t)) = et cos(2t) + i sin(2t) − sin(2t) + i cos(2t) = et cos(2t) + i sin(2t) Thus the general solution in real form is − sin(2t) cos(2t) a1 et + a2 et cos(2t) sin(2t) To satisfy the initial condition we need 0 1 1 a1 + a2 = 1 0 1 249
  • 250. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems or 0 1 a1 1 = 1 0 a2 1 which has solution a1 1 = a2 1 so the solution is − sin(2t) + cos(2t) et cos(2t) + sin(2t)cr   0 Solution 6.17 The eigenvalues and vectors are λ1 = 1 x1 =  1 , λ2 = 1 + i     0 −5 − i −5 + i x2 =  −5 − i  and λ3 = 1 − i x3 =  −5 + i . Thus the general solution in 2 2 complex form is is       0 −5 − i −5 + i c1 et  1  + c2 e(1+i)t  −5 − i  + c3 e(1−i)t  −5 + i  0 2 2 The two complex solutions are conjugates of each other. To find real solutions wemust find the real  and imaginary parts of one of them. These are given by   −5 cos(t) + sin(t) −5 sin(t) − cos(t) et  −5 cos(t) + sin(t)  and et  −5 sin(t) − cos(t)  Thus the general solution 2 cos(t) 2 sin(t) in real form is given by       0 −5 cos(t) + sin(t) −5 sin(t) − cos(t) a1 et  1  + a2 et  −5 cos(t) + sin(t)  + a3 et  −5 sin(t) − cos(t)  0 2 cos(t) 2 sin(t) To solve the initial value problem, we must solve         0 −5 −1 1 a1  1  + a2  −5  + a3  −1  =  1  0 2 0 1 This means solving the system of equations with augmented matrix   0 −5 −1 1  1 −5 −1 1  0 2 0 1 Reducing this matrix yields   1 −5 −1 1 0 −5 −1 1  0 0 −2 7 250
  • 251. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems Which yields the solution a1 = 0, a2 = 1/2 and a3 = −7/2. So the final answer is       −5 cos(t) + sin(t) −5 sin(t) − cos(t) cos(t) + 18 sin(t) 1 t 7 t e −5 cos(t) + sin(t)  − e −5 sin(t) − cos(t)  = et  cos(t) + 18 sin(t)  2 2 2 cos(t) 2 sin(t) cos(t) − 7 sin(t) Solution 6.18 1 1 y (t) = y(t) 5 1 √ The characteristic polynomial is (1 − λ)2 − 5 = 0, hence λ1,2 = 1 ± 5 We need to find the eigenvectors: √ √ − 5 1 √ √1 5 √ 1 1 √ → v1 = ; → v2 = 5 − 5 5 5 5 − 5 The general solution is given by y(t) = c1 eλ1 t v1 + c2 eλ2 t v2 √ 1 √ 1 = c1 e(1+ 5)t √ + c2 e(1− 5)t √ 5 − 5 Or, equivalently √ √ y1 (t) = c1 e(1+ 5)t + c2 e(1− 5)t √ √ √ √ y2 (t) = c1 5e(1+ 5)t − c2 5e(1− 5)t Solution 6.19 3 −1 y (t) = y(t) 1 −1 √ The characteristic polynomial is λ2 − 2λ − 2 = 0, hence λ1,2 = 1 ± 3 We need to find the eigenvectors: √ √ 2− 3 −1√ 2+ 3 → v1 = 1 −2 − 3 1 √ √ 2+ 3 −1√ 2− 3 → v2 = 1 −2 + 3 1 The general solution is given by √ √ √ √ (1+ 3)t 2+ 3 (1− 3)t 2− 3 y(t) = c1 e + c2 e 1 1 The initial condition states that y(0) = [0, 1]T . Hence √ √ c1 (2 + 3) + c2 (2 − 3) = 0 c1 + c2 = 1 1 1 1 1 c1 = − √ , c2 = + √ 2 3 2 3 251
  • 252. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems Or, equivalently 1 √ 1 √ y1 (t) = − √ e(1+ 3)t + √ e(1− 3)t 2 3 2 3 1 1 √ 1 1 √ (1+ 3)t y2 (t) = −√ e − +√ e(1− 3)t 2 3 2 3 Solution 6.20 3 −1 y (t) = y(t) 7 1 √ The characteristic polynomial is λ2 + 2λ + 4 = 0, hence λ1,2 = −1 ± i 3 Finding the eigenvectors: √ √ −2 − i 3 −1 √ 0 −2 + i 3 → v1 = 7 2−i 3 7 The function below is a solution. We will separate the real and imaginary parts to get two independent solutions: √ √ √ −2 + i 3 1 i√3t −2 + i 3 e(−1+i 3)t = e 7 et 7 √ 1 −2 + i 3 √ √ = t (cos( 3t) + i sin( 3t)) e 7 √ √ √ 1 −2 cos( 3t) − 3 sin( 3t) = √ + ··· et 7 cos( 3t) √ √ √ i 3 cos( 3t) − 2 sin( 3t) √ et 7 sin( 3t) The general solution is √ √ √ √ √ √ c1 −2 cos( 3t) − 3 sin( 3t) c2 3 cos( 3t) − 2 sin( 3t) y(t) = t √ + √ e 7 cos( 3t) et 7 sin( 3t) The initial condition states that y(0) = [2, −1]T . Hence √ −2c1 + 3c2 = 2 7c1 = −1 1 12 c1 = − , c2 = √ 7 7 3 Solution 6.21 If A is a 3 × 3 matrix with real entries, then det(A − λI) is a polynomial of the form −λ3 + aλ2 + bλ + c. If you plot the graph of this polynomial, it tends to infinity for large positive λ and to negative infinity for large negative λ. Thus there must be a place where the polynomial crosses the real axis, that is, a real zero. This means that there is always at least one 252
  • 253. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems real eigenvalue. The other two eigenvalues can be either both real, or complex conjugates (as in the previous problem). Solution 6.22 We have to solve the equation −1 −1 x (t) = x(t) 1 −1 with 1 x(0) = 1 1 The eigenvalues and eigenvectors of the matrix are λ1 = −1 + i, x1 = , −i 1 λ2 = −1 − i, x2 = Real solutions are the real and imaginary parts of i 1 e(−1+i)t , −i which are cos(t) e−t sin(t) and sin(t) e−t − cos(t) Thus the general solution is cos(t) sin(t) x(t) = a1 e−t + a2 e−t . sin(t) − cos(t) To satisfy the initial conditions, we need 1 0 1 a1 + a2 = 0 −1 1 so a1 = 1, a2 = −1 and the solution is cos(t) − sin(t) e−t sin(t) + cos(t) Solution 6.23 The current is the same through each component. The voltage across the resistor is IR, so if VL is the voltage across the inductor, then VL + V + IR = 0 so VL = −V − IR. The equations are therefore 1 I = L VL = −RI − L 1 LV 1 V = CI or I −R L −L1 I = 1 V C 0 V 253
  • 254. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems The characteristic polynomial (i.e., det(A−λI)) is λ2 +(R/L)λ+1/(LC), so the eigenvalues are λ = (−R/L ± R2 /L2 − 4/(LC))/2. Oscillations occur when the eigenvalues are complex, so, if R2 /L2 < 4/(LC) or R < 2 L/C. If we set all the values to 1 the equation becomes I −1 −1 I = V 1 0 V and the initial conditions are I(0) 1 = V (0) 1 √ √ (−1 + i 3)/2 The eigenvalues are vectors are λ1 = (−1 + i 3)/2, x1 = , √ 1 √ (−1 − i 3)/2 λ2 = (−1 − i 3)/2, x2 = . Real solutions are given by the real 1 and imaginary parts of √ √ √ √ √ (−1 + i 3)/2 (−1/2 + i 3/2)(cos( 3t/2)√ i sin( 3t/2)) √ + e(−1+i 3)t/2 = e−t/2 1 cos( 3t/2) + i sin( 3t/2) √ √ √ √ √ √ − cos( 3t/2)/2 − 3 sin( 3t/2)/2 + i(− sin( 3t/2)/2 + 3 cos( 3t/2)/2) √ √ = e−t/2 cos( 3t/2) + i sin( 3t/2) So the general solution is √ √ √ √ √ √ − − cos( 3t/2)/2 √ 3 sin( 3t/2)/2 − sin( 3t/2)/2 √ 3 cos( 3t/2)/2 + a1 e−t/2 +a2 e−t/2 cos( 3t/2) sin( 3t/2) To satisfy the initial conditions, we need √ −1/2 3/2 1 a1 + a2 = 1/2 0 1 √ √ This has solution a1 = 3/3, a2 = −2 − 3/3. Solution 6.24 The equivalent 2 × 2 system is 0 1 z = z −6 5 with 1 z(0) = 0 1 1 The eigenvalues and eigenvectors are λ1 = 2, x1 = , λ2 = 3, x2 = . 2 3 The general solution is 1 1 c1 e2t + c2 e3t 2 3 254
  • 255. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems The initial conditions are satisfied if 1 1 1 c1 + c2 = 2 3 0 which has solution c1 = 3, c2 = −2. So y(t) = z1 (t) = 3e2t − 2e3t . To solve this directly, plug in eλt . This solves the equation if λ2 − 5λ + 6 = 0, or λ = 2 or 3. So the general solution is y(t) = c1 e2t + c2 e3t . Satisfying the initial conditions y(0) = 1 and y (0) = 0 leads to c1 = 3 and c2 = −2 as above. Solution 6.25 The equivalent 2 × 2 system is 0 1 z = z −1 −1 with 1 z(0) = 0 √ 1 √ The eigenvalues and eigenvectors are λ1 = −1/2+i 3/2, x1 = , −1/2 + i 3/2 √ 1 √ λ2 = −1/2 − i 3/2, x2 = Real solutions are the real and −1/2 − i 3/2 imaginary parts of √ 1 √ e(−1/2+i 3/2)t −1/2 + i 3/2 which leads to the general solution cos √ sin a1 e−t/2 + a2 e−t/2 √ −(1/2) cos −( 3/2) sin ( 3/2) cos −(1/2) sin √ (the sines and cosines are evaluated at 3t/2). Initial conditions require 1 0 1 a1 + a2 √ = −(1/2) ( 3/2) 0 √ which has √ solution a1√ 1 and a2 = 3/3. Thus the solution is y(t) = z1 (t) = = √ e−t/2 (cos( 3t/2) + ( 3/3) sin( 3t/2)) If we try to solve directly by substi- √ tuting eλt , we√get λ = −1/2 ± i √ 3/2. The real and imaginary parts of eλt −t/2 −t/2 are e (cos( 3t/2) and e √ sin( 3t/2)), so we obtain the general solution √ y(t) = a1 e−t/2 (cos( 3t/2) + a2 e−t/2 sin( 3t/2)). Choosing a1 and a2 to solve the initial condition yields the same answer as above.   z1 Solution 6.26 Set z1 (t) = y(t), z2 (t) = y (t) and z3 (t) = y (t). Then z =  z2  z3 solves the equation   0 1 0 z = 0 0 1 z −c −b −a 255
  • 256. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems Solution 6.27 The general solution is       1 1 0 (a1 cos(t)+b1 sin(t))  0 +(a2 cos(2t)+b2 sin(2t))  0 +(a3 cos(3t)+b3 sin(3t))  1  1 −1 0 To satisfy the initial conditions, we need         1 1 0 1 a1  0  + a2  0  + a3  1  =  2  1 −1 0 1 which has solution a1 = 1, a2 = 0 and a3 = 2, and         1 1 0 0 b1  0  + 2b2  0  + 3b3  1  =  1  1 −1 0 1 which has solution b1 = 1/2, 2b2 = −1/2 and 3b3 = 1, so that b1 = 1/2, b2 = −1/4 and b3 = 1/3 Solution 6.28 Going through the analysis with two springs and weights, we end up with −2 1 K= 1 −1 and 10 b= 11 Then 21 xeq = K −1 b = 31 so if we define y = x − xeq , we obtain initial conditions 1 y(0) = 1 and 1 y (0) = 1 √ √ The eigenvalues of K are √1 = (−3 + 5)/2 and λ√ = (−3 − 5)/2 with λ 2 (−1 + 5)/2 (−1 − 5)/2 eigenvectors x1 = and x2 = . If we set µ1 = 1 1 √ √ −λ1 and µ2 = −λ2 , then the general solution is y(t) = (a1 cos(µ1 t) + b1 sin(µ1 t))x1 + (a2 cos(µ2 t) + b2 sin(µ2 t))x1 The initial condition are satisfied if 1 a1 x1 + a2 x2 = 1 256
  • 257. Chapter 6. Eigen-analysis 6.6. Solutions to Chapter Problems and 1 µ1 b1 x1 + µ2 b2 x2 = . 1 These can be solved by inverting 2 × 2 matrices. The answers are √ a1 1 −3/2 +√ 5/2 =√ a2 5 3/2 + 5/2 and √ b1 1 (−3/2 +√ 5/2)/µ1 =√ b2 5 (3/2 + 5/2)/µ2 257