SlideShare a Scribd company logo
Mathematical Methods
in Quantum Mechanics
With Applications
to Schr¨dinger Operators
       o


Gerald Teschl

Note: The AMS has granted the permission to post this online edition!
This version is for personal online use only! If you like this book and want
to support the idea of online versions, please consider buying this book:
    http://www.ams.org/bookstore-getitem?item=gsm-99




Graduate Studies
in Mathematics
Volume 99




            American Mathematical Society
            Providence, Rhode Island
Editorial Board
                David Cox (Chair)         Steven G. Krants
                Rafe Mazzeo               Martin Scharlemann

2000 Mathematics subject classification. 81-01, 81Qxx, 46-01, 34Bxx, 47B25
Abstract. This book provides a self-contained introduction to mathematical methods in quan-
tum mechanics (spectral theory) with applications to Schr¨dinger operators. The first part cov-
                                                         o
ers mathematical foundations of quantum mechanics from self-adjointness, the spectral theorem,
quantum dynamics (including Stone’s and the RAGE theorem) to perturbation theory for self-
adjoint operators.
     The second part starts with a detailed study of the free Schr¨dinger operator respectively
                                                                  o
position, momentum and angular momentum operators. Then we develop Weyl–Titchmarsh the-
ory for Sturm–Liouville operators and apply it to spherically symmetric problems, in particular
to the hydrogen atom. Next we investigate self-adjointness of atomic Schr¨dinger operators and
                                                                         o
their essential spectrum, in particular the HVZ theorem. Finally we have a look at scattering
theory and prove asymptotic completeness in the short range case.
    For additional information and updates on this book, visit:
                            http://www.ams.org/bookpages/gsm-99/

Typeset by L TEXand Makeindex. Version: February 17, 2009.
           A




Library of Congress Cataloging-in-Publication Data
Teschl, Gerald, 1970–
    Mathematical methods in quantum mechanics : with applications to Schr¨dinger operators
                                                                         o
/ Gerald Teschl.
      p. cm. — (Graduate Studies in Mathematics ; v. 99)
    Includes bibliographical references and index.
    ISBN 978-0-8218-4660-5 (alk. paper)

    1. Schr¨dinger operators. 2. Quantum theory—Mathematics. I. Title.
           o
QC174.17.S3T47 2009                                                                 2008045437
515’.724–dc22



Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting
for them, are permitted to make fair use of the material, such as to copy a chapter for use
in teaching or research. Permission is granted to quote brief passages from this publication in
reviews, provided the customary acknowledgement of the source is given.
      Republication, systematic copying, or multiple reproduction of any material in this pub-
lication (including abstracts) is permitted only under license from the American Mathematical
Society. Requests for such permissions should be addressed to the Assistant to the Publisher,
American Mathematical Society, P.O. Box 6248, Providence, Rhode Island 02940-6248. Requests
can also be made by e-mail to reprint-permission@ams.org.
                c 2009 by the American Mathematical Society. All rights reserved.
                     The American Mathematical Society retains all rights
                    except those granted too the United States Government.
To Susanne, Simon, and Jakob
Mathematical methods in quantum mechanics
Contents


Preface                                                  xi

Part 0. Preliminaries

Chapter 0. A first look at Banach and Hilbert spaces       3
  §0.1. Warm up: Metric and topological spaces            3
  §0.2. The Banach space of continuous functions         12
  §0.3. The geometry of Hilbert spaces                   16
  §0.4. Completeness                                     22
  §0.5. Bounded operators                                22
  §0.6. Lebesgue    Lp   spaces                          25
  §0.7. Appendix: The uniform boundedness principle      32

Part 1. Mathematical Foundations of Quantum Mechanics

Chapter 1. Hilbert spaces                                37
  §1.1. Hilbert spaces                                   37
  §1.2. Orthonormal bases                                39
  §1.3. The projection theorem and the Riesz lemma       43
  §1.4. Orthogonal sums and tensor products              45
  §1.5. The   C∗   algebra of bounded linear operators   47
  §1.6. Weak and strong convergence                      49
  §1.7. Appendix: The Stone–Weierstraß theorem           51

Chapter 2. Self-adjointness and spectrum                 55

                                                         vii
viii                                                            Contents


   §2.1.   Some quantum mechanics                                      55
   §2.2.   Self-adjoint operators                                      58
   §2.3.   Quadratic forms and the Friedrichs extension                67
   §2.4.   Resolvents and spectra                                      73
   §2.5.   Orthogonal sums of operators                                79
   §2.6.   Self-adjoint extensions                                     81
   §2.7.   Appendix: Absolutely continuous functions                   84
Chapter    3. The spectral theorem                                      87
  §3.1.    The spectral theorem                                         87
  §3.2.    More on Borel measures                                       99
  §3.3.    Spectral types                                              104
  §3.4.    Appendix: The Herglotz theorem                              107
Chapter    4. Applications of the spectral theorem                     111
  §4.1.    Integral formulas                                           111
  §4.2.    Commuting operators                                         115
  §4.3.    The min-max theorem                                         117
  §4.4.    Estimating eigenspaces                                      119
  §4.5.    Tensor products of operators                                120
Chapter    5. Quantum dynamics                                         123
  §5.1.    The time evolution and Stone’s theorem                      123
  §5.2.    The RAGE theorem                                            126
  §5.3.    The Trotter product formula                                 131
Chapter    6. Perturbation theory for self-adjoint operators           133
  §6.1.    Relatively bounded operators and the Kato–Rellich theorem   133
  §6.2.    More on compact operators                                   136
  §6.3.    Hilbert–Schmidt and trace class operators                   139
  §6.4.    Relatively compact operators and Weyl’s theorem             145
  §6.5.    Relatively form bounded operators and the KLMN theorem      149
  §6.6.    Strong and norm resolvent convergence                       153

Part 2. Schr¨dinger Operators
            o
Chapter 7. The free Schr¨dinger operator
                        o                                              161
  §7.1. The Fourier transform                                          161
  §7.2. The free Schr¨dinger operator
                     o                                                 167
Contents                                                          ix


  §7.3. The time evolution in the free case                      169
  §7.4. The resolvent and Green’s function                       171
Chapter   8. Algebraic methods                                   173
  §8.1.   Position and momentum                                  173
  §8.2.   Angular momentum                                       175
  §8.3.   The harmonic oscillator                                178
  §8.4.   Abstract commutation                                   179
Chapter   9. One-dimensional Schr¨dinger operators
                                     o                           181
  §9.1.   Sturm–Liouville operators                              181
  §9.2.   Weyl’s limit circle, limit point alternative           187
  §9.3.   Spectral transformations I                             195
  §9.4.   Inverse spectral theory                                202
  §9.5.   Absolutely continuous spectrum                         206
  §9.6.   Spectral transformations II                            209
  §9.7.   The spectra of one-dimensional Schr¨dinger operators
                                                 o               214
Chapter 10. One-particle Schr¨dinger operators
                              o                                  221
  §10.1. Self-adjointness and spectrum                           221
  §10.2. The hydrogen atom                                       222
  §10.3. Angular momentum                                        225
  §10.4. The eigenvalues of the hydrogen atom                    229
  §10.5. Nondegeneracy of the ground state                       235
Chapter 11. Atomic Schr¨dinger operators
                          o                                      239
  §11.1. Self-adjointness                                        239
  §11.2. The HVZ theorem                                         242
Chapter 12. Scattering theory                                    247
  §12.1. Abstract theory                                         247
  §12.2. Incoming and outgoing states                            250
  §12.3. Schr¨dinger operators with short range potentials
             o                                                   253

Part 3. Appendix
Appendix A. Almost everything about Lebesgue integration         259
  §A.1. Borel measures in a nut shell                            259
  §A.2. Extending a premeasure to a measure                      263
  §A.3. Measurable functions                                     268
x                                         Contents


    §A.4. The Lebesgue integral               270
    §A.5. Product measures                    275
    §A.6. Vague convergence of measures       278
    §A.7. Decomposition of measures           280
    §A.8. Derivatives of measures             282
Bibliographical notes                         289
Bibliography                                  293
Glossary of notation                          297
Index                                         301
Preface


Overview

    The present text was written for my course Schr¨dinger Operators held
                                                     o
at the University of Vienna in winter 1999, summer 2002, summer 2005,
and winter 2007. It gives a brief but rather self-contained introduction
to the mathematical methods of quantum mechanics with a view towards
applications to Schr¨dinger operators. The applications presented are highly
                    o
selective and many important and interesting items are not touched upon.
    Part 1 is a stripped down introduction to spectral theory of unbounded
operators where I try to introduce only those topics which are needed for
the applications later on. This has the advantage that you will (hopefully)
not get drowned in results which are never used again before you get to
the applications. In particular, I am not trying to present an encyclopedic
reference. Nevertheless I still feel that the first part should provide a solid
background covering many important results which are usually taken for
granted in more advanced books and research papers.
    My approach is built around the spectral theorem as the central object.
Hence I try to get to it as quickly as possible. Moreover, I do not take the
detour over bounded operators but I go straight for the unbounded case.
In addition, existence of spectral measures is established via the Herglotz
theorem rather than the Riesz representation theorem since this approach
paves the way for an investigation of spectral types via boundary values of
the resolvent as the spectral parameter approaches the real line.




                                                                            xi
xii                                                                  Preface


    Part 2 starts with the free Schr¨dinger equation and computes the
                                      o
free resolvent and time evolution. In addition, I discuss position, momen-
tum, and angular momentum operators via algebraic methods. This is
usually found in any physics textbook on quantum mechanics, with the
only difference that I include some technical details which are typically
not found there. Then there is an introduction to one-dimensional mod-
els (Sturm–Liouville operators) including generalized eigenfunction expan-
sions (Weyl–Titchmarsh theory) and subordinacy theory from Gilbert and
Pearson. These results are applied to compute the spectrum of the hy-
drogen atom, where again I try to provide some mathematical details not
found in physics textbooks. Further topics are nondegeneracy of the ground
state, spectra of atoms (the HVZ theorem), and scattering theory (the Enß
method).

Prerequisites

    I assume some previous experience with Hilbert spaces and bounded
linear operators which should be covered in any basic course on functional
analysis. However, while this assumption is reasonable for mathematics
students, it might not always be for physics students. For this reason there
is a preliminary chapter reviewing all necessary results (including proofs).
In addition, there is an appendix (again with proofs) providing all necessary
results from measure theory.

Literature

    The present book is highly influenced by the four volumes of Reed and
Simon [40]–[43] (see also [14]) and by the book by Weidmann [60] (an
extended version of which has recently appeared in two volumes [62], [63],
however, only in German). Other books with a similar scope are for example
[14], [15], [21], [23], [39], [48], and [55]. For those who want to know more
about the physical aspects, I can recommend the classical book by Thirring
[58] and the visual guides by Thaller [56], [57]. Further information can be
found in the bibliographical notes at the end.

Reader’s guide

    There is some intentional overlap between Chapter 0, Chapter 1, and
Chapter 2. Hence, provided you have the necessary background, you can
start reading in Chapter 1 or even Chapter 2. Chapters 2 and 3 are key
Preface                                                                   xiii


chapters and you should study them in detail (except for Section 2.6 which
can be skipped on first reading). Chapter 4 should give you an idea of how
the spectral theorem is used. You should have a look at (e.g.) the first
section and you can come back to the remaining ones as needed. Chapter 5
contains two key results from quantum dynamics: Stone’s theorem and the
RAGE theorem. In particular the RAGE theorem shows the connections
between long time behavior and spectral types. Finally, Chapter 6 is again
of central importance and should be studied in detail.
    The chapters in the second part are mostly independent of each other
except for Chapter 7, which is a prerequisite for all others except for Chap-
ter 9.
    If you are interested in one-dimensional models (Sturm–Liouville equa-
tions), Chapter 9 is all you need.
    If you are interested in atoms, read Chapter 7, Chapter 10, and Chap-
ter 11. In particular, you can skip the separation of variables (Sections 10.3
and 10.4, which require Chapter 9) method for computing the eigenvalues of
the hydrogen atom, if you are happy with the fact that there are countably
many which accumulate at the bottom of the continuous spectrum.
    If you are interested in scattering theory, read Chapter 7, the first two
sections of Chapter 10, and Chapter 12. Chapter 5 is one of the key prereq-
uisites in this case.

Updates

   The AMS is hosting a web page for this book at

                http://www.ams.org/bookpages/gsm-99/

where updates, corrections, and other material may be found, including a
link to material on my own web site:

      http://www.mat.univie.ac.at/~gerald/ftp/book-schroe/


Acknowledgments

    I would like to thank Volker Enß for making his lecture notes [18] avail-
able to me. Many colleagues and students have made useful suggestions and
pointed out mistakes in earlier drafts of this book, in particular: Kerstin
Ammann, J¨rg Arnberger, Chris Davis, Fritz Gesztesy, Maria Hoffmann-
            o
Ostenhof, Zhenyou Huang, Helge Kr¨ger, Katrin Grunert, Wang Lanning,
                                     u
Daniel Lenz, Christine Pfeuffer, Roland M¨ws, Arnold L. Neidhardt, Harald
                                          o
xiv                                                             Preface


Rindler, Johannes Temme, Karl Unterkofler, Joachim Weidmann, and Rudi
Weikard.
   If you also find an error or if you have comments or suggestions
(no matter how small), please let me know.
    I have been supported by the Austrian Science Fund (FWF) during much
of this writing, most recently under grant Y330.


                                                          Gerald Teschl

Vienna, Austria
January 2009




Gerald Teschl
Fakult¨t f¨r Mathematik
      a u
Nordbergstraße 15
Universit¨t Wien
         a
1090 Wien, Austria

E-mail: Gerald.Teschl@univie.ac.at
URL: http://www.mat.univie.ac.at/~gerald/
Part 0

Preliminaries
Mathematical methods in quantum mechanics
Chapter 0




A first look at Banach
and Hilbert spaces


I assume that the reader has some basic familiarity with measure theory and func-
tional analysis. For convenience, some facts needed from Banach and Lp spaces are
reviewed in this chapter. A crash course in measure theory can be found in the
Appendix A. If you feel comfortable with terms like Lebesgue Lp spaces, Banach
space, or bounded linear operator, you can skip this entire chapter. However, you
might want to at least browse through it to refresh your memory.


0.1. Warm up: Metric and topological spaces
Before we begin, I want to recall some basic facts from metric and topological
spaces. I presume that you are familiar with these topics from your calculus
course. As a general reference I can warmly recommend Kelly’s classical
book [26].
   A metric space is a space X together with a distance function d :
X × X → R such that
      (i) d(x, y) ≥ 0,
      (ii) d(x, y) = 0 if and only if x = y,
     (iii) d(x, y) = d(y, x),
     (iv) d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality).
    If (ii) does not hold, d is called a semi-metric. Moreover, it is straight-
forward to see the inverse triangle inequality (Problem 0.1)

                          |d(x, y) − d(z, y)| ≤ d(x, z).                   (0.1)

                                                                               3
4                                     0. A first look at Banach and Hilbert spaces


Example. Euclidean space Rn together with d(x, y) = ( n (xk − yk )2 )1/2
                                                            k=1
is a metric space and so is Cn together with d(x, y) = ( n |xk −yk |2 )1/2 .
                                                         k=1

    The set
                             Br (x) = {y ∈ X|d(x, y) < r}                     (0.2)
is called an open ball around x with radius r > 0. A point x of some set
U is called an interior point of U if U contains some ball around x. If x is
an interior point of U , then U is also called a neighborhood of x. A point
x is called a limit point of U if (Br (x){x}) ∩ U = ∅ for every ball around
x. Note that a limit point x need not lie in U , but U must contain points
arbitrarily close to x.
Example. Consider R with the usual metric and let U = (−1, 1). Then
every point x ∈ U is an interior point of U . The points ±1 are limit points
of U .

   A set consisting only of interior points is called open. The family of
open sets O satisfies the properties
       (i) ∅, X ∈ O,
       (ii) O1 , O2 ∈ O implies O1 ∩ O2 ∈ O,
     (iii) {Oα } ⊆ O implies          α Oα    ∈ O.
That is, O is closed under finite intersections and arbitrary unions.
    In general, a space X together with a family of sets O, the open sets,
satisfying (i)–(iii) is called a topological space. The notions of interior
point, limit point, and neighborhood carry over to topological spaces if we
replace open ball by open set.
    There are usually different choices for the topology. Two usually not
very interesting examples are the trivial topology O = {∅, X} and the
discrete topology O = P(X) (the powerset of X). Given two topologies
O1 and O2 on X, O1 is called weaker (or coarser) than O2 if and only if
O1 ⊆ O2 .
Example. Note that different metrics can give rise to the same topology.
For example, we can equip Rn (or Cn ) with the Euclidean distance d(x, y)
as before or we could also use
                                               n
                                  ˜
                                  d(x, y) =         |xk − yk |.               (0.3)
                                              k=1

Then
                              n                n                  n
                        1
                       √           |xk | ≤           |xk |2 ≤         |xk |   (0.4)
                         n
                             k=1              k=1               k=1
0.1. Warm up: Metric and topological spaces                                  5


                  ˜                         ˜
shows Br/√n (x) ⊆ Br (x) ⊆ Br (x), where B, B are balls computed using d,
˜
d, respectively.

Example. We can always replace a metric d by the bounded metric
                          ˜           d(x, y)
                          d(x, y) =                             (0.5)
                                    1 + d(x, y)
without changing the topology.

     Every subspace Y of a topological space X becomes a topological space
                                                           ˜
of its own if we call O ⊆ Y open if there is some open set O ⊆ X such that
      ˜
O = O ∩ Y (induced topology).
Example. The set (0, 1] ⊆ R is not open in the topology of X = R, but it is
open in the induced topology when considered as a subset of Y = [−1, 1].

    A family of open sets B ⊆ O is called a base for the topology if for each
x and each neighborhood U (x), there is some set O ∈ B with x ∈ O ⊆ U (x).
Since an open set O is a neighborhood of every one of its points, it can be
                          ˜
written as O = O⊇O∈B O and we have
                     ˜

Lemma 0.1. If B ⊆ O is a base for the topology, then every open set can
be written as a union of elements from B.

   If there exists a countable base, then X is called second countable.
Example. By construction the open balls B1/n (x) are a base for the topol-
ogy in a metric space. In the case of Rn (or Cn ) it even suffices to take balls
with rational center and hence Rn (and Cn ) is second countable.

    A topological space is called a Hausdorff space if for two different
points there are always two disjoint neighborhoods.
Example. Any metric space is a Hausdorff space: Given two different
points x and y, the balls Bd/2 (x) and Bd/2 (y), where d = d(x, y) > 0, are
disjoint neighborhoods (a semi-metric space will not be Hausdorff).

   The complement of an open set is called a closed set. It follows from
de Morgan’s rules that the family of closed sets C satisfies
      (i) ∅, X ∈ C,
     (ii) C1 , C2 ∈ C implies C1 ∪ C2 ∈ C,
     (iii) {Cα } ⊆ C implies   α Cα   ∈ C.
That is, closed sets are closed under finite unions and arbitrary intersections.
   The smallest closed set containing a given set U is called the closure
                               U=                C,                       (0.6)
                                      C∈C,U ⊆C
6                               0. A first look at Banach and Hilbert spaces


and the largest open set contained in a given set U is called the interior
                              U◦ =             O.                         (0.7)
                                     O∈O,O⊆U

    We can define interior and limit points as before by replacing the word
ball by open set. Then it is straightforward to check
Lemma 0.2. Let X be a topological space. Then the interior of U is the
set of all interior points of U and the closure of U is the union of U with
all limit points of U .

    A sequence (xn )∞ ⊆ X is said to converge to some point x ∈ X if
                       n=1
d(x, xn ) → 0. We write limn→∞ xn = x as usual in this case. Clearly the
limit is unique if it exists (this is not true for a semi-metric).
   Every convergent sequence is a Cauchy sequence; that is, for every
ε > 0 there is some N ∈ N such that
                        d(xn , xm ) ≤ ε,    n, m ≥ N.                     (0.8)
If the converse is also true, that is, if every Cauchy sequence has a limit,
then X is called complete.
Example. Both Rn and Cn are complete metric spaces.

   A point x is clearly a limit point of U if and only if there is some sequence
xn ∈ U {x} converging to x. Hence
Lemma 0.3. A closed subset of a complete metric space is again a complete
metric space.

    Note that convergence can also be equivalently formulated in terms of
topological terms: A sequence xn converges to x if and only if for every
neighborhood U of x there is some N ∈ N such that xn ∈ U for n ≥ N . In
a Hausdorff space the limit is unique.
    A set U is called dense if its closure is all of X, that is, if U = X. A
metric space is called separable if it contains a countable dense set. Note
that X is separable if and only if it is second countable as a topological
space.
Lemma 0.4. Let X be a separable metric space. Every subset of X is again
separable.

Proof. Let A = {xn }n∈N be a dense set in X. The only problem is that
A ∩ Y might contain no elements at all. However, some elements of A must
be at least arbitrarily close: Let J ⊆ N2 be the set of all pairs (n, m) for
which B1/m (xn ) ∩ Y = ∅ and choose some yn,m ∈ B1/m (xn ) ∩ Y for all
(n, m) ∈ J. Then B = {yn,m }(n,m)∈J ⊆ Y is countable. To see that B is
0.1. Warm up: Metric and topological spaces                                     7


dense, choose y ∈ Y . Then there is some sequence xnk with d(xnk , y) < 1/k.
Hence (nk , k) ∈ J and d(ynk ,k , y) ≤ d(ynk ,k , xnk ) + d(xnk , y) ≤ 2/k → 0.

    A function between metric spaces X and Y is called continuous at a
point x ∈ X if for every ε > 0 we can find a δ > 0 such that
                   dY (f (x), f (y)) ≤ ε       if   dX (x, y) < δ.           (0.9)
If f is continuous at every point, it is called continuous.
Lemma 0.5. Let X, Y be metric spaces and f : X → Y . The following are
equivalent:
      (i) f is continuous at x (i.e, (0.9) holds).
     (ii) f (xn ) → f (x) whenever xn → x.
     (iii) For every neighborhood V of f (x), f −1 (V ) is a neighborhood of x.

Proof. (i) ⇒ (ii) is obvious. (ii) ⇒ (iii): If (iii) does not hold, there is
a neighborhood V of f (x) such that Bδ (x) ⊆ f −1 (V ) for every δ. Hence
we can choose a sequence xn ∈ B1/n (x) such that f (xn ) ∈ f −1 (V ). Thus
xn → x but f (xn ) → f (x). (iii) ⇒ (i): Choose V = Bε (f (x)) and observe
that by (iii), Bδ (x) ⊆ f −1 (V ) for some δ.

    The last item implies that f is continuous if and only if the inverse image
of every open (closed) set is again open (closed).
    Note: In a topological space, (iii) is used as the definition for continuity.
However, in general (ii) and (iii) will no longer be equivalent unless one uses
generalized sequences, so-called nets, where the index set N is replaced by
arbitrary directed sets.
   The support of a function f : X → Cn is the closure of all points x for
which f (x) does not vanish; that is,
                         supp(f ) = {x ∈ X|f (x) = 0}.                      (0.10)

   If X and Y are metric spaces, then X × Y together with
                d((x1 , y1 ), (x2 , y2 )) = dX (x1 , x2 ) + dY (y1 , y2 )   (0.11)
is a metric space. A sequence (xn , yn ) converges to (x, y) if and only if
xn → x and yn → y. In particular, the projections onto the first (x, y) → x,
respectively, onto the second (x, y) → y, coordinate are continuous.
   In particular, by the inverse triangle inequality (0.1),
                  |d(xn , yn ) − d(x, y)| ≤ d(xn , x) + d(yn , y),          (0.12)
we see that d : X × X → R is continuous.
8                                  0. A first look at Banach and Hilbert spaces


Example. If we consider R × R, we do not get the Euclidean distance of
R2 unless we modify (0.11) as follows:
             ˜
             d((x1 , y1 ), (x2 , y2 )) =   dX (x1 , x2 )2 + dY (y1 , y2 )2 .   (0.13)
As noted in our previous example, the topology (and thus also conver-
gence/continuity) is independent of this choice.

    If X and Y are just topological spaces, the product topology is defined
by calling O ⊆ X × Y open if for every point (x, y) ∈ O there are open
neighborhoods U of x and V of y such that U × V ⊆ O. In the case of
metric spaces this clearly agrees with the topology defined via the product
metric (0.11).
    A cover of a set Y ⊆ X is a family of sets {Uα } such that Y ⊆ α Uα .
A cover is called open if all Uα are open. Any subset of {Uα } which still
covers Y is called a subcover.
Lemma 0.6 (Lindel¨f). If X is second countable, then every open cover
                     o
has a countable subcover.

Proof. Let {Uα } be an open cover for Y and let B be a countable base.
Since every Uα can be written as a union of elements from B, the set of all
B ∈ B which satisfy B ⊆ Uα for some α form a countable open cover for Y .
Moreover, for every Bn in this set we can find an αn such that Bn ⊆ Uαn .
By construction {Uαn } is a countable subcover.

   A subset K ⊂ X is called compact if every open cover has a finite
subcover.
Lemma 0.7. A topological space is compact if and only if it has the finite
intersection property: The intersection of a family of closed sets is empty
if and only if the intersection of some finite subfamily is empty.

Proof. By taking complements, to every family of open sets there is a cor-
responding family of closed sets and vice versa. Moreover, the open sets
are a cover if and only if the corresponding closed sets have empty intersec-
tion.

    A subset K ⊂ X is called sequentially compact if every sequence has
a convergent subsequence.
Lemma 0.8. Let X be a topological space.
      (i) The continuous image of a compact set is compact.
     (ii) Every closed subset of a compact set is compact.
     (iii) If X is Hausdorff, any compact set is closed.
0.1. Warm up: Metric and topological spaces                                   9


     (iv) The product of finitely many compact sets is compact.
      (v) A compact set is also sequentially compact.

Proof. (i) Observe that if {Oα } is an open cover for f (Y ), then {f −1 (Oα )}
is one for Y .
  (ii) Let {Oα } be an open cover for the closed subset Y . Then {Oα } ∪
{XY } is an open cover for X.
    (iii) Let Y ⊆ X be compact. We show that XY is open. Fix x ∈ XY
(if Y = X, there is nothing to do). By the definition of Hausdorff, for
every y ∈ Y there are disjoint neighborhoods V (y) of y and Uy (x) of x. By
compactness of Y , there are y1 , . . . , yn such that the V (yj ) cover Y . But
then U (x) = n Uyj (x) is a neighborhood of x which does not intersect
                 j=1
Y.
    (iv) Let {Oα } be an open cover for X × Y . For every (x, y) ∈ X × Y
there is some α(x, y) such that (x, y) ∈ Oα(x,y) . By definition of the product
topology there is some open rectangle U (x, y) × V (x, y) ⊆ Oα(x,y) . Hence for
fixed x, {V (x, y)}y∈Y is an open cover of Y . Hence there are finitely many
points yk (x) such that the V (x, yk (x)) cover Y . Set U (x) = k U (x, yk (x)).
Since finite intersections of open sets are open, {U (x)}x∈X is an open cover
and there are finitely many points xj such that the U (xj ) cover X. By
construction, the U (xj ) × V (xj , yk (xj )) ⊆ Oα(xj ,yk (xj )) cover X × Y .
    (v) Let xn be a sequence which has no convergent subsequence. Then
K = {xn } has no limit points and is hence compact by (ii). For every n
there is a ball Bεn (xn ) which contains only finitely many elements of K.
However, finitely many suffice to cover K, a contradiction.

    In a metric space compact and sequentially compact are equivalent.
Lemma 0.9. Let X be a metric space. Then a subset is compact if and only
if it is sequentially compact.

Proof. By item (v) of the previous lemma it suffices to show that X is
compact if it is sequentially compact.
    First of all note that every cover of open balls with fixed radius ε > 0
has a finite subcover since if this were false we could construct a sequence
xn ∈ X n−1 Bε (xm ) such that d(xn , xm ) > ε for m < n.
           m=1
   In particular, we are done if we can show that for every open cover
{Oα } there is some ε > 0 such that for every x we have Bε (x) ⊆ Oα for
some α = α(x). Indeed, choosing {xk }n such that Bε (xk ) is a cover, we
                                      k=1
have that Oα(xk ) is a cover as well.
    So it remains to show that there is such an ε. If there were none, for
every ε > 0 there must be an x such that Bε (x) ⊆ Oα for every α. Choose
10                               0. A first look at Banach and Hilbert spaces


     1
ε = n and pick a corresponding xn . Since X is sequentially compact, it is no
restriction to assume xn converges (after maybe passing to a subsequence).
Let x = lim xn . Then x lies in some Oα and hence Bε (x) ⊆ Oα . But choosing
                 1   ε                  ε
n so large that n < 2 and d(xn , x) < 2 , we have B1/n (xn ) ⊆ Bε (x) ⊆ Oα ,
contradicting our assumption.

     Please also recall the Heine–Borel theorem:
Theorem 0.10 (Heine–Borel). In Rn (or Cn ) a set is compact if and only
if it is bounded and closed.

Proof. By Lemma 0.8 (ii) and (iii) it suffices to show that a closed interval
in I ⊆ R is compact. Moreover, by Lemma 0.9 it suffices to show that
every sequence in I = [a, b] has a convergent subsequence. Let xn be our
sequence and divide I = [a, a+b ] ∪ [ a+b , b]. Then at least one of these two
                                  2       2
intervals, call it I1 , contains infinitely many elements of our sequence. Let
y1 = xn1 be the first one. Subdivide I1 and pick y2 = xn2 , with n2 > n1 as
before. Proceeding like this, we obtain a Cauchy sequence yn (note that by
construction In+1 ⊆ In and hence |yn − ym | ≤ b−a for m ≥ n).
                                                   n

   A topological space is called locally compact if every point has a com-
pact neighborhood.
Example. Rn is locally compact.

     The distance between a point x ∈ X and a subset Y ⊆ X is
                           dist(x, Y ) = inf d(x, y).                    (0.14)
                                          y∈Y

Note that x is a limit point of Y if and only if dist(x, Y ) = 0.
Lemma 0.11. Let X be a metric space. Then
                      | dist(x, Y ) − dist(z, Y )| ≤ d(x, z).            (0.15)
In particular, x → dist(x, Y ) is continuous.

Proof. Taking the infimum in the triangle inequality d(x, y) ≤ d(x, z) +
d(z, y) shows dist(x, Y ) ≤ d(x, z)+dist(z, Y ). Hence dist(x, Y )−dist(z, Y ) ≤
d(x, z). Interchanging x and z shows dist(z, Y ) − dist(x, Y ) ≤ d(x, z).
Lemma 0.12 (Urysohn). Suppose C1 and C2 are disjoint closed subsets of
a metric space X. Then there is a continuous function f : X → [0, 1] such
that f is zero on C1 and one on C2 .
   If X is locally compact and C1 is compact, one can choose f with compact
support.
0.1. Warm up: Metric and topological spaces                                                    11


                                                                     dist(x,C2 )
Proof. To prove the first claim, set f (x) =                    dist(x,C1 )+dist(x,C2 ) .   For the
second claim, observe that there is an open set O such that O is compact
and C1 ⊂ O ⊂ O ⊂ XC2 . In fact, for every x, there is a ball Bε (x) such
that Bε (x) is compact and Bε (x) ⊂ XC2 . Since C1 is compact, finitely
many of them cover C1 and we can choose the union of those balls to be O.
Now replace C2 by XO.

     Note that Urysohn’s lemma implies that a metric space is normal; that
is, for any two disjoint closed sets C1 and C2 , there are disjoint open sets
O1 and O2 such that Cj ⊆ Oj , j = 1, 2. In fact, choose f as in Urysohn’s
lemma and set O1 = f −1 ([0, 1/2)), respectively, O2 = f −1 ((1/2, 1]).
Lemma 0.13. Let X be a locally compact metric space. Suppose K is
a compact set and {Oj }n an open cover. Then there is a partition of
                         j=1
unity for K subordinate to this cover; that is, there are continuous functions
hj : X → [0, 1] such that hj has compact support contained in Oj and
                                    n
                                         hj (x) ≤ 1                                         (0.16)
                                   j=1
with equality for x ∈ K.

Proof. For every x ∈ K there is some ε and some j such that Bε (x) ⊆ Oj .
By compactness of K, finitely many of these balls cover K. Let Kj be the
union of those balls which lie inside Oj . By Urysohn’s lemma there are
functions gj : X → [0, 1] such that gj = 1 on Kj and gj = 0 on XOj . Now
set
                                            j−1
                                hj = gj           (1 − gk ).                                (0.17)
                                          k=1
Then hj : X → [0, 1] has compact support contained in Oj and
                           n                       n
                               hj (x) = 1 −            (1 − gj (x))                         (0.18)
                       j=1                        j=1
shows that the sum is one for x ∈ K, since x ∈ Kj for some j implies
gj (x) = 1 and causes the product to vanish.
Problem 0.1. Show that |d(x, y) − d(z, y)| ≤ d(x, z).
Problem 0.2. Show the quadrangle inequality |d(x, y) − d(x , y )| ≤
d(x, x ) + d(y, y ).
Problem 0.3. Let X be some space together with a sequence of distance
functions dn , n ∈ N. Show that
                                        ∞
                                              1 dn (x, y)
                        d(x, y) =
                                             2n 1 + dn (x, y)
                                     n=1
12                              0. A first look at Banach and Hilbert spaces


is again a distance function.

Problem 0.4. Show that the closure satisfies U = U .
Problem 0.5. Let U ⊆ V be subsets of a metric space X. Show that if U
is dense in V and V is dense in X, then U is dense in X.
Problem 0.6. Show that any open set O ⊆ R can be written as a countable
union of disjoint intervals. (Hint: Let {Iα } be the set of all maximal subin-
tervals of O; that is, Iα ⊆ O and there is no other subinterval of O which
contains Iα . Then this is a cover of disjoint intervals which has a countable
subcover.)

0.2. The Banach space of continuous functions
Now let us have a first look at Banach spaces by investigating the set of
continuous functions C(I) on a compact interval I = [a, b] ⊂ R. Since we
want to handle complex models, we will always consider complex-valued
functions!
   One way of declaring a distance, well-known from calculus, is the max-
imum norm:
                   f (x) − g(x) ∞ = max |f (x) − g(x)|.              (0.19)
                                       x∈I
It is not hard to see that with this definition C(I) becomes a normed linear
space:
    A normed linear space X is a vector space X over C (or R) with a
real-valued function (the norm) . such that
       • f ≥ 0 for all f ∈ X and f = 0 if and only if f = 0,
       • α f = |α| f for all α ∈ C and f ∈ X, and
       • f + g ≤ f + g for all f, g ∈ X (triangle inequality).
    From the triangle inequality we also get the inverse triangle inequal-
ity (Problem 0.7)
                          | f − g |≤ f −g .                          (0.20)
   Once we have a norm, we have a distance d(f, g) = f −g and hence we
know when a sequence of vectors fn converges to a vector f . We will write
fn → f or limn→∞ fn = f , as usual, in this case. Moreover, a mapping
F : X → Y between two normed spaces is called continuous if fn → f
implies F (fn ) → F (f ). In fact, it is not hard to see that the norm, vector
addition, and multiplication by scalars are continuous (Problem 0.8).
   In addition to the concept of convergence we have also the concept of
a Cauchy sequence and hence the concept of completeness: A normed
0.2. The Banach space of continuous functions                                               13


space is called complete if every Cauchy sequence has a limit. A complete
normed space is called a Banach space.
Example. The space        1 (N)   of all sequences a = (aj )∞ for which the norm
                                                            j=1
                                                     ∞
                                         a   1   =           |aj |                       (0.21)
                                                     j=1

is finite is a Banach space.
    To show this, we need to verify three things: (i) 1 (N) is a vector space
that is closed under addition and scalar multiplication, (ii) . 1 satisfies the
three requirements for a norm, and (iii) 1 (N) is complete.
   First of all observe
                k                    k                   k
                     |aj + bj | ≤            |aj | +           |bj | ≤ a   1   + b   1   (0.22)
               j=1                  j=1                j=1

for any finite k. Letting k → ∞, we conclude that 1 (N) is closed under
addition and that the triangle inequality holds. That 1 (N) is closed under
scalar multiplication and the two other properties of a norm are straight-
forward. It remains to show that 1 (N) is complete. Let an = (an )∞ be
                                                                    j j=1
a Cauchy sequence; that is, for given ε > 0 we can find an Nε such that
 am − an 1 ≤ ε for m, n ≥ Nε . This implies in particular |am − an | ≤ ε for
                                                            j     j
any fixed j. Thus an is a Cauchy sequence for fixed j and by completeness
                     j
of C has a limit: limn→∞ an = aj . Now consider
                           j
                                     k
                                             |am − an | ≤ ε
                                               j    j                                    (0.23)
                                    j=1

and take m → ∞:
                                     k
                                             |aj − an | ≤ ε.
                                                    j                                    (0.24)
                                    j=1
Since this holds for any finite k, we even have a−an 1 ≤ ε. Hence (a−an ) ∈
 1 (N) and since a ∈ 1 (N), we finally conclude a = a + (a − a ) ∈ 1 (N).
                  n                                 n        n

Example. The space        ∞ (N)     of all bounded sequences a = (aj )∞ together
                                                                      j=1
with the norm
                                     a       ∞   = sup |aj |                             (0.25)
                                                     j∈N
is a Banach space (Problem 0.10).

    Now what about convergence in the space C(I)? A sequence of functions
fn (x) converges to f if and only if
                lim f − fn = lim sup |fn (x) − f (x)| = 0.                               (0.26)
                n→∞                      n→∞ x∈I
14                                 0. A first look at Banach and Hilbert spaces


That is, in the language of real analysis, fn converges uniformly to f . Now
let us look at the case where fn is only a Cauchy sequence. Then fn (x) is
clearly a Cauchy sequence of real numbers for any fixed x ∈ I. In particular,
by completeness of C, there is a limit f (x) for each x. Thus we get a limiting
function f (x). Moreover, letting m → ∞ in
                 |fm (x) − fn (x)| ≤ ε           ∀m, n > Nε , x ∈ I,    (0.27)
we see
                   |f (x) − fn (x)| ≤ ε    ∀n > Nε , x ∈ I;              (0.28)
that is, fn (x) converges uniformly to f (x). However, up to this point we
do not know whether it is in our vector space C(I) or not, that is, whether
it is continuous or not. Fortunately, there is a well-known result from real
analysis which tells us that the uniform limit of continuous functions is again
continuous. Hence f (x) ∈ C(I) and thus every Cauchy sequence in C(I)
converges. Or, in other words
Theorem 0.14. C(I) with the maximum norm is a Banach space.
    Next we want to know if there is a countable basis for C(I). We will call
a set of vectors {un } ⊂ X linearly independent if every finite subset is and
we will call a countable set of linearly independent vectors {un }N ⊂ X
                                                                  n=1
a Schauder basis if every element f ∈ X can be uniquely written as a
countable linear combination of the basis elements:
                            N
                      f=         cn un ,        cn = cn (f ) ∈ C,       (0.29)
                           n=1
where the sum has to be understood as a limit if N = ∞. In this case the
span span{un } (the set of all finite linear combinations) of {un } is dense in
X. A set whose span is dense is called total and if we have a countable total
set, we also have a countable dense set (consider only linear combinations
with rational coefficients — show this). A normed linear space containing a
countable dense set is called separable.
Example. The Banach space 1 (N) is separable. In fact, the set of vectors
δ n , with δn = 1 and δm = 0, n = m, is total: Let a = (aj )∞ ∈ 1 (N) be
            n          n
                                                            j=1
                n =   n
given and set a       j=1 aj δ j . Then
                                                ∞
                                   n
                           a−a         1   =           |aj | → 0        (0.30)
                                               j=n+1
since an = aj for 1 ≤ j ≤ n and an = 0 for j > n.
       j                         j

     Luckily this is also the case for C(I):
Theorem 0.15 (Weierstraß). Let I be a compact interval. Then the set of
polynomials is dense in C(I).
0.2. The Banach space of continuous functions                                                      15


Proof. Let f (x) ∈ C(I) be given. By considering f (x) − f (a) + (f (b) −
f (a))(x − b) it is no loss to assume that f vanishes at the boundary points.
Moreover, without restriction we only consider I = [ −1 , 1 ] (why?).
                                                        2 2
    Now the claim follows from the lemma below using
                                                    1
                                    un (x) =          (1 − x2 )n ,
                                                   In
where
                      1                                 1
                                             n
        In =              (1 − x2 )n dx =                   (1 − x)n−1 (1 + x)n+1 dx
                  −1                        n+1        −1
                                      n!                             n!
             = ··· =                               22n+1 = 1 1
                            (n + 1) · · · (2n + 1)         2 ( 2 + 1) · · · ( 1 + n)
                                                                              2
                 √ Γ(1 + n)      π         1
             =    π 3        =     (1 + O( )).
                   Γ( 2 + n)     n         n
                                         √
In the last step we have used Γ( 1 ) = π [1, (6.1.8)] and the asymptotics
                                   2
follow from Stirling’s formula [1, (6.1.37)].

Lemma 0.16 (Smoothing). Let un (x) be a sequence of nonnegative contin-
uous functions on [−1, 1] such that

                un (x)dx = 1        and                     un (x)dx → 0,       δ > 0.      (0.31)
        |x|≤1                                 δ≤|x|≤1

(In other words, un has mass one and concentrates near x = 0 as n → ∞.)
                             1
      Then for every f ∈ C[− 2 , 1 ] which vanishes at the endpoints, f (− 2 ) =
                                 2
                                                                           1

f ( 1 ) = 0, we have that
    2
                                             1/2
                               fn (x) =            un (x − y)f (y)dy                        (0.32)
                                            −1/2

converges uniformly to f (x).

Proof. Since f is uniformly continuous, for given ε we can find a δ (indepen-
dent of x) such that |f (x)−f (y)| ≤ ε whenever |x−y| ≤ δ. Moreover, we can
choose n such that δ≤|y|≤1 un (y)dy ≤ ε. Now abbreviate M = max{1, |f |}
and note
                1/2                                                   1/2
  |f (x) −            un (x − y)f (x)dy| = |f (x)| |1 −                     un (x − y)dy| ≤ M ε.
             −1/2                                                    −1/2

In fact, either the distance of x to one of the boundary points ± 1 is smaller
                                                                  2
than δ and hence |f (x)| ≤ ε or otherwise the difference between one and the
integral is smaller than ε.
16                                        0. A first look at Banach and Hilbert spaces


     Using this, we have
                               1/2
      |fn (x) − f (x)| ≤             un (x − y)|f (y) − f (x)|dy + M ε
                           −1/2

                      ≤                        un (x − y)|f (y) − f (x)|dy
                           |y|≤1/2,|x−y|≤δ

                           +                      un (x − y)|f (y) − f (x)|dy + M ε
                                |y|≤1/2,|x−y|≥δ
                      =ε + 2M ε + M ε = (1 + 3M )ε,                               (0.33)
which proves the claim.

   Note that fn will be as smooth as un , hence the title smoothing lemma.
The same idea is used to approximate noncontinuous functions by smooth
ones (of course the convergence will no longer be uniform in this case).
Corollary 0.17. C(I) is separable.

However,    ∞ (N)   is not separable (Problem 0.11)!
Problem 0.7. Show that | f − g | ≤ f − g .
Problem 0.8. Let X be a Banach space. Show that the norm, vector ad-
dition, and multiplication by scalars are continuous. That is, if fn → f ,
gn → g, and αn → α, then fn → f , fn + gn → f + g, and αn gn → αg.
                                                                   ∞
Problem 0.9. Let X be a Banach space. Show that                    j=1   fj < ∞ implies
that
                                      ∞               n
                                           fj = lim         fj
                                               n→∞
                                     j=1              j=1

exists. The series is called absolutely convergent in this case.
Problem 0.10. Show that               ∞ (N)   is a Banach space.
Problem 0.11. Show that ∞ (N) is not separable. (Hint: Consider se-
quences which take only the value one and zero. How many are there? What
is the distance between two such sequences?)

0.3. The geometry of Hilbert spaces
So it looks like C(I) has all the properties we want. However, there is
still one thing missing: How should we define orthogonality in C(I)? In
Euclidean space, two vectors are called orthogonal if their scalar product
vanishes, so we would need a scalar product:
0.3. The geometry of Hilbert spaces                                                   17


    Suppose H is a vector space. A map ., .. : H × H → C is called a
sesquilinear form if it is conjugate linear in the first argument and linear
in the second; that is,
       α1 f1 + α2 f2 , g          ∗           ∗
                               = α1 f1 , g + α2 f2 , g ,
                                                                    α1 , α2 ∈ C,   (0.34)
       f, α1 g1 + α2 g2        = α1 f, g1 + α2 f, g2 ,
where ‘∗’ denotes complex conjugation. A sesquilinear form satisfying the
requirements
      (i) f, f > 0 for f = 0                (positive definite),
     (ii) f, g = g, f      ∗         (symmetry)
is called an inner product or scalar product. Associated with every
scalar product is a norm
                                  f =    f, f .                     (0.35)
The pair (H, ., .. ) is called an inner product space. If H is complete, it
is called a Hilbert space.
Example. Clearly Cn with the usual scalar product
                                                  n
                                     a, b =            a∗ bj
                                                        j                          (0.36)
                                                 j=1

is a (finite dimensional) Hilbert space.

Example. A somewhat more interesting example is the Hilbert space                  2 (N),

that is, the set of all sequences
                                             ∞
                                 (aj )∞
                                      j=1         |aj |2 < ∞                       (0.37)
                                            j=1

with scalar product
                                                 ∞
                                     a, b =            a∗ bj .
                                                        j                          (0.38)
                                                j=1
(Show that this is in fact a separable Hilbert space — Problem 0.13.)

   Of course I still owe you a proof for the claim that     f, f is indeed a
norm. Only the triangle inequality is nontrivial, which will follow from the
Cauchy–Schwarz inequality below.
     A vector f ∈ H is called normalized or a unit vector if f = 1.
Two vectors f, g ∈ H are called orthogonal or perpendicular (f ⊥ g) if
 f, g = 0 and parallel if one is a multiple of the other.
   If f and g are orthogonal, we have the Pythagorean theorem:
                                 2          2
                       f +g          = f        + g 2,           f ⊥ g,            (0.39)
which is one line of computation.
18                                 0. A first look at Banach and Hilbert spaces


    Suppose u is a unit vector. Then the projection of f in the direction of
u is given by
                                f = u, f u                            (0.40)
and f⊥ defined via
                                  f⊥ = f − u, f u                             (0.41)
is perpendicular to u since u, f⊥ = u, f − u, f u = u, f − u, f u, u =
0.

                                           f
                                            w
                                           f
                                      f       f f⊥
                                               f
                                              I
                                                f
                                            
                                         
                                     f
                                 u
                                   
                                   I
                                 

Taking any other vector parallel to u, it is easy to see
                   2                      2              2
        f − αu         = f⊥ + (f − αu)        = f⊥           + | u, f − α|2   (0.42)
and hence f = u, f u is the unique vector parallel to u which is closest to
f.
    As a first consequence we obtain the Cauchy–Schwarz–Bunjakowski
inequality:
Theorem 0.18 (Cauchy–Schwarz–Bunjakowski). Let H0 be an inner prod-
uct space. Then for every f, g ∈ H0 we have
                                  | f, g | ≤ f   g                            (0.43)
with equality if and only if f and g are parallel.

Proof. It suffices to prove the case g = 1. But then the claim follows
from f 2 = | g, f |2 + f⊥ 2 .

     Note that the Cauchy–Schwarz inequality entails that the scalar product
is continuous in both variables; that is, if fn → f and gn → g, we have
 fn , gn → f, g .
    As another consequence we infer that the map . is indeed a norm. In
fact,
               2          2                          2
        f +g       = f        + f, g + g, f + g          ≤ ( f + g )2 .       (0.44)

    But let us return to C(I). Can we find a scalar product which has the
maximum norm as associated norm? Unfortunately the answer is no! The
reason is that the maximum norm does not satisfy the parallelogram law
(Problem 0.17).
0.3. The geometry of Hilbert spaces                                                                                   19


Theorem 0.19 (Jordan–von Neumann). A norm is associated with a scalar
product if and only if the parallelogram law
                                 2                         2              2             2
                          f +g       + f −g                    =2 f           +2 g                              (0.45)
holds.
    In this case the scalar product can be recovered from its norm by virtue
of the polarization identity
                  1          2                    2                           2                2
         f, g =       f +g       − f −g                   + i f − ig              − i f + ig        .           (0.46)
                  4
Proof. If an inner product space is given, verification of the parallelogram
law and the polarization identity is straightforward (Problem 0.14).
    To show the converse, we define
                      1              2                         2                   2                    2
          s(f, g) =       f +g           − f −g                    + i f − ig          − i f + ig           .
                      4
Then s(f, f ) = f 2 and s(f, g) = s(g, f )∗ are straightforward to check.
Moreover, another straightforward computation using the parallelogram law
shows
                                                    g+h
                         s(f, g) + s(f, h) = 2s(f,       ).
                                                     2
Now choosing h = 0 (and using s(f, 0) = 0) shows s(f, g) = 2s(f, g ) and   2
thus s(f, g) + s(f, h) = s(f, g + h). Furthermore, by induction we infer
m                 m
2n s(f, g) = s(f, 2n g); that is, α s(f, g) = s(f, αg) for every positive rational
α. By continuity (check this!) this holds for all α  0 and s(f, −g) =
−s(f, g), respectively, s(f, ig) = i s(f, g), finishes the proof.

    Note that the parallelogram law and the polarization identity even hold
for sesquilinear forms (Problem 0.14).
    But how do we define a scalar product on C(I)? One possibility is
                                                      b
                                 f, g =                   f ∗ (x)g(x)dx.                                        (0.47)
                                                  a

The corresponding inner product space is denoted by L2 (I). Note that
                                                     cont
we have
                          f ≤ |b − a| f ∞                      (0.48)
and hence the maximum norm is stronger than the L2 norm.
                                                 cont
    Suppose we have two norms . 1 and . 2 on a space X. Then .                                                    2   is
said to be stronger than . 1 if there is a constant m  0 such that
                                          f   1   ≤m f               2.                                         (0.49)
It is straightforward to check the following.
20                                   0. A first look at Banach and Hilbert spaces


Lemma 0.20. If . 2 is stronger than . 1 , then any .                     2   Cauchy sequence
is also a . 1 Cauchy sequence.

   Hence if a function F : X → Y is continuous in (X, . 1 ), it is also
continuous in (X, . 2 ) and if a set is dense in (X, . 2 ), it is also dense in
(X, . 1 ).
    In particular, L2 is separable. But is it also complete? Unfortunately
                    cont
the answer is no:
Example. Take I = [0, 2] and define
                       
                                                                   1
                       0,
                                                      0 ≤ x ≤ 1 − n,
              fn (x) = 1 + n(x − 1),                       1
                                                       1 − n ≤ x ≤ 1,                 (0.50)
                       
                          1,                           1 ≤ x ≤ 2.
                       

Then fn (x) is a Cauchy sequence in L2 , but there is no limit in L2 !
                                         cont                           cont
Clearly the limit should be the step function which is 0 for 0 ≤ x  1 and 1
for 1 ≤ x ≤ 2, but this step function is discontinuous (Problem 0.18)!

     This shows that in infinite dimensional spaces different norms will give
rise to different convergent sequences! In fact, the key to solving problems in
infinite dimensional spaces is often finding the right norm! This is something
which cannot happen in the finite dimensional case.
Theorem 0.21. If X is a finite dimensional space, then all norms are equiv-
alent. That is, for any two given norms . 1 and . 2 , there are constants
m1 and m2 such that
                         1
                            f 1 ≤ f 2 ≤ m1 f 1 .                    (0.51)
                        m2
Proof. Clearly we can choose a basis uj , 1 ≤ j ≤ n, and assume that . 2
                                         2             2
is the usual Euclidean norm,    j αj uj 2 =     j |αj | . Let f = j αj uj .
Then by the triangle and Cauchy–Schwartz inequalities

                    f       ≤       |αj | uj       ≤        uj   2   f
                        1                      1                 1       2
                                j                       j

and we can choose m2 =               j   uj   1.

    In particular, if fn is convergent with respect to . 2 , it is also convergent
with respect to . 1 . Thus . 1 is continuous with respect to . 2 and attains
its minimum m  0 on the unit sphere (which is compact by the Heine-Borel
theorem). Now choose m1 = 1/m.
Problem 0.12. Show that the norm in a Hilbert space satisfies f + g =
 f + g if and only if f = αg, α ≥ 0, or g = 0.
0.3. The geometry of Hilbert spaces                                         21


Problem 0.13. Show that        2 (N)   is a separable Hilbert space.

Problem 0.14. Suppose Q is a vector space. Let s(f, g) be a sesquilinear
form on Q and q(f ) = s(f, f ) the associated quadratic form. Prove the
parallelogram law

                      q(f + g) + q(f − g) = 2q(f ) + 2q(g)               (0.52)

and the polarization identity
                 1
     s(f, g) =     (q(f + g) − q(f − g) + i q(f − ig) − i q(f + ig)) .   (0.53)
                 4
    Conversely, show that any quadratic form q(f ) : Q → R satisfying
q(αf ) = |α|2 q(f ) and the parallelogram law gives rise to a sesquilinear form
via the polarization identity.

Problem 0.15. A sesquilinear form is called bounded if

                              s =       sup       |s(f, g)|
                                       f = g =1

is finite. Similarly, the associated quadratic form q is bounded if

                                 q = sup |q(f )|
                                         f =1

is finite. Show
                                 q ≤ s ≤2 q .
(Hint: Use the parallelogram law and the polarization identity from the pre-
vious problem.)

Problem 0.16. Suppose Q is a vector space. Let s(f, g) be a sesquilinear
form on Q and q(f ) = s(f, f ) the associated quadratic form. Show that the
Cauchy–Schwarz inequality

                            |s(f, g)| ≤ q(f )1/2 q(g)1/2                 (0.54)

holds if q(f ) ≥ 0.
   (Hint: Consider 0 ≤ q(f + αg) = q(f ) + 2 Re(α s(f, g)) + |α|2 q(g) and
choose α = t s(f, g)∗ /|s(f, g)| with t ∈ R.)

Problem 0.17. Show that the maximum norm (on C[0, 1]) does not satisfy
the parallelogram law.

Problem 0.18. Prove the claims made about fn , defined in (0.50), in the
last example.
22                             0. A first look at Banach and Hilbert spaces


0.4. Completeness
Since L2cont is not complete, how can we obtain a Hilbert space from it?
Well, the answer is simple: take the completion.
    If X is an (incomplete) normed space, consider the set of all Cauchy
            ˜
sequences X. Call two Cauchy sequences equivalent if their difference con-
                               ¯
verges to zero and denote by X the set of all equivalence classes. It is easy
to see that X¯ (and X) inherit the vector space structure from X. Moreover,
                    ˜

Lemma 0.22. If xn is a Cauchy sequence, then xn converges.

    Consequently the norm of a Cauchy sequence (xn )∞ can be defined by
                                                    n=1
 (xn )∞
      n=1 = limn→∞ xn and is independent of the equivalence class (show
             ¯                    ˜
this!). Thus X is a normed space (X is not! Why?).
                 ¯
Theorem 0.23. X is a Banach space containing X as a dense subspace if
we identify x ∈ X with the equivalence class of all sequences converging to
x.
                                         ¯
Proof. (Outline) It remains to show that X is complete. Let ξn = [(xn,j )∞ ]
                                                                         j=1
                         ¯
be a Cauchy sequence in X. Then it is not hard to see that ξ = [(xj,j )∞ ]
                                                                         j=1
is its limit.
                                       ¯
    Let me remark that the completion X is unique. More precisely any
other complete space which contains X as a dense subset is isomorphic to
 ¯
X. This can for example be seen by showing that the identity map on X
                          ¯
has a unique extension to X (compare Theorem 0.26 below).
    In particular it is no restriction to assume that a normed linear space
or an inner product space is complete. However, in the important case
of L2cont it is somewhat inconvenient to work with equivalence classes of
Cauchy sequences and hence we will give a different characterization using
the Lebesgue integral later.

0.5. Bounded operators
A linear map A between two normed spaces X and Y will be called a (lin-
ear) operator
                         A : D(A) ⊆ X → Y.                       (0.55)
The linear subspace D(A) on which A is defined is called the domain of A
and is usually required to be dense. The kernel
                       Ker(A) = {f ∈ D(A)|Af = 0}                      (0.56)
and range
                    Ran(A) = {Af |f ∈ D(A)} = AD(A)                    (0.57)
0.5. Bounded operators                                                      23


are defined as usual. The operator A is called bounded if the operator
norm
                         A = sup Af Y                          (0.58)
                                      f   X =1

is finite.
   The set of all bounded linear operators from X to Y is denoted by
L(X, Y ). If X = Y , we write L(X, X) = L(X).
Theorem 0.24. The space L(X, Y ) together with the operator norm (0.58)
is a normed space. It is a Banach space if Y is.

Proof. That (0.58) is indeed a norm is straightforward. If Y is complete and
An is a Cauchy sequence of operators, then An f converges to an element
g for every f . Define a new operator A via Af = g. By continuity of
the vector operations, A is linear and by continuity of the norm Af =
limn→∞ An f ≤ (limn→∞ An ) f , it is bounded. Furthermore, given
ε  0 there is some N such that An − Am ≤ ε for n, m ≥ N and thus
  An f −Am f ≤ ε f . Taking the limit m → ∞, we see An f −Af ≤ ε f ;
that is, An → A.

    By construction, a bounded operator is Lipschitz continuous,
                             Af   Y   ≤ A        f   X,                  (0.59)
and hence continuous. The converse is also true
Theorem 0.25. An operator A is bounded if and only if it is continuous.

Proof. Suppose A is continuous but not bounded. Then there is a sequence
                                                 1
of unit vectors un such that Aun ≥ n. Then fn = n un converges to 0 but
 Afn ≥ 1 does not converge to 0.

    Moreover, if A is bounded and densely defined, it is no restriction to
assume that it is defined on all of X.
Theorem 0.26 (B.L.T. theorem). Let A ∈ L(X, Y ) and let Y be a Banach
space. If D(A) is dense, there is a unique (continuous) extension of A to X
which has the same norm.

Proof. Since a bounded operator maps Cauchy sequences to Cauchy se-
quences, this extension can only be given by
                 Af = lim Afn ,           fn ∈ D(A),       f ∈ X.
                       n→∞
To show that this definition is independent of the sequence fn → f , let
gn → f be a second sequence and observe
             Afn − Agn = A(fn − gn ) ≤ A                  fn − gn → 0.
24                               0. A first look at Banach and Hilbert spaces


From continuity of vector addition and scalar multiplication it follows that
our extension is linear. Finally, from continuity of the norm we conclude
that the norm does not increase.

   An operator in L(X, C) is called a bounded linear functional and the
space X ∗ = L(X, C) is called the dual space of X. A sequence fn is said to
converge weakly, fn       f , if (fn ) → (f ) for every ∈ X ∗ .
    The Banach space of bounded linear operators L(X) even has a multi-
plication given by composition. Clearly this multiplication satisfies
 (A + B)C = AC + BC,       A(B + C) = AB + BC,                    A, B, C ∈ L(X) (0.60)
and
      (AB)C = A(BC),       α (AB) = (αA)B = A (αB),                    α ∈ C.      (0.61)
Moreover, it is easy to see that we have
                              AB ≤ A B .                                           (0.62)
However, note that our multiplication is not commutative (unless X is one-
dimensional). We even have an identity, the identity operator I satisfying
 I = 1.
    A Banach space together with a multiplication satisfying the above re-
quirements is called a Banach algebra. In particular, note that (0.62)
ensures that multiplication is continuous (Problem 0.22).

Problem 0.19. Show that the integral operator
                                            1
                       (Kf )(x) =               K(x, y)f (y)dy,
                                        0

where K(x, y) ∈ C([0, 1] × [0, 1]), defined on D(K) = C[0, 1] is a bounded
operator both in X = C[0, 1] (max norm) and X = L2 (0, 1).
                                                   cont
                                                                           d
Problem 0.20. Show that the differential operator A =                      dx   defined on
D(A) = C 1 [0, 1] ⊂ C[0, 1] is an unbounded operator.

Problem 0.21. Show that AB ≤ A                      B for every A, B ∈ L(X).

Problem 0.22. Show that the multiplication in a Banach algebra X is con-
tinuous: xn → x and yn → y imply xn yn → xy.

Problem 0.23. Let
                                 ∞
                       f (z) =         fj z j ,       |z|  R,
                                 j=0
0.6. Lebesgue Lp spaces                                                         25


be a convergent power series with convergence radius R  0. Suppose A is
a bounded operator with A  R. Show that
                                               ∞
                                 f (A) =           fj Aj
                                            j=0

exists and defines a bounded linear operator (cf. Problem 0.9).

0.6. Lebesgue Lp spaces
For this section some basic facts about the Lebesgue integral are required.
The necessary background can be found in Appendix A. To begin with,
Sections A.1, A.3, and A.4 will be sufficient.
   We fix some σ-finite measure space (X, Σ, µ) and denote by Lp (X, dµ),
1 ≤ p, the set of all complex-valued measurable functions for which
                                                           1/p
                             f   p   =         |f |p dµ                     (0.63)
                                           X
is finite. First of all note that Lp (X, dµ) is a linear space, since |f + g|p ≤
2p max(|f |, |g|)p ≤ 2p max(|f |p , |g|p ) ≤ 2p (|f |p + |g|p ). Of course our hope
is that Lp (X, dµ) is a Banach space. However, there is a small technical
problem (recall that a property is said to hold almost everywhere if the set
where it fails to hold is contained in a set of measure zero):
Lemma 0.27. Let f be measurable. Then

                                         |f |p dµ = 0                       (0.64)
                                     X
if and only if f (x) = 0 almost everywhere with respect to µ.

Proof. Observe that we have A = {x|f (x) = 0} = n An , where An =
              1
{x| |f (x)|  n }. If |f |p dµ = 0, we must have µ(An ) = 0 for every n and
hence µ(A) = limn→∞ µ(An ) = 0. The converse is obvious.

    Note that the proof also shows that if f is not 0 almost everywhere,
there is an ε  0 such that µ({x| |f (x)| ≥ ε})  0.
Example. Let λ be the Lebesgue measure on R. Then the characteristic
function of the rationals χQ is zero a.e. (with respect to λ).
    Let Θ be the Dirac measure centered at 0. Then f (x) = 0 a.e. (with
respect to Θ) if and only if f (0) = 0.

    Thus f p = 0 only implies f (x) = 0 for almost every x, but not for all!
Hence . p is not a norm on Lp (X, dµ). The way out of this misery is to
identify functions which are equal almost everywhere: Let
               N (X, dµ) = {f |f (x) = 0 µ-almost everywhere}.              (0.65)
26                                 0. A first look at Banach and Hilbert spaces


Then N (X, dµ) is a linear subspace of Lp (X, dµ) and we can consider the
quotient space
                        Lp (X, dµ) = Lp (X, dµ)/N (X, dµ).              (0.66)
If dµ is the Lebesgue measure on X ⊆ Rn , we simply write Lp (X). Observe
that f p is well-defined on Lp (X, dµ).
    Even though the elements of Lp (X, dµ) are, strictly speaking, equiva-
lence classes of functions, we will still call them functions for notational
convenience. However, note that for f ∈ Lp (X, dµ) the value f (x) is not
well-defined (unless there is a continuous representative and different con-
tinuous functions are in different equivalence classes, e.g., in the case of
Lebesgue measure).
    With this modification we are back in business since Lp (X, dµ) turns
out to be a Banach space. We will show this in the following sections.
    But before that let us also define L∞ (X, dµ). It should be the set of
bounded measurable functions B(X) together with the sup norm. The only
problem is that if we want to identify functions equal almost everywhere, the
supremum is no longer independent of the equivalence class. The solution
is the essential supremum

                   f    ∞   = inf{C | µ({x| |f (x)|  C}) = 0}.         (0.67)

That is, C is an essential bound if |f (x)| ≤ C almost everywhere and the
essential supremum is the infimum over all essential bounds.
Example. If λ is the Lebesgue measure, then the essential sup of χQ with
respect to λ is 0. If Θ is the Dirac measure centered at 0, then the essential
sup of χQ with respect to Θ is 1 (since χQ (0) = 1, and x = 0 is the only
point which counts for Θ).

     As before we set

                         L∞ (X, dµ) = B(X)/N (X, dµ)                    (0.68)

and observe that f      ∞   is independent of the equivalence class.
     If you wonder where the ∞ comes from, have a look at Problem 0.24.
    As a preparation for proving that Lp is a Banach space, we will need
H¨lder’s inequality, which plays a central role in the theory of Lp spaces.
  o
In particular, it will imply Minkowski’s inequality, which is just the triangle
inequality for Lp .

Theorem 0.28 (H¨lder’s inequality). Let p and q be dual indices; that is,
               o
                                     1 1
                                      + =1                              (0.69)
                                     p q
0.6. Lebesgue Lp spaces                                                                       27


with 1 ≤ p ≤ ∞. If f ∈ Lp (X, dµ) and g ∈ Lq (X, dµ), then f g ∈ L1 (X, dµ)
and
                            f g 1 ≤ f p g q.                         (0.70)

Proof. The case p = 1, q = ∞ (respectively p = ∞, q = 1) follows directly
from the properties of the integral and hence it remains to consider 1 
p, q  ∞.
    First of all it is no restriction to assume f                  p   = g   q   = 1. Then, using
the elementary inequality (Problem 0.25)
                                        1    1
                         a1/p b1/q ≤      a + b,              a, b ≥ 0,                    (0.71)
                                        p    q
with a = |f |p and b = |g|q and integrating over X gives
                                    1                     1
                     |f g|dµ ≤              |f |p dµ +             |g|q dµ = 1
                 X                  p   X                 q    X
and finishes the proof.

   As a consequence we also get
Theorem 0.29 (Minkowski’s inequality). Let f, g ∈ Lp (X, dµ). Then
                               f +g     p   ≤ f     p    + g p.                            (0.72)

Proof. Since the cases p = 1, ∞ are straightforward, we only consider 1 
p  ∞. Using |f + g|p ≤ |f | |f + g|p−1 + |g| |f + g|p−1 , we obtain from
H¨lder’s inequality (note (p − 1)q = p)
  o
                     p
             f +g    p   ≤ f    p   (f + g)p−1       q   + g       p   (f + g)p−1   q
                                                               p−1
                         =( f   p    + g p ) (f + g)           p .




    This shows that Lp (X, dµ) is a normed linear space. Finally it remains
to show that Lp (X, dµ) is complete.
Theorem 0.30. The space Lp (X, dµ) is a Banach space.

Proof. Suppose fn is a Cauchy sequence. It suffices to show that some
subsequence converges (show this). Hence we can drop some terms such
that
                                            1
                           fn+1 − fn p ≤ n .
                                           2
Now consider gn = fn − fn−1 (set f0 = 0). Then
                                               ∞
                                    G(x) =          |gk (x)|
                                              k=1
28                                          0. A first look at Banach and Hilbert spaces


is in Lp . This follows from
                          n                     n
                                                                                       1
                               |gk |       ≤            gk (x)     p   ≤ f1    p   +
                                       p                                               2
                         k=1                   k=1
using the monotone convergence theorem. In particular, G(x)  ∞ almost
everywhere and the sum
                                       ∞
                                           gn (x) = lim fn (x)
                                                             n→∞
                                   n=1
is absolutely convergent for those x. Now let f (x) be this limit. Since
|f (x) − fn (x)|p converges to zero almost everywhere and |f (x) − fn (x)|p ≤
2p G(x)p ∈ L1 , dominated convergence shows f − fn p → 0.

     In particular, in the proof of the last theorem we have seen:
Corollary 0.31. If fn − f p → 0, then there is a subsequence which con-
verges pointwise almost everywhere.

   Note that the statement is not true in general without passing to a
subsequence (Problem 0.28).
    Using H¨lder’s inequality, we can also identify a class of bounded oper-
              o
ators in Lp .
Lemma 0.32 (Schur criterion). Consider Lp (X, dµ) and Lq (X, dν) with
1    1
p + q = 1. Suppose that K(x, y) is measurable and there are measurable
functions K1 (x, y), K2 (x, y) such that |K(x, y)| ≤ K1 (x, y)K2 (x, y) and
                         K1 (x, .)     q   ≤ C1 ,              K2 (., y)   p   ≤ C2                   (0.73)
for µ-almost every x, respectively, for ν-almost every y. Then the operator
K : Lp (X, dµ) → Lp (X, dµ) defined by

                              (Kf )(x) =               K(x, y)f (y)dν(y)                              (0.74)
                                                   Y
for µ-almost every x is bounded with K ≤ C1 C2 .

Proof. Choose f ∈ Lp (X, dµ). By Fubini’s theorem                                  Y   |K(x, y)f (y)|dν(y)
is measurable and by H¨lder’s inequality we have
                      o

             |K(x, y)f (y)|dν(y) ≤                  K1 (x, y)K2 (x, y)|f (y)|dν(y)
         Y                                     Y
                                                       1/q                                      1/p
              ≤          K1 (x, y)q dν(y)                          |K2 (x, y)f (y)|p dν(y)
                     Y                                         Y
                                                                   1/p
              ≤ C1            |K2 (x, y)f (y)|p dν(y)
                          Y
0.6. Lebesgue Lp spaces                                                              29


(if K2 (x, .)f (.) ∈ Lp (X, dν), the inequality is trivially true). Now take this
inequality to the p’th power and integrate with respect to x using Fubini
                                   p
                                                p
             |K(x, y)f (y)|dν(y)       dµ(x) ≤ C1           |K2 (x, y)f (y)|p dν(y)dµ(x)
  X      Y                                          X   Y
            p                                           p p
       =   C1           |K2 (x, y)f (y)|p dµ(x)dµ(y) ≤ C1 C2 f     p
                                                                   p.
                Y   X
Hence Y |K(x, y)f (y)|dν(y) ∈ Lp (X, dµ) and in particular it is finite for
µ-almost every x. Thus K(x, .)f (.) is ν integrable for µ-almost every x and
 Y K(x, y)f (y)dν(y) is measurable.

      It even turns out that Lp is separable.
Lemma 0.33. Suppose X is a second countable topological space (i.e., it
has a countable basis) and µ is a regular Borel measure. Then Lp (X, dµ),
1 ≤ p  ∞, is separable.

Proof. The set of all characteristic functions χA (x) with A ∈ Σ and µ(A) 
∞ is total by construction of the integral. Now our strategy is as follows:
Using outer regularity, we can restrict A to open sets and using the existence
of a countable base, we can restrict A to open sets from this base.
    Fix A. By outer regularity, there is a decreasing sequence of open sets
On such that µ(On ) → µ(A). Since µ(A)  ∞, it is no restriction to assume
µ(On )  ∞, and thus µ(On A) = µ(On ) − µ(A) → 0. Now dominated
convergence implies χA − χOn p → 0. Thus the set of all characteristic
functions χO (x) with O open and µ(O)  ∞ is total. Finally let B be a
countable basis for the topology. Then, every open set O can be written as
O = ∞ Oj with Oj ∈ B. Moreover, by considering the set of all finite
        j=1
            ˜        ˜
unions of elements from B, it is no restriction to assume n Oj ∈ B. Hence
                                                          j=1
                                                              ˜
there is an increasing sequence On˜                ˜
                                          O with On ∈ B. By monotone con-
vergence, χO − χOn p → 0 and hence the set of all characteristic functions
                   ˜
χO with O
  ˜
          ˜ ∈ B is total.

    To finish this chapter, let us show that continuous functions are dense
in Lp .
Theorem 0.34. Let X be a locally compact metric space and let µ be a
σ-finite regular Borel measure. Then the set Cc (X) of continuous functions
with compact support is dense in Lp (X, dµ), 1 ≤ p  ∞.

Proof. As in the previous proof the set of all characteristic functions χK (x)
with K compact is total (using inner regularity). Hence it suffices to show
that χK (x) can be approximated by continuous functions. By outer regu-
larity there is an open set O ⊃ K such that µ(OK) ≤ ε. By Urysohn’s
30                                0. A first look at Banach and Hilbert spaces


lemma (Lemma 0.12) there is a continuous function fε which is 1 on K and
0 outside O. Since

                    |χK − fε |p dµ =            |fε |p dµ ≤ µ(OK) ≤ ε,
                X                         OK

we have fε − χK → 0 and we are done.

   If X is some subset of Rn , we can do even better. A nonnegative function
      ∞
u ∈ Cc (Rn ) is called a mollifier if

                                          u(x)dx = 1.                     (0.75)
                                     Rn
                                        1
The standard mollifier is u(x) = exp( |x|2 −1 ) for |x|  1 and u(x) = 0
otherwise.
    If we scale a mollifier according to uk (x) = k n u(k x) such that its mass is
preserved ( uk 1 = 1) and it concentrates more and more around the origin,
                                             T u
                                                k




                                                                E
we have the following result (Problem 0.29):
Lemma 0.35. Let u be a mollifier in Rn and set uk (x) = k n u(k x). Then
for any (uniformly) continuous function f : Rn → C we have that

                          fk (x) =          uk (x − y)f (y)dy             (0.76)
                                       Rn
is in C ∞ (Rn ) and converges to f (uniformly).

     Now we are ready to prove
Theorem 0.36. If X ⊆ Rn and µ is a regular Borel measure, then the set
  ∞
Cc (X) of all smooth functions with compact support is dense in Lp (X, dµ),
1 ≤ p  ∞.

Proof. By our previous result it suffices to show that any continuous func-
tion f (x) with compact support can be approximated by smooth ones. By
setting f (x) = 0 for x ∈ X, it is no restriction to assume X = Rn . Now
choose a mollifier u and observe that fk has compact support (since f
0.6. Lebesgue Lp spaces                                                                     31


has). Moreover, since f has compact support, it is uniformly continuous
and fk → f uniformly. But this implies fk → f in Lp .

    We say that f ∈ Lp (X) if f ∈ Lp (K) for any compact subset K ⊂ X.
                     loc

Lemma 0.37. Suppose f ∈ L1 (Rn ). Then
                         loc

                                                                         ∞
                                 ϕ(x)f (x)dx = 0,                  ∀ϕ ∈ Cc (Rn ),        (0.77)
                            Rn

if and only if f (x) = 0 (a.e.).

Proof. First of all we claim that for any bounded function g with compact
                                                   ∞
support K, there is a sequence of functions ϕn ∈ Cc (Rn ) with support in
K which converges pointwise to g such that ϕn ∞ ≤ g ∞ .
    To see this, take a sequence of continuous functions ϕn with support
in K which converges to g in L1 . To make sure that ϕn ∞ ≤ g ∞ , just
set it equal to g ∞ whenever ϕn  g ∞ and equal to − g ∞ whenever
ϕn  g ∞ (show that the resulting sequence still converges). Finally use
(0.76) to make ϕn smooth (note that this operation does not change the
range) and extract a pointwise convergent subsequence.
    Now let K be some compact set and choose g = sign(f )χK . Then

                         |f |dx =        f sign(f )dx = lim               f χn dx = 0,
                     K               K                             n→∞

which shows f = 0 for a.e. x ∈ K. Since K is arbitrary, we are done.
Problem 0.24. Suppose µ(X)  ∞. Show that
                                         lim f        p   = f       ∞
                                         p→∞

for any bounded measurable function.
Problem 0.25. Prove (0.71). (Hint: Take logarithms on both sides.)
Problem 0.26. Show the following generalization of H¨lder’s inequality:
                                                    o
                                          fg     r   ≤ f       p   g q,                  (0.78)
        1       1
where   p   +   q   = 1.
                      r

Problem 0.27 (Lyapunov inequality). Let 0  θ  1. Show that if f ∈
Lp1 ∩ Lp2 , then f ∈ Lp and
                                                          θ        1−θ
                                         f   p   ≤ f      p1   f   p2 ,                  (0.79)
        1       θ        1−θ
where   p   =   p1   +    p2 .
32                               0. A first look at Banach and Hilbert spaces


Problem 0.28. Find a sequence fn which converges to 0 in Lp ([0, 1], dx) but
for which fn (x) → 0 for a.e. x ∈ [0, 1] does not hold. (Hint: Every n ∈ N can
be uniquely written as n = 2m +k with 0 ≤ m and 0 ≤ k  2m . Now consider
the characteristic functions of the intervals Im,k = [k2−m , (k + 1)2−m ].)
Problem 0.29. Prove Lemma 0.35. (Hint: To show that fk is smooth, use
Problems A.7 and A.8.)
Problem 0.30. Construct a function f ∈ Lp (0, 1) which has a singularity at
every rational number in [0, 1]. (Hint: Start with the function f0 (x) = |x|−α
which has a single pole at 0. Then fj (x) = f0 (x − xj ) has a pole at xj .)

0.7. Appendix: The uniform boundedness principle
Recall that the interior of a set is the largest open subset (that is, the union
of all open subsets). A set is called nowhere dense if its closure has empty
interior. The key to several important theorems about Banach spaces is the
observation that a Banach space cannot be the countable union of nowhere
dense sets.
Theorem 0.38 (Baire category theorem). Let X be a complete metric space.
Then X cannot be the countable union of nowhere dense sets.

Proof. Suppose X = ∞ Xn . We can assume that the sets Xn are closed
                        n=1
and none of them contains a ball; that is, XXn is open and nonempty for
every n. We will construct a Cauchy sequence xn which stays away from all
Xn .
    Since XX1 is open and nonempty, there is a closed ball Br1 (x1 ) ⊆
XX1 . Reducing r1 a little, we can even assume Br1 (x1 ) ⊆ XX1 . More-
over, since X2 cannot contain Br1 (x1 ), there is some x2 ∈ Br1 (x1 ) that is
not in X2 . Since Br1 (x1 ) ∩ (XX2 ) is open, there is a closed ball Br2 (x2 ) ⊆
Br1 (x1 ) ∩ (XX2 ). Proceeding by induction, we obtain a sequence of balls
such that
                     Brn (xn ) ⊆ Brn−1 (xn−1 ) ∩ (XXn ).
Now observe that in every step we can choose rn as small as we please; hence
without loss of generality rn → 0. Since by construction xn ∈ BrN (xN ) for
n ≥ N , we conclude that xn is Cauchy and converges to some point x ∈ X.
But x ∈ Brn (xn ) ⊆ XXn for every n, contradicting our assumption that
the Xn cover X.

    (Sets which can be written as the countable union of nowhere dense sets
are said to be of first category. All other sets are second category. Hence
we have the name category theorem.)
0.7. Appendix: The uniform boundedness principle                         33


   In other words, if Xn ⊆ X is a sequence of closed subsets which cover
X, at least one Xn contains a ball of radius ε  0.
   Now we come to the first important consequence, the uniform bound-
edness principle.
Theorem 0.39 (Banach–Steinhaus). Let X be a Banach space and Y some
normed linear space. Let {Aα } ⊆ L(X, Y ) be a family of bounded operators.
Suppose Aα x ≤ C(x) is bounded for fixed x ∈ X. Then Aα ≤ C is
uniformly bounded.

Proof. Let
             Xn = {x| Aα x ≤ n for all α} =       {x| Aα x ≤ n}.
                                              α
Then n Xn = X by assumption. Moreover, by continuity of Aα and the
norm, each Xn is an intersection of closed sets and hence closed. By Baire’s
theorem at least one contains a ball of positive radius: Bε (x0 ) ⊂ Xn . Now
observe
              Aα y ≤ Aα (y + x0 ) + Aα x0 ≤ n + Aα x0
                           x
for y  ε. Setting y = ε x , we obtain
                                    n + C(x0 )
                           Aα x ≤              x
                                        ε
for any x.
Mathematical methods in quantum mechanics
Part 1

Mathematical
Foundations of
Quantum Mechanics
Mathematical methods in quantum mechanics
Chapter 1




Hilbert spaces


The phase space in classical mechanics is the Euclidean space R2n (for the n
position and n momentum coordinates). In quantum mechanics the phase
space is always a Hilbert space H. Hence the geometry of Hilbert spaces
stands at the outset of our investigations.


1.1. Hilbert spaces
Suppose H is a vector space. A map ., .. : H×H → C is called a sesquilinear
form if it is conjugate linear in the first argument and linear in the second.
A positive definite sesquilinear form is called an inner product or scalar
product. Associated with every scalar product is a norm

                               ψ =        ψ, ψ .                        (1.1)

The triangle inequality follows from the Cauchy–Schwarz–Bunjakowski
inequality:
                             | ψ, ϕ | ≤ ψ      ϕ                        (1.2)
with equality if and only if ψ and ϕ are parallel.
    If H is complete with respect to the above norm, it is called a Hilbert
space. It is no restriction to assume that H is complete since one can easily
replace it by its completion.
Example. The space L2 (M, dµ) is a Hilbert space with scalar product given
by

                         f, g =       f (x)∗ g(x)dµ(x).                 (1.3)
                                  M


                                                                          37
38                                                                             1. Hilbert spaces


Similarly, the set of all square summable sequences                    2 (N)   is a Hilbert space
with scalar product
                                                        ∗
                                      f, g =           fj gj .                              (1.4)
                                                 j∈N

(Note that the second example is a special case of the first one; take M = R
and µ a sum of Dirac measures.)

    A vector ψ ∈ H is called normalized or a unit vector if ψ = 1.
Two vectors ψ, ϕ ∈ H are called orthogonal or perpendicular (ψ ⊥ ϕ) if
 ψ, ϕ = 0 and parallel if one is a multiple of the other.
     If ψ and ϕ are orthogonal, we have the Pythagorean theorem:
                                 2           2
                    ψ+ϕ              = ψ         + ϕ 2,            ψ ⊥ ϕ,                   (1.5)
which is one line of computation.
    Suppose ϕ is a unit vector. Then the projection of ψ in the direction of
ϕ is given by
                                       ψ = ϕ, ψ ϕ                                           (1.6)
and ψ⊥ defined via
                                     ψ⊥ = ψ − ϕ, ψ ϕ                                        (1.7)
is perpendicular to ϕ.
    These results can also be generalized to more than one vector. A set of
vectors {ϕj } is called an orthonormal set (ONS) if ϕj , ϕk = 0 for j = k
and ϕj , ϕj = 1.

Lemma 1.1. Suppose {ϕj }n is an orthonormal set. Then every ψ ∈ H
                        j=0
can be written as
                                                           n
                   ψ = ψ + ψ⊥ ,                  ψ =             ϕj , ψ ϕj ,                (1.8)
                                                         j=0

where ψ and ψ⊥ are orthogonal. Moreover, ϕj , ψ⊥ = 0 for all 1 ≤ j ≤ n.
In particular,
                                       n
                             2
                         ψ       =          | ϕj , ψ |2 + ψ⊥ 2 .                            (1.9)
                                      j=0

                ˆ
Moreover, every ψ in the span of {ϕj }n satisfies
                                      j=0

                                          ˆ
                                      ψ − ψ ≥ ψ⊥                                          (1.10)
                                     ˆ
with equality holding if and only if ψ = ψ . In other words, ψ is uniquely
characterized as the vector in the span of {ϕj }n closest to ψ.
                                                j=0
1.2. Orthonormal bases                                                                                      39


Proof. A straightforward calculation shows ϕj , ψ − ψ = 0 and hence ψ
and ψ⊥ = ψ − ψ are orthogonal. The formula for the norm follows by
applying (1.5) iteratively.
    Now, fix a vector
                                                               n
                                                  ˆ
                                                  ψ=               cj ϕj
                                                           j=0

in the span of {ϕj }n . Then one computes
                    j=0

                 ˆ
               ψ−ψ         2              ˆ
                               = ψ + ψ⊥ − ψ                        2
                                                                       = ψ⊥     2        ˆ
                                                                                    + ψ −ψ       2

                                                       n
                                              2
                               = ψ⊥               +            |cj − ϕj , ψ |2
                                                      j=0

from which the last claim follows.

    From (1.9) we obtain Bessel’s inequality
                                          n
                                                  | ϕj , ψ |2 ≤ ψ           2
                                                                                                         (1.11)
                                      j=0

with equality holding if and only if ψ lies in the span of {ϕj }n .
                                                                j=0
    Recall that a scalar product can be recovered from its norm by virtue of
the polarization identity
               1                2                          2                    2                2
     ϕ, ψ =             ϕ+ψ         − ϕ−ψ                      + i ϕ − iψ           − i ϕ + iψ       .   (1.12)
               4
    A bijective linear operator U ∈ L(H1 , H2 ) is called unitary if U preserves
scalar products:
                          U ϕ, U ψ        2   = ϕ, ψ 1 ,                   ϕ, ψ ∈ H1 .                   (1.13)
By the polarization identity this is the case if and only if U preserves norms:
 U ψ 2 = ψ 1 for all ψ ∈ H1 . The two Hilbert space H1 and H2 are called
unitarily equivalent in this case.
Problem 1.1. The operator
                   2           2
          S:           (N) →       (N),               (a1 , a2 , a3 , . . . ) → (0, a1 , a2 , . . . )
satisfies Sa = a . Is it unitary?

1.2. Orthonormal bases
Of course, since we cannot assume H to be a finite dimensional vector space,
we need to generalize Lemma 1.1 to arbitrary orthonormal sets {ϕj }j∈J .
40                                                                      1. Hilbert spaces


We start by assuming that J is countable. Then Bessel’s inequality (1.11)
shows that
                                 | ϕj , ψ |2                       (1.14)
                                 j∈J
converges absolutely. Moreover, for any finite subset K ⊂ J we have
                                         2
                             ϕj , ψ ϕj       =         | ϕj , ψ |2                 (1.15)
                       j∈K                       j∈K

by the Pythagorean theorem and thus j∈J ϕj , ψ ϕj is Cauchy if and only
                     2
if     j∈J | ϕj , ψ | is. Now let J be arbitrary. Again, Bessel’s inequality
shows that for any given ε  0 there are at most finitely many j for which
| ϕj , ψ | ≥ ε. Hence there are at most countably many j for which | ϕj , ψ | 
0. Thus it follows that
                                     | ϕj , ψ |2                         (1.16)
                                 j∈J
is well-defined and so is
                                       ϕj , ψ ϕj .                                 (1.17)
                                 j∈J
In particular, by continuity of the scalar product we see that Lemma 1.1
holds for arbitrary orthonormal sets without modifications.
Theorem 1.2. Suppose {ϕj }j∈J is an orthonormal set. Then every ψ ∈ H
can be written as
                   ψ = ψ + ψ⊥ ,          ψ =              ϕj , ψ ϕj ,              (1.18)
                                                  j∈J

where ψ and ψ⊥ are orthogonal. Moreover, ϕj , ψ⊥ = 0 for all j ∈ J. In
particular,
                    ψ 2=       | ϕj , ψ |2 + ψ⊥ 2 .              (1.19)
                                j∈J
                ˆ
Moreover, every ψ in the span of {ϕj }j∈J satisfies
                                    ˆ
                                ψ − ψ ≥ ψ⊥                                         (1.20)
                                     ˆ
with equality holding if and only if ψ = ψ . In other words, ψ is uniquely
characterized as the vector in the span of {ϕj }j∈J closest to ψ.

    Note that from Bessel’s inequality (which of course still holds) it follows
that the map ψ → ψ is continuous.
   An orthonormal set which is not a proper subset of any other orthonor-
mal set is called an orthonormal basis (ONB) due to the following result:
Theorem 1.3. For an orthonormal set {ϕj }j∈J the following conditions are
equivalent:
1.2. Orthonormal bases                                                      41


      (i) {ϕj }j∈J is a maximal orthonormal set.
     (ii) For every vector ψ ∈ H we have
                              ψ=           ϕj , ψ ϕj .                  (1.21)
                                     j∈J

     (iii) For every vector ψ ∈ H we have
                                 2
                             ψ       =         | ϕj , ψ |2 .            (1.22)
                                         j∈J

     (iv) ϕj , ψ = 0 for all j ∈ J implies ψ = 0.

Proof. We will use the notation from Theorem 1.2.
                                                                         ˜
(i) ⇒ (ii): If ψ⊥ = 0, then we can normalize ψ⊥ to obtain a unit vector ψ⊥
                                                             ˜
which is orthogonal to all vectors ϕj . But then {ϕj }j∈J ∪ {ψ⊥ } would be a
larger orthonormal set, contradicting the maximality of {ϕj }j∈J .
(ii) ⇒ (iii): This follows since (ii) implies ψ⊥ = 0.
(iii) ⇒ (iv): If ψ, ϕj = 0 for all j ∈ J, we conclude ψ 2 = 0 and hence
ψ = 0.
(iv) ⇒ (i): If {ϕj }j∈J were not maximal, there would be a unit vector ϕ
such that {ϕj }j∈J ∪ {ϕ} is a larger orthonormal set. But ϕj , ϕ = 0 for all
j ∈ J implies ϕ = 0 by (iv), a contradiction.

    Since ψ → ψ is continuous, it suffices to check conditions (ii) and (iii)
on a dense set.
Example. The set of functions
                                 1
                       ϕn (x) = √ ein x ,                n ∈ Z,         (1.23)
                                 2π
forms an orthonormal basis for H = L2 (0, 2π). The corresponding orthogo-
nal expansion is just the ordinary Fourier series (Problem 1.20).

    A Hilbert space is separable if and only if there is a countable orthonor-
mal basis. In fact, if H is separable, then there exists a countable total set
{ψj }N . Here N ∈ N if H is finite dimensional and N = ∞ otherwise. After
     j=0
throwing away some vectors, we can assume that ψn+1 cannot be expressed
as a linear combinations of the vectors ψ0 , . . . , ψn . Now we can construct
an orthonormal basis as follows: We begin by normalizing ψ0 ,
                                         ψ0
                                 ϕ0 =       .                            (1.24)
                                         ψ0
Next we take ψ1 and remove the component parallel to ϕ0 and normalize
again:
                             ψ1 − ϕ0 , ψ1 ϕ0
                       ϕ1 =                  .                 (1.25)
                             ψ1 − ϕ0 , ψ1 ϕ0
42                                                                  1. Hilbert spaces


Proceeding like this, we define recursively
                                          n−1
                               ψn −       j=0   ϕj , ψn ϕj
                        ϕn =              n−1                .                 (1.26)
                               ψn −       j=0   ϕj , ψn ϕj
This procedure is known as Gram–Schmidt orthogonalization. Hence
we obtain an orthonormal set {ϕj }N such that span{ϕj }n = span{ψj }n
                                  j=0                     j=0           j=0
for any finite n and thus also for N (if N = ∞). Since {ψj }N is total, so
                                                              j=0
is {ϕj }N . Now suppose there is some ψ = ψ + ψ⊥ ∈ H for which ψ⊥ = 0.
        j=0
                                    ˆ                             ˆ
Since {ϕj }N is total, we can find a ψ in its span, such that ψ − ψ  ψ⊥
            j=1
                                                N
contradicting (1.20). Hence we infer that {ϕj }j=1 is an orthonormal basis.
Theorem 1.4. Every separable Hilbert space has a countable orthonormal
basis.

Example. In L2 (−1, 1) we can orthogonalize the polynomial fn (x) = xn .
The resulting polynomials are up to a normalization equal to the Legendre
polynomials
                                                       3 x2 − 1
              P0 (x) = 1,   P1 (x) = x,    P2 (x) =             ,    ...       (1.27)
                                                           2
(which are normalized such that Pn (1) = 1).

    If fact, if there is one countable basis, then it follows that any other basis
is countable as well.
Theorem 1.5. If H is separable, then every orthonormal basis is countable.

Proof. We know that there is at least one countable orthonormal basis
{ϕj }j∈J . Now let {φk }k∈K be a second basis and consider the set Kj =
{k ∈ K| φk , ϕj = 0}. Since these are the expansion coefficients of ϕj with
                                                           ˜
respect to {φk }k∈K , this set is countable. Hence the set K = j∈J Kj is
                                 ˜                          ˜
countable as well. But k ∈ KK implies φk = 0 and hence K = K.

         We will assume all Hilbert spaces to be separable.
In particular, it can be shown that L2 (M, dµ) is separable. Moreover, it
turns out that, up to unitary equivalence, there is only one (separable)
infinite dimensional Hilbert space:
    Let H be an infinite dimensional Hilbert space and let {ϕj }j∈N be any
orthogonal basis. Then the map U : H → 2 (N), ψ → ( ϕj , ψ )j∈N is unitary
(by Theorem 1.3 (iii)). In particular,
Theorem 1.6. Any separable infinite dimensional Hilbert space is unitarily
equivalent to 2 (N).
1.3. The projection theorem and the Riesz lemma                           43


    Let me remark that if H is not separable, there still exists an orthonor-
mal basis. However, the proof requires Zorn’s lemma: The collection of
all orthonormal sets in H can be partially ordered by inclusion. Moreover,
any linearly ordered chain has an upper bound (the union of all sets in the
chain). Hence Zorn’s lemma implies the existence of a maximal element,
that is, an orthonormal basis.
Problem 1.2. Let {ϕj } be some orthonormal basis. Show that a bounded
linear operator A is uniquely determined by its matrix elements Ajk =
 ϕj , Aϕk with respect to this basis.
Problem 1.3. Show that L(H) is not separable if H is infinite dimensional.

1.3. The projection theorem and the Riesz lemma
Let M ⊆ H be a subset. Then M ⊥ = {ψ| ϕ, ψ = 0, ∀ϕ ∈ M } is called
the orthogonal complement of M . By continuity of the scalar prod-
uct it follows that M ⊥ is a closed linear subspace and by linearity that
(span(M ))⊥ = M ⊥ . For example we have H⊥ = {0} since any vector in H⊥
must be in particular orthogonal to all vectors in some orthonormal basis.
Theorem 1.7 (Projection theorem). Let M be a closed linear subspace of a
Hilbert space H. Then every ψ ∈ H can be uniquely written as ψ = ψ + ψ⊥
with ψ ∈ M and ψ⊥ ∈ M ⊥ . One writes
                               M ⊕ M⊥ = H                              (1.28)
in this situation.

Proof. Since M is closed, it is a Hilbert space and has an orthonormal basis
{ϕj }j∈J . Hence the result follows from Theorem 1.2.

    In other words, to every ψ ∈ H we can assign a unique vector ψ which
is the vector in M closest to ψ. The rest, ψ − ψ , lies in M ⊥ . The operator
PM ψ = ψ is called the orthogonal projection corresponding to M . Note
that we have
                 2
                PM = PM       and      PM ψ, ϕ = ψ, PM ϕ               (1.29)
since PM ψ, ϕ = ψ , ϕ = ψ, PM ϕ . Clearly we have PM ⊥ ψ = ψ −
PM ψ = ψ⊥ . Furthermore, (1.29) uniquely characterizes orthogonal projec-
tions (Problem 1.6).
    Moreover, we see that the vectors in a closed subspace M are precisely
those which are orthogonal to all vectors in M ⊥ ; that is, M ⊥⊥ = M . If M
is an arbitrary subset, we have at least
                             M ⊥⊥ = span(M ).                          (1.30)
44                                                            1. Hilbert spaces


Note that by H⊥ = {0} we see that M ⊥ = {0} if and only if M is total.
   Finally we turn to linear functionals, that is, to operators : H →
C. By the Cauchy–Schwarz inequality we know that ϕ : ψ → ϕ, ψ is a
bounded linear functional (with norm ϕ ). It turns out that in a Hilbert
space every bounded linear functional can be written in this way.
Theorem 1.8 (Riesz lemma). Suppose is a bounded linear functional on a
Hilbert space H. Then there is a unique vector ϕ ∈ H such that (ψ) = ϕ, ψ
for all ψ ∈ H. In other words, a Hilbert space is equivalent to its own dual
space H∗ = H.

Proof. If ≡ 0, we can choose ϕ = 0. Otherwise Ker( ) = {ψ| (ψ) = 0}
is a proper subspace and we can find a unit vector ϕ ∈ Ker( )⊥ . For every
                                                  ˜
ψ ∈ H we have (ψ)ϕ − (ϕ)ψ ∈ Ker( ) and hence
                    ˜    ˜
                 0 = ϕ, (ψ)ϕ − (ϕ)ψ = (ψ) − (ϕ) ϕ, ψ .
                     ˜     ˜    ˜            ˜ ˜
In other words, we can choose ϕ = (ϕ)∗ ϕ. To see uniqueness, let ϕ1 , ϕ2 be
                                   ˜ ˜
two such vectors. Then ϕ1 − ϕ2 , ψ = ϕ1 , ψ − ϕ2 , ψ = (ψ) − (ψ) = 0
for any ψ ∈ H, which shows ϕ1 − ϕ2 ∈ H⊥ = {0}.

     The following easy consequence is left as an exercise.
Corollary 1.9. Suppose s is a bounded sesquilinear form; that is,
                            |s(ψ, ϕ)| ≤ C ψ    ϕ .                       (1.31)
Then there is a unique bounded operator A such that
                              s(ψ, ϕ) = Aψ, ϕ .                          (1.32)
Moreover, A ≤ C.

   Note that by the polarization identity (Problem 0.14), A is already
uniquely determined by its quadratic form qA (ψ) = ψ, Aψ .
Problem 1.4. Suppose U : H → H is unitary and M ⊆ H. Show that
U M ⊥ = (U M )⊥ .
Problem 1.5. Show that an orthogonal projection PM = 0 has norm one.
Problem 1.6. Suppose P ∈ L satisfies
                   P2 = P       and      P ψ, ϕ = ψ, P ϕ
and set M = Ran(P ). Show
        • P ψ = ψ for ψ ∈ M and M is closed,
        • ϕ ∈ M ⊥ implies P ϕ ∈ M ⊥ and thus P ϕ = 0,
and conclude P = PM .
1.4. Orthogonal sums and tensor products                                                                45


Problem 1.7. Let P1 , P2 be two orthogonal projections. Show that P1 ≤ P2
(that is, ψ, P1 ψ ≤ ψ, P2 ψ ) if and only if Ran(P1 ) ⊆ Ran(P2 ). Show in
this case that the two projections commute (that is, P1 P2 = P2 P1 ) and that
P2 − P1 is also a projection. (Hints: Pj ψ = ψ if and only if Pj ψ = ψ
and Ran(P1 ) ⊆ Ran(P2 ) if and only if P2 P1 = P1 .)
                                                                                  1
Problem 1.8. Show P : L2 (R) → L2 (R), f (x) →                                    2 (f (x)   + f (−x)) is a
projection. Compute its range and kernel.

Problem 1.9. Prove Corollary 1.9.

Problem 1.10. Consider the sesquilinear form
                                     1         x                         x
                B(f, g) =                          f (t)∗ dt                 g(t)dt dx
                                 0         0                         0

in L2 (0, 1). Show that it is bounded and find the corresponding operator A.
(Hint: Partial integration.)


1.4. Orthogonal sums and tensor products
Given two Hilbert spaces H1 and H2 , we define their orthogonal sum
H1 ⊕ H2 to be the set of all pairs (ψ1 , ψ2 ) ∈ H1 × H2 together with the scalar
product
                  (ϕ1 , ϕ2 ), (ψ1 , ψ2 ) = ϕ1 , ψ1               1   + ϕ2 , ψ2 2 .                   (1.33)
It is left as an exercise to verify that H1 ⊕ H2 is again a Hilbert space.
Moreover, H1 can be identified with {(ψ1 , 0)|ψ1 ∈ H1 } and we can regard
H1 as a subspace of H1 ⊕ H2 , and similarly for H2 . It is also customary to
write ψ1 + ψ2 instead of (ψ1 , ψ2 ).
   More generally, let Hj , j ∈ N, be a countable collection of Hilbert spaces
and define
                 ∞               ∞                              ∞
                                                                              2
                       Hj = {            ψj | ψj ∈ Hj ,                  ψj   j    ∞},              (1.34)
                 j=1             j=1                           j=1

which becomes a Hilbert space with the scalar product
                           ∞             ∞                 ∞
                                 ϕj ,          ψj =             ϕj , ψj j .                          (1.35)
                           j=1           j=1              j=1

               ∞           2 (N).
Example.       j=1 C   =
                         ˜
   Similarly, if H and H are two Hilbert spaces, we define their tensor
                                                       ˜
product as follows: The elements should be products ψ⊗ ψ of elements ψ ∈ H
46                                                                                  1. Hilbert spaces


    ˜ ˜
and ψ ∈ H. Hence we start with the set of all finite linear combinations of
                ˜
elements of H × H:
                        n
             ˜
        F(H, H) = {                    ˜          ˜          ˜
                              αj (ψj , ψj )|(ψj , ψj ) ∈ H × H, αj ∈ C}.                       (1.36)
                        j=1
                         ˜        ˜      ˜     ˜ ˜            ˜      ˜
Since we want (ψ1 +ψ2 )⊗ ψ = ψ1 ⊗ ψ+ψ2 ⊗ ψ, ψ⊗(ψ1 + ψ2 ) = ψ⊗ ψ1 +ψ⊗ ψ2 ,
and (αψ) ⊗ ψ˜ = ψ ⊗ (αψ), we consider F(H, H)/N (H, H), where
                       ˜                    ˜       ˜
                              n                              n                n
            ˜
      N (H, H) = span{                          ˜
                                    αj βk (ψj , ψk ) − (           αj ψj ,            ˜
                                                                                   βk ψk )}    (1.37)
                            j,k=1                            j=1             k=1
              ˜                                  ˜
and write ψ ⊗ ψ for the equivalence class of (ψ, ψ).
     Next we define
                              ˜      ˜
                         ψ ⊗ ψ, φ ⊗ φ = ψ, φ ψ, φ˜ ˜                     (1.38)
                                               ˜
which extends to a sesquilinear form on F(H, H)/N (H, H).˜ To show that we
                                                                           ˜
obtain a scalar product, we need to ensure positivity. Let ψ = i αi ψi ⊗ψi =
0 and pick orthonormal bases ϕj , ϕk for span{ψi }, span{ψ
                                    ˜                        ˜i }, respectively.
Then
             ψ=      αjk ϕj ⊗ ϕk , αjk =
                               ˜                           ˜ ˜
                                               αi ϕj , ψi ϕk , ψi        (1.39)
                  j,k                                    i
and we compute
                                  ψ, ψ =            |αjk |2  0.                               (1.40)
                                              j,k
                       ˜          ˜
The completion of F(H, H)/N (H, H) with respect to the induced norm is
called the tensor product H ⊗ H ˜ of H and H.
                                           ˜
                                                    ˜
Lemma 1.10. If ϕj , ϕk are orthonormal bases for H, H, respectively, then
                     ˜
                                        ˜
ϕj ⊗ ϕk is an orthonormal basis for H ⊗ H.
      ˜

Proof. That ϕj ⊗ ϕk is an orthonormal set is immediate from (1.38). More-
                   ˜
                                                 ˜
over, since span{ϕj }, span{ϕk } are dense in H, H, respectively, it is easy to
                            ˜
                                    ˜
see that ϕj ⊗ ϕk is dense in F(H, H)/N (H, H).
               ˜                              ˜ But the latter is dense in
     ˜
H ⊗ H.

Example. We have H ⊗ Cn = Hn .
                                 ˜ µ
Example. Let (M, dµ) and (M , d˜) be two measure spaces. Then we have
L                 ˜ µ                 ˜
  2 (M, dµ) ⊗ L2 (M , d˜ ) = L2 (M × M , dµ × d˜ ).
                                                 µ
     Clearly we have L                  ˜ µ                 ˜
                        2 (M, dµ) ⊗ L2 (M , d˜ ) ⊆ L2 (M × M , dµ × d˜ ). Now
                                                                       µ
take an orthonormal basis ϕj ⊗ ϕk for L
                                    ˜                         ˜ µ
                                             2 (M, dµ) ⊗ L2 (M , d˜ ) as in our

previous lemma. Then

                            (ϕj (x)ϕk (y))∗ f (x, y)dµ(x)d˜(y) = 0
                                   ˜                      µ                                    (1.41)
                  M     ˜
                        M
1.5. The C ∗ algebra of bounded linear operators                                            47


implies

          ϕj (x)∗ fk (x)dµ(x) = 0,             fk (x) =          ϕk (y)∗ f (x, y)d˜(y)
                                                                 ˜                µ      (1.42)
     M                                                      ˜
                                                            M
and hence fk (x) = 0 µ-a.e. x. But this implies f (x, y) = 0 for µ-a.e. x and
˜                                    ˜                           ˜
µ-a.e. y and thus f = 0. Hence ϕj ⊗ ϕk is a basis for L2 (M × M , dµ × d˜) µ
and equality follows.

    It is straightforward to extend the tensor product to any finite number
of Hilbert spaces. We even note
                              ∞                      ∞
                          (           Hj ) ⊗ H =           (Hj ⊗ H),                     (1.43)
                              j=1                    j=1
where equality has to be understood in the sense that both spaces are uni-
tarily equivalent by virtue of the identification
                                  ∞                   ∞
                              (         ψj ) ⊗ ψ =         ψj ⊗ ψ.                       (1.44)
                                  j=1                j=1

                              ˜                           ˜
Problem 1.11. Show that ψ ⊗ ψ = 0 if and only if ψ = 0 or ψ = 0.
                             ˜        ˜
Problem 1.12. We have ψ ⊗ ψ = φ ⊗ φ = 0 if and only if there is some
α ∈ C{0} such that ψ = αφ and ψ˜ = α−1 φ.
                                        ˜

Problem 1.13. Show (1.43)

1.5. The C ∗ algebra of bounded linear operators
We start by introducing a conjugation for operators on a Hilbert space H.
Let A ∈ L(H). Then the adjoint operator is defined via
                                        ϕ, A∗ ψ = Aϕ, ψ                                  (1.45)
(compare Corollary 1.9).
Example. If H = Cn and A = (ajk )1≤j,k≤n , then A∗ = (a∗ )1≤j,k≤n .
                                                       kj

Lemma 1.11. Let A, B ∈ L(H). Then
      (i) (A + B)∗ = A∗ + B ∗ ,                 (αA)∗ = α∗ A∗ ,
     (ii) A∗∗ = A,
     (iii) (AB)∗ = B ∗ A∗ ,
     (iv) A = A∗ and A                    2   = A∗ A = AA∗ .

Proof. (i) and (ii) are obvious. (iii) follows from ϕ, (AB)ψ = A∗ ϕ, Bψ =
B ∗ A∗ ϕ, ψ . (iv) follows from
            A∗ =       sup         | ψ, A∗ ϕ | =           sup      | Aψ, ϕ | = A
                     ϕ = ψ =1                         ϕ = ψ =1
48                                                                                 1. Hilbert spaces


and

              A∗ A =          sup        | ϕ, A∗ Aψ | =            sup       | Aϕ, Aψ |
                          ϕ = ψ =1                               ϕ = ψ =1
                                          2
                       = sup        Aϕ        = A 2,
                          ϕ =1

where we have used ϕ = sup                ψ =1 |       ψ, ϕ |.

   As a consequence of A∗                     =       A observe that taking the adjoint is
continuous.
      In general, a Banach algebra A together with an involution

      (a + b)∗ = a∗ + b∗ ,         (αa)∗ = α∗ a∗ ,         a∗∗ = a,        (ab)∗ = b∗ a∗         (1.46)

satisfying
                                              a   2
                                                      = a∗ a                                     (1.47)

is called a C ∗ algebra. The element a∗ is called the adjoint of a. Note that
  a∗ = a follows from (1.47) and aa∗ ≤ a a∗ .
    Any subalgebra which is also closed under involution is called a ∗-
subalgebra. An ideal is a subspace I ⊆ A such that a ∈ I, b ∈ A imply
ab ∈ I and ba ∈ I. If it is closed under the adjoint map, it is called a ∗-ideal.
Note that if there is an identity e, we have e∗ = e and hence (a−1 )∗ = (a∗ )−1
(show this).
Example. The continuous functions C(I) together with complex conjuga-
tion form a commutative C ∗ algebra.

    An element a ∈ A is called normal if aa∗ = a∗ a, self-adjoint if a = a∗ ,
unitary if aa∗ = a∗ a = I, an (orthogonal) projection if a = a∗ = a2 , and
positive if a = bb∗ for some b ∈ A. Clearly both self-adjoint and unitary
elements are normal.

Problem 1.14. Let A ∈ L(H). Show that A is normal if and only if

                                   Aψ = A∗ ψ ,              ∀ψ ∈ H.

(Hint: Problem 0.14.)

Problem 1.15. Show that U : H → H is unitary if and only if U −1 = U ∗ .

Problem 1.16. Compute the adjoint of
                  2           2
             S:       (N) →       (N),        (a1 , a2 , a3 , . . . ) → (0, a1 , a2 , . . . ).
1.6. Weak and strong convergence                                              49


1.6. Weak and strong convergence
Sometimes a weaker notion of convergence is useful: We say that ψn con-
verges weakly to ψ and write

                               w-lim ψn = ψ    or ψn     ψ                 (1.48)
                               n→∞

if ϕ, ψn → ϕ, ψ for every ϕ ∈ H (show that a weak limit is unique).
Example. Let ϕn be an (infinite) orthonormal set. Then ψ, ϕn → 0 for
every ψ since these are just the expansion coefficients of ψ. (ϕn does not
converge to 0, since ϕn = 1.)

    Clearly ψn → ψ implies ψn   ψ and hence this notion of convergence is
indeed weaker. Moreover, the weak limit is unique, since ϕ, ψn → ϕ, ψ
                  ˜                 ˜
and ϕ, ψn → ϕ, ψ imply ϕ, (ψ − ψ) = 0. A sequence ψn is called a
weak Cauchy sequence if ϕ, ψn is Cauchy for every ϕ ∈ H.

Lemma 1.12. Let H be a Hilbert space.
      (i) ψn       ψ implies ψ ≤ lim inf ψn .
     (ii) Every weak Cauchy sequence ψn is bounded: ψn ≤ C.
     (iii) Every weak Cauchy sequence converges weakly.
     (iv) For a weakly convergent sequence ψn            ψ we have ψn → ψ if and
          only if lim sup ψn ≤ ψ .

Proof. (i) Observe
                   2
               ψ       = ψ, ψ = lim inf ψ, ψn ≤ ψ lim inf ψn .

(ii) For every ϕ we have that | ϕ, ψn | ≤ C(ϕ) is bounded. Hence by the
uniform boundedness principle we have ψn = ψn , . ≤ C.
(iii) Let ϕm be an orthonormal basis and define cm = limn→∞ ϕm , ψn .
Then ψ = m cm ϕm is the desired limit.
(iv) By (i) we have lim ψn = ψ and hence
                           2         2                          2
               ψ − ψn          = ψ       − 2 Re( ψ, ψn ) + ψn       → 0.

The converse is straightforward.

   Clearly an orthonormal basis does not have a norm convergent subse-
quence. Hence the unit ball in an infinite dimensional Hilbert space is never
compact. However, we can at least extract weakly convergent subsequences:

Lemma 1.13. Let H be a Hilbert space. Every bounded sequence ψn has a
weakly convergent subsequence.
50                                                                             1. Hilbert spaces


Proof. Let ϕk be an orthonormal basis. Then by the usual diagonal se-
quence argument we can find a subsequence ψnm such that ϕk , ψnm con-
verges for all k. Since ψn is bounded, ϕ, ψnm converges for every ϕ ∈ H
and hence ψnm is a weak Cauchy sequence.

    Finally, let me remark that similar concepts can be introduced for oper-
ators. This is of particular importance for the case of unbounded operators,
where convergence in the operator norm makes no sense at all.
     A sequence of operators An is said to converge strongly to A,
       s-lim An = A     :⇔       An ψ → Aψ            ∀x ∈ D(A) ⊆ D(An ).                 (1.49)
       n→∞

It is said to converge weakly to A,
       w-lim An = A      :⇔       An ψ        Aψ      ∀ψ ∈ D(A) ⊆ D(An ).                 (1.50)
       n→∞

Clearly norm convergence implies strong convergence and strong conver-
gence implies weak convergence.
Example. Consider the operator Sn ∈ L( 2 (N)) which shifts a sequence n
places to the left, that is,
                      Sn (x1 , x2 , . . . ) = (xn+1 , xn+2 , . . . ),                     (1.51)
                   ∗
and the operator Sn ∈ L( 2 (N)) which shifts a sequence n places to the right
and fills up the first n places with zeros, that is,
                    ∗
                   Sn (x1 , x2 , . . . ) = (0, . . . , 0, x1 , x2 , . . . ).              (1.52)
                                               n places

Then Sn converges to zero strongly but not in norm (since Sn = 1) and
  ∗                                    ∗
Sn converges weakly to zero (since ϕ, Sn ψ = Sn ϕ, ψ ) but not strongly
        ∗
(since Sn ψ = ψ ) .

   Note that this example also shows that taking adjoints is not continuous
                                          s
with respect to strong convergence! If An → A, we only have
                  ϕ, A∗ ψ = An ϕ, ψ → Aϕ, ψ = ϕ, A∗ ψ
                      n                                                                   (1.53)
and hence A∗
           n      A∗ in general. However, if An and A are normal, we have
                          (An − A)∗ ψ = (An − A)ψ                                         (1.54)
              s
and hence  A∗ →
            n     A∗ in this case. Thus at least for normal operators taking
adjoints is continuous with respect to strong convergence.
Lemma 1.14. Suppose An is a sequence of bounded operators.
       (i) s-limn→∞ An = A implies A ≤ lim inf n→∞ An .
      (ii) Every strong Cauchy sequence An is bounded: An ≤ C.
1.7. Appendix: The Stone–Weierstraß theorem                               51


     (iii) If An ψ → Aψ for ψ in some dense set and         An    ≤ C, then
           s-limn→∞ An = A.
The same result holds if strong convergence is replaced by weak convergence.

Proof. (i) follows from
                     Aψ = lim An ψ ≤ lim inf An
                             n→∞             n→∞
for every ψ with ψ = 1.
(ii) follows as in Lemma 1.12 (i).
(iii) Just use
         An ψ − Aψ ≤ An ψ − An ϕ + An ϕ − Aϕ + Aϕ − Aψ
                     ≤ 2C ψ − ϕ + An ϕ − Aϕ
                                                              ε
and choose ϕ in the dense subspace such that ψ − ϕ ≤         4C   and n large
                        ε
such that An ϕ − Aϕ ≤ 2 .
   The case of weak convergence is left as an exercise. (Hint: (2.14).)
Problem 1.17. Suppose ψn → ψ and ϕn           ϕ. Then ψn , ϕn → ψ, ϕ .
Problem 1.18. Let {ϕj }∞ be some orthonormal basis. Show that ψn
                          j=1                                         ψ
if and only if ψn is bounded and ϕj , ψn → ϕj , ψ for every j. Show that
this is wrong without the boundedness assumption.
Problem 1.19. A subspace M ⊆ H is closed if and only if every weak
Cauchy sequence in M has a limit in M . (Hint: M = M ⊥⊥ .)

1.7. Appendix: The Stone–Weierstraß theorem
In case of a self-adjoint operator, the spectral theorem will show that the
closed ∗-subalgebra generated by this operator is isomorphic to the C ∗ al-
gebra of continuous functions C(K) over some compact set. Hence it is
important to be able to identify dense sets:
Theorem 1.15 (Stone–Weierstraß, real version). Suppose K is a compact
set and let C(K, R) be the Banach algebra of continuous functions (with the
sup norm).
    If F ⊂ C(K, R) contains the identity 1 and separates points (i.e., for
every x1 = x2 there is some function f ∈ F such that f (x1 ) = f (x2 )), then
the algebra generated by F is dense.

Proof. Denote by A the algebra generated by F . Note that if f ∈ A, we
have |f | ∈ A: By the Weierstraß approximation theorem (Theorem 0.15)
                                                      1
there is a polynomial pn (t) such that |t| − pn (t)  n for t ∈ f (K) and
hence pn (f ) → |f |.
52                                                                   1. Hilbert spaces


     In particular, if f, g are in A, we also have
                      (f + g) + |f − g|                        (f + g) − |f − g|
        max{f, g} =                     ,     min{f, g} =
                              2                                        2
in A.
     Now fix f ∈ C(K, R). We need to find some fε ∈ A with f − fε                ∞    ε.
    First of all, since A separates points, observe that for given y, z ∈ K
there is a function fy,z ∈ A such that fy,z (y) = f (y) and fy,z (z) = f (z)
(show this). Next, for every y ∈ K there is a neighborhood U (y) such that
                         fy,z (x)  f (x) − ε,     x ∈ U (y),
and since K is compact, finitely many, say U (y1 ), . . . , U (yj ), cover K. Then
                         fz = max{fy1 ,z , . . . , fyj ,z } ∈ A
and satisfies fz  f − ε by construction. Since fz (z) = f (z) for every z ∈ K,
there is a neighborhood V (z) such that
                         fz (x)  f (x) + ε,      x ∈ V (z),
and a corresponding finite cover V (z1 ), . . . , V (zk ). Now
                           fε = min{fz1 , . . . , fzk } ∈ A
satisfies fε  f + ε. Since f − ε  fzl  fε , we have found a required
function.
Theorem 1.16 (Stone–Weierstraß). Suppose K is a compact set and let
C(K) be the C ∗ algebra of continuous functions (with the sup norm).
    If F ⊂ C(K) contains the identity 1 and separates points, then the ∗-
subalgebra generated by F is dense.

                            ˜
Proof. Just observe that F = {Re(f ), Im(f )|f ∈ F } satisfies the assump-
tion of the real version. Hence any real-valued continuous functions can be
                                  ˜
approximated by elements from F , in particular this holds for the real and
imaginary parts for any given complex-valued function.

    Note that the additional requirement of being closed under complex
conjugation is crucial: The functions holomorphic on the unit ball and con-
tinuous on the boundary separate points, but they are not dense (since the
uniform limit of holomorphic functions is again holomorphic).
Corollary 1.17. Suppose K is a compact set and let C(K) be the C ∗ algebra
of continuous functions (with the sup norm).
    If F ⊂ C(K) separates points, then the closure of the ∗-subalgebra gen-
erated by F is either C(K) or {f ∈ C(K)|f (t0 ) = 0} for some t0 ∈ K.
1.7. Appendix: The Stone–Weierstraß theorem                                 53


Proof. There are two possibilities: either all f ∈ F vanish at one point
t0 ∈ K (there can be at most one such point since F separates points)
or there is no such point. If there is no such point, we can proceed as in
the proof of the Stone–Weierstraß theorem to show that the identity can
be approximated by elements in A (note that to show |f | ∈ A if f ∈ A,
we do not need the identity, since pn can be chosen to contain no constant
term). If there is such a t0 , the identity is clearly missing from A. However,
adding the identity to A, we get A + C = C(K) and it is easy to see that
A = {f ∈ C(K)|f (t0 ) = 0}.
Problem 1.20. Show that the functions ϕn (x) =       √1 einx ,   n ∈ Z, form an
                                                      2π
orthonormal basis for H = L2 (0, 2π).
Problem 1.21. Let k ∈ N and I ⊆ R. Show that the ∗-subalgebra generated
by fz0 (t) = (t−z0 )k for one z0 ∈ C is dense in the C ∗ algebra C∞ (I) of
                1

continuous functions vanishing at infinity
       • for I = R if z0 ∈ CR and k = 1, 2,
       • for I = [a, ∞) if z0 ∈ (−∞, a) and any k,
       • for I = (−∞, a] ∪ [b, ∞) if z0 ∈ (a, b) and k odd.
(Hint: Add ∞ to R to make it compact.)
Mathematical methods in quantum mechanics
Chapter 2




Self-adjointness and
spectrum


2.1. Some quantum mechanics
In quantum mechanics, a single particle living in R3 is described by a
complex-valued function (the wave function)
                          ψ(x, t),        (x, t) ∈ R3 × R,                        (2.1)
where x corresponds to a point in space and t corresponds to time. The
quantity ρt (x) = |ψ(x, t)|2 is interpreted as the probability density of the
particle at the time t. In particular, ψ must be normalized according to

                               |ψ(x, t)|2 d3 x = 1,        t ∈ R.                 (2.2)
                          R3

The location x of the particle is a quantity which can be observed (i.e.,
measured) and is hence called observable. Due to our probabilistic inter-
pretation, it is also a random variable whose expectation is given by

                          Eψ (x) =            x|ψ(x, t)|2 d3 x.                   (2.3)
                                         R3

In a real life setting, it will not be possible to measure x directly and one will
only be able to measure certain functions of x. For example, it is possible to
check whether the particle is inside a certain area Ω of space (e.g., inside a
detector). The corresponding observable is the characteristic function χΩ (x)
of this set. In particular, the number

             Eψ (χΩ ) =        χΩ (x)|ψ(x, t)|2 d3 x =          |ψ(x, t)|2 d3 x   (2.4)
                          R3                                Ω


                                                                                    55
56                                              2. Self-adjointness and spectrum


corresponds to the probability of finding the particle inside Ω ⊆ R3 . An
important point to observe is that, in contradistinction to classical mechan-
ics, the particle is no longer localized at a certain point. In particular,
the mean-square deviation (or variance) ∆ψ (x)2 = Eψ (x2 ) − Eψ (x)2 is
always nonzero.
    In general, the configuration space (or phase space) of a quantum
system is a (complex) Hilbert space H and the possible states of this system
are represented by the elements ψ having norm one, ψ = 1.
   An observable a corresponds to a linear operator A in this Hilbert space
and its expectation, if the system is in the state ψ, is given by the real
number
                          Eψ (A) = ψ, Aψ = Aψ, ψ ,                              (2.5)
where ., .. denotes the scalar product of H. Similarly, the mean-square
deviation is given by
              ∆ψ (A)2 = Eψ (A2 ) − Eψ (A)2 = (A − Eψ (A))ψ 2 .                  (2.6)
Note that ∆ψ (A) vanishes if and only if ψ is an eigenstate corresponding to
the eigenvalue Eψ (A); that is, Aψ = Eψ (A)ψ.
    From a physical point of view, (2.5) should make sense for any ψ ∈ H.
However, this is not in the cards as our simple example of one particle already
shows. In fact, the reader is invited to find a square integrable function ψ(x)
for which xψ(x) is no longer square integrable. The deeper reason behind
this nuisance is that Eψ (x) can attain arbitrarily large values if the particle
is not confined to a finite domain, which renders the corresponding opera-
tor unbounded. But unbounded operators cannot be defined on the entire
Hilbert space in a natural way by the closed graph theorem (Theorem 2.8
below).
    Hence, A will only be defined on a subset D(A) ⊆ H called the domain
of A. Since we want A to be defined for at least most states, we require
D(A) to be dense.
    However, it should be noted that there is no general prescription for how
to find the operator corresponding to a given observable.
    Now let us turn to the time evolution of such a quantum mechanical
system. Given an initial state ψ(0) of the system, there should be a unique
ψ(t) representing the state of the system at time t ∈ R. We will write
                                 ψ(t) = U (t)ψ(0).                              (2.7)
Moreover, it follows from physical experiments that superposition of states
holds; that is, U (t)(α1 ψ1 (0) + α2 ψ2 (0)) = α1 ψ1 (t) + α2 ψ2 (t) (|α1 |2 + |α2 |2 =
1). In other words, U (t) should be a linear operator. Moreover, since ψ(t)
2.1. Some quantum mechanics                                             57


is a state (i.e., ψ(t) = 1), we have
                                 U (t)ψ = ψ .                         (2.8)
Such operators are called unitary. Next, since we have assumed uniqueness
of solutions to the initial value problem, we must have
                    U (0) = I,     U (t + s) = U (t)U (s).            (2.9)
A family of unitary operators U (t) having this property is called a one-
parameter unitary group. In addition, it is natural to assume that this
group is strongly continuous; that is,
                      lim U (t)ψ = U (t0 )ψ,     ψ ∈ H.              (2.10)
                     t→t0

Each such group has an infinitesimal generator defined by
             i                                  1
   Hψ = lim (U (t)ψ − ψ), D(H) = {ψ ∈ H| lim (U (t)ψ − ψ) exists}.
         t→0 t                              t→0 t
                                                                  (2.11)
This operator is called the Hamiltonian and corresponds to the energy of
the system. If ψ(0) ∈ D(H), then ψ(t) is a solution of the Schr¨dinger
                                                                o
equation (in suitable units)
                              d
                             i  ψ(t) = Hψ(t).                        (2.12)
                             dt
This equation will be the main subject of our course.
   In summary, we have the following axioms of quantum mechanics.

    Axiom 1. The configuration space of a quantum system is a complex
separable Hilbert space H and the possible states of this system are repre-
sented by the elements of H which have norm one.
    Axiom 2. Each observable a corresponds to a linear operator A defined
maximally on a dense subset D(A). Moreover, the operator correspond-
ing to a polynomial Pn (a) = n αj aj , αj ∈ R, is Pn (A) = n αj Aj ,
                              j=0                               j=0
D(Pn (A)) = D(An ) = {ψ ∈ D(A)|Aψ ∈ D(An−1 )} (A0 = I).
    Axiom 3. The expectation value for a measurement of a, when the
system is in the state ψ ∈ D(A), is given by (2.5), which must be real for
all ψ ∈ D(A).
    Axiom 4. The time evolution is given by a strongly continuous one-
parameter unitary group U (t). The generator of this group corresponds to
the energy of the system.

   In the following sections we will try to draw some mathematical conse-
quences from these assumptions:
   First we will see that Axioms 2 and 3 imply that observables corre-
spond to self-adjoint operators. Hence these operators play a central role
58                                          2. Self-adjointness and spectrum


in quantum mechanics and we will derive some of their basic properties.
Another crucial role is played by the set of all possible expectation values
for the measurement of a, which is connected with the spectrum σ(A) of the
corresponding operator A.
    The problem of defining functions of an observable will lead us to the
spectral theorem (in the next chapter), which generalizes the diagonalization
of symmetric matrices.
     Axiom 4 will be the topic of Chapter 5.


2.2. Self-adjoint operators
Let H be a (complex separable) Hilbert space. A linear operator is a linear
mapping
                                 A : D(A) → H,                          (2.13)
where D(A) is a linear subspace of H, called the domain of A. It is called
bounded if the operator norm
                   A = sup        Aψ =      sup     | ψ, Aϕ |           (2.14)
                          ψ =1           ϕ = ψ =1

is finite. The second equality follows since equality in | ψ, Aϕ | ≤ ψ Aϕ
is attained when Aϕ = zψ for some z ∈ C. If A is bounded, it is no
restriction to assume D(A) = H and we will always do so. The Banach space
of all bounded linear operators is denoted by L(H). Products of (unbounded)
operators are defined naturally; that is, ABψ = A(Bψ) for ψ ∈ D(AB) =
{ψ ∈ D(B)|Bψ ∈ D(A)}.
   The expression ψ, Aψ encountered in the previous section is called the
quadratic form,
                      qA (ψ) = ψ, Aψ ,         ψ ∈ D(A),                (2.15)
associated to A. An operator can be reconstructed from its quadratic form
via the polarization identity
             1
  ϕ, Aψ =      (qA (ϕ + ψ) − qA (ϕ − ψ) + iqA (ϕ − iψ) − iqA (ϕ + iψ)) . (2.16)
             4
A densely defined linear operator A is called symmetric (or hermitian) if
                     ϕ, Aψ = Aϕ, ψ ,         ψ, ϕ ∈ D(A).               (2.17)
The justification for this definition is provided by the following

Lemma 2.1. A densely defined operator A is symmetric if and only if the
corresponding quadratic form is real-valued.
2.2. Self-adjoint operators                                                59


Proof. Clearly (2.17) implies that Im(qA (ψ)) = 0. Conversely, taking the
imaginary part of the identity
            qA (ψ + iϕ) = qA (ψ) + qA (ϕ) + i( ψ, Aϕ − ϕ, Aψ )
shows Re Aϕ, ψ = Re ϕ, Aψ . Replacing ϕ by iϕ in this last equation
shows Im Aϕ, ψ = Im ϕ, Aψ and finishes the proof.

   In other words, a densely defined operator A is symmetric if and only if
                      ψ, Aψ = Aψ, ψ ,         ψ ∈ D(A).                (2.18)

    This already narrows the class of admissible operators to the class of
symmetric operators by Axiom 3. Next, let us tackle the issue of the correct
domain.
                                                               ˜
    By Axiom 2, A should be defined maximally; that is, if A is another
symmetric operator such that A ⊆ A, ˜ then A = A. Here we write A ⊆ A
                                                 ˜                         ˜
              ˜             ˜                                     ˜
if D(A) ⊆ D(A) and Aψ = Aψ for all ψ ∈ D(A). The operator A is called
an extension of A in this case. In addition, we write A = A˜ if both A ⊆ A
                                                                      ˜
          ˜
and A ⊆ A hold.
   The adjoint operator A∗ of a densely defined linear operator A is
defined by
                      ˜               ˜
     D(A∗ ) = {ψ ∈ H|∃ψ ∈ H : ψ, Aϕ = ψ, ϕ , ∀ϕ ∈ D(A)},
              ˜
        ∗ ψ = ψ.                                                       (2.19)
      A
The requirement that D(A) be dense implies that A∗ is well-defined. How-
ever, note that D(A∗ ) might not be dense in general. In fact, it might
contain no vectors other than 0.
    Clearly we have (αA)∗ = α∗ A∗ for α ∈ C and (A + B)∗ ⊇ A∗ + B ∗
provided D(A + B) = D(A) ∩ D(B) is dense. However, equality will not
hold in general unless one operator is bounded (Problem 2.2).
   For later use, note that (Problem 2.4)
                              Ker(A∗ ) = Ran(A)⊥ .                     (2.20)

    For symmetric operators we clearly have A ⊆ A∗ . If, in addition, A = A∗
holds, then A is called self-adjoint. Our goal is to show that observables
correspond to self-adjoint operators. This is for example true in the case of
the position operator x, which is a special case of a multiplication operator.
Example. (Multiplication operator) Consider the multiplication operator
    (Af )(x) = A(x)f (x),  D(A) = {f ∈ L2 (Rn , dµ) | Af ∈ L2 (Rn , dµ)}
                                                                      (2.21)
given by multiplication with the measurable function A : Rn → C. First
of all note that D(A) is dense. In fact, consider Ωn = {x ∈ Rn | |A(x)| ≤
60                                            2. Self-adjointness and spectrum


n}    Rn . Then, for every f ∈ L2 (Rn , dµ) the function fn = χΩn f ∈ D(A)
converges to f as n → ∞ by dominated convergence.
   Next, let us compute the adjoint of A. Performing a formal computation,
we have for h, f ∈ D(A) that

  h, Af =        h(x)∗ A(x)f (x)dµ(x) =    (A(x)∗ h(x))∗ f (x)dµ(x) = Ah, f ,
                                                                      ˜
                                                                          (2.22)
      ˜
where A is multiplication by A(x)∗ ,
     (Af )(x) = A(x)∗ f (x),
      ˜                         ˜                       ˜
                             D(A) = {f ∈ L2 (Rn , dµ) | Af ∈ L2 (Rn , dµ)}.
                                                                        (2.23)
Note D(A)˜ = D(A). At first sight this seems to show that the adjoint of
      ˜
A is A. But for our calculation we had to assume h ∈ D(A) and there
might be some functions in D(A∗ ) which do not satisfy this requirement! In
                                       ˜
particular, our calculation only shows A ⊆ A∗ . To show that equality holds,
we need to work a little harder:
     If h ∈ D(A∗ ), there is some g ∈ L2 (Rn , dµ) such that

         h(x)∗ A(x)f (x)dµ(x) =      g(x)∗ f (x)dµ(x),      f ∈ D(A),     (2.24)

and thus
             (h(x)A(x)∗ − g(x))∗ f (x)dµ(x) = 0,         f ∈ D(A).        (2.25)

In particular,

      χΩn (x)(h(x)A(x)∗ − g(x))∗ f (x)dµ(x) = 0,         f ∈ L2 (Rn , dµ), (2.26)

which shows that χΩn (h(x)A(x)∗ − g(x))∗ ∈ L2 (Rn , dµ) vanishes. Since n
is arbitrary, we even have h(x)A(x)∗ = g(x) ∈ L2 (Rn , dµ) and thus A∗ is
multiplication by A(x)∗ and D(A∗ ) = D(A).
    In particular, A is self-adjoint if A is real-valued. In the general case we
have at least Af = A∗ f for all f ∈ D(A) = D(A∗ ). Such operators are
called normal.

     Now note that
                               A⊆B    ⇒     B ∗ ⊆ A∗ ;                    (2.27)
that is, increasing the domain of A implies decreasing the domain of A∗ .
Thus there is no point in trying to extend the domain of a self-adjoint
operator further. In fact, if A is self-adjoint and B is a symmetric extension,
we infer A ⊆ B ⊆ B ∗ ⊆ A∗ = A implying A = B.
Corollary 2.2. Self-adjoint operators are maximal; that is, they do not have
any symmetric extensions.
2.2. Self-adjoint operators                                                                    61


     Furthermore, if A∗ is densely defined (which is the case if A is symmet-
ric), we can consider A∗∗ . From the definition (2.19) it is clear that A ⊆ A∗∗
and thus A∗∗ is an extension of A. This extension is closely related to ex-
tending a linear subspace M via M ⊥⊥ = M (as we will see a bit later) and
thus is called the closure A = A∗∗ of A.
    If A is symmetric, we have A ⊆ A∗ and hence A = A∗∗ ⊆ A∗ ; that is,
A lies between A and A∗ . Moreover, ψ, A∗ ϕ = Aψ, ϕ for all ψ ∈ D(A),
ϕ ∈ D(A∗ ) implies that A is symmetric since A∗ ϕ = Aϕ for ϕ ∈ D(A).
Example. (Differential operator) Take H = L2 (0, 2π).
   (i) Consider the operator
                 d
    A0 f = −i      f,        D(A0 ) = {f ∈ C 1 [0, 2π] | f (0) = f (2π) = 0}.               (2.28)
                dx
That A0 is symmetric can be shown by a simple integration by parts (do
this). Note that the boundary conditions f (0) = f (2π) = 0 are chosen
such that the boundary terms occurring from integration by parts vanish.
However, this will also follow once we have computed A∗ . If g ∈ D(A∗ ), we
                                                      0             0
must have
                        2π                                     2π
                             g(x)∗ (−if (x))dx =                    g (x)∗ f (x)dx
                                                                    ˜                       (2.29)
                    0                                      0
for some g ∈ L2 (0, 2π). Integration by parts (cf. (2.116)) shows
         ˜
                            2π                        x               ∗
                                 f (x) g(x) − i           g (t)dt
                                                          ˜               dx = 0.           (2.30)
                        0                         0

In fact, this formula holds for g ∈ C[0, 2π]. Since the set of continuous
                                 ˜
functions is dense, the general case g ∈ L2 (0, 2π) follows by approximating
                                     ˜
g with continuous functions and taking limits on both sides using dominated
˜
convergence.
                            x
    Hence g(x) − i 0 g (t)dt ∈ {f |f ∈ D(A0 )}⊥ . But {f |f ∈ D(A0 )} =
                      ˜
                2π                                                 x
{h ∈ C[0, 2π]| 0 h(t)dt = 0} (show this) implying g(x) = g(0) + i 0 g (t)dt
                                                                     ˜
since {f |f ∈ D(A0 )} = {h ∈ H| 1, h = 0} = {1}⊥ and {1}⊥⊥ = span{1}.
Thus g ∈ AC[0, 2π], where
                                                               x
   AC[a, b] = {f ∈ C[a, b]|f (x) = f (a) +                         g(t)dt, g ∈ L1 (a, b)}   (2.31)
                                                           a

denotes the set of all absolutely continuous functions (see Section 2.7). In
summary, g ∈ D(A∗ ) implies g ∈ AC[0, 2π] and A∗ g = g = −ig . Conversely,
                    0                            0      ˜
for every g ∈ H 1 (0, 2π) = {f ∈ AC[0, 2π]|f ∈ L2 (0, 2π)}, (2.29) holds with

g = −ig and we conclude
˜
                                        d
                        A∗ f = −i
                         0                f,   D(A∗ ) = H 1 (0, 2π).
                                                  0                                         (2.32)
                                       dx
62                                              2. Self-adjointness and spectrum


In particular, A0 is symmetric but not self-adjoint. Since A0 = A∗∗ ⊆ A∗ ,
                                                                 0     0
we can use integration by parts to compute
         0 = g, A0 f − A∗ g, f = i(f (0)g(0)∗ − f (2π)g(2π)∗ )
                        0                                                 (2.33)
and since the boundary values of g ∈ D(A∗ ) can be prescribed arbitrarily,
                                        0
we must have f (0) = f (2π) = 0. Thus
                   d
      A0 f = −i      f,    D(A0 ) = {f ∈ D(A∗ ) | f (0) = f (2π) = 0}.
                                            0                             (2.34)
                  dx
     (ii) Now let us take
                    d
        Af = −i       f,    D(A) = {f ∈ C 1 [0, 2π] | f (0) = f (2π)},    (2.35)
                   dx
which is clearly an extension of A0 . Thus A∗ ⊆ A∗ and we compute
                                                 0

               0 = g, Af − A∗ g, f = if (0)(g(0)∗ − g(2π)∗ ).             (2.36)
Since this must hold for all f ∈ D(A), we conclude g(0) = g(2π) and
                    d
       A∗ f = −i      f,   D(A∗ ) = {f ∈ H 1 (0, 2π) | f (0) = f (2π)}.   (2.37)
                   dx
Similarly, as before, A = A∗ and thus A is self-adjoint.

    One might suspect that there is no big difference between the two sym-
metric operators A0 and A from the previous example, since they coincide
on a dense set of vectors. However, the converse is true: For example, the
first operator A0 has no eigenvectors at all (i.e., solutions of the equation
A0 ψ = zψ, z ∈ C) whereas the second one has an orthonormal basis of
eigenvectors!
Example. Compute the eigenvectors of A0 and A from the previous exam-
ple.
    (i) By definition, an eigenvector is a (nonzero) solution of A0 u = zu,
z ∈ C, that is, a solution of the ordinary differential equation
                                  − i u (x) = zu(x)                       (2.38)
satisfying the boundary conditions u(0) = u(2π) = 0 (since we must have
u ∈ D(A0 )). The general solution of the differential equation is u(x) =
u(0)eizx and the boundary conditions imply u(x) = 0. Hence there are no
eigenvectors.
    (ii) Now we look for solutions of Au = zu, that is, the same differential
equation as before, but now subject to the boundary condition u(0) = u(2π).
Again the general solution is u(x) = u(0)eizx and the boundary condition
requires u(0) = u(0)e2πiz . Thus there are two possibilities. Either u(0) = 0
2.2. Self-adjoint operators                                                   63


(which is of no use for us) or z ∈ Z. In particular, we see that all eigenvectors
are given by
                                    1
                         un (x) = √ einx ,       n ∈ Z,                    (2.39)
                                    2π
which are well known to form an orthonormal basis.

    We will see a bit later that this is a consequence of self-adjointness of
A. Hence it will be important to know whether a given operator is self-
adjoint or not. Our example shows that symmetry is easy to check (in case
of differential operators it usually boils down to integration by parts), but
computing the adjoint of an operator is a nontrivial job even in simple situ-
ations. However, we will learn soon that self-adjointness is a much stronger
property than symmetry, justifying the additional effort needed to prove it.
    On the other hand, if a given symmetric operator A turns out not to
be self-adjoint, this raises the question of self-adjoint extensions. Two cases
need to be distinguished. If A is self-adjoint, then there is only one self-
adjoint extension (if B is another one, we have A ⊆ B and hence A = B
by Corollary 2.2). In this case A is called essentially self-adjoint and
D(A) is called a core for A. Otherwise there might be more than one self-
adjoint extension or none at all. This situation is more delicate and will be
investigated in Section 2.6.
     Since we have seen that computing A∗ is not always easy, a criterion for
self-adjointness not involving A∗ will be useful.
Lemma 2.3. Let A be symmetric such that Ran(A + z) = Ran(A + z ∗ ) = H
for one z ∈ C. Then A is self-adjoint.
                                    ˜
Proof. Let ψ ∈ D(A∗ ) and A∗ ψ = ψ. Since Ran(A + z ∗ ) = H, there is a
ϑ ∈ D(A) such that (A + z        ˜
                          ∗ )ϑ = ψ + z ∗ ψ. Now we compute

ψ, (A + z)ϕ = ψ + z ∗ ψ, ϕ = (A + z ∗ )ϑ, ϕ = ϑ, (A + z)ϕ ,
              ˜                                                      ϕ ∈ D(A),
and hence ψ = ϑ ∈ D(A) since Ran(A + z) = H.

    To proceed further, we will need more information on the closure of
an operator. We will use a different approach which avoids the use of the
adjoint operator. We will establish equivalence with our original definition
in Lemma 2.4.
    The simplest way of extending an operator A is to take the closure of its
graph Γ(A) = {(ψ, Aψ)|ψ ∈ D(A)} ⊂ H2 . That is, if (ψn , Aψn ) → (ψ, ψ),   ˜
we might try to define Aψ = ψ.  ˜ For Aψ to be well-defined, we need that
                  ˜          ˜
(ψn , Aψn ) → (0, ψ) implies ψ = 0. In this case A is called closable and
the unique operator A which satisfies Γ(A) = Γ(A) is called the closure of
A. Clearly, A is called closed if A = A, which is the case if and only if the
64                                            2. Self-adjointness and spectrum


graph of A is closed. Equivalently, A is closed if and only if Γ(A) equipped
with the graph norm (ψ, Aψ) 2      Γ(A) = ψ
                                               2 + Aψ 2 is a Hilbert space

(i.e., closed). By construction, A is the smallest closed extension of A.
Example. Suppose A is bounded. Then the closure was already computed
in Theorem 0.26. In particular, D(A) = D(A) and a bounded operator is
closed if and only if its domain is closed.

Example. Consider again the differential operator A0 from (2.28) and let
us compute the closure without the use of the adjoint operator.
    Let f ∈ D(A0 ) and let fn ∈ D(A0 ) be a sequence such that fn → f ,
                                             x
A0 fn → −ig. Then fn → g and hence f (x) = 0 g(t)dt. Thus f ∈ AC[0, 2π]
                                          2π
and f (0) = 0. Moreover f (2π) = limn→0 0 fn (t)dt = 0. Conversely, any
such f can be approximated by functions in D(A0 ) (show this).

Example. Consider again the multiplication operator by A(x) in L2 (Rn , dµ)
but now defined on functions with compact support, that is,

                   D(A0 ) = {f ∈ D(A) | supp(f ) is compact}.             (2.40)

Then its closure is given by A0 = A. In particular, A0 is essentially self-
adjoint and D(A0 ) is a core for A.
    To prove A0 = A, let some f ∈ D(A) be given and consider fn =
χ{x| |x|≤n} f . Then fn ∈ D(A0 ) and fn (x) → f (x) as well as A(x)fn (x) →
A(x)f (x) in L2 (Rn , dµ) by dominated convergence. Thus D(A0 ) ⊆ D(A)
and since A is closed, we even get equality.

Example. Consider the multiplication A(x) = x in L2 (R) defined on

                      D(A0 ) = {f ∈ D(A) |       f (x)dx = 0}.            (2.41)
                                             R

Then A0 is closed. Hence D(A0 ) is not a core for A.
   To show that A0 is closed, suppose there is a sequence fn (x) → f (x)
such that xfn (x) → g(x). Since A is closed, we necessarily have f ∈ D(A)
and g(x) = xf (x). But then
                                           1
      0 = lim       fn (x)dx = lim              (fn (x) + sign(x)xfn (x))dx
            n→∞ R               n→∞ R   1 + |x|
                   1
        =               (f (x) + sign(x)g(x))dx =       f (x)dx           (2.42)
            R   1 + |x|                             R

which shows f ∈ D(A0 ).

     Next, let us collect a few important results.
2.2. Self-adjoint operators                                                  65


Lemma 2.4. Suppose A is a densely defined operator.
      (i) A∗ is closed.
     (ii) A is closable if and only if D(A∗ ) is dense and A = A∗∗ , respec-
          tively, (A)∗ = A∗ , in this case.
     (iii) If A is injective and Ran(A) is dense, then (A∗ )−1 = (A−1 )∗ . If
                                                   −1
           A is closable and A is injective, then A = A−1 .

Proof. Let us consider the following two unitary operators from H2 to itself
                   U (ϕ, ψ) = (ψ, −ϕ),        V (ϕ, ψ) = (ψ, ϕ).

   (i) From
      Γ(A∗ ) = {(ϕ, ϕ) ∈ H2 | ϕ, Aψ = ϕ, ψ , ∀ψ ∈ D(A)}
                    ˜                     ˜
                                  ˜ ˜                      ˜
             = {(ϕ, ϕ) ∈ H2 | (ϕ, ϕ), (ψ, −ψ) H2 = 0, ∀(ψ, ψ) ∈ Γ(A)}
                    ˜
              = (U Γ(A))⊥                                                (2.43)
we conclude that   A∗   is closed.
   (ii) Similarly, using U Γ⊥ = (U Γ)⊥ (Problem 1.4), by
            Γ(A) = Γ(A)⊥⊥ = (U Γ(A∗ ))⊥
                 = {(ψ, ψ)| ψ, A∗ ϕ − ψ, ϕ = 0, ∀ϕ ∈ D(A∗ )}
                        ˜             ˜
                ˜                            ˜
we see that (0, ψ) ∈ Γ(A) if and only if ψ ∈ D(A∗ )⊥ . Hence A is closable if
and only if D(A ∗ ) is dense. In this case, equation (2.43) also shows A∗ = A∗ .

Moreover, replacing A by A∗ in (2.43) and comparing with the last formula
shows A∗∗ = A.
   (iii) Next note that (provided A is injective)
                                  Γ(A−1 ) = V Γ(A).
Hence if Ran(A) is dense, then Ker(A∗ ) = Ran(A)⊥ = {0} and
       Γ((A∗ )−1 ) = V Γ(A∗ ) = V U Γ(A)⊥ = U V Γ(A)⊥ = U (V Γ(A))⊥
shows that (A∗ )−1 = (A−1 )∗ . Similarly, if A is closable and A is injective,
      −1
then A = A−1 by
                          −1
                    Γ(A        ) = V Γ(A) = V Γ(A) = Γ(A−1 ).


Corollary 2.5. If A is self-adjoint and injective, then A−1 is also self-
adjoint.

Proof. Equation (2.20) in the case A = A∗ implies Ran(A)⊥ = Ker(A) =
{0} and hence (iii) is applicable.
66                                            2. Self-adjointness and spectrum


   If A is densely defined and bounded, we clearly have D(A∗ ) = H and by
Corollary 1.9, A∗ ∈ L(H). In particular, since A = A∗∗ , we obtain
Theorem 2.6. We have A ∈ L(H) if and only if A∗ ∈ L(H).

   Now we can also generalize Lemma 2.3 to the case of essential self-adjoint
operators.
Lemma 2.7. A symmetric operator A is essentially self-adjoint if and only
if one of the following conditions holds for one z ∈ CR:
       • Ran(A + z) = Ran(A + z ∗ ) = H,
       • Ker(A∗ + z) = Ker(A∗ + z ∗ ) = {0}.
If A is nonnegative, that is, ψ, Aψ ≥ 0 for all ψ ∈ D(A), we can also
admit z ∈ (−∞, 0).

Proof. First of all note that by (2.20) the two conditions are equivalent.
By taking the closure of A, it is no restriction to assume that A is closed.
Let z = x + iy. From
                         2                       2
              (A + z)ψ       = (A + x)ψ + iyψ
                                          2
                             = (A + x)ψ       + y2 ψ   2
                                                           ≥ y2 ψ 2,    (2.44)

we infer that Ker(A+z) = {0} and hence (A+z)−1 exists. Moreover, setting
ψ = (A + z)−1 ϕ (y = 0) shows (A + z)−1 ≤ |y|−1 . Hence (A + z)−1 is
bounded and closed. Since it is densely defined by assumption, its domain
Ran(A + z) must be equal to H. Replacing z by z ∗ , we see Ran(A + z ∗ ) = H
and applying Lemma 2.3 shows that A is self-adjoint. Conversely, if A = A∗ ,
the above calculation shows Ker(A∗ + z) = {0}, which finishes the case
z ∈ CR.
   The argument for the nonnegative case with z  0 is similar using
ε ψ 2 ≤ ψ, (A + ε)ψ ≤ ψ (A + ε)ψ which shows (A + ε)−1 ≤ ε−1 ,
ε  0.

    In addition, we can also prove the closed graph theorem which shows
that an unbounded closed operator cannot be defined on the entire Hilbert
space.
Theorem 2.8 (Closed graph). Let H1 and H2 be two Hilbert spaces and
A : H1 → H2 an operator defined on all of H1 . Then A is bounded if and
only if Γ(A) is closed.

Proof. If A is bounded, then it is easy to see that Γ(A) is closed. So let us
assume that Γ(A) is closed. Then A∗ is well-defined and for all unit vectors
2.3. Quadratic forms and the Friedrichs extension                               67


ϕ ∈ D(A∗ ) we have that the linear functional ϕ (ψ) = A∗ ϕ, ψ is pointwise
bounded, that is,
                        ϕ (ψ) = | ϕ, Aψ | ≤ Aψ .
Hence by the uniform boundedness principle there is a constant C such that
          ∗                    ∗                            ∗∗
  ϕ = A ϕ ≤ C. That is, A is bounded and so is A = A .

    Note that since symmetric operators are closable, they are automatically
closed if they are defined on the entire Hilbert space.
Theorem 2.9 (Hellinger-Toeplitz). A symmetric operator defined on the
entire Hilbert space is bounded.
Problem 2.1 (Jacobi operator). Let a and b be some real-valued sequences
in ∞ (Z). Consider the operator
                                                          2
             Jfn = an fn+1 + an−1 fn−1 + bn fn ,     f∈       (Z).
Show that J is a bounded self-adjoint operator.
Problem 2.2. Show that (αA)∗ = α∗ A∗ and (A + B)∗ ⊇ A∗ + B ∗ (where
D(A∗ + B ∗ ) = D(A∗ ) ∩ D(B ∗ )) with equality if one operator is bounded.
Give an example where equality does not hold.
Problem 2.3. Suppose AB is densely defined. Show that (AB)∗ ⊇ B ∗ A∗ .
Moreover, if B is bounded, then (BA)∗ = A∗ B ∗ .
Problem 2.4. Show (2.20).
Problem 2.5. An operator is called normal if         Aψ       =      A∗ ψ   for all
ψ ∈ D(A) = D(A∗ ).
   Show that if A is normal, so is A + z for any z ∈ C.
Problem 2.6. Show that normal operators are closed. (Hint: A∗ is closed.)
Problem 2.7. Show that a bounded operator A is normal if and only if
AA∗ = A∗ A.
Problem 2.8. Show that the kernel of a closed operator is closed.
Problem 2.9. Show that if A is closed and B bounded, then AB is closed.

2.3. Quadratic forms and the Friedrichs extension
Finally we want to draw some further consequences of Axiom 2 and show
that observables correspond to self-adjoint operators. Since self-adjoint op-
erators are already maximal, the difficult part remaining is to show that an
observable has at least one self-adjoint extension. There is a good way of
doing this for nonnegative operators and hence we will consider this case
first.
68                                                   2. Self-adjointness and spectrum


    An operator is called nonnegative (resp. positive) if ψ, Aψ ≥ 0 (resp.
 0 for ψ = 0) for all ψ ∈ D(A). If A is positive, the map (ϕ, ψ) → ϕ, Aψ
is a scalar product. However, there might be sequences which are Cauchy
with respect to this scalar product but not with respect to our original one.
To avoid this, we introduce the scalar product
                        ϕ, ψ   A   = ϕ, (A + 1)ψ ,            A ≥ 0,             (2.45)
defined on D(A), which satisfies ψ ≤ ψ A . Let HA be the completion of
D(A) with respect to the above scalar product. We claim that HA can be
regarded as a subspace of H; that is, D(A) ⊆ HA ⊆ H.
    If (ψn ) is a Cauchy sequence in D(A), then it is also Cauchy in H (since
 ψ ≤ ψ A by assumption) and hence we can identify the limit in HA with
the limit of (ψn ) regarded as a sequence in H. For this identification to be
unique, we need to show that if (ψn ) ⊂ D(A) is a Cauchy sequence in HA
such that ψn → 0, then ψn A → 0. This follows from
                    2
               ψn   A   = ψn , ψn − ψm        A   + ψn , ψm   A
                        ≤ ψn       A   ψn − ψm     A + ψn      (A + 1)ψm         (2.46)
since the right-hand side can be made arbitrarily small choosing m, n large.
    Clearly the quadratic form qA can be extended to every ψ ∈ HA by
setting
              qA (ψ) = ψ, ψ A − ψ 2 ,    ψ ∈ Q(A) = HA .       (2.47)
The set Q(A) is also called the form domain of A.
Example. (Multiplication operator) Let A be multiplication by A(x) ≥ 0
in L2 (Rn , dµ). Then
      Q(A) = D(A1/2 ) = {f ∈ L2 (Rn , dµ) | A1/2 f ∈ L2 (Rn , dµ)}               (2.48)
and
                          qA (x) =           A(x)|f (x)|2 dµ(x)                  (2.49)
                                        Rn
(show this).

    Now we come to our extension result. Note that A + 1 is injective and
                                                             ˜
the best we can hope for is that for a nonnegative extension A, the operator
 ˜                            ˜
A + 1 is a bijection from D(A) onto H.
Lemma 2.10. Suppose A is a nonnegative operator. Then there is a non-
                   ˜               ˜
negative extension A such that Ran(A + 1) = H.
                                ˜
Proof. Let us define an operator A by
            ˜             ˜
         D(A) = {ψ ∈ HA |∃ψ ∈ H : ϕ, ψ                 A
                                                                ˜
                                                           = ϕ, ψ , ∀ϕ ∈ HA },
           ˜    ˜
          Aψ = ψ − ψ.
2.3. Quadratic forms and the Friedrichs extension                                                           69


                     ˜
Since HA is dense, ψ is well-defined. Moreover, it is straightforward to see
      ˜
that A is a nonnegative extension of A.
                                        ˜                             ˜
    It is also not hard to see that Ran(A + 1) = H. Indeed, for any ψ ∈ H,
ϕ → ψ, ˜ ϕ is a bounded linear functional on HA . Hence there is an element
                     ˜
ψ ∈ HA such that ψ, ϕ = ψ, ϕ A for all ϕ ∈ HA . By the definition of A,   ˜
(A            ˜
 ˜ + 1)ψ = ψ and hence A + 1 is onto.
                           ˜

Example. Let us take H = L2 (0, π) and consider the operator
                      d2
           Af = −        f,          D(A) = {f ∈ C 2 [0, π] | f (0) = f (π) = 0},                        (2.50)
                     dx2
which corresponds to the one-dimensional model of a particle confined to a
box.
    (i) First of all, using integration by parts twice, it is straightforward to
check that A is symmetric:
       π                                        π                                 π
           g(x)∗ (−f )(x)dx =                       g (x)∗ f (x)dx =                  (−g )(x)∗ f (x)dx. (2.51)
   0                                        0                                 0

Note that the boundary conditions f (0) = f (π) = 0 are chosen such that
the boundary terms occurring from integration by parts vanish. Moreover,
the same calculation also shows that A is positive:
                     π                                            π
                         f (x)∗ (−f )(x)dx =                          |f (x)|2 dx  0,       f = 0.      (2.52)
                 0                                            0

    (ii) Next let us show HA = {f ∈ H 1 (0, π) | f (0) = f (π) = 0}. In fact,
since
                                                π
                             g, f   A   =            g (x)∗ f (x) + g(x)∗ f (x) dx,                      (2.53)
                                            0
we see that fn is Cauchy in HA if and only if both fn and fn are Cauchy
                                                                   x
in L2 (0, π). Thus fn → f and fn → g in L2 (0, π) and fn (x) = 0 fn (t)dt
                    x
implies f (x) = 0 g(t)dt. Thus f ∈ AC[0, π]. Moreover, f (0) = 0 is obvious
                              π                                 π
and from 0 = fn (π) = 0 fn (t)dt we have f (π) = limn→∞ 0 fn (t)dt = 0.
So we have HA ⊆ {f ∈ H 1 (0, π) | f (0) = f (π) = 0}. To see the converse,
                                                          1 π
approximate f by smooth functions gn . Using gn − π 0 gn (t)dt instead
                                           π
of gn , it is no restriction to assume 0 gn (t)dt = 0. Now define fn (x) =
  x
 0 gn (t)dt and note fn ∈ D(A) → f .
                                                  ˜                 ˜
     (iii) Finally, let us compute the extension A. We have f ∈ D(A) if for
all g ∈ HA there is an f   ˜ such that g, f A = g, f . That is,
                                                   ˜
                             π                                π
                                 g (x)∗ f (x)dx =                 g(x)∗ (f (x) − f (x))dx.
                                                                         ˜                               (2.54)
                         0                                0
70                                                            2. Self-adjointness and spectrum


Integration by parts on the right-hand side shows
             π                               π                   x
                 g (x)∗ f (x)dx = −              g (x)∗               ˜
                                                                     (f (t) − f (t))dt dx   (2.55)
         0                               0                   0
or equivalently
                        π                            x
                            g (x)∗ f (x) +                ˜
                                                         (f (t) − f (t))dt dx = 0.          (2.56)
                    0                            0
                                                                 π
Now observe {g ∈ H|g ∈ HA } = {h ∈ H| 0 h(t)dt = 0} = {1}⊥ and thus
         x ˜
f (x) + 0 (f (t) − f (t))dt ∈ {1}⊥⊥ = span{1}. So we see f ∈ H 2 (0, π) =
                                    ˜
{f ∈ AC[0, π]|f ∈ H 1 (0, π)} and Af = −f . The converse is easy and
hence
       ˜       d2          ˜
      Af = − 2 f, D(A) = {f ∈ H 2 [0, π] | f (0) = f (π) = 0}.     (2.57)
              dx


    Now let us apply this result to operators A corresponding to observables.
Since A will, in general, not satisfy the assumptions of our lemma, we will
                                                       ˜            ˜
consider A2 instead, which has a symmetric extension A2 with Ran(A2 +1) =
H. By our requirement for observables, A    2 is maximally defined and hence

is equal to this extension. In other words, Ran(A2 + 1) = H. Moreover, for
any ϕ ∈ H there is a ψ ∈ D(A2 ) such that
                            (A − i)(A + i)ψ = (A + i)(A − i)ψ = ϕ                           (2.58)
and since (A ± i)ψ ∈ D(A), we infer Ran(A ± i) = H. As an immediate
consequence we obtain
Corollary 2.11. Observables correspond to self-adjoint operators.

    But there is another important consequence of the results which is worth-
while mentioning. A symmetric operator is called semi-bounded, respec-
tively, bounded from below, if
                            qA (ψ) = ψ, Aψ ≥ γ ψ 2 ,                       γ ∈ R.           (2.59)
We will write A ≥ γ for short.
Theorem 2.12 (Friedrichs extension). Let A be a symmetric operator which
                                                                   ˜
is bounded from below by γ. Then there is a self-adjoint extension A which
is also bounded from below by γ and which satisfies D(A) ˜ ⊆ HA−γ .
               ˜                                           ˜
    Moreover, A is the only self-adjoint extension with D(A) ⊆ HA−γ .

Proof. If we replace A by A − γ, then existence follows from Lemma 2.10.
                       ˆ                                          ˆ
To see uniqueness, let A be another self-adjoint extension with D(A) ⊆ HA .
Choose ϕ ∈ D(A) and ψ ∈ D(A). ˆ Then
    ϕ, (A + 1)ψ = (A + 1)ϕ, ψ = ψ, (A + 1)ϕ ∗ = ψ, ϕ ∗ = ϕ, ψ A
        ˆ                                                   A
2.3. Quadratic forms and the Friedrichs extension                           71


                                    ˆ
and by continuity we even get ϕ, (A + 1)ψ = ϕ, ψ A for every ϕ ∈ HA .
                            ˜                 ˜      ˜       ˆ
Hence by the definition of A we have ψ ∈ D(A) and Aψ = Aψ; that is,
Aˆ ⊆ A. But self-adjoint operators are maximal by Corollary 2.2 and thus
     ˜
 ˆ   ˜
A = A.

    Clearly Q(A) = HA and qA can be defined for semi-bounded operators
as before by using ψ A = ψ, (A − γ)ψ + ψ 2 .
   In many physical applications, the converse of this result is also of im-
portance: given a quadratic form q, when is there a corresponding operator
A such that q = qA ?
     So let q : Q → C be a densely defined quadratic form corresponding
to a sesquilinear form s : Q × Q → C; that is, q(ψ) = s(ψ, ψ). As with
a scalar product, s can be recovered from q via the polarization identity
(cf. Problem 0.14). Furthermore, as in Lemma 2.1 one can show that s is
symmetric, s(ϕ, ψ) = s(ψ, ϕ)∗ , if and only if q is real-valued. In this case q
will be called hermitian.
    A hermitian form q is called nonnegative if q(ψ) ≥ 0 and semi-
bounded if q(ψ) ≥ γ ψ 2 for some γ ∈ R. As before we can associate
a norm ψ q = q(ψ) + (1 − γ) ψ 2 with any semi-bounded q and look at the
completion Hq of Q with respect to this norm. However, since we are not
assuming that q is steaming from a semi-bounded operator, we do not know
whether Hq can be regarded as a subspace of H! Hence we will call q clos-
able if for every Cauchy sequence ψn ∈ Q with respect to . q , ψn → 0
implies ψn q → 0. In this case we have Hq ⊆ H and we call the extension
of q to Hq the closure of q. In particular, we will call q closed if Q = Hq .
Example. Let H = L2 (0, 1). Then
                 q(f ) = |f (c)|2 ,   f ∈ C[0, 1],   c ∈ [0, 1],
is a well-defined nonnegative form. However, let fn (x) = max(0, 1−n|x−c|).
Then fn is a Cauchy sequence with respect to . q such that fn → 0 but
  fn q → 1. Hence q is not closable and hence also not associated with a
nonnegative operator. Formally, one can interpret q as the quadratic form
of the multiplication operator with the delta distribution at x = c. Exercise:
Show Hq = H ⊕ C.

    From our previous considerations we already know that the quadratic
form qA of a semi-bounded operator A is closable and its closure is associated
with a self-adjoint operator. It turns out that the converse is also true
(compare also Corollary 1.9 for the case of bounded operators):

Theorem 2.13. To every closed semi-bounded quadratic form q there cor-
responds a unique self-adjoint operator A such that Q = Q(A) and q = qA .
72                                                        2. Self-adjointness and spectrum


If s is the sesquilinear form corresponding to q, then A is given by
                        ˜                    ˜
       D(A) = {ψ ∈ Hq |∃ψ ∈ H : s(ϕ, ψ) = ϕ, ψ , ∀ϕ ∈ Hq },
              ˜                                                                            (2.60)
        Aψ = ψ − (1 − γ)ψ.

                          ˜
Proof. Since Hq is dense, ψ and hence A is well-defined. Moreover, replacing
q by q(.) − γ . and A by A − γ, it is no restriction to assume γ = 0. As
in the proof of Lemma 2.10 it follows that A is a nonnegative operator,
 Aψ 2 ≥ ψ 2 , with Ran(A + 1) = H. In particular, (A + 1)−1 exists and is
bounded. Furthermore, for every ϕj ∈ H we can find ψj ∈ D(A) such that
ϕj = (A + 1)ψj . Finally,
         (A + 1)−1 ϕ1 , ϕ2 = ψ1 , (A + 1)ψ2 = s(ψ1 , ψ2 ) = s(ψ2 , ψ1 )∗
                                                          ∗
                                    = ψ2 , (A + 1)ψ1           = (A + 1)ψ1 , ψ2
                                                     −1
                                    = ϕ1 , (A + 1)        ϕ2
shows that (A + 1)−1 is self-adjoint and so is A + 1 by Corollary 2.5.

                ˜
   Any subspace Q ⊆ Q(A) which is dense with respect to .                          A   is called a
form core of A and uniquely determines A.
Example. We have already seen that the operator
                d2
       Af = −      f, D(A) = {f ∈ H 2 [0, π] | f (0) = f (π) = 0}                          (2.61)
               dx2
is associated with the closed form
                 π
 qA (f ) =           |f (x)|2 dx,    Q(A) = {f ∈ H 1 [0, π] | f (0) = f (π) = 0}. (2.62)
             0
However, this quadratic form even makes sense on the larger form domain
Q = H 1 [0, π]. What is the corresponding self-adjoint operator? (See Prob-
lem 2.13.)

     A hermitian form q is called bounded if |q(ψ)| ≤ C ψ                    2    and we call
                                        q = sup |q(ψ)|                                     (2.63)
                                              ψ =1

the norm of q. In this case the norm . q is equivalent to . . Hence
Hq = H and the corresponding operator is bounded by the Hellinger–Toeplitz
theorem (Theorem 2.9). In fact, the operator norm is equal to the norm of
q (see also Problem 0.15):
Lemma 2.14. A semi-bounded form q is bounded if and only if the associ-
ated operator A is. Moreover, in this case
                                           q = A .                                         (2.64)
2.4. Resolvents and spectra                                                73


Proof. Using the polarization identity and the parallelogram law (Prob-
lem 0.14), we infer 2 Re ϕ, Aψ ≤ ( ψ 2 + ϕ 2 ) supψ | ψ, Aψ | and choosing
ϕ = Aψ −1 Aψ shows A ≤ q|. The converse is easy.

   As a consequence we see that for symmetric operators we have
                              A = sup | ψ, Aψ |                        (2.65)
                                    ψ =1

generalizing (2.14) in this case.
Problem 2.10. Let A be invertible. Show A  0 if and only if A−1  0.
                            d  2
Problem 2.11. Let A = − dx2 , D(A) = {f ∈ H 2 (0, π) | f (0) = f (π) = 0}
                1
and let ψ(x) = 2√π x(π −x). Find the error in the following argument: Since
A is symmetric, we have 1 = Aψ, Aψ = ψ, A2 ψ = 0.
Problem 2.12. Suppose A is a closed operator. Show that A∗ A (with
D(A∗ A) = {ψ ∈ D(A)|Aψ ∈ D(A∗ )}) is self-adjoint. Show Q(A∗ A) =
D(A). (Hint: A∗ A ≥ 0.)
Problem 2.13. Suppose A0 can be written as A0 = S ∗ S. Show that the
Friedrichs extension is given by A = S ∗ S.
                                                             2
                                                             d
     Use this to compute the Friedrichs extension of A = − dx2 , D(A) = {f ∈
C 2 (0, π)|f (0) = f (π) = 0}. Compute also the self-adjoint operator SS ∗ and

its form domain.
Problem 2.14. Use the previous problem to compute the Friedrichs exten-
                   d2               ∞
sion A of A0 = − dx2 , D(A0 ) = Cc (R). Show that Q(A) = H 1 (R) and
D(A) = H 2 (R). (Hint: Section 2.7.)
Problem 2.15. Let A be self-adjoint. Suppose D ⊆ D(A) is a core. Then
D is also a form core.
Problem 2.16. Show that (2.65) is wrong if A is not symmetric.

2.4. Resolvents and spectra
Let A be a (densely defined) closed operator. The resolvent set of A is
defined by
                   ρ(A) = {z ∈ C|(A − z)−1 ∈ L(H)}.             (2.66)
More precisely, z ∈ ρ(A) if and only if (A − z) : D(A) → H is bijective
and its inverse is bounded. By the closed graph theorem (Theorem 2.8), it
suffices to check that A − z is bijective. The complement of the resolvent
set is called the spectrum
                               σ(A) = Cρ(A)                           (2.67)
74                                             2. Self-adjointness and spectrum


of A. In particular, z ∈ σ(A) if A − z has a nontrivial kernel. A vector
ψ ∈ Ker(A − z) is called an eigenvector and z is called an eigenvalue in
this case.
     The function
                        RA : ρ(A) → L(H)                                 (2.68)
                              z     → (A − z)−1
is called the resolvent of A. Note the convenient formula
  RA (z)∗ = ((A − z)−1 )∗ = ((A − z)∗ )−1 = (A∗ − z ∗ )−1 = RA∗ (z ∗ ). (2.69)
In particular,
                                 ρ(A∗ ) = ρ(A)∗ .                        (2.70)
Example. (Multiplication operator) Consider again the multiplication op-
erator
     (Af )(x) = A(x)f (x), D(A) = {f ∈ L2 (Rn , dµ) | Af ∈ L2 (Rn , dµ)},
                                                                       (2.71)
given by multiplication with the measurable function A : R n → C. Clearly

(A − z)−1 is given by the multiplication operator
                              1
       (A − z)−1 f (x) =            f (x),
                           A(x) − z
                                               1
        D((A − z)−1 ) = {f ∈ L2 (Rn , dµ) |       f ∈ L2 (Rn , dµ)}      (2.72)
                                              A−z
whenever this operator is bounded. But (A − z)−1            =    1
                                                                A−z ∞   ≤   1
                                                                            ε   is
equivalent to µ({x| |A(x) − z|  ε}) = 0 and hence
          ρ(A) = {z ∈ C|∃ε  0 : µ({x| |A(x) − z|  ε}) = 0}.            (2.73)
The spectrum
               σ(A) = {z ∈ C|∀ε  0 : µ({x| |A(x) − z|  ε})  0}        (2.74)
is also known as the essential range of A(x). Moreover, z is an eigenvalue
of A if µ(A−1 ({z}))  0 and χA−1 ({z}) is a corresponding eigenfunction in
this case.

Example. (Differential operator) Consider again the differential operator
              d
     Af = −i    f, D(A) = {f ∈ AC[0, 2π] | f ∈ L2 , f (0) = f (2π)} (2.75)
             dx
in L2 (0, 2π). We already know that the eigenvalues of A are the integers
and that the corresponding normalized eigenfunctions
                                      1
                            un (x) = √ einx                         (2.76)
                                      2π
form an orthonormal basis.
2.4. Resolvents and spectra                                                             75


    To compute the resolvent, we must find the solution of the correspond-
ing inhomogeneous equation −if (x) − z f (x) = g(x). By the variation of
constants formula the solution is given by (this can also be easily verified
directly)
                                                         x
                         f (x) = f (0)eizx + i               eiz(x−t) g(t)dt.        (2.77)
                                                     0
Since f must lie in the domain of A, we must have f (0) = f (2π) which gives
                                           2π
                                i
                f (0) =                e−izt g(t)dt,   z ∈ CZ.           (2.78)
                       e−2πiz − 1 0
(Since z ∈ Z are the eigenvalues, the inverse cannot exist in this case.) Hence
                                                    2π
                        (A − z)−1 g(x) =                 G(z, x, t)g(t)dt,           (2.79)
                                                0
where
                                              −i
                                           1−e−2πiz
                                                     ,         t  x,
               G(z, x, t) = eiz(x−t)          i
                                                                          z ∈ CZ.   (2.80)
                                           1−e2πiz
                                                   ,           t  x,
In particular σ(A) = Z.

    If z, z ∈ ρ(A), we have the first resolvent formula
   RA (z) − RA (z ) = (z − z )RA (z)RA (z ) = (z − z )RA (z )RA (z).                 (2.81)
In fact,
  (A − z)−1 − (z − z )(A − z)−1 (A − z )−1
           = (A − z)−1 (1 − (z − A + A − z )(A − z )−1 ) = (A − z )−1 , (2.82)
which proves the first equality. The second follows after interchanging z and
z . Now fix z = z0 and use (2.81) recursively to obtain
                n
   RA (z) =          (z − z0 )j RA (z0 )j+1 + (z − z0 )n+1 RA (z0 )n+1 RA (z).       (2.83)
               j=0

The sequence of bounded operators
                                     n
                             Rn =         (z − z0 )j RA (z0 )j+1                     (2.84)
                                    j=0

converges to a bounded operator if |z − z0 |  RA (z0 ) −1 and clearly we
expect z ∈ ρ(A) and Rn → RA (z) in this case. Let R∞ = limn→∞ Rn and
set ϕn = Rn ψ, ϕ = R∞ ψ for some ψ ∈ H. Then a quick calculation shows
      ARn ψ = (A − z0 )Rn ψ + z0 ϕn = ψ + (z − z0 )ϕn−1 + z0 ϕn .                    (2.85)
Hence (ϕn , Aϕn ) → (ϕ, ψ + zϕ) shows ϕ ∈ D(A) (since A is closed) and
(A − z)R∞ ψ = ψ. Similarly, for ψ ∈ D(A),
                          Rn Aψ = ψ + (z − z0 )ϕn−1 + z0 ϕn                          (2.86)
76                                                2. Self-adjointness and spectrum


and hence R∞ (A − z)ψ = ψ after taking the limit. Thus R∞ = RA (z) as
anticipated.
    If A is bounded, a similar argument verifies the Neumann series for
the resolvent
                                  n−1
                                         Aj     1
                     RA (z) = −           j+1
                                              + n An RA (z)
                                        z      z
                                  j=0
                                   ∞
                                         Aj
                            =−                ,   |z|  A .                 (2.87)
                                        z j+1
                                  j=0

     In summary we have proved the following:
Theorem 2.15. The resolvent set ρ(A) is open and RA : ρ(A) → L(H) is
holomorphic; that is, it has an absolutely convergent power series expansion
around every point z0 ∈ ρ(A). In addition,
                          RA (z) ≥ dist(z, σ(A))−1                          (2.88)
and if A is bounded, we have {z ∈ C| |z|  A } ⊆ ρ(A).

     As a consequence we obtain the useful
Lemma 2.16. We have z ∈ σ(A) if there is a sequence ψn ∈ D(A) such
that ψn = 1 and (A − z)ψn → 0. If z is a boundary point of ρ(A), then
the converse is also true. Such a sequence is called a Weyl sequence.

Proof. Let ψn be a Weyl sequence. Then z ∈ ρ(A) is impossible by 1 =
 ψn = RA (z)(A − z)ψn ≤ RA (z) (A − z)ψn → 0. Conversely, by
(2.88) there is a sequence zn → z and corresponding vectors ϕn ∈ H such
that RA (z)ϕn ϕn −1 → ∞. Let ψn = RA (zn )ϕn and rescale ϕn such that
 ψn = 1. Then ϕn → 0 and hence
           (A − z)ψn = ϕn + (zn − z)ψn ≤ ϕn + |z − zn | → 0
shows that ψn is a Weyl sequence.

     Let us also note the following spectral mapping result.
Lemma 2.17. Suppose A is injective. Then
                        σ(A−1 ){0} = (σ(A){0})−1 .                        (2.89)
In addition, we have Aψ = zψ if and only if A−1 ψ = z −1 ψ.

Proof. Suppose z ∈ ρ(A){0}. Then we claim
                 RA−1 (z −1 ) = −zARA (z) = −z − z 2 RA (z).
2.4. Resolvents and spectra                                                     77


In fact, the right-hand side is a bounded operator from H → Ran(A) =
D(A−1 ) and
         (A−1 − z −1 )(−zARA (z))ϕ = (−z + A)RA (z)ϕ = ϕ,        ϕ ∈ H.
Conversely, if ψ ∈   D(A−1 )   = Ran(A), we have ψ = Aϕ and hence
          (−zARA (z))(A−1 − z −1 )ψ = ARA (z)((A − z)ϕ) = Aϕ = ψ.
Thus z −1 ∈ ρ(A−1 ). The rest follows after interchanging the roles of A and
A−1 .

      Next, let us characterize the spectra of self-adjoint operators.
Theorem 2.18. Let A be symmetric. Then A is self-adjoint if and only if
σ(A) ⊆ R and (A − E) ≥ 0, E ∈ R, if and only if σ(A) ⊆ [E, ∞). Moreover,
 RA (z) ≤ | Im(z)|−1 and, if (A − E) ≥ 0, RA (λ) ≤ |λ − E|−1 , λ  E.

Proof. If σ(A) ⊆ R, then Ran(A + z) = H, z ∈ CR, and hence A is
self-adjoint by Lemma 2.7. Conversely, if A is self-adjoint (resp. A ≥ E),
then RA (z) exists for z ∈ CR (resp. z ∈ C[E, ∞)) and satisfies the given
estimates as has been shown in the proof of Lemma 2.7.

      In particular, we obtain (show this!)
Theorem 2.19. Let A be self-adjoint. Then
                         inf σ(A) =       inf        ψ, Aψ                (2.90)
                                      ψ∈D(A), ψ =1

and
                        sup σ(A) =        sup        ψ, Aψ .              (2.91)
                                      ψ∈D(A), ψ =1

      For the eigenvalues and corresponding eigenfunctions we have
Lemma 2.20. Let A be symmetric. Then all eigenvalues are real and eigen-
vectors corresponding to different eigenvalues are orthogonal.

Proof. If Aψj = λj ψj , j = 1, 2, we have
  λ1 ψ1    2
               = ψ1 , λ1 ψ1 = ψ1 , Aψ1 = ψ1 , Aψ1 = λ1 ψ1 , ψ1 = λ∗ ψ1
                                                                  1
                                                                            2

and
              (λ1 − λ2 ) ψ1 , ψ2 = Aψ1 , ψ2 − Aψ1 , ψ2 = 0,
finishing the proof.

    The result does not imply that two linearly independent eigenfunctions
to the same eigenvalue are orthogonal. However, it is no restriction to
assume that they are since we can use Gram–Schmidt to find an orthonormal
basis for Ker(A − λ). If H is finite dimensional, we can always find an
orthonormal basis of eigenvectors. In the infinite dimensional case this is
78                                          2. Self-adjointness and spectrum


no longer true in general. However, if there is an orthonormal basis of
eigenvectors, then A is essentially self-adjoint.
Theorem 2.21. Suppose A is a symmetric operator which has an orthonor-
mal basis of eigenfunctions {ϕj }. Then A is essentially self-adjoint. In
particular, it is essentially self-adjoint on span{ϕj }.
                                                             n
Proof. Consider the set of all finite linear combinations ψ = j=0 cj ϕj
                                   n    cj
which is dense in H. Then φ = j=0 λj ±i ϕj ∈ D(A) and (A ± i)φ = ψ
shows that Ran(A ± i) is dense.

    Similarly, we can characterize the spectra of unitary operators. Recall
that a bijection U is called unitary if U ψ, U ψ = ψ, U ∗ U ψ = ψ, ψ . Thus
U is unitary if and only if
                                   U ∗ = U −1 .                       (2.92)
Theorem 2.22. Let U be unitary. Then σ(U ) ⊆ {z ∈ C| |z| = 1}. All
eigenvalues have modulus one and eigenvectors corresponding to different
eigenvalues are orthogonal.

Proof. Since U ≤ 1, we have σ(U ) ⊆ {z ∈ C| |z| ≤ 1}. Moreover, U −1
is also unitary and hence σ(U ) ⊆ {z ∈ C| |z| ≥ 1} by Lemma 2.17. If
U ψj = zj ψj , j = 1, 2, we have
               (z1 − z2 ) ψ1 , ψ2 = U ∗ ψ1 , ψ2 − ψ1 , U ψ2 = 0
since U ψ = zψ implies U ∗ ψ = U −1 ψ = z −1 ψ = z ∗ ψ.
Problem 2.17. Suppose A is closed and B bounded:
       • Show that I + B has a bounded inverse if B  1.
       • Suppose A has a bounded inverse. Then so does A + B if B ≤
          A−1 −1 .
Problem 2.18. What is the spectrum of an orthogonal projection?
Problem 2.19. Compute the resolvent of
               Af = f ,      D(A) = {f ∈ H 1 [0, 1] | f (0) = 0}
and show that unbounded operators can have empty spectrum.
                                                                       d 2
Problem 2.20. Compute the eigenvalues and eigenvectors of A = − dx2 ,
D(A) = {f ∈ H 2 (0, π)|f (0) = f (π) = 0}. Compute the resolvent of A.
Problem 2.21. Find a Weyl sequence for the self-adjoint operator A =
   d2
− dx2 , D(A) = H 2 (R) for z ∈ (0, ∞). What is σ(A)? (Hint: Cut off the
solutions of −u (x) = z u(x) outside a finite ball.)
2.5. Orthogonal sums of operators                                                  79


Problem 2.22. Suppose A = A0 . If ψn ∈ D(A) is a Weyl sequence for
                                      ˜
z ∈ σ(A), then there is also one with ψn ∈ D(A0 ).
Problem 2.23. Suppose A is bounded. Show that the spectra of AA∗ and
A∗ A coincide away from 0 by showing
                1                                              1 ∗
   RAA∗ (z) =     (ARA∗ A (z)A∗ − 1) ,           RA∗ A (z) =     (A RAA∗ (z)A − 1) .
                z                                              z
                                                                                (2.93)

2.5. Orthogonal sums of operators
Let Hj , j = 1, 2, be two given Hilbert spaces and let Aj : D(Aj ) → Hj be
two given operators. Setting H = H1 ⊕ H2 , we can define an operator
                 A = A1 ⊕ A2 ,           D(A) = D(A1 ) ⊕ D(A2 )                 (2.94)
by setting A(ψ1 + ψ2 ) = A1 ψ1 + A2 ψ2 for ψj ∈ D(Aj ). Clearly A is closed,
(essentially) self-adjoint, etc., if and only if both A1 and A2 are. The same
considerations apply to countable orthogonal sums. Let H = j Hj and set

            A=         Aj ,       D(A) = {ψ ∈              D(Aj )|Aψ ∈ H}.      (2.95)
                   j                                  j

Then we have
Theorem 2.23. Suppose Aj are self-adjoint operators on Hj . Then A =
 j Aj is self-adjoint and

                RA (z) =          RAj (z),        z ∈ ρ(A) = Cσ(A)             (2.96)
                              j

where
                                   σ(A) =         σ(Aj )                        (2.97)
                                             j

(the closure can be omitted if there are only finitely many terms).

Proof. Fix z ∈ j σ(Aj ) and let ε = Im(z). Then, by Theorem 2.18,
  RAj (z) ≤ ε−1 and so R(z) =         j RAj (z) is a bounded operator with
  R(z) ≤ ε  −1 (cf. Problem 2.26). It is straightforward to check that R(z)

is in fact the resolvent of A and thus σ(A) ⊆ R. In particular, A is self-
adjoint by Theorem 2.18. To see that σ(A) ⊆ j σ(Aj ), note that the
above argument can be repeated with ε = dist(z, j σ(Aj ))  0, which will
follow from the spectral theorem (Problem 3.5) to be proven in the next
chapter. Conversely, if z ∈ σ(Aj ), there is a corresponding Weyl sequence
ψn ∈ D(Aj ) ⊆ D(A) and hence z ∈ σ(A).
80                                                2. Self-adjointness and spectrum


    Conversely, given an operator A, it might be useful to write A as an
orthogonal sum and investigate each part separately.
    Let H1 ⊆ H be a closed subspace and let P1 be the corresponding pro-
jector. We say that H1 reduces the operator A if P1 A ⊆ AP1 . Note that
this is equivalent to P1 D(A) ⊆ D(A) and P1 Aψ = AP1 ψ for ψ ∈ D(A).
Moreover, if we set H2 = H⊥ , we have H = H1 ⊕ H2 and P2 = 1 − P1 reduces
                           1
A as well.

Lemma 2.24. Suppose H =             j   Hj where each Hj reduces A. Then A =
  j Aj , where

                Aj ψ = Aψ,          D(Aj ) = Pj D(A) ⊆ D(A).                (2.98)

If A is closable, then Hj also reduces A and

                                   A=           Aj .                        (2.99)
                                           j


Proof. As already noted, Pj D(A) ⊆ D(A) and thus every ψ ∈ D(A) can be
written as ψ = j Pj ψ; that is, D(A) = j D(Aj ). Moreover, if ψ ∈ D(Aj ),
we have Aψ = APj ψ = Pj Aψ ∈ Hj and thus Aj : D(Aj ) → Hj which proves
the first claim.
    Now let us turn to the second claim. Suppose ψ ∈ D(A). Then there
is a sequence ψn ∈ D(A) such that ψn → ψ and Aψn → ϕ = Aψ. Thus
Pj ψn → Pj ψ and APj ψn = Pj Aψn → Pj ϕ which shows Pj ψ ∈ D(A) and
Pj Aψ = APj ψ; that is, Hj reduces A. Moreover, this argument also shows
Pj D(A) ⊆ D(Aj ) and the converse follows analogously.

     If A is self-adjoint, then H1 reduces A if P1 D(A) ⊆ D(A) and AP1 ψ ∈ H1
for every ψ ∈ D(A). In fact, if ψ ∈ D(A), we can write ψ = ψ1 ⊕ ψ2 , with
P2 = 1 − P1 and ψj = Pj ψ ∈ D(A). Since AP1 ψ = Aψ1 and P1 Aψ =
P1 Aψ1 + P1 Aψ2 = Aψ1 + P1 Aψ2 , we need to show P1 Aψ2 = 0. But this
follows since
                        ϕ, P1 Aψ2 = AP1 ϕ, ψ2 = 0                          (2.100)
for every ϕ ∈ D(A).

Problem 2.24. Show (       j   Aj ) ∗ =    j   A∗ .
                                                j

Problem 2.25. Show that A defined in (2.95) is closed if and only if all Aj
are.

Problem 2.26. Show that for A defined in (2.95), we have A = supj Aj .
2.6. Self-adjoint extensions                                                  81


2.6. Self-adjoint extensions
It is safe to skip this entire section on first reading.
    In many physical applications a symmetric operator is given. If this
operator turns out to be essentially self-adjoint, there is a unique self-adjoint
extension and everything is fine. However, if it is not, it is important to find
out if there are self-adjoint extensions at all (for physical problems there
better be) and to classify them.
    In Section 2.2 we saw that A is essentially self-adjoint if Ker(A∗ − z) =
Ker(A∗ − z ∗ ) = {0} for one z ∈ CR. Hence self-adjointness is related to
the dimension of these spaces and one calls the numbers
        d± (A) = dim K± ,      K± = Ran(A ± i)⊥ = Ker(A∗         i),      (2.101)
defect indices of A (we have chosen z = i for simplicity; any other z ∈ CR
would be as good). If d− (A) = d+ (A) = 0, there is one self-adjoint extension
of A, namely A. But what happens in the general case? Is there more than
one extension, or maybe none at all? These questions can be answered by
virtue of the Cayley transform
           V = (A − i)(A + i)−1 : Ran(A + i) → Ran(A − i).                (2.102)

Theorem 2.25. The Cayley transform is a bijection from the set of all
symmetric operators A to the set of all isometric operators V (i.e., V ϕ =
 ϕ for all ϕ ∈ D(V )) for which Ran(1 − V ) is dense.

Proof. Since A is symmetric, we have (A ± i)ψ 2 = Aψ 2 + ψ 2 for all
ψ ∈ D(A) by a straightforward computation. Thus for every ϕ = (A + i)ψ ∈
D(V ) = Ran(A + i) we have
                     V ϕ = (A − i)ψ = (A + i)ψ = ϕ .
Next observe
                                                          2A(A + i)−1 ,
           1 ± V = ((A − i) ± (A + i))(A + i)−1 =
                                                          2i(A + i)−1 ,
which shows that Ran(1 − V ) = D(A) is dense and
                              A = i(1 + V )(1 − V )−1 .
Conversely, let V be given and use the last equation to define A.
    Since V is isometric, we have (1 ± V )ϕ, (1 V )ϕ = ±2i Im V ϕ, ϕ
for all ϕ ∈ D(V ) by a straightforward computation. Thus for every ψ =
(1 − V )ϕ ∈ D(A) = Ran(1 − V ) we have
    Aψ, ψ = −i (1 + V )ϕ, (1 − V )ϕ = i (1 − V )ϕ, (1 + V )ϕ = ψ, Aψ ;
82                                                 2. Self-adjointness and spectrum


that is, A is symmetric. Finally observe
                                                              2i(1 − V )−1 ,
         A ± i = ((1 + V ) ± (1 − V ))(1 − V )−1 =
                                                             2iV (1 − V )−1 ,
which shows that A is the Cayley transform of V and finishes the proof.

    Thus A is self-adjoint if and only if its Cayley transform V is unitary.
Moreover, finding a self-adjoint extension of A is equivalent to finding a
unitary extensions of V and this in turn is equivalent to (taking the closure
and) finding a unitary operator from D(V )⊥ to Ran(V )⊥ . This is possible
if and only if both spaces have the same dimension, that is, if and only if
d+ (A) = d− (A).
Theorem 2.26. A symmetric operator has self-adjoint extensions if and
only if its defect indices are equal.
   In this case let A1 be a self-adjoint extension and V1 its Cayley trans-
form. Then
  D(A1 ) = D(A) + (1 − V1 )K+ = {ψ + ϕ+ − V1 ϕ+ |ψ ∈ D(A), ϕ+ ∈ K+ }
                                                               (2.103)
and
                A1 (ψ + ϕ+ − V1 ϕ+ ) = Aψ + iϕ+ + iV1 ϕ+ .     (2.104)
Moreover,
                                    i
         (A1 ± i)−1 = (A ± i)−1 ⊕        ϕ± , . (ϕ± − ϕj ),
                                          j       j            (2.105)
                                   2
                                              j

where {ϕ+ } is an orthonormal basis for K+ and ϕ− = V1 ϕ+ .
        j                                       j       j

Proof. From the proof of the previous theorem we know that D(A1 ) =
Ran(1 − V1 ) = Ran(1 + V ) + (1 − V1 )K+ = D(A) + (1 − V1 )K+ . Moreover,
A1 (ψ +ϕ+ −V1 ϕ+ ) = Aψ +i(1+V1 )(1−V1 )−1 (1−V1 )ϕ+ = Aψ +i(1+V1 )ϕ+ .
    Similarly, Ran(A1 ± i) = Ran(A ± i) ⊕ K± and (A1 + i)−1 = − 2 (1 − V1 ),
                                                                i

respectively, (A1 + i)−1 = − 2 (1 − V1−1 ).
                             i


    Note that instead of z = i we could use V (z) = (A + z ∗ )(A + z)−1 for
any z ∈ CR. We remark that in this case one can show that the defect
indices are independent of z ∈ C+ = {z ∈ C| Im(z)  0}.
                                         d
Example. Recall the operator A = −i dx , D(A) = {f ∈ H 1 (0, 2π)|f (0) =
f (2π) = 0} with adjoint A ∗ = −i d , D(A∗ ) = H 1 (0, 2π).
                                 dx
     Clearly
                            K± = span{e x }                                     (2.106)
is one-dimensional and hence all unitary maps are of the form
                        Vθ e2π−x = eiθ ex ,       θ ∈ [0, 2π).                  (2.107)
2.6. Self-adjoint extensions                                                        83


The functions in the domain of the corresponding operator Aθ are given by
       fθ (x) = f (x) + α(e2π−x − eiθ ex ),         f ∈ D(A), α ∈ C.           (2.108)
In particular, fθ satisfies
                                  ˜             ˜      1 − eiθ e2π
                     fθ (2π) = eiθ fθ (0),    e iθ =               ,           (2.109)
                                                       e2π − eiθ
and thus we have
                                                              ˜
                 D(Aθ ) = {f ∈ H 1 (0, 2π)|f (2π) = eiθ f (0)}.                (2.110)


     Concerning closures, we can combine the fact that a bounded operator
is closed if and only if its domain is closed with item (iii) from Lemma 2.4
to obtain
Lemma 2.27. The following items are equivalent.
       • A is closed.
       • D(V ) = Ran(A + i) is closed.
       • Ran(V ) = Ran(A − i) is closed.
       • V is closed.

    Next, we give a useful criterion for the existence of self-adjoint exten-
sions. A conjugate linear map C : H → H is called a conjugation if it
satisfies C 2 = I and Cψ, Cϕ = ψ, ϕ . The prototypical example is, of
course, complex conjugation Cψ = ψ ∗ . An operator A is called C-real if
        CD(A) ⊆ D(A),          and ACψ = CAψ,               ψ ∈ D(A).          (2.111)
Note that in this case CD(A) = D(A), since D(A) = C 2 D(A) ⊆ CD(A).
Theorem 2.28. Suppose the symmetric operator A is C-real. Then its
defect indices are equal.

Proof. Let {ϕj } be an orthonormal set in Ran(A + i)⊥ . Then {Cϕj } is an
orthonormal set in Ran(A − i)⊥ . Hence {ϕj } is an orthonormal basis for
Ran(A + i)⊥ if and only if {Cϕj } is an orthonormal basis for Ran(A − i)⊥ .
Hence the two spaces have the same dimension.

   Finally, we note the following useful formula for the difference of resol-
vents of self-adjoint extensions.
Lemma 2.29. If Aj , j = 1, 2, are self-adjoint extensions of A and if {ϕj (z)}
is an orthonormal basis for Ker(A∗ − z), then
  (A1 − z)−1 − (A2 − z)−1 =            (αjk (z) − αjk (z)) ϕj (z ∗ ), . ϕk (z), (2.112)
                                         1         2

                                 j,k
84                                                           2. Self-adjointness and spectrum


where
                           αjk (z) = ϕk (z), (Al − z)−1 ϕj (z ∗ ) .
                            l
                                                                                                 (2.113)

Proof. First observe that ((A1 − z)−1 − (A2 − z)−1 )ϕ is zero for every
ϕ ∈ Ran(A − z). Hence it suffices to consider vectors of the form ϕ =
         ∗          ∗              ⊥       ∗   ∗
  j ϕj (z ), ϕ ϕj (z ) ∈ Ran(A − z) = Ker(A − z ). Hence we have

               (A1 − z)−1 − (A2 − z)−1 =                                 ϕj (z ∗ ), . ψj (z),
                                                                 j

where
                   ψj (z) = ((A1 − z)−1 − (A2 − z)−1 )ϕj (z ∗ ).
Now computing the adjoint once using ((Al − z)−1 )∗ = (Al − z ∗ )−1 and once
using ( j ϕj , . ψj )∗ = j ψj , . ϕj , we obtain

                             ϕj (z), . ψj (z ∗ ) =               ψj (z), . ϕk (z ∗ ).
                       j                                 j

Evaluating at ϕk (z) implies
        ψk (z) =           ψj (z ∗ ), ϕk (z ∗ ) ϕj (z) =                   1         2
                                                                         (αkj (z) − αkj (z))ϕj (z)
                   j                                                 j

and finishes the proof.
Problem 2.27. Compute the defect indices of
                           d                  ∞
                       A0 = i,    D(A0 ) = Cc ((0, ∞)).
                          dx
Can you give a self-adjoint extension of A0 ?
Problem 2.28. Let A1 be a self-adjoint extension of A and suppose ϕ ∈
Ker(A∗ − z0 ). Show that ϕ(z) = ϕ + (z − z0 )(A1 − z)−1 ϕ ∈ Ker(A∗ − z).

2.7. Appendix: Absolutely continuous functions
Let (a, b) ⊆ R be some interval. We denote by
                                                             x
AC(a, b) = {f ∈ C(a, b)|f (x) = f (c) +                          g(t)dt, c ∈ (a, b), g ∈ L1 (a, b)}
                                                                                          loc
                                                         c
                                                                   (2.114)
the set of all absolutely continuous functions. That is, f is absolutely
continuous if and only if it can be written as the integral of some locally
integrable function. Note that AC(a, b) is a vector space.
                                                     x
   By Corollary A.36, f (x) = f (c) + c g(t)dt is differentiable a.e. (with re-
spect to Lebesgue measure) and f (x) = g(x). In particular, g is determined
uniquely a.e.
2.7. Appendix: Absolutely continuous functions                                                      85


      If [a, b] is a compact interval, we set
                  AC[a, b] = {f ∈ AC(a, b)|g ∈ L1 (a, b)} ⊆ C[a, b].                            (2.115)
If f, g ∈ AC[a, b], we have the formula of partial integration (Problem 2.29)
              b                                                               b
                  f (x)g (x)dx = f (b)g(b) − f (a)g(a) −                          f (x)g(x)dx   (2.116)
          a                                                               a
which also implies that the product rule holds for absolutely continuous
functions.
      We set
H (a, b) = {f ∈ L2 (a, b)|f (j) ∈ AC(a, b), f (j+1) ∈ L2 (a, b), 0 ≤ j ≤ m − 1}.
  m

                                                                         (2.117)
      Then we have
Lemma 2.30. Suppose f ∈ H m (a, b), m ≥ 1. Then f is bounded and
limx↓a f (j) (x), respectively, limx↑b f (j) (x), exists for 0 ≤ j ≤ m − 1. More-
over, the limit is zero if the endpoint is infinite.

Proof. If the endpoint is finite, then f (j+1) is integrable near this endpoint
and hence the claim follows. If the endpoint is infinite, note that
                                                          x
                    |f (j) (x)|2 = |f (j) (c)|2 + 2           Re(f (j) (t)∗ f (j+1) (t))dt
                                                      c

shows that the limit exists (dominated convergence). Since f (j) is square
integrable, the limit must be zero.

    Let me remark that it suffices to check that the function plus the highest
derivative are in L2 ; the lower derivatives are then automatically in L2 . That
is,
 H m (a, b) = {f ∈ L2 (a, b)|f (j) ∈ AC(a, b), 0 ≤ j ≤ m − 1, f (m) ∈ L2 (a, b)}.
                                                                         (2.118)
For a finite endpoint this is straightforward. For an infinite endpoint this
can also be shown directly, but it is much easier to use the Fourier transform
(compare Section 7.1).
Problem 2.29. Show (2.116). (Hint: Fubini.)
Problem 2.30. A function u ∈ L1 (0, 1) is called weakly differentiable if for
some v ∈ L1 (0, 1) we have
                                   1                              1
                                       v(x)ϕ(x)dx = −                 u(x)ϕ (x)dx
                               0                              0
                             ∞
for all test functions ϕ ∈ Cc (0, 1). Show that u is weakly differentiable if
and only if u is absolutely continuous and u = v in this case. (Hint: You will
86                                                      2. Self-adjointness and spectrum


           1                              ∞
need that 0 u(t)ϕ (t)dt = 0 for all ϕ ∈ Cc (0, 1) if and only if f is constant.
                                 ∞                        1
To see this choose some ϕ0 ∈ Cc (0, 1) with I(ϕ0 ) = 0 ϕ0 (t)dt = 1. Then
invoke Lemma 0.37 and use that every ϕ ∈ Cc        ∞ (0, 1) can be written as
                                        t                  t
ϕ(t) = Φ (t) + I(ϕ)ϕ0 (t) with Φ(t) = 0 ϕ(s)ds − I(ϕ) 0 ϕ0 (s)ds.)
Problem 2.31. Show that H 1 (a, b) together with the norm
                                        b                       b
                          2
                      f   2,1   =           |f (t)|2 dt +           |f (t)|2 dt
                                    a                       a
is a Hilbert space.
                                      ∞
Problem 2.32. What is the closure of C0 (a, b) in H 1 (a, b)? (Hint: Start
with the case where (a, b) is finite.)
Problem 2.33. Show that if f ∈ AC(a, b) and f ∈ Lp (a, b), then f is
H¨lder continuous:
 o
                                                                          1
                                                                       1− p
                      |f (x) − f (y)| ≤ f              p |x   − y|            .
Chapter 3




The spectral theorem


The time evolution of a quantum mechanical system is governed by the
Schr¨dinger equation
    o
                              d
                            i ψ(t) = Hψ(t).                      (3.1)
                             dt
If H = Cn and H is hence a matrix, this system of ordinary differential
equations is solved by the matrix exponential
                           ψ(t) = exp(−itH)ψ(0).                           (3.2)
This matrix exponential can be defined by a convergent power series
                                      ∞
                                           (−it)n n
                       exp(−itH) =               H .                       (3.3)
                                             n!
                                     n=0
For this approach the boundedness of H is crucial, which might not be the
case for a quantum system. However, the best way to compute the matrix
exponential and to understand the underlying dynamics is to diagonalize H.
But how do we diagonalize a self-adjoint operator? The answer is known as
the spectral theorem.

3.1. The spectral theorem
In this section we want to address the problem of defining functions of a
self-adjoint operator A in a natural way, that is, such that
 (f +g)(A) = f (A)+g(A),     (f g)(A) = f (A)g(A),     (f ∗ )(A) = f (A)∗ . (3.4)
As long as f and g are polynomials, no problems arise. If we want to extend
this definition to a larger class of functions, we will need to perform some
limiting procedure. Hence we could consider convergent power series or
equip the space of polynomials on the spectrum with the sup norm. In both

                                                                              87
88                                                           3. The spectral theorem


cases this only works if the operator A is bounded. To overcome this limita-
tion, we will use characteristic functions χΩ (A) instead of powers Aj . Since
χΩ (λ)2 = χΩ (λ), the corresponding operators should be orthogonal projec-
tions. Moreover, we should also have χR (A) = I and χΩ (A) = n χΩj (A)
                                                                   j=1
for any finite union Ω = n Ωj of disjoint sets. The only remaining prob-
                            j=1
lem is of course the definition of χΩ (A). However, we will defer this problem
and begin by developing a functional calculus for a family of characteristic
functions χΩ (A).
   Denote the Borel sigma algebra of R by B. A projection-valued mea-
sure is a map
                     P : B → L(H),       Ω → P (Ω),              (3.5)
from the Borel sets to the set of orthogonal projections, that is, P (Ω)∗ =
P (Ω) and P (Ω)2 = P (Ω), such that the following two conditions hold:
        (i) P (R) = I.
       (ii) If Ω = n Ωn with Ωn ∩ Ωm = ∅ for n = m, then                n P (Ωn )ψ   =
            P (Ω)ψ for every ψ ∈ H (strong σ-additivity).
    Note that we require strong convergence, n P (Ωn )ψ = P (Ω)ψ, rather
than norm convergence,        n P (Ωn ) = P (Ω). In fact, norm convergence
does not even hold in the simplest case where H = L2 (I) and P (Ω) = χΩ
(multiplication operator), since for a multiplication operator the norm is just
the sup norm of the function. Furthermore, it even suffices to require weak
convergence, since w-lim Pn = P for some orthogonal projections implies
s-lim Pn = P by ψ, Pn ψ = ψ, Pn ψ = Pn ψ, Pn ψ = Pn ψ 2 together
                                       2

with Lemma 1.12 (iv).
Example. Let H = Cn and let A ∈ GL(n) be some symmetric matrix. Let
λ1 , . . . , λm be its (distinct) eigenvalues and let Pj be the projections onto
the corresponding eigenspaces. Then

                                 PA (Ω) =               Pj                     (3.6)
                                            {j|λj ∈Ω}

is a projection-valued measure.

Example. Let H = L2 (R) and let f be a real-valued measurable function.
Then
                          P (Ω) = χf −1 (Ω)                       (3.7)
is a projection-valued measure (Problem 3.3).

      It is straightforward to verify that any projection-valued measure satis-
fies
                         P (∅) = 0,    P (RΩ) = I − P (Ω),                    (3.8)
3.1. The spectral theorem                                                     89


and
              P (Ω1 ∪ Ω2 ) + P (Ω1 ∩ Ω2 ) = P (Ω1 ) + P (Ω2 ).             (3.9)
Moreover, we also have
                           P (Ω1 )P (Ω2 ) = P (Ω1 ∩ Ω2 ).                 (3.10)
Indeed, first suppose Ω1 ∩ Ω2 = ∅. Then, taking the square of (3.9), we infer
                         P (Ω1 )P (Ω2 ) + P (Ω2 )P (Ω1 ) = 0.             (3.11)
Multiplying this equation from the right by P (Ω2 ) shows that P (Ω1 )P (Ω2 ) =
−P (Ω2 )P (Ω1 )P (Ω2 ) is self-adjoint and thus P (Ω1 )P (Ω2 ) = P (Ω2 )P (Ω1 ) =
0. For the general case Ω1 ∩ Ω2 = ∅ we now have
  P (Ω1 )P (Ω2 ) = (P (Ω1 − Ω2 ) + P (Ω1 ∩ Ω2 ))(P (Ω2 − Ω1 ) + P (Ω1 ∩ Ω2 ))
                  = P (Ω1 ∩ Ω2 )                                          (3.12)
as stated.
      Moreover, a projection-valued measure is monotone, that is,
                        Ω1 ⊆ Ω2      ⇒        P (Ω1 ) ≤ P (Ω2 ),          (3.13)
in the sense that ψ, P (Ω1 )ψ ≤ ψ, P (Ω2 )ψ or equivalently Ran(P (Ω1 )) ⊆
Ran(P (Ω2 )) (cf. Problem 1.7). As a useful consequence note that P (Ω2 ) = 0
implies P (Ω1 ) = 0 for every subset Ω1 ⊆ Ω2 .
   To every projection-valued measure there corresponds a resolution of
the identity
                           P (λ) = P ((−∞, λ])                    (3.14)
which has the properties (Problem 3.4):
        (i) P (λ) is an orthogonal projection.
       (ii) P (λ1 ) ≤ P (λ2 ) for λ1 ≤ λ2 .
       (iii) s-limλn ↓λ P (λn ) = P (λ) (strong right continuity).
       (iv) s-limλ→−∞ P (λ) = 0 and s-limλ→+∞ P (λ) = I.
As before, strong right continuity is equivalent to weak right continuity.
    Picking ψ ∈ H, we obtain a finite Borel measure µψ (Ω) = ψ, P (Ω)ψ =
 P (Ω)ψ 2 with µψ (R) = ψ 2  ∞. The corresponding distribution func-
tion is given by µψ (λ) = ψ, P (λ)ψ and since for every distribution function
there is a unique Borel measure (Theorem A.2), for every resolution of the
identity there is a unique projection-valued measure.
   Using the polarization identity (2.16), we also have the complex Borel
measures
                         1
µϕ,ψ (Ω) = ϕ, P (Ω)ψ = (µϕ+ψ (Ω) − µϕ−ψ (Ω) + iµϕ−iψ (Ω) − iµϕ+iψ (Ω)).
                         4
                                                                   (3.15)
Note also that, by Cauchy–Schwarz, |µϕ,ψ (Ω)| ≤ ϕ ψ .
90                                                             3. The spectral theorem


   Now let us turn to integration with respect to our projection-valued
measure. For any simple function f = n αj χΩj (where Ωj = f −1 (αj ))
                                       j=1
we set
                                                       n
                   P (f ) ≡       f (λ)dP (λ) =            αj P (Ωj ).               (3.16)
                              R                      j=1
In particular, P (χΩ ) = P (Ω). Then ϕ, P (f )ψ =                j   αj µϕ,ψ (Ωj ) shows

                        ϕ, P (f )ψ =            f (λ)dµϕ,ψ (λ)                       (3.17)
                                           R
and, by linearity of the integral, the operator P is a linear map from the set
of simple functions into the set of bounded linear operators on H. Moreover,
 P (f )ψ 2 = j |αj |2 µψ (Ωj ) (the sets Ωj are disjoint) shows
                                   2
                        P (f )ψ        =       |f (λ)|2 dµψ (λ).                     (3.18)
                                           R
Equipping the set of simple functions with the sup norm, we infer
                              P (f )ψ ≤ f          ∞   ψ ,                           (3.19)
which implies that P has norm one. Since the simple functions are dense
in the Banach space of bounded Borel functions B(R), there is a unique
extension of P to a bounded linear operator P : B(R) → L(H) (whose norm
is one) from the bounded Borel functions on R (with sup norm) to the set
of bounded linear operators on H. In particular, (3.17) and (3.18) remain
true.
    There is some additional structure behind this extension. Recall that
the set L(H) of all bounded linear mappings on H forms a C ∗ algebra. A C ∗
algebra homomorphism φ is a linear map between two C ∗ algebras which
respects both the multiplication and the adjoint; that is, φ(ab) = φ(a)φ(b)
and φ(a∗ ) = φ(a)∗ .
Theorem 3.1. Let P (Ω) be a projection-valued measure on H. Then the
operator
                      P : B(R) → L(H)                         (3.20)
                          f     → R f (λ)dP (λ)
is a C ∗ algebra homomorphism with norm one such that

                   P (g)ϕ, P (f )ψ =           g ∗ (λ)f (λ)dµϕ,ψ (λ).                (3.21)
                                           R
In addition, if fn (x) → f (x) pointwise and if the sequence supλ∈R |fn (λ)| is
                       s
bounded, then P (fn ) → P (f ) strongly.

Proof. The properties P (1) = I, P (f ∗ ) = P (f )∗ , and P (f g) = P (f )P (g)
are straightforward for simple functions f . For general f they follow from
continuity. Hence P is a C ∗ algebra homomorphism.
3.1. The spectral theorem                                                               91


   Equation (3.21) is a consequence of P (g)ϕ, P (f )ψ = ϕ, P (g ∗ f )ψ .
    The last claim follows from the dominated convergence theorem and
(3.18).

   As a consequence of (3.21), observe

  µP (g)ϕ,P (f )ψ (Ω) = P (g)ϕ, P (Ω)P (f )ψ =              g ∗ (λ)f (λ)dµϕ,ψ (λ),   (3.22)
                                                        Ω
which implies
                           dµP (g)ϕ,P (f )ψ = g ∗ f dµϕ,ψ .                          (3.23)
Example. Let H =      Cn
                       and A =         A∗   ∈ GL(n), respectively, PA , as in the
previous example. Then
                                           m
                             PA (f ) =          f (λj )Pj .                          (3.24)
                                          j=1

In particular, PA (f ) = A for f (λ) = λ.

    Next we want to define this operator for unbounded Borel functions.
Since we expect the resulting operator to be unbounded, we need a suitable
domain first. Motivated by (3.18), we set

                   Df = {ψ ∈ H|           |f (λ)|2 dµψ (λ)  ∞}.                     (3.25)
                                      R
This is clearly a linear subspace of H since µαψ (Ω) = |α|2 µψ (Ω) and since
µϕ+ψ (Ω) = P (Ω)(ϕ+ψ) 2 ≤ 2( P (Ω)ϕ 2 + P (Ω)ψ 2 ) = 2(µϕ (Ω)+µψ (Ω))
(by the triangle inequality).
   For every ψ ∈ Df , the sequence of bounded Borel functions
                    fn = χΩn f,        Ωn = {λ| |f (λ)| ≤ n},                        (3.26)
is a Cauchy sequence converging to f in the sense of L2 (R, dµψ ). Hence, by
virtue of (3.18), the vectors ψn = P (fn )ψ form a Cauchy sequence in H and
we can define
                     P (f )ψ = lim P (fn )ψ,     ψ ∈ Df .             (3.27)
                                n→∞
By construction, P (f ) is a linear operator such that (3.18) holds. Since
f ∈ L1 (R, dµψ ) (µψ is finite), (3.17) also remains true at least for ϕ = ψ.
   In addition, Df is dense. Indeed, let Ωn be defined as in (3.26) and
abbreviate ψn = P (Ωn )ψ. Now observe that dµψn = χΩn dµψ and hence
ψn ∈ Df . Moreover, ψn → ψ by (3.18) since χΩn → 1 in L2 (R, dµψ ).
   The operator P (f ) has some additional properties. One calls an un-
bounded operator A normal if D(A) = D(A∗ ) and Aψ = A∗ ψ for all
ψ ∈ D(A). Note that normal operators are closed since the graph norms on
D(A) = D(A∗ ) are identical.
92                                                         3. The spectral theorem


Theorem 3.2. For any Borel function f , the operator

                 P (f ) ≡       f (λ)dP (λ),        D(P (f )) = Df ,                (3.28)
                            R
is normal and satisfies
                                  P (f )∗ = P (f ∗ ).                               (3.29)

Proof. Let f be given and define fn , Ωn as above. Since (3.29) holds for
fn by our previous theorem, we get
                            ϕ, P (f )ψ = P (f ∗ )ϕ, ψ
for any ϕ, ψ ∈ Df = Df ∗ by continuity. Thus it remains to show that
                                                                ˜
D(P (f )∗ ) ⊆ Df . If ψ ∈ D(P (f )∗ ), we have ψ, P (f )ϕ = ψ, ϕ for all
ϕ ∈ Df by definition. By construction of P (f ) we have P (fn ) = P (f )P (Ωn )
and thus
            ∗                                                  ˜
        P (fn )ψ, ϕ = ψ, P (fn )ϕ = ψ, P (f )P (Ωn )ϕ = P (Ωn )ψ, ϕ
                        ∗             ˜
for any ϕ ∈ H shows P (fn )ψ = P (Ωn )ψ. This proves existence of the limit
                                    ∗                           ˜            ˜
       lim     |fn |2 dµψ = lim P (fn )ψ       2
                                                   = lim P (Ωn )ψ      2
                                                                           = ψ 2,
       n→∞ R                n→∞                      n→∞

which by monotone convergence implies f ∈ L2 (R, dµψ ); that is, ψ ∈ Df .
    That P (f ) is normal follows from (3.18), which implies P (f )ψ                 2   =
 P (f ∗ )ψ 2 = R |f (λ)|2 dµψ .

    These considerations seem to indicate some kind of correspondence be-
                                                                           ˜
tween the operators P (f ) in H and f in L2 (R, dµψ ). Recall that U : H → H
is called unitary if it is a bijection which preserves norms U ψ = ψ (and
                                                          ˜  ˜
hence scalar products). The operators A in H and A in H are said to be
unitarily equivalent if
                            ˜
                      U A = AU,                      ˜
                                          U D(A) = D(A).                            (3.30)
                                          ˜                 ˜
Clearly, A is self-adjoint if and only if A is and σ(A) = σ(A).
     Now let us return to our original problem and consider the subspace
                    Hψ = {P (g)ψ|g ∈ L2 (R, dµψ )} ⊆ H.                             (3.31)
Note that Hψ is closed since L2 is and ψn = P (gn )ψ converges in H if and
only if gn converges in L2 . It even turns out that we can restrict P (f ) to
Hψ (see Section 2.5).
Lemma 3.3. The subspace Hψ reduces P (f ); that is, Pψ P (f ) ⊆ P (f )Pψ .
Here Pψ is the projection onto Hψ .
3.1. The spectral theorem                                                               93


Proof. First suppose f is bounded. Any ϕ ∈ H can be decomposed as
ϕ = P (g)ψ + ϕ⊥ . Moreover, P (h)ψ, P (f )ϕ⊥ = P (f ∗ h)ψ, ϕ⊥ = 0
for every bounded function h implies P (f )ϕ⊥ ∈ H⊥ . Hence Pψ P (f )ϕ =
                                                      ψ
Pψ P (f )P (g)ψ = P (f )Pψ ϕ which by definition says that Hψ reduces P (f ).
    If f is unbounded, we consider fn = f χΩn as before. Then, for every
ϕ ∈ Df , P (fn )Pψ ϕ = Pψ P (fn )ϕ. Letting n → ∞, we have P (Ωn )Pψ ϕ →
Pψ ϕ and P (fn )Pψ ϕ = P (f )P (Ωn )Pψ ϕ → Pψ P (f )ϕ. Finally, closedness of
P (f ) implies Pψ ϕ ∈ Df and P (f )Pψ ϕ = Pψ P (f )ϕ.

   In particular we can decompose P (f ) = P (f )          Hψ
                                                                ⊕ P (f )   H⊥
                                                                                . Note that
                                                                            ψ


              Pψ Df = Df ∩ Hψ = {P (g)ψ|g, f g ∈ L2 (R, dµψ )}                       (3.32)
and P (f )P (g)ψ = P (f g)ψ ∈ Hψ in this case.
   By (3.18), the relation
                               Uψ (P (f )ψ) = f                                      (3.33)
defines a unique unitary operator Uψ : Hψ → L2 (R, dµψ ) such that

                              Uψ P (f )   Hψ
                                               = f Uψ ,                              (3.34)

where f is identified with its corresponding multiplication operator. More-
over, if f is unbounded, we have Uψ (Df ∩Hψ ) = D(f ) = {g ∈ L2 (R, dµψ )|f g ∈
L2 (R, dµψ )} (since ϕ = P (f )ψ implies dµϕ = |f |2 dµψ ) and the above equa-
tion still holds.
    The vector ψ is called cyclic if Hψ = H and in this case our picture is
complete. Otherwise we need to extend this approach. A set {ψj }j∈J (J
some index set) is called a set of spectral vectors if ψj = 1 and Hψi ⊥ Hψj
for all i = j. A set of spectral vectors is called a spectral basis if j Hψj =
H. Luckily a spectral basis always exists:

Lemma 3.4. For every projection-valued measure P , there is an (at most
countable) spectral basis {ψn } such that

                                 H=            Hψn                                   (3.35)
                                          n

and a corresponding unitary operator

                    U=        Uψn : H →             L2 (R, dµψn )                    (3.36)
                          n                     n

such that for any Borel function f ,
                      U P (f ) = f U,          U Df = D(f ).                         (3.37)
94                                                     3. The spectral theorem


Proof. It suffices to show that a spectral basis exists. This can be easily
done using a Gram–Schmidt type construction. First of all observe that if
{ψj }j∈J is a spectral set and ψ ⊥ Hψj for all j, we have Hψ ⊥ Hψj for all j.
Indeed, ψ ⊥ Hψj implies P (g)ψ ⊥ Hψj for every bounded function g since
 P (g)ψ, P (f )ψj = ψ, P (g ∗ f )ψj = 0. But P (g)ψ with g bounded is dense
in Hψ implying Hψ ⊥ Hψj .
                                         ˜                 ˜
     Now start with some total set {ψj }. Normalize ψ1 and choose this to
be ψ1 . Move to the first ψ   ˜j which is not in Hψ , project to the orthogonal
                                                    1
complement of Hψ1 and normalize it. Choose the result to be ψ2 . Proceeding
                                                                 ˜
like this, we get a set of spectral vectors {ψj } such that span{ψj } ⊆ j Hψj .
                    ˜
Hence H = span{ψj } ⊆          Hψ .
                            j    j



    It is important to observe that the cardinality of a spectral basis is not
well-defined (in contradistinction to the cardinality of an ordinary basis of
the Hilbert space). However, it can be at most equal to the cardinality of
an ordinary basis. In particular, since H is separable, it is at most count-
able. The minimal cardinality of a spectral basis is called the spectral
multiplicity of P . If the spectral multiplicity is one, the spectrum is called
simple.
Example. Let H = C2 and A = 0 0 and consider the associated projection-
                                  01
valued measure PA (Ω) as before. Then ψ1 = (1, 0) and ψ2 = (0, 1) are a
spectral basis. However, ψ = (1, 1) is cyclic and hence the spectrum of A is
simple. If A = 1 0 , there is no cyclic vector (why?) and hence the spectral
                  01
multiplicity is two.

    Using this canonical form of projection-valued measures, it is straight-
forward to prove

Lemma 3.5. Let f, g be Borel functions and α, β ∈ C. Then we have

     αP (f ) + βP (g) ⊆ P (αf + βg),    D(αP (f ) + βP (g)) = D|f |+|g|   (3.38)

and
              P (f )P (g) ⊆ P (f g),   D(P (f )P (g)) = Dg ∩ Df g .       (3.39)

     Now observe that to every projection-valued measure P we can assign a
self-adjoint operator A = R λdP (λ). The question is whether we can invert
this map. To do this, we consider the resolvent RA (z) = R (λ − z)−1 dP (λ).
From (3.17) the corresponding quadratic form is given by

                                                   1
                   Fψ (z) = ψ, RA (z)ψ =              dµψ (λ),            (3.40)
                                              R   λ−z
3.1. The spectral theorem                                                     95


which is know as the Borel transform of the measure µψ . By
                                                         1
                   Im(Fψ (z)) = Im(z)                          dµψ (λ),    (3.41)
                                                  R   |λ − z|2
we infer that Fψ (z) is a holomorphic map from the upper half plane into
itself. Such functions are called Herglotz or Nevanlinna functions (see
Section 3.4). Moreover, the measure µψ can be reconstructed from Fψ (z)
by the Stieltjes inversion formula
                                              λ+δ
                                      1
                  µψ (λ) = lim lim                    Im(Fψ (t + iε))dt.   (3.42)
                            δ↓0 ε↓0   π   −∞
(The limit with respect to δ is only here to ensure right continuity of µψ (λ).)
                                                                     M
Conversely, if Fψ (z) is a Herglotz function satisfying |Fψ (z)| ≤ Im(z) , then
it is the Borel transform of a unique measure µψ (given by the Stieltjes
inversion formula) satisfying µψ (R) ≤ M .
    So let A be a given self-adjoint operator and consider the expectation of
the resolvent of A,
                             Fψ (z) = ψ, RA (z)ψ .                     (3.43)
This function is holomorphic for z ∈ ρ(A) and satisfies
                                                                  ψ 2
                  Fψ (z ∗ ) = Fψ (z)∗     and |Fψ (z)| ≤                   (3.44)
                                                                 Im(z)
(see (2.69) and Theorem 2.18). Moreover, the first resolvent formula (2.81)
shows that it maps the upper half plane to itself:
                        Im(Fψ (z)) = Im(z) RA (z)ψ 2 ;                     (3.45)
that is, it is a Herglotz function. So by our above remarks, there is a
corresponding measure µψ (λ) given by the Stieltjes inversion formula. It is
called the spectral measure corresponding to ψ.
   More generally, by polarization, for each ϕ, ψ ∈ H we can find a corre-
sponding complex measure µϕ,ψ such that
                                                   1
                       ϕ, RA (z)ψ =                   dµϕ,ψ (λ).           (3.46)
                                              R   λ−z
The measure µϕ,ψ is conjugate linear in ϕ and linear in ψ. Moreover, a
comparison with our previous considerations begs us to define a family of
operators via the sesquilinear forms

                        sΩ (ϕ, ψ) =           χΩ (λ)dµϕ,ψ (λ).             (3.47)
                                          R
Since the associated quadratic form is nonnegative, qΩ (ψ) = sΩ (ψ, ψ) =
µψ (Ω) ≥ 0, the Cauchy–Schwarz inequality for sesquilinear forms (Prob-
lem 0.16) implies |sΩ (ϕ, ψ)| ≤ qΩ (ϕ)1/2 qΩ (ψ)1/2 = µϕ (Ω)1/2 µψ (Ω)1/2 ≤
96                                                           3. The spectral theorem


µϕ (R)1/2 µψ (R)1/2 ≤ ϕ ψ . Hence Corollary 1.9 implies that there is in-
deed a family of nonnegative (0 ≤ ψ, PA (Ω)ψ ≤ 1) and hence self-adjoint
operators PA (Ω) such that

                          ϕ, PA (Ω)ψ =           χΩ (λ)dµϕ,ψ (λ).             (3.48)
                                             R

Lemma 3.6. The family of operators PA (Ω) forms a projection-valued mea-
sure.

Proof. We first show PA (Ω1 )PA (Ω2 ) = PA (Ω1 ∩ Ω2 ) in two steps. First
observe (using the first resolvent formula (2.81))
            1
                dµ     ∗   (λ) = RA (z ∗ )ϕ, RA (˜)ψ = ϕ, RA (z)RA (˜)ψ
                                                 z                  z
      R   λ − z RA (z )ϕ,ψ
              ˜
                1
           =        ( ϕ, RA (z)ψ − ϕ, RA (˜)ψ )
                                            z
              z−z ˜
                1          1      1                      1 dµϕ,ψ (λ)
           =                   −          dµϕ,ψ (λ) =
              z−z R λ−z λ−z
                  ˜                 ˜                 R λ−z λ−z
                                                           ˜
implying dµRA (z ∗ )ϕ,ψ (λ) = (λ − z)−1 dµϕ,ψ (λ) by Problem 3.21. Secondly we
compute
             1
                dµ         (λ) = ϕ, RA (z)PA (Ω)ψ = RA (z ∗ )ϕ, PA (Ω)ψ
       R   λ − z ϕ,PA (Ω)ψ
                                                1
             =    χΩ (λ)dµRA (z ∗ )ϕ,ψ (λ) =       χΩ (λ)dµϕ,ψ (λ)
                R                            R λ−z

implying dµϕ,PA (Ω)ψ (λ) = χΩ (λ)dµϕ,ψ (λ). Equivalently we have

                      ϕ, PA (Ω1 )PA (Ω2 )ψ = ϕ, PA (Ω1 ∩ Ω2 )ψ

since χΩ1 χΩ2 = χΩ1 ∩Ω2 . In particular, choosing Ω1 = Ω2 , we see that
PA (Ω1 ) is a projector.
   To see PA (R) = I, let ψ ∈ Ker(PA (R)). Then 0 = ψ, PA (R)ψ = µψ (R)
implies ψ, RA (z)ψ = 0 which implies ψ = 0.
                       ∞
     Now let Ω =       n=1 Ωn   with Ωn ∩ Ωm = ∅ for n = m. Then
              n                     n
                   ψ, PA (Ωj )ψ =         µψ (Ωj ) → ψ, PA (Ω)ψ = µψ (Ω)
             j=1                    j=1

by σ-additivity of µψ . Hence PA is weakly σ-additive which implies strong
σ-additivity, as pointed out earlier.

     Now we can prove the spectral theorem for self-adjoint operators.
3.1. The spectral theorem                                                        97


Theorem 3.7 (Spectral theorem). To every self-adjoint operator A there
corresponds a unique projection-valued measure PA such that

                                   A=         λdPA (λ).                       (3.49)
                                          R

Proof. Existence has already been established. Moreover, Lemma 3.5 shows
that PA ((λ−z)−1 ) = RA (z), z ∈ CR. Since the measures µϕ,ψ are uniquely
determined by the resolvent and the projection-valued measure is uniquely
determined by the measures µϕ,ψ , we are done.

   The quadratic form of A is given by

                               qA (ψ) =           λdµψ (λ)                    (3.50)
                                              R
and can be defined for every ψ in the form domain

              Q(A) = D(|A|1/2 ) = {ψ ∈ H|                 |λ|dµψ (λ)  ∞}     (3.51)
                                                      R
(which is larger than the domain D(A) = {ψ ∈ H| R λ2 dµψ (λ)  ∞}). This
extends our previous definition for nonnegative operators.
                        ˜
    Note that if A and A are unitarily equivalent as in (3.30), then U RA (z) =
RA (z)U and hence
  ˜
                                 dµψ = d˜U ψ .
                                          µ                              (3.52)
In particular, we have U PA (f ) = PA (f )U , U D(PA (f )) = D(PA (f )).
                                     ˜                            ˜
    Finally, let us give a characterization of the spectrum of A in terms of
the associated projectors.
Theorem 3.8. The spectrum of A is given by
        σ(A) = {λ ∈ R|PA ((λ − ε, λ + ε)) = 0 for all ε  0}.                 (3.53)
                      1        1
Proof. Let Ωn = (λ0 − n , λ0 + n ). Suppose PA (Ωn ) = 0. Then we can find
a ψn ∈ PA (Ωn )H with ψn = 1. Since
                           2                                  2
             (A − λ0 )ψn       = (A − λ0 )PA (Ωn )ψn
                                                                       1
                               =       (λ − λ0 )2 χΩn (λ)dµψn (λ) ≤       ,
                                   R                                   n2
we conclude λ0 ∈ σ(A) by Lemma 2.16.
   Conversely, if PA ((λ0 − ε, λ0 + ε)) = 0, set
                    fε (λ) = χR(λ0 −ε,λ0 +ε) (λ)(λ − λ0 )−1 .
Then
    (A − λ0 )PA (fε ) = PA ((λ − λ0 )fε (λ)) = PA (R(λ0 − ε, λ0 + ε)) = I.
Similarly PA (fε )(A − λ0 ) = I|D(A) and hence λ0 ∈ ρ(A).
98                                                                   3. The spectral theorem


     In particular, PA ((λ1 , λ2 )) = 0 if and only if (λ1 , λ2 ) ⊆ ρ(A).

Corollary 3.9. We have
                 PA (σ(A)) = I            and        PA (R ∩ ρ(A)) = 0.               (3.54)

Proof. For every λ ∈ R ∩ ρ(A) there is some open interval Iλ with PA (Iλ ) =
0. These intervals form an open cover for R ∩ ρ(A) and there is a countable
subcover Jn . Setting Ωn = Jn  mn Jm , we have disjoint Borel sets which
cover R ∩ ρ(A) and satisfy PA (Ωn ) = 0. Finally, strong σ-additivity shows
PA (R ∩ ρ(A))ψ = n PA (Ωn )ψ = 0.

     Consequently,
                     PA (f ) = PA (σ(A))PA (f ) = PA (χσ(A) f ).                      (3.55)
In other words, PA (f ) is not affected by the values of f on Rσ(A)!
   It is clearly more intuitive to write PA (f ) = f (A) and we will do so from
now on. This notation is justified by the elementary observation
                                   n                 n
                            PA (         αj λj ) =         αj Aj .                    (3.56)
                                   j=0               j=0

Moreover, this also shows that if A is bounded and f (A) can be defined via
a convergent power series, then this agrees with our present definition by
Theorem 3.1.

Problem 3.1. Show that a self-adjoint operator P is a projection if and
only if σ(P ) ⊆ {0, 1}.

Problem 3.2. Consider the parity operator Π : L2 (Rn ) → L2 (Rn ),
ψ(x) → ψ(−x). Show that Π is self-adjoint. Compute its spectrum σ(Π)
and the corresponding projection-valued measure PΠ .

Problem 3.3. Show that (3.7) is a projection-valued measure. What is the
corresponding operator?

Problem 3.4. Show that P (λ) defined in (3.14) satisfies properties (i)–(iv)
stated there.

Problem 3.5. Show that for a self-adjoint operator A we have RA (z) =
dist(z, σ(A)).

Problem 3.6. Suppose A is self-adjoint and B − z0 ≤ r. Show that
σ(A + B) ⊆ σ(A) + Br (z0 ), where Br (z0 ) is the ball of radius r around z0 .
(Hint: Problem 2.17.)
3.2. More on Borel measures                                              99


Problem 3.7. Show that for a self-adjoint operator A we have ARA (z) ≤
 |z|
Im(z) . Find some A for which equality is attained.
   Conclude that for every ψ ∈ H we have
                              lim ARA (z)ψ = 0,                       (3.57)
                              z→∞

where the limit is taken in any sector ε| Re(z)| ≤ | Im(z)|, ε  0.
Problem 3.8. Suppose A is self-adjoint. Show that, if ψ ∈ D(An ), then
                      n
                            Aj ψ         An ψ
        RA (z)ψ = −               + O( n        ),   as   z → ∞.      (3.58)
                            z j+1     |z| Im(z)
                      j=0

(Hint: Proceed as in (2.87) and use the previous problem.)
Problem 3.9. Let λ0 be an eigenvalue and ψ a corresponding normalized
eigenvector. Compute µψ .
Problem 3.10. Show that λ0 is an eigenvalue if and only if P ({λ0 }) = 0.
Show that Ran(P ({λ0 })) is the corresponding eigenspace in this case.
Problem 3.11 (Polar decomposition). Let A be a closed operator and
          √
set |A| = A∗ A (recall that, by Problem 2.12, A∗ A is self-adjoint and
Q(A∗ A) = D(A)). Show that
                                 |A|ψ = Aψ .
Conclude that Ker(A) = Ker(|A|) = Ran(|A|)⊥ and that
                          ϕ = |A|ψ → Aψ      if ϕ ∈ Ran(|A|),
                U=
                          ϕ→0                if ϕ ∈ Ker(|A|)
extends to a well-defined partial isometry; that is, U : Ker(U )⊥ → Ran(U )
is unitary, where Ker(U ) = Ker(A) and Ran(U ) = Ker(A∗ )⊥ .
   In particular, we have the polar decomposition
                          A = U |A|.
                            √
Problem 3.12. Compute |A| = A∗ A for the rank one operator A =
               √
ϕ, . ψ. Compute AA∗ also.

3.2. More on Borel measures
Section 3.1 showed that in order to understand self-adjoint operators, one
needs to understand multiplication operators on L2 (R, dµ), where dµ is a
finite Borel measure. This is the purpose of the present section.
   The set of all growth points, that is,
             σ(µ) = {λ ∈ R|µ((λ − ε, λ + ε))  0 for all ε  0},      (3.59)
100                                                              3. The spectral theorem


is called the spectrum of µ. The same proof as for Corollary 3.9 shows that
the spectrum σ = σ(µ) is a support for µ; that is, µ(Rσ) = 0.
      In the previous section we have already seen that the Borel transform of
µ,
                                                1
                                F (z) =            dµ(λ),                              (3.60)
                                           R   λ−z
plays an important role.
Theorem 3.10. The Borel transform of a finite Borel measure is a Herglotz
function. It is holomorphic in Cσ(µ) and satisfies
                                                       µ(R)
               F (z ∗ ) = F (z)∗ ,        |F (z)| ≤         ,        z ∈ C+ .          (3.61)
                                                      Im(z)
Proof. First of all note
                                      1                                    dµ(λ)
             Im(F (z)) =        Im             dµ(λ) = Im(z)                       ,
                            R        λ−z                              R   |λ − z|2
which shows that F maps C+ to C+ . Moreover,                    F (z ∗)
                                                          = F (z)∗ is obvious
and
                                  dµ(λ)        1
                    |F (z)| ≤            ≤           dµ(λ)
                               R |λ − z|    Im(z) R
establishes the bound. Moreover, since µ(Rσ) = 0, we have
                                          1
                            F (z) =           dµ(λ),
                                      σ λ−z
which together with the bound
                                 1           1
                                      ≤
                              |λ − z|    dist(z, σ)
allows the application of the dominated convergence theorem to conclude
that F is continuous on Cσ. To show that F is holomorphic in Cσ,
by Morera’s theorem, it suffices to check Γ F (z)dz = 0 for every triangle
Γ ⊂ Cσ. Since (λ − z)−1 is bounded for (λ, z) ∈ σ × Γ, this follows from
          −1                                                   −1
 Γ (λ − z) dz = 0 by using Fubini, Γ F (z)dz = Γ R (λ − z) dµ(λ) dz =
             −1 dz dµ(λ) = 0.
 R Γ (λ − z)

    Note that F cannot be holomorphically extended to a larger domain. In
fact, if F is holomorphic in a neighborhood of some λ ∈ R, then F (λ) =
F (λ∗ ) = F (λ)∗ implies Im(F (λ)) = 0 and the Stieltjes inversion formula
(Theorem 3.21) shows that λ ∈ Rσ(µ).
      Associated with this measure is the operator
     Af (λ) = λf (λ),      D(A) = {f ∈ L2 (R, dµ)|λf (λ) ∈ L2 (R, dµ)}.                (3.62)
By Theorem 3.8 the spectrum of A is precisely the spectrum of µ; that is,
                                     σ(A) = σ(µ).                                      (3.63)
3.2. More on Borel measures                                                 101


Note that 1 ∈ L2 (R, dµ) is a cyclic vector for A and that
                            dµg,f (λ) = g(λ)∗ f (λ)dµ(λ).                 (3.64)

    Now what can we say about the function f (A) (which is precisely the
multiplication operator by f ) of A? We are only interested in the case where
f is real-valued. Introduce the measure
                              (f µ)(Ω) = µ(f −1 (Ω)).                     (3.65)
Then
                           g(λ)d(f µ)(λ) =         g(f (λ))dµ(λ).         (3.66)
                       R                       R
In fact, it suffices to check this formula for simple functions g, which follows
since χΩ ◦ f = χf −1 (Ω) . In particular, we have
                                Pf (A) (Ω) = χf −1 (Ω) .                  (3.67)

    It is tempting to conjecture that f (A) is unitarily equivalent to multi-
plication by λ in L2 (R, d(f µ)) via the map
                   L2 (R, d(f µ)) → L2 (R, dµ),            g → g ◦ f.     (3.68)
However, this map is only unitary if its range is L2 (R, dµ).
Lemma 3.11. Suppose f is injective. Then
                 U : L2 (R, dµ) → L2 (R, d(f µ)),          g → g ◦ f −1   (3.69)
is a unitary map such that U f (λ) = λ.

Example. Let f (λ) = λ2 . Then (g ◦ f )(λ) = g(λ2 ) and the range of the
above map is given by the symmetric functions. Note that we can still
get a unitary map L2 (R, d(f µ)) ⊕ L2 (R, d(f µ)) → L2 (R, dµ), (g1 , g2 ) →
g1 (λ2 ) + g2 (λ2 )(χ(0,∞) (λ) − χ(0,∞) (−λ)).

Lemma 3.12. Let f be real-valued. The spectrum of f (A) is given by
                                σ(f (A)) = σ(f µ).                        (3.70)
In particular,
                                σ(f (A)) ⊆ f (σ(A)),                      (3.71)
where equality holds if f is continuous and the closure can be dropped if, in
addition, σ(A) is bounded (i.e., compact).

Proof. The first formula follows by comparing
          σ(f µ) = {λ ∈ R | µ(f −1 (λ − ε, λ + ε))  0 for all ε  0}
with (2.74).
102                                                             3. The spectral theorem


   If f is continuous, f −1 ((f (λ) − ε, f (λ) + ε)) contains an open interval
around λ and hence f (λ) ∈ σ(f (A)) if λ ∈ σ(A). If, in addition, σ(A) is
compact, then f (σ(A)) is compact and hence closed.

   Whether two operators with simple spectrum are unitarily equivalent
can be read off from the corresponding measures:
Lemma 3.13. Let A1 , A2 be self-adjoint operators with simple spectrum and
corresponding spectral measures µ1 and µ2 of cyclic vectors. Then A1 and
A2 are unitarily equivalent if and only if µ1 and µ2 are mutually absolutely
continuous.

Proof. Without restriction we can assume that Aj is multiplication by λ
in L2 (R, dµj ). Let U : L2 (R, dµ1 ) → L2 (R, dµ2 ) be a unitary map such that
U A1 = A2 U . Then we also have U f (A1 ) = f (A2 )U for any bounded Borel
function and hence
                          U f (λ) = U f (λ) · 1 = f (λ)U (1)(λ)
and thus U is multiplication by u(λ) = U (1)(λ). Moreover, since U is
unitary, we have

               µ1 (Ω) =         |χΩ |2 dµ1 =       |u χΩ |2 dµ2 =       |u|2 dµ2 ;
                            R                  R                    Ω
that is, dµ1 =    |u|2 dµ
                      2 . Reversing the roles of A1 and A2 , we obtain dµ2 =
|v|2 dµ1 , where v = U −1 1.
       The converse is left as an exercise (Problem 3.17).

    Next we recall the unique decomposition of µ with respect to Lebesgue
measure,
                               dµ = dµac + dµs ,                       (3.72)
where µac is absolutely continuous with respect to Lebesgue measure (i.e.,
we have µac (B) = 0 for all B with Lebesgue measure zero) and µs is singular
with respect to Lebesgue measure (i.e., µs is supported, µs (RB) = 0, on
a set B with Lebesgue measure zero). The singular part µs can be further
decomposed into a (singularly) continuous and a pure point part,
                                    dµs = dµsc + dµpp ,                              (3.73)
where µsc is continuous on R and µpp is a step function. Since the measures
dµac , dµsc , and dµpp are mutually singular, they have mutually disjoint
supports Mac , Msc , and Mpp . Note that these sets are not unique. We will
choose them such that Mpp is the set of all jumps of µ(λ) and such that Msc
has Lebesgue measure zero.
       To the sets Mac , Msc , and Mpp correspond projectors P ac = χMac (A),
P sc   = χMsc (A), and P pp = χMpp (A) satisfying P ac + P sc + P pp = I. In
3.2. More on Borel measures                                                103


other words, we have a corresponding direct sum decomposition of both our
Hilbert space
        L2 (R, dµ) = L2 (R, dµac ) ⊕ L2 (R, dµsc ) ⊕ L2 (R, dµpp )      (3.74)
and our operator
                       A = (AP ac ) ⊕ (AP sc ) ⊕ (AP pp ).              (3.75)
The corresponding spectra, σac (A) = σ(µac ), σsc (A) = σ(µsc ), and σpp (A) =
σ(µpp ) are called the absolutely continuous, singularly continuous, and pure
point spectrum of A, respectively.
    It is important to observe that σpp (A) is in general not equal to the set
of eigenvalues
                   σp (A) = {λ ∈ R|λ is an eigenvalue of A}             (3.76)
since we only have σpp (A) = σp (A).
                                                            1
Example. Let H = 2 (N) and let A be given by Aδn = n δn , where δn is
the sequence which is 1 at the n’th place and zero otherwise (that is, A is
                                            1                      1
a diagonal matrix with diagonal elements n ). Then σp (A) = { n |n ∈ N}
but σ(A) = σpp (A) = σp (A) ∪ {0}. To see this, just observe that δn is the
                                              1
eigenvector corresponding to the eigenvalue n and for z ∈ σ(A) we have
              n
RA (z)δn = 1−nz δn . At z = 0 this formula still gives the inverse of A, but
it is unbounded and hence 0 ∈ σ(A) but 0 ∈ σp (A). Since a continuous
measure cannot live on a single point and hence also not on a countable set,
we have σac (A) = σsc (A) = ∅.

Example. An example with purely absolutely continuous spectrum is given
by taking µ to be the Lebesgue measure. An example with purely singularly
continuous spectrum is given by taking µ to be the Cantor measure.

    Finally, we show how the spectrum can be read off from the boundary
values of Im(F ) towards the real line. We define the following sets:
             Mac = {λ|0  lim sup Im(F (λ + iε))  ∞},
                                 ε↓0
              Ms = {λ| lim sup Im(F (λ + iε)) = ∞},                     (3.77)
                           ε↓0
              M = Mac ∪ Ms = {λ|0  lim sup Im(F (λ + iε))}.
                                             ε↓0

Then, by Theorem 3.23 we conclude that these sets are minimal supports for
µac , µs , and µ, respectively. In fact, by Theorem 3.23 we could even restrict
ourselves to values of λ, where the lim sup is a lim (finite or infinite).
Lemma 3.14. The spectrum of µ is given by
            σ(µ) = M ,       M = {λ|0  lim inf Im(F (λ + iε))}.        (3.78)
                                             ε↓0
104                                                      3. The spectral theorem


Proof. First observe that F is real holomorphic near λ ∈ σ(µ) and hence
Im(F (λ)) = 0 in this case. Thus M ⊆ σ(µ) and since σ(µ) is closed, we
even have M ⊆ σ(µ). To see the converse, note that by Theorem 3.23, the
set M is a support for M . Thus, if λ ∈ σ(µ), then
                    0  µ((λ − ε, λ + ε)) = µ((λ − ε, λ + ε) ∩ M )
for all ε  0 and we can find a sequence λn ∈ (λ−1/n, λ+1/n)∩M converging
to λ from inside M . This shows the remaining part σ(µ) ⊆ M .

    To recover σ(µac ) from Mac , we need the essential closure of a Borel
set N ⊆ R,
            ess
        N         = {λ ∈ R||(λ − ε, λ + ε) ∩ N |  0 for all ε  0}.      (3.79)
              ess
Note that N     is closed, whereas, in contradistinction to the ordinary clo-
                            ess
sure, we might have N ⊂ N       (e.g., any isolated point of N will disappear).
Lemma 3.15. The absolutely continuous spectrum of µ is given by
                                                 ess
                                     σ(µac ) = M ac .                     (3.80)

Proof. We use that 0  µac ((λ − ε, λ + ε)) = µac ((λ − ε, λ + ε) ∩ Mac )
is equivalent to |(λ − ε, λ + ε) ∩ Mac |  0. One direction follows from the
definition of absolute continuity and the other from minimality of Mac .
Problem 3.13. Construct a multiplication operator on L2 (R) which has
dense point spectrum.
Problem 3.14. Let λ be Lebesgue measure on R. Show that if f ∈ AC(R)
with f  0, then
                                    1
                         d(f λ) =       dλ.
                                  f (λ)
Problem 3.15. Let dµ(λ) = χ[0,1] (λ)dλ and f (λ) = χ(−∞,t] (λ), t ∈ R.
Compute f µ.
Problem 3.16. Let A be the multiplication operator by the Cantor function
in L2 (0, 1). Compute the spectrum of A. Determine the spectral types.
Problem 3.17. Show the missing direction in the proof of Lemma 3.13.
                              ess
Problem 3.18. Show N                ⊆ N.

3.3. Spectral types
Our next aim is to transfer the results of the previous section to arbitrary
self-adjoint operators A using Lemma 3.4. To this end, we will need a
spectral measure which contains the information from all measures in a
spectral basis. This will be the case if there is a vector ψ such that for every
3.3. Spectral types                                                        105


ϕ ∈ H its spectral measure µϕ is absolutely continuous with respect to µψ .
Such a vector will be called a maximal spectral vector of A and µψ will
be called a maximal spectral measure of A.

Lemma 3.16. For every self-adjoint operator A there is a maximal spectral
vector.

Proof. Let {ψj }j∈J be a spectral basis and choose nonzero numbers εj with
           2
  j∈J |εj | = 1. Then I claim that

                                ψ=         εj ψj
                                     j∈J

is a maximal spectral vector. Let ϕ be given. Then we can write it as ϕ =
                                        2                           2
   j fj (A)ψj and hence dµϕ =    j |fj | dµψj . But µψ (Ω) = j |εj | µψj (Ω) =
0 implies µψj (Ω) = 0 for every j ∈ J and thus µϕ (Ω) = 0.

   A set {ψj } of spectral vectors is called ordered if ψk is a maximal
                                      k−1
spectral vector for A restricted to ( j=1 Hψj )⊥ . As in the unordered case
one can show

Theorem 3.17. For every self-adjoint operator there is an ordered spectral
basis.

   Observe that if {ψj } is an ordered spectral basis, then µψj+1 is absolutely
continuous with respect to µψj .
    If µ is a maximal spectral measure, we have σ(A) = σ(µ) and the fol-
lowing generalization of Lemma 3.12 holds.

Theorem 3.18 (Spectral mapping). Let µ be a maximal spectral measure
and let f be real-valued. Then the spectrum of f (A) is given by

      σ(f (A)) = {λ ∈ R|µ(f −1 (λ − ε, λ + ε))  0 for all ε  0}.      (3.81)

In particular,
                             σ(f (A)) ⊆ f (σ(A)),                       (3.82)
where equality holds if f is continuous and the closure can be dropped if, in
addition, σ(A) is bounded.

     Next, we want to introduce the splitting (3.74) for arbitrary self-adjoint
operators A. It is tempting to pick a spectral basis and treat each summand
in the direct sum separately. However, since it is not clear that this approach
is independent of the spectral basis chosen, we use the more sophisticated
106                                                                3. The spectral theorem


definition
                  Hac = {ψ ∈ H|µψ is absolutely continuous},
                   Hsc = {ψ ∈ H|µψ is singularly continuous},
                  Hpp = {ψ ∈ H|µψ is pure point}.                                   (3.83)
Lemma 3.19. We have
                                      H = Hac ⊕ Hsc ⊕ Hpp .                         (3.84)
There are Borel sets Mxx such that the projector onto Hxx is given by P xx =
χMxx (A), xx ∈ {ac, sc, pp}. In particular, the subspaces Hxx reduce A. For
the sets Mxx one can choose the corresponding supports of some maximal
spectral measure µ.

Proof. We will use the unitary operator U of Lemma 3.4. Pick ϕ ∈ H and
write ϕ = n ϕn with ϕn ∈ Hψn . Let fn = U ϕn . Then, by construction
of the unitary operator U , ϕn = fn (A)ψn and hence dµϕn = |fn |2 dµψn .
Moreover, since the subspaces Hψn are orthogonal, we have
                                       dµϕ =         |fn |2 dµψn
                                                 n
and hence
                dµϕ,xx =              |fn |2 dµψn ,xx ,     xx ∈ {ac, sc, pp}.
                                 n
This shows
                U Hxx =              L2 (R, dµψn ,xx ),      xx ∈ {ac, sc, pp}
                             n
and reduces our problem to the considerations of the previous section.
   Furthermore, note that if µ is a maximal spectral measure, then every
support for µxx is also a support for µϕ,xx for any ϕ ∈ H.

   The absolutely continuous, singularly continuous, and pure point
spectrum of A are defined as
      σac (A) = σ(A|Hac ),       σsc (A) = σ(A|Hsc ),
                                                 and σpp (A) = σ(A|Hpp ),
                                                                       (3.85)
respectively. If µ is a maximal spectral measure, we have σac (A) = σ(µac ),
σsc (A) = σ(µsc ), and σpp (A) = σ(µpp ).
               ˜                                                        ˜˜
     If A and A are unitarily equivalent via U , then so are A|Hxx and A|Hxx
                                          ˜
by (3.52). In particular, σxx (A) = σxx (A).
Problem 3.19. Compute σ(A), σac (A), σsc (A), and σpp (A) for the multi-
                         1
plication operator A = 1+x2 in L2 (R). What is its spectral multiplicity?
3.4. Appendix: The Herglotz theorem                                                  107


3.4. Appendix: The Herglotz theorem
Let C± = {z ∈ C| ± Im(z)  0} be the upper, respectively, lower, half plane.
A holomorphic function F : C+ → C+ mapping the upper half plane to itself
is called a Herglotz function. We can define F on C− using F (z ∗ ) = F (z)∗ .
    In Theorem 3.10 we have seen that the Borel transform of a finite mea-
sure is a Herglotz function satisfying a growth estimate. It turns out that
the converse is also true.
Theorem 3.20 (Herglotz representation). Suppose F is a Herglotz function
satisfying
                               M
                    |F (z)| ≤       ,    z ∈ C+ .                 (3.86)
                              Im(z)
Then there is a Borel measure µ, satisfying µ(R) ≤ M , such that F is the
Borel transform of µ.

Proof. We abbreviate F (z) = v(z) + i w(z) and z = x + i y. Next we choose
a contour
         Γ = {x + iε + λ|λ ∈ [−R, R]} ∪ {x + iε + Reiϕ |ϕ ∈ [0, π]}
and note that z lies inside Γ and z ∗ + 2iε lies outside Γ if 0  ε  y  R.
Hence we have by Cauchy’s formula
                                  1               1       1
              F (z) =                               −     ∗ − 2iε
                                                                          F (ζ)dζ.
                                 2πi      Γ     ζ −z ζ −z
Inserting the explicit form of Γ, we see
                         R
                 1                    y−ε
       F (z) =                                   F (x + iε + λ)dλ
                 π   −R      λ2      + (y − ε)2
                                 π
                     i                      y−ε
                 +                                      F (x + iε + Reiϕ )Reiϕ dϕ.
                     π       0       R2 e2iϕ + (y − ε)2
The integral over the semi-circle vanishes as R → ∞ and hence we obtain
                                     1               y−ε
                 F (z) =                                          F (λ + iε)dλ
                                     π    R   (λ − x)2 + (y − ε)2
and taking imaginary parts,

                                         w(z) =         φε (λ)wε (λ)dλ,
                                                    R

where φε (λ) = (y − ε)/((λ − x)2 + (y − ε)2 ) and wε (λ) = w(λ + iε)/π. Letting
y → ∞, we infer from our bound

                                                   wε (λ)dλ ≤ M.
                                               R
108                                                                        3. The spectral theorem


In particular, since |φε (λ) − φ0 (λ)| ≤ const ε, we have

                               w(z) = lim             φ0 (λ)dµε (λ),
                                        ε↓0       R
                        λ
where µε (λ) = −∞ wε (x)dx. Since µε (R) ≤ M , Lemma A.26 implies that
there is subsequence which converges vaguely to some measure µ. Moreover,
by Lemma A.27 we even have

                                 w(z) =           φ0 (λ)dµ(λ).
                                              R

    Now F (z) and R (λ − z)−1 dµ(λ) have the same imaginary part and thus
they only differ by a real constant. By our bound this constant must be
zero.

      Observe
                                                                   dµ(λ)
                              Im(F (z)) = Im(z)                                             (3.87)
                                                             R    |λ − z|2
and
                                 lim λ Im(F (iλ)) = µ(R).                                   (3.88)
                                λ→∞

Theorem 3.21. Let F be the Borel transform of some finite Borel measure
µ. Then the measure µ is unique and can be reconstructed via the Stieltjes
inversion formula
                                                                  λ2
          1                                       1
            (µ((λ1 , λ2 )) + µ([λ1 , λ2 ])) = lim                      Im(F (λ + iε))dλ.    (3.89)
          2                                   ε↓0 π              λ1

Proof. By Fubini we have
                  λ2                                   λ2
            1                                 1                         ε
                       Im(F (λ + iε))dλ =                                       dµ(x)dλ
            π    λ1                           π       λ1      R   (x − λ)2 + ε2
                                                             λ2
                                                  1                     ε
                                        =                                       dλ dµ(x),
                                              R   π         λ1    (x − λ)2 + ε2
where
            λ2
      1          ε              1             λ2 − x             λ1 − x
                   2 + ε2
                          dλ =     arctan               − arctan
      π λ1 (x − λ)              π                ε                  ε
                                1
                             →    χ         (x) + χ(λ1 ,λ2 ) (x)
                                2 [λ1 ,λ2 ]
pointwise. Hence the result follows from the dominated convergence theorem
                       −x              −x
since 0 ≤ π arctan( λ2ε ) − arctan( λ1ε ) ≤ 1.
          1


    Furthermore, the Radon–Nikodym derivative of µ can be obtained from
the boundary values of F .
3.4. Appendix: The Herglotz theorem                                                      109


Theorem 3.22. Let µ be a finite Borel measure and F its Borel transform.
Then
                   1                    1
  (Dµ)(λ) ≤ lim inf F (λ + iε) ≤ lim sup F (λ + iε) ≤ (Dµ)(λ). (3.90)
              ε↓0 π                ε↓0  π

Proof. We need to estimate
                                                                                ε
           Im(F (λ + iε)) =                 Kε (t)dµ(t),        Kε (t) =             .
                                        R                                  t2   + ε2
We first split the integral into two parts:

Im(F (λ+iε)) =              Kε (t−λ)dµ(t)+                 Kε (t−λ)µ(t),    Iδ = (λ−δ, λ+δ).
                       Iδ                           RIδ
Clearly the second part can be estimated by

                                     Kε (t − λ)µ(t) ≤ Kε (δ)µ(R).
                              RIδ
To estimate the first part, we integrate
                                         Kε (s) ds dµ(t)
over the triangle {(s, t)|λ − s  t  λ + s, 0  s  δ} = {(s, t)|λ − δ  t 
λ + δ, t − λ  s  δ} and obtain
                   δ
                       µ(Is )Kε (s)ds =             (K(δ) − Kε (t − λ))dµ(t).
               0                               Iδ

Now suppose there are constants c and C such that c ≤ µ(Is ) ≤ C, 0 ≤ s ≤ δ.
                                                        2s
Then
                       δ                                     δ
            2c arctan( ) ≤      Kε (t − λ)dµ(t) ≤ 2C arctan( )
                       ε     Iδ                              ε
since
                               δ
                                                      δ
                   δKε (δ) +     −sKε (s)ds = arctan( ).
                             0                        ε
Thus the claim follows combining both estimates.

    As a consequence of Theorem A.37 and Theorem A.38 we obtain (cf.
also Lemma A.39)
Theorem 3.23. Let µ be a finite Borel measure and F its Borel transform.
Then the limit
                                       1
                       Im(F (λ)) = lim Im(F (λ + iε))               (3.91)
                                   ε↓0 π
exists a.e. with respect to both µ and Lebesgue measure (finite or infinite)
and
                                       1
                            (Dµ)(λ) = Im(F (λ))                     (3.92)
                                       π
whenever (Dµ)(λ) exists.
110                                                            3. The spectral theorem


    Moreover, the set {λ| Im(F (λ)) = ∞} is a support for the singularly
continuous part and {λ|0  Im(F (λ))  ∞} is a minimal support for the
absolutely continuous part.

      In particular,
Corollary 3.24. The measure µ is purely absolutely continuous on I if
lim supε↓0 Im(F (λ + iε))  ∞ for all λ ∈ I.

      The limit of the real part can be computed as well.
Corollary 3.25. The limit
                                      lim F (λ + iε)                               (3.93)
                                      ε↓0
exists a.e. with respect to both µ and Lebesgue measure. It is finite a.e. with
respect to Lebesgue measure.

Proof. If F (z) is a Herglotz function, then so is F (z). Moreover, F (z)
has values in the first quadrant; that is, both Re( F (z)) and Im( F (z)) are
positive for z ∈ C+ . Hence both F (z) and i F (z) are Herglotz functions
and by Theorem 3.23 both limε↓0 Re( F (λ + iε)) and limε↓0 Im( F (λ + iε))
exist and are finite a.e. with respect to Lebesgue measure. By taking squares,
the same is true for F (z) and hence limε↓0 F (λ + iε) exists and is finite a.e.
with respect to Lebesgue measure. Since limε↓0 Im(F (λ + iε)) = ∞ implies
limε↓0 F (λ + iε) = ∞, the result follows.
Problem 3.20. Find all rational Herglotz functions F : C → C satisfying
F (z ∗ ) = F (z)∗ and lim|z|→∞ |zF (z)| = M  ∞. What can you say about
the zeros of F ?
Problem 3.21. A complex measure dµ is a measure which can be written
as a complex linear combinations of positive measures dµj :
                           dµ = dµ1 − dµ2 + i(dµ3 − dµ4 ).
Let
                                           dµ
                                    F (z) =
                                       R λ−z
be the Borel transform of a complex measure. Show that µ is uniquely de-
termined by F via the Stieltjes inversion formula
                                                     λ2
      1                                        1
        (µ((λ1 , λ2 )) + µ([λ1 , λ2 ])) = lim             (F (λ + iε) − F (λ − iε))dλ.
      2                                   ε↓0 2πi   λ1

Problem 3.22. Compute the Borel transform of the complex measure given
             dλ
by dµ(λ) = (λ−i)2 .
Chapter 4




Applications of the
spectral theorem


This chapter can be mostly skipped on first reading. You might want to have a
look at the first section and then come back to the remaining ones later.
    Now let us show how the spectral theorem can be used. We will give a
few typical applications:
    First we will derive an operator-valued version of the Stieltjes inversion
formula. To do this, we need to show how to integrate a family of functions
of A with respect to a parameter. Moreover, we will show that these integrals
can be evaluated by computing the corresponding integrals of the complex-
valued functions.
    Secondly we will consider commuting operators and show how certain
facts, which are known to hold for the resolvent of an operator A, can be
established for a larger class of functions.
   Then we will show how the eigenvalues below the essential spectrum and
dimension of Ran PA (Ω) can be estimated using the quadratic form.
   Finally, we will investigate tensor products of operators.



4.1. Integral formulas
We begin with the first task by having a closer look at the projections PA (Ω).
They project onto subspaces corresponding to expectation values in the set
Ω. In particular, the number

                                 ψ, χΩ (A)ψ                              (4.1)

                                                                          111
112                                                4. Applications of the spectral theorem


is the probability for a measurement of a to lie in Ω. In addition, we have

         ψ, Aψ =           λ dµψ (λ) ∈ hull(Ω),             ψ ∈ PA (Ω)H, ψ = 1,            (4.2)
                       Ω

where hull(Ω) is the convex hull of Ω.
    The space Ran χ{λ0 } (A) is called the eigenspace corresponding to λ0
since we have

       ϕ, Aψ =         λ χ{λ0 } (λ)dµϕ,ψ (λ) = λ0                   dµϕ,ψ (λ) = λ0 ϕ, ψ    (4.3)
                   R                                            R

and hence Aψ = λ0 ψ for all ψ ∈ Ran χ{λ0 } (A). The dimension of the
eigenspace is called the multiplicity of the eigenvalue.
      Moreover, since
                               −iε
                                  lim  = χ{λ0 } (λ),                                       (4.4)
                       ε↓0 λ − λ0 − iε
we infer from Theorem 3.1 that
                           lim −iεRA (λ0 + iε)ψ = χ{λ0 } (A)ψ.                             (4.5)
                            ε↓0

Similarly, we can obtain an operator-valued version of the Stieltjes inversion
formula. But first we need to recall a few facts from integration in Banach
spaces.
    We will consider the case of mappings f : I → X where I = [t0 , t1 ] ⊂ R is
a compact interval and X is a Banach space. As before, a function f : I → X
is called simple if the image of f is finite, f (I) = {xi }n , and if each inverse
                                                          i=1
image f −1 (xi ), 1 ≤ i ≤ n, is a Borel set. The set of simple functions S(I, X)
forms a linear space and can be equipped with the sup norm
                                         f   ∞   = sup f (t) .                             (4.6)
                                                    t∈I

The corresponding Banach space obtained after completion is called the set
of regulated functions R(I, X).
    Observe that C(I, X) ⊂ R(I, X). In fact, consider the simple function
        n−1                                        t1 −t0
fn =    i=0 f (si )χ[si ,si+1 ) , where si = t0 + i n . Since f ∈ C(I, X) is
uniformly continuous, we infer that fn converges uniformly to f .
      For f ∈ S(I, X) we can define a linear map                         : S(I, X) → X by
                                                    n
                                       f (t)dt =          xi |f −1 (xi )|,                 (4.7)
                                   I               i=1

where |Ω| denotes the Lebesgue measure of Ω. This map satisfies

                                        f (t)dt ≤ f         ∞ (t1   − t0 )                 (4.8)
                                   I
4.1. Integral formulas                                                                                            113


and hence it can be extended uniquely to a linear map : R(I, X) → X
with the same norm (t1 − t0 ) by Theorem 0.26. We even have

                                              f (t)dt ≤               f (t) dt,                                  (4.9)
                                          I                   I

which clearly holds for f ∈ S(I, X) and thus for all f ∈ R(I, X) by conti-
nuity. In addition, if ∈ X ∗ is a continuous linear functional, then

                      (         f (t)dt) =             (f (t))dt,         f ∈ R(I, X).                          (4.10)
                            I                      I

In particular, if A(t) ∈ R(I, L(H)), then

                                          A(t)dt ψ =                  (A(t)ψ)dt.                                (4.11)
                                      I                           I

   If I = R, we say that f : I → X is integrable if f ∈ R([−r, r], X) for all
r  0 and if f (t) is integrable. In this case we can set

                                        f (t)dt = lim                     f (t)dt                               (4.12)
                                    R                    r→∞ [−r,r]

and (4.9) and (4.10) still hold.
                                                               t3
       We will use the standard notation                      t2      f (s)ds =         I   χ(t2 ,t3 ) (s)f (s)ds and
  t2                   t3
 t3    f (s)ds = −    t2    f (s)ds.
       We write f ∈ C 1 (I, X) if
                                   d              f (t + ε) − f (t)
                                      f (t) = lim                                                               (4.13)
                                   dt         ε→0         ε
                                                                                                         t
exists for all t ∈ I. In particular, if f ∈ C(I, X), then F (t) =                                       t0   f (s)ds ∈
C 1 (I, X) and dF/dt = f as can be seen from
                                                   t+ε
 F (t + ε) − F (t) − f (t)ε =                          (f (s) − f (t))ds ≤ |ε| sup f (s) − f (t) .
                                               t                                            s∈[t,t+ε]
                                                                                                                (4.14)
       The important facts for us are the following two results.

Lemma 4.1. Suppose f : I × R → C is a bounded Borel function and set
F (λ) = I f (t, λ)dt. Let A be self-adjoint. Then f (t, A) ∈ R(I, L(H)) and

       F (A) =       f (t, A)dt,          respectively,           F (A)ψ =              f (t, A)ψ dt.           (4.15)
                 I                                                                  I

Proof. That f (t, A) ∈ R(I, L(H)) follows from the spectral theorem, since
it is no restriction to assume that A is multiplication by λ in some L2 space.
114                                               4. Applications of the spectral theorem


We compute

                 ϕ, ( f (t, A)dt)ψ =              ϕ, f (t, A)ψ dt
                      I                       I

                                          =            f (t, λ)dµϕ,ψ (λ)dt
                                              I    R

                                          =            f (t, λ)dt dµϕ,ψ (λ)
                                              R    I

                                          =       F (λ)dµϕ,ψ (λ) = ϕ, F (A)ψ
                                              R

by Fubini’s theorem and hence the first claim follows.

Lemma 4.2. Suppose f : R → L(H) is integrable and A ∈ L(H). Then

      A       f (t)dt =       Af (t)dt,   respectively,               f (t)dtA =       f (t)Adt.
          R               R                                       R                R
                                                                                             (4.16)

Proof. It suffices to prove the case where f is simple and of compact sup-
port. But for such functions the claim is straightforward.

    Now we can prove an operator-valued version of the Stieltjes inversion
formula.

Theorem 4.3 (Stone’s formula). Let A be self-adjoint. Then
           λ2
   1                                                     s   1
                 RA (λ + iε) − RA (λ − iε) dλ →                PA ([λ1 , λ2 ]) + PA ((λ1 , λ2 ))
  2πi     λ1                                                 2
                                                                                            (4.17)
strongly.

Proof. By
                 λ2
         1               1                 1              1 λ2       ε
                                  −                 dλ =                     dλ
        2πi     λ1   x − λ − iε x − λ + iε                π λ1 (x − λ)2 + ε2
                   1             λ2 − x                λ1 − x
                =     arctan               − arctan
                   π                ε                     ε
                   1
                →    χ         (x) + χ(λ1 ,λ2 ) (x)
                   2 [λ1 ,λ2 ]
the result follows combining the last part of Theorem 3.1 with Lemma 4.1.
4.2. Commuting operators                                                              115


   Note that by using the first resolvent formula, Stone’s formula can also
be written in the form
                                                           λ2
       1                                           1
  ψ,     PA ([λ1 , λ2 ]) + PA ((λ1 , λ2 )) ψ = lim              Im ψ, RA (λ + iε)ψ dλ
       2                                       ε↓0 π      λ1
                                                           λ2
                                                      ε
                                           = lim                 RA (λ + iε)ψ 2 dλ.
                                                ε↓0   π   λ1
                                                                                  (4.18)
Problem 4.1. Let Γ be a differentiable Jordan curve in ρ(A). Show

                              χΩ (A) =        RA (z)dz,
                                          Γ
where Ω is the intersection of the interior of Γ with R.

4.2. Commuting operators
Now we come to commuting operators. As a preparation we can now prove
Lemma 4.4. Let K ⊆ R be closed and let C∞ (K) be the set of all continuous
functions on K which vanish at ∞ (if K is unbounded) with the sup norm.
The ∗-subalgebra generated by the function
                                           1
                                    λ→                                            (4.19)
                                          λ−z
for one z ∈ CK is dense in C∞ (K).

Proof. If K is compact, the claim follows directly from the complex Stone–
Weierstraß theorem since (λ1 −z)−1 = (λ2 −z)−1 implies λ1 = λ2 . Otherwise,
              ˜
replace K by K = K ∪{∞}, which is compact, and set (∞−z)−1 = 0. Then
we can again apply the complex Stone–Weierstraß theorem to conclude that
                                      ˜
our ∗-subalgebra is equal to {f ∈ C(K)|f (∞) = 0} which is equivalent to
C∞ (K).

   We say that two bounded operators A, B commute if
                             [A, B] = AB − BA = 0.                                (4.20)
If A or B is unbounded, we soon run into trouble with this definition since
the above expression might not even make sense for any nonzero vector (e.g.,
take B = ϕ, . ψ with ψ ∈ D(A)). To avoid this nuisance, we will replace A
by a bounded function of A. A good candidate is the resolvent. Hence if A
is self-adjoint and B is bounded, we will say that A and B commute if
                          [RA (z), B] = [RA (z ∗ ), B] = 0                        (4.21)
for one z ∈ ρ(A).
116                                    4. Applications of the spectral theorem


Lemma 4.5. Suppose A is self-adjoint and commutes with the bounded
operator B. Then
                         [f (A), B] = 0                      (4.22)
for any bounded Borel function f . If f is unbounded, the claim holds for
any ψ ∈ D(f (A)) in the sense that Bf (A) ⊆ f (A)B.

Proof. Equation (4.21) tells us that (4.22) holds for any f in the ∗-sub-
algebra generated by RA (z). Since this subalgebra is dense in C∞ (σ(A)),
the claim follows for all such f ∈ C∞ (σ(A)). Next fix ψ ∈ H and let f be
bounded. Choose a sequence fn ∈ C∞ (σ(A)) converging to f in L2 (R, dµψ ).
Then
          Bf (A)ψ = lim Bfn (A)ψ = lim fn (A)Bψ = f (A)Bψ.
                       n→∞                n→∞
If f is unbounded, let ψ ∈ D(f (A)) and choose fn as in (3.26). Then
                  f (A)Bψ = lim fn (A)Bψ = lim Bfn (A)ψ
                               n→∞              n→∞

shows f ∈   L2 (R, dµBψ )   (i.e., Bψ ∈ D(f (A))) and f (A)Bψ = BF (A)ψ.

      In the special case where B is an orthogonal projection, we obtain
Corollary 4.6. Let A be self-adjoint and H1 a closed subspace with corre-
sponding projector P1 . Then H1 reduces A if and only if P1 and A commute.

      Furthermore, note
Corollary 4.7. If A is self-adjoint and bounded, then (4.21) holds if and
only if (4.20) holds.

Proof. Since σ(A) is compact, we have λ ∈ C∞ (σ(A)) and hence (4.20)
follows from (4.22) by our lemma. Conversely, since B commutes with any
polynomial of A, the claim follows from the Neumann series.

      As another consequence we obtain
Theorem 4.8. Suppose A is self-adjoint and has simple spectrum. A bounded
operator B commutes with A if and only if B = f (A) for some bounded Borel
function.

Proof. Let ψ be a cyclic vector for A. By our unitary equivalence it is no
restriction to assume H = L2 (R, dµψ ). Then
                       Bg(λ) = Bg(λ) · 1 = g(λ)(B1)(λ)
since B commutes with the multiplication operator g(λ). Hence B is multi-
plication by f (λ) = (B1)(λ).
4.3. The min-max theorem                                                 117


    The assumption that the spectrum of A is simple is crucial as the exam-
ple A = I shows. Note also that the functions exp(−itA) can also be used
instead of resolvents.
Lemma 4.9. Suppose A is self-adjoint and B is bounded. Then B commutes
with A if and only if
                             [e−iAt , B] = 0                     (4.23)
for all t ∈ R.
                            ˆ
Proof. It suffices to show [f (A), B] = 0 for f ∈ S(R), since these functions
are dense in C∞ (R) by the complex Stone–Weierstraß theorem. Here f         ˆ
denotes the Fourier transform of f ; see Section 7.1. But for such f we have
                  1                           1
    [f (A), B] = √ [ f (t)e−iAt dt, B] = √
     ˆ                                               f (t)[e−iAt , B]dt = 0
                   2π R                        2π R
by Lemma 4.2.

    The extension to the case where B is self-adjoint and unbounded is
straightforward. We say that A and B commute in this case if
                                              ∗
                 [RA (z1 ), RB (z2 )] = [RA (z1 ), RB (z2 )] = 0       (4.24)
for one z1 ∈ ρ(A) and one z2 ∈ ρ(B) (the claim for           ∗
                                                            z2
                                                            follows by taking
adjoints). From our above analysis it follows that this is equivalent to
                        [e−iAt , e−iBs ] = 0,   t, s ∈ R,              (4.25)
respectively,
                             [f (A), g(B)] = 0                         (4.26)
for arbitrary bounded Borel functions f and g.
Problem 4.2. Let A and B be self-adjoint. Show that A and B commute if
and only if the corresponding spectral projections PA (Ω) and PB (Ω) commute
for every Borel set Ω. In particular, Ran(PB (Ω)) reduces A and vice versa.
Problem 4.3. Let A and B be self-adjoint operators with pure point spec-
trum. Show that A and B commute if and only if they have a common
orthonormal basis of eigenfunctions.

4.3. The min-max theorem
In many applications a self-adjoint operator has a number of eigenvalues
below the bottom of the essential spectrum. The essential spectrum is ob-
tained from the spectrum by removing all discrete eigenvalues with finite
multiplicity (we will have a closer look at this in Section 6.2). In general
there is no way of computing the lowest eigenvalues and their corresponding
eigenfunctions explicitly. However, one often has some idea about how the
eigenfunctions might approximately look.
118                                           4. Applications of the spectral theorem


    So suppose we have a normalized function ψ1 which is an approximation
for the eigenfunction ϕ1 of the lowest eigenvalue E1 . Then by Theorem 2.19
we know that
                             ψ1 , Aψ1 ≥ ϕ1 , Aϕ1 = E1 .                          (4.27)
If we add some free parameters to ψ1 , one can optimize them and obtain
quite good upper bounds for the first eigenvalue.
     But is there also something one can say about the next eigenvalues?
Suppose we know the first eigenfunction ϕ1 . Then we can restrict A to
the orthogonal complement of ϕ1 and proceed as before: E2 will be the
infimum over all expectations restricted to this subspace. If we restrict to
the orthogonal complement of an approximating eigenfunction ψ1 , there will
still be a component in the direction of ϕ1 left and hence the infimum of the
expectations will be lower than E2 . Thus the optimal choice ψ1 = ϕ1 will
give the maximal value E2 .
    More precisely, let {ϕj }N be an orthonormal basis for the space spanned
                             j=1
by the eigenfunctions corresponding to eigenvalues below the essential spec-
trum. Here the essential spectrum σess (A) is given by precisely those values
in the spectrum which are not isolated eigenvalues of finite multiplicity (see
Section 6.2). Assume they satisfy (A − Ej )ϕj = 0, where Ej ≤ Ej+1 are
the eigenvalues (counted according to their multiplicity). If the number of
eigenvalues N is finite, we set Ej = inf σess (A) for j  N and choose ϕj
orthonormal such that (A − Ej )ϕj ≤ ε.
      Define
   U (ψ1 , . . . , ψn ) = {ψ ∈ D(A)| ψ = 1, ψ ∈ span{ψ1 , . . . , ψn }⊥ }.       (4.28)

      (i) We have
                               inf            ψ, Aψ ≤ En + O(ε).                 (4.29)
                        ψ∈U (ψ1 ,...,ψn−1 )
                      n
In fact, set ψ =      j=1 αj ϕj      and choose αj such that ψ ∈ U (ψ1 , . . . , ψn−1 ).
Then
                                     n
                     ψ, Aψ =             |αj |2 Ej + O(ε) ≤ En + O(ε)            (4.30)
                                 j=1

and the claim follows.
      (ii) We have
                               inf            ψ, Aψ ≥ En − O(ε).                 (4.31)
                        ψ∈U (ϕ1 ,...,ϕn−1 )

In fact, set ψ = ϕn .
      Since ε can be chosen arbitrarily small, we have proven the following.
4.4. Estimating eigenspaces                                                           119


Theorem 4.10 (Min-max). Let A be self-adjoint and let E1 ≤ E2 ≤ E3 · · ·
be the eigenvalues of A below the essential spectrum, respectively, the in-
fimum of the essential spectrum, once there are no more eigenvalues left.
Then
                  En =       sup             inf            ψ, Aψ .                 (4.32)
                         ψ1 ,...,ψn−1 ψ∈U (ψ1 ,...,ψn−1 )

    Clearly the same result holds if D(A) is replaced by the quadratic form
domain Q(A) in the definition of U . In addition, as long as En is an eigen-
value, the sup and inf are in fact max and min, explaining the name.

Corollary 4.11. Suppose A and B are self-adjoint operators with A ≥ B
(i.e., A − B ≥ 0). Then En (A) ≥ En (B).

Problem 4.4. Suppose A, An are bounded and An → A. Then Ek (An ) →
Ek (A). (Hint: A − An ≤ ε is equivalent to A − ε ≤ A ≤ A + ε.)


4.4. Estimating eigenspaces
Next, we show that the dimension of the range of PA (Ω) can be estimated
if we have some functions which lie approximately in this space.

Theorem 4.12. Suppose A is a self-adjoint operator and ψj , 1 ≤ j ≤ k,
are linearly independent elements of a H.
      (i) Let λ ∈ R, ψj ∈ Q(A). If
                                                    2
                                ψ, Aψ  λ ψ                                         (4.33)
                                                               k
         for any nonzero linear combination ψ =                j=1 cj ψj ,   then
                         dim Ran PA ((−∞, λ)) ≥ k.                                  (4.34)
         Similarly, ψ, Aψ  λ ψ         2   implies dim Ran PA ((λ, ∞)) ≥ k.

     (ii) Let λ1  λ2 , ψj ∈ D(A). If
                              λ2 + λ1      λ2 − λ1
                      (A −            )ψ          ψ                                (4.35)
                                 2            2
                                                               k
         for any nonzero linear combination ψ =                j=1 cj ψj ,   then
                         dim Ran PA ((λ1 , λ2 )) ≥ k.                               (4.36)

Proof. (i) Let M = span{ψj } ⊆ H. We claim dim PA ((−∞, λ))M =
dim M = k. For this it suffices to show Ker PA ((−∞, λ))|M = {0}. Sup-
pose PA ((−∞, λ))ψ = 0, ψ = 0. Then we see that for any nonzero linear
120                                         4. Applications of the spectral theorem


combination ψ

                    ψ, Aψ =         η dµψ (η) =             η dµψ (η)
                                R                   [λ,∞)

                            ≥λ              dµψ (η) = λ ψ 2 .
                                    [λ,∞)

This contradicts our assumption (4.33).
   (ii) This is just the previous case (i) applied to (A − (λ2 + λ1 )/2)2 with
λ = (λ2 − λ1 )2 /4.

      Another useful estimate is
Theorem 4.13 (Temple’s inequality). Let λ1  λ2 and ψ ∈ D(A) with
 ψ = 1 such that
                     λ = ψ, Aψ ∈ (λ1 , λ2 ).               (4.37)
If there is one isolated eigenvalue E between λ1 and λ2 , that is, σ(A) ∩
(λ1 , λ2 ) = E, then
                       (A − λ)ψ     2                 (A − λ)ψ      2
                  λ−                    ≤E ≤λ+                          .    (4.38)
                         λ2 − λ                         λ − λ1
Proof. First of all we can assume λ = 0 if we replace A by A − λ. To prove
the first inequality, observe that by assumption (E, λ2 ) ⊂ ρ(A) and hence the
spectral theorem implies (A − λ2 )(A − E) ≥ 0. Thus ψ, (A − λ2 )(A − E) =
 Aψ 2 + λ2 E ≥ 0 and the first inequality follows after dividing by λ2  0.
Similarly, (A − λ1 )(A − E) ≥ 0 implies the second inequality.

   Note that the last inequality only provides additional information if
 (A − λ)ψ 2 ≤ (λ2 − λ)(λ − λ1 ).
   A typical application is if E = E0 is the lowest eigenvalue. In this case
any normalized trial function ψ will give the bound E0 ≤ ψ, Aψ . If, in
addition, we also have some estimate λ2 ≤ E1 for the second eigenvalue E1 ,
then Temple’s inequality can give a bound from below. For λ1 we can choose
any value λ1  E0 ; in fact, if we let λ1 → −∞, we just recover the bound
we already know.

4.5. Tensor products of operators
Recall the definition of the tensor product of Hilbert space from Section 1.4.
Suppose Aj , 1 ≤ j ≤ n, are (essentially) self-adjoint operators on Hj . For
every monomial λn1 · · · λnn we can define
                  1       n
                                                                              n
(An1 ⊗ · · · ⊗ Ann )ψ1 ⊗ · · · ⊗ ψn = (An1 ψ1 ) ⊗ · · · ⊗ (Ann ψn ), ψj ∈ D(Aj j ),
  1             n                       1                   n
                                                                            (4.39)
4.5. Tensor products of operators                                                  121


and extend this definition by linearity to the span of all such functions
(check that this definition is well-defined by showing that the corresponding
operator on F(H1 , . . . , Hn ) vanishes on N (H1 , . . . , Hn )). Hence for every
polynomial P (λ1 , . . . , λn ) of degree N we obtain an operator
                  P (A1 , . . . , An )ψ1 ⊗ · · · ⊗ ψn ,   ψj ∈ D(AN ),
                                                                  j             (4.40)
defined on the set
                    D = span{ψ1 ⊗ · · · ⊗ ψn | ψj ∈ D(AN )}.
                                                       j                        (4.41)
Moreover, if P is real-valued, then the operator P (A1 , . . . , An ) on D is sym-
metric and we can consider its closure, which will again be denoted by
P (A1 , . . . , An ).
Theorem 4.14. Suppose Aj , 1 ≤ j ≤ n, are self-adjoint operators on Hj
and let P (λ1 , . . . , λn ) be a real-valued polynomial and define P (A1 , . . . , An )
as above.
   Then P (A1 , . . . , An ) is self-adjoint and its spectrum is the closure of the
range of P on the product of the spectra of the Aj ; that is,
                   σ(P (A1 , . . . , An )) = P (σ(A1 ), . . . , σ(An )).        (4.42)

Proof. By the spectral theorem it is no restriction to assume that Aj is
multiplication by λj on L2 (R, dµj ) and P (A1 , . . . , An ) is hence multiplication
by P (λ1 , . . . , λn ) on L2 (Rn , dµ1 × · · · × dµn ). Since D contains the set of
all functions ψ1 (λ1 ) · · · ψn (λn ) for which ψj ∈ L2 (R, dµj ), it follows that the
                                                         c
domain of the closure of P contains L2 (Rn , dµ1 × · · · × dµn ). Hence P is
                                                 c
the maximally defined multiplication operator by P (λ1 , . . . , λn ), which is
self-adjoint.
    Now let λ = P (λ1 , . . . , λn ) with λj ∈ σ(Aj ). Then there exist Weyl
sequences ψj,k ∈ D(AN ) with (Aj − λj )ψj,k → 0 as k → ∞. Consequently,
                           j
(P −λ)ψk → 0, where ψk = ψ1,k ⊗· · ·⊗ψ1,k and hence λ ∈ σ(P ). Conversely,
if λ ∈ P (σ(A1 ), . . . , σ(An )), then |P (λ1 , . . . , λn ) − λ| ≥ ε for a.e. λj with
respect to µj and hence (P − λ)−1 exists and is bounded; that is, λ ∈
ρ(P ).

    The two main cases of interest are A1 ⊗ A2 , in which case
               σ(A1 ⊗ A2 ) = σ(A1 )σ(A2 ) = {λ1 λ2 |λj ∈ σ(Aj )},               (4.43)
and A1 ⊗ I + I ⊗ A2 , in which case
    σ(A1 ⊗ I + I ⊗ A2 ) = σ(A1 ) + σ(A2 ) = {λ1 + λ2 |λj ∈ σ(Aj )}.             (4.44)
Problem 4.5. Show that the closure can be omitted in (4.44) if at least one
operator is bounded and in (4.43) if both operators are bounded.
Mathematical methods in quantum mechanics
Chapter 5




Quantum dynamics


As in the finite dimensional case, the solution of the Schr¨dinger equation
                                                          o
                                  d
                              i      ψ(t) = Hψ(t)                        (5.1)
                                  dt
is given by
                           ψ(t) = exp(−itH)ψ(0).                         (5.2)
A detailed investigation of this formula will be our first task. Moreover, in
the finite dimensional case the dynamics is understood once the eigenvalues
are known and the same is true in our case once we know the spectrum. Note
that, like any Hamiltonian system from classical mechanics, our system is
not hyperbolic (i.e., the spectrum is not away from the real axis) and hence
simple results such as all solutions tend to the equilibrium position cannot
be expected.

5.1. The time evolution and Stone’s theorem
In this section we want to have a look at the initial value problem associated
with the Schr¨dinger equation (2.12) in the Hilbert space H. If H is one-
               o
dimensional (and hence A is a real number), the solution is given by
                              ψ(t) = e−itA ψ(0).                         (5.3)
Our hope is that this formula also applies in the general case and that we
can reconstruct a one-parameter unitary group U (t) from its generator A
(compare (2.11)) via U (t) = exp(−itA). We first investigate the family of
operators exp(−itA).
Theorem 5.1. Let A be self-adjoint and let U (t) = exp(−itA).
      (i) U (t) is a strongly continuous one-parameter unitary group.

                                                                          123
124                                                                   5. Quantum dynamics


       (ii) The limit limt→0 1 (U (t)ψ − ψ) exists if and only if ψ ∈ D(A) in
                             t
            which case limt→0 1 (U (t)ψ − ψ) = −iAψ.
                               t
       (iii) U (t)D(A) = D(A) and AU (t) = U (t)A.

Proof. The group property (i) follows directly from Theorem 3.1 and the
corresponding statements for the function exp(−itλ). To prove strong con-
tinuity, observe that

            lim e−itA ψ − e−it0 A ψ   2
                                          = lim          |e−itλ − e−it0 λ |2 dµψ (λ)
            t→t0                              t→t0   R

                                          =      lim |e−itλ − e−it0 λ |2 dµψ (λ) = 0
                                               R t→t0

by the dominated convergence theorem.
      Similarly, if ψ ∈ D(A), we obtain
             1 −itA                                       1
      lim      (e   ψ − ψ) + iAψ      2
                                          = lim          | (e−itλ − 1) + iλ|2 dµψ (λ) = 0
      t→0    t                                t→0 R       t
                                 ˜
since |eitλ − 1| ≤ |tλ|. Now let A be the generator defined as in (2.11). Then
 ˜
A is a symmetric extension of A since we have

         ˜          i                    i                    ˜
      ϕ, Aψ = lim ϕ, (U (t) − 1)ψ = lim    (U (−t) − 1)ϕ, ψ = Aϕ, ψ
              t→0   t               t→0 −t

          ˜
and hence A = A by Corollary 2.2. This settles (ii).
      To see (iii), replace ψ → U (s)ψ in (ii).

    For our original problem this implies that formula (5.3) is indeed the
solution to the initial value problem of the Schr¨dinger equation. Moreover,
                                                 o
                    U (t)ψ, AU (t)ψ = U (t)ψ, U (t)Aψ = ψ, Aψ                          (5.4)
shows that the expectations of A are time independent. This corresponds
to conservation of energy.
    On the other hand, the generator of the time evolution of a quantum
mechanical system should always be a self-adjoint operator since it corre-
sponds to an observable (energy). Moreover, there should be a one-to-one
correspondence between the unitary group and its generator. This is ensured
by Stone’s theorem.
Theorem 5.2 (Stone). Let U (t) be a weakly continuous one-parameter uni-
tary group. Then its generator A is self-adjoint and U (t) = exp(−itA).

Proof. First of all observe that weak continuity together with item (iv) of
Lemma 1.12 shows that U (t) is in fact strongly continuous.
5.1. The time evolution and Stone’s theorem                               125


   Next we show that A is densely defined. Pick ψ ∈ H and set
                                          τ
                               ψτ =           U (t)ψdt
                                      0

(the integral is defined as in Section 4.1) implying limτ →0 τ −1 ψτ = ψ. More-
over,
   1                   1 t+τ               1 τ
     (U (t)ψτ − ψτ ) =        U (s)ψds −       U (s)ψds
   t                   t t                 t 0
                       1 τ +t              1 t
                     =        U (s)ψds −       U (s)ψds
                       t τ                 t 0
                                t
                       1                     1 t
                     = U (τ )     U (s)ψds −      U (s)ψds → U (τ )ψ − ψ
                       t      0              t 0
as t → 0 shows ψτ ∈ D(A). As in the proof of the previous theorem, we can
show that A is symmetric and that U (t)D(A) = D(A).
   Next, let us prove that A is essentially self-adjoint. By Lemma 2.7 it
suffices to prove Ker(A∗ − z ∗ ) = {0} for z ∈ CR. Suppose A∗ ϕ = z ∗ ϕ.
Then for each ψ ∈ D(A) we have
     d
        ϕ, U (t)ψ = ϕ, −iAU (t)ψ = −i A∗ ϕ, U (t)ψ = −iz ϕ, U (t)ψ
     dt
and hence ϕ, U (t)ψ = exp(−izt) ϕ, ψ . Since the left-hand side is bounded
for all t ∈ R and the exponential on the right-hand side is not, we must have
 ϕ, ψ = 0 implying ϕ = 0 since D(A) is dense.
   So A is essentially self-adjoint and we can introduce V (t) = exp(−itA).
We are done if we can show U (t) = V (t).
   Let ψ ∈ D(A) and abbreviate ψ(t) = (U (t) − V (t))ψ. Then
                              ψ(t + s) − ψ(t)
                        lim                   = iAψ(t)
                       s→0           s
           d
and hence dt ψ(t) 2 = 2 Re ψ(t), iAψ(t) = 0. Since ψ(0) = 0, we have
ψ(t) = 0 and hence U (t) and V (t) coincide on D(A). Furthermore, since
D(A) is dense, we have U (t) = V (t) by continuity.

    As an immediate consequence of the proof we also note the following
useful criterion.

Corollary 5.3. Suppose D ⊆ D(A) is dense and invariant under U (t).
Then A is essentially self-adjoint on D.

Proof. As in the above proof it follows that ϕ, ψ = 0 for any ψ ∈ D and
ϕ ∈ Ker(A∗ − z ∗ ).
126                                                              5. Quantum dynamics


   Note that by Lemma 4.9 two strongly continuous one-parameter groups
commute,
                              [e−itA , e−isB ] = 0,                             (5.5)
if and only if the generators commute.
    Clearly, for a physicist, one of the goals must be to understand the time
evolution of a quantum mechanical system. We have seen that the time
evolution is generated by a self-adjoint operator, the Hamiltonian, and is
given by a linear first order differential equation, the Schr¨dinger equation.
                                                            o
To understand the dynamics of such a first order differential equation, one
must understand the spectrum of the generator. Some general tools for this
endeavor will be provided in the following sections.

Problem 5.1. Let H = L2 (0, 2π) and consider the one-parameter unitary
group given by U (t)f (x) = f (x − t mod 2π). What is the generator of U ?


5.2. The RAGE theorem
Now, let us discuss why the decomposition of the spectrum introduced in
Section 3.3 is of physical relevance. Let ϕ = ψ = 1. The vector ϕ, ψ ϕ
is the projection of ψ onto the (one-dimensional) subspace spanned by ϕ.
Hence | ϕ, ψ |2 can be viewed as the part of ψ which is in the state ϕ. The
first question one might raise is, how does

                      | ϕ, U (t)ψ |2 ,         U (t) = e−itA ,                  (5.6)

behave as t → ∞? By the spectral theorem,

                  µϕ,ψ (t) = ϕ, U (t)ψ =
                  ˆ                                     e−itλ dµϕ,ψ (λ)         (5.7)
                                                    R

is the Fourier transform of the measure µϕ,ψ . Thus our question is an-
swered by Wiener’s theorem.

Theorem 5.4 (Wiener). Let µ be a finite complex Borel measure on R and
let
                            µ(t) =
                            ˆ                e−itλ dµ(λ)                        (5.8)
                                         R

be its Fourier transform. Then the Ces`ro time average of µ(t) has the limit
                                      a                   ˆ
                                 T
                        1
                    lim              |ˆ(t)|2 dt =
                                      µ                   |µ({λ})|2 ,           (5.9)
                   T →∞ T    0                      λ∈R

where the sum on the right-hand side is finite.
5.2. The RAGE theorem                                                                                       127


Proof. By Fubini we have
                   T                              T
          1                           1
                       |ˆ(t)|2 dt =
                        µ                                     e−i(x−y)t dµ(x)dµ∗ (y)dt
          T    0                      T       0       R   R
                                                              T
                                                      1
                                 =                                e−i(x−y)t dt dµ(x)dµ∗ (y).
                                          R    R      T   0

The function in parentheses is bounded by one and converges pointwise to
χ{0} (x − y) as T → ∞. Thus, by the dominated convergence theorem, the
limit of the above expression is given by

              χ{0} (x − y)dµ(x)dµ∗ (y) =                          µ({y})dµ∗ (y) =             |µ({y})|2 ,
      R   R                                                   R                         y∈R

which finishes the proof.

     To apply this result to our situation, observe that the subspaces Hac ,
Hsc , and Hpp are invariant with respect to time evolution since P xx U (t) =
χMxx (A) exp(−itA) = exp(−itA)χMxx (A) = U (t)P xx , xx ∈ {ac, sc, pp}.
Moreover, if ψ ∈ Hxx , we have P xx ψ = ψ, which shows ϕ, f (A)ψ =
 ϕ, P xx f (A)ψ = P xx ϕ, f (A)ψ implying dµϕ,ψ = dµP xx ϕ,ψ . Thus if µψ
is ac, sc, or pp, so is µϕ,ψ for every ϕ ∈ H.
    That is, if ψ ∈ Hc = Hac ⊕Hsc , then the Ces`ro mean of ϕ, U (t)ψ tends
                                                a
to zero. In other words, the average of the probability of finding the system
in any prescribed state tends to zero if we start in the continuous subspace
Hc of A.
    If ψ ∈ Hac , then dµϕ,ψ is absolutely continuous with respect to Lebesgue
measure and thus µϕ,ψ (t) is continuous and tends to zero as |t| → ∞. In
                      ˆ
fact, this follows from the Riemann-Lebesgue lemma (see Lemma 7.6 below).
   Now we want to draw some additional consequences from Wiener’s the-
orem. This will eventually yield a dynamical characterization of the contin-
uous and pure point spectrum due to Ruelle, Amrein, Gorgescu, and Enß.
But first we need a few definitions.
    An operator K ∈ L(H) is called a finite rank operator if its range is
finite dimensional. The dimension

                                  rank(K) = dim Ran(K)

is called the rank of K. If {ψj }n is an orthonormal basis for Ran(K), we
                                 j=1
have
                                      n                              n
                          Kψ =                ψj , Kψ ψj =                ϕj , ψ ψj ,                   (5.10)
                                  j=1                               j=1
128                                                       5. Quantum dynamics


where ϕj = K ∗ ψj . The elements ϕj are linearly independent since Ran(K) =
Ker(K ∗ )⊥ . Hence every finite rank operator is of the form (5.10). In addi-
tion, the adjoint of K is also finite rank and is given by
                                      n
                             K ∗ψ =         ψj , ψ ϕj .                  (5.11)
                                      j=1

    The closure of the set of all finite rank operators in L(H) is called the set
of compact operators C(H). It is straightforward to verify (Problem 5.2)
Lemma 5.5. The set of all compact operators C(H) is a closed ∗-ideal in
L(H).

   There is also a weaker version of compactness which is useful for us. The
operator K is called relatively compact with respect to A if
                               KRA (z) ∈ C(H)                            (5.12)
for one z ∈ ρ(A). By the first resolvent formula this then follows for all
z ∈ ρ(A). In particular we have D(A) ⊆ D(K).
      Now let us return to our original problem.
Theorem 5.6. Let A be self-adjoint and suppose K is relatively compact.
Then
         1 T
    lim        Ke−itA P c ψ 2 dt = 0    and      lim Ke−itA P ac ψ = 0
   T →∞ T 0                                     t→∞
                                                                      (5.13)
for every ψ ∈ D(A). If, in addition, K is bounded, then the result holds for
any ψ ∈ H.

Proof. Let ψ ∈ Hc , respectively, ψ ∈ Hac , and drop the projectors. Then,
if K is a rank one operator (i.e., K = ϕ1 , . ϕ2 ), the claim follows from
Wiener’s theorem, respectively, the Riemann-Lebesgue lemma. Hence it
holds for any finite rank operator K.
    If K is compact, there is a sequence Kn of finite rank operators such
that K − Kn ≤ 1/n and hence
                                               1
                    Ke−itA ψ ≤ Kn e−itA ψ +      ψ .
                                               n
Thus the claim holds for any compact operator K.
    If ψ ∈ D(A), we can set ψ = (A − i)−1 ϕ, where ϕ ∈ Hc if and only if
ψ ∈ Hc (since Hc reduces A). Since K(A + i)−1 is compact by assumption,
the claim can be reduced to the previous situation. If K is also bounded,
we can find a sequence ψn ∈ D(A) such that ψ − ψn ≤ 1/n and hence
                                                1
                    Ke−itA ψ ≤ Ke−itA ψn +        K ,
                                                n
concluding the proof.
5.2. The RAGE theorem                                                                          129


  With the help of this result we can now prove an abstract version of the
RAGE theorem.
Theorem 5.7 (RAGE). Let A be self-adjoint. Suppose Kn ∈ L(H) is a se-
quence of relatively compact operators which converges strongly to the iden-
tity. Then
                                                                   T
                                                  1
             Hc = {ψ ∈ H| lim lim                                      Kn e−itA ψ dt = 0},
                                         n→∞ T →∞ T            0
            Hpp = {ψ ∈ H| lim sup (I − Kn )e−itA ψ = 0}.                                     (5.14)
                                         n→∞ t≥0


Proof. Abbreviate ψ(t) = exp(−itA)ψ. We begin with the first equation.
   Let ψ ∈ Hc . Then
                       T                                       T                  1/2
            1                                      1
                           Kn ψ(t) dt ≤                            Kn ψ(t) 2 dt         →0
            T      0                               T       0
by Cauchy–Schwarz and the previous theorem. Conversely, if ψ ∈ Hc , we
can write ψ = ψ c + ψ pp . By our previous estimate it suffices to show
 Kn ψ pp (t) ≥ ε  0 for n large. In fact, we even claim
                            lim sup Kn ψ pp (t) − ψ pp (t) = 0.                              (5.15)
                            n→∞ t≥0

By the spectral theorem, we can write ψ pp (t) = j αj (t)ψj , where the ψj
are orthonormal eigenfunctions and αj (t) = exp(−itλj )αj . Truncate this
expansion after N terms. Then this part converges uniformly to the desired
limit by strong convergence of Kn . Moreover, by Lemma 1.14 we have
  Kn ≤ M , and hence the error can be made arbitrarily small by choosing
N large.
    Now let us turn to the second equation. If ψ ∈ Hpp , the claim follows
by (5.15). Conversely, if ψ ∈ Hpp , we can write ψ = ψ c + ψ pp and by our
previous estimate it suffices to show that (I − Kn )ψ c (t) does not tend to
0 as n → ∞. If it did, we would have
                                     T
                     1
           0 = lim                        (I − Kn )ψ c (t) 2 dt
                T →∞ T           0
                                                       T
                                              1
             ≥ ψ c (t)       2
                                 − lim                     Kn ψ c (t) 2 dt = ψ c (t) 2 ,
                                         T →∞ T    0
a contradiction.

    In summary, regularity properties of spectral measures are related to
the long time behavior of the corresponding quantum mechanical system.
However, a more detailed investigation of this topic is beyond the scope of
this manuscript. For a survey containing several recent results, see [28].
130                                                                              5. Quantum dynamics


    It is often convenient to treat the observables as time dependent rather
than the states. We set
                              K(t) = eitA Ke−itA                      (5.16)
and note
                 ψ(t), Kψ(t) = ψ, K(t)ψ ,       ψ(t) = e−itA ψ.       (5.17)
This point of view is often referred to as the Heisenberg picture in the
physics literature. If K is unbounded, we will assume D(A) ⊆ D(K) such
that the above equations make sense at least for ψ ∈ D(A). The main
interest is the behavior of K(t) for large t. The strong limits are called
asymptotic observables if they exist.
Theorem 5.8. Suppose A is self-adjoint and K is relatively compact. Then
                    T
       1
   lim                  eitA Ke−itA ψdt =                     PA ({λ})KPA ({λ})ψ,           ψ ∈ D(A).
  T →∞ T        0                                  λ∈σp (A)
                                                                                                   (5.18)
If K is in addition bounded, the result holds for any ψ ∈ H.

Proof. We will assume that K is bounded. To obtain the general result,
use the same trick as before and replace K by KRA (z). Write ψ = ψ c + ψ pp .
Then
                        T
                 1                         1 T
            lim           K(t)ψ c dt ≤ lim      K(t)ψ c dt = 0
           T →∞ T     0               T →∞ T 0
by Theorem 5.6. As in the proof of the previous theorem we can write
ψ pp = j αj ψj and hence
                                 T                                       T
                         1                                       1
                    αj               K(t)ψj dt =           αj                eit(A−λj ) dt Kψj .
                         T   0                                   T   0
            j                                          j

As in the proof of Wiener’s theorem, we see that the operator in parentheses
tends to PA ({λj }) strongly as T → ∞. Since this operator is also bounded
by 1 for all T , we can interchange the limit with the summation and the
claim follows.

      We also note the following corollary.
Corollary 5.9. Under the same assumptions as in the RAGE theorem we
have
                         1 T itA
                lim lim       e Kn e−itA ψdt = P pp ψ,        (5.19)
               n→∞ T →∞ T 0
respectively,
                                               T
                                  1
                         lim lim                   eitA (I − Kn )e−itA ψdt = P c ψ.                (5.20)
                         n→∞ T →∞ T        0
Problem 5.2. Prove Lemma 5.5.
5.3. The Trotter product formula                                                              131


Problem 5.3. Prove Corollary 5.9.

5.3. The Trotter product formula
In many situations the operator is of the form A + B, where eitA and eitB
can be computed explicitly. Since A and B will not commute in general, we
cannot obtain eit(A+B) from eitA eitB . However, we at least have
Theorem 5.10 (Trotter product formula). Suppose A, B, and A + B are
self-adjoint. Then
                                                        t     t       n
                               eit(A+B) = s-lim ei n A ei n B             .                 (5.21)
                                             n→∞

Proof. First of all note that we have
                       n
        eiτ A eiτ B        − eit(A+B)
                 n−1
                                         n−1−j                                         j
             =             eiτ A eiτ B           eiτ A eiτ B − eiτ (A+B)      eiτ (A+B) ,
                 j=0
            t
where τ =   n,   and hence
                       (eiτ A eiτ B )n − eit(A+B) ψ ≤ |t| max Fτ (s),
                                                                  |s|≤|t|

where
                            1 iτ A iτ B
                      Fτ (s) =(e e      − eiτ (A+B) )eis(A+B) ψ .
                            τ
Now for ψ ∈ D(A + B) = D(A) ∩ D(B) we have
          1 iτ A iτ B
            (e e      − eiτ (A+B) )ψ → iAψ + iBψ − i(A + B)ψ = 0
          τ
as τ → 0. So limτ →0 Fτ (s) = 0 at least pointwise, but we need this uniformly
with respect to s ∈ [−|t|, |t|].
    Pointwise convergence implies
                        1 iτ A iτ B
                          (e e      − eiτ (A+B) )ψ ≤ C(ψ)
                        τ
and, since D(A + B) is a Hilbert space when equipped with the graph norm
 ψ 2Γ(A+B) = ψ
                   2 + (A + B)ψ 2 , we can invoke the uniform boundedness

principle to obtain
                   1 iτ A iτ B
                     (e e      − eiτ (A+B) )ψ ≤ C ψ Γ(A+B) .
                   τ
Now
                           1 iτ A iτ B
     |Fτ (s) − Fτ (r)| ≤     (e e      − eiτ (A+B) )(eis(A+B) − eir(A+B) )ψ
                           τ
                       ≤ C (eis(A+B) − eir(A+B) )ψ Γ(A+B)
132                                                        5. Quantum dynamics


shows that Fτ (.) is uniformly continuous and the claim follows by a standard
ε
2 argument.

      If the operators are semi-bounded from below, the same proof shows
Theorem 5.11 (Trotter product formula). Suppose A, B, and A + B are
self-adjoint and semi-bounded from below. Then
                                       t     t     n
                  e−t(A+B) = s-lim e− n A e− n B       ,   t ≥ 0.        (5.22)
                              n→∞

Problem 5.4. Prove Theorem 5.11.
Chapter 6




Perturbation theory for
self-adjoint operators


The Hamiltonian of a quantum mechanical system is usually the sum of
the kinetic energy H0 (free Schr¨dinger operator) plus an operator V cor-
                                o
responding to the potential energy. Since H0 is easy to investigate, one
usually tries to consider V as a perturbation of H0 . This will only work
if V is small with respect to H0 . Hence we study such perturbations of
self-adjoint operators next.


6.1. Relatively bounded operators and the Kato–Rellich
     theorem
An operator B is called A bounded or relatively bounded with respect
to A if D(A) ⊆ D(B) and if there are constants a, b ≥ 0 such that

                   Bψ ≤ a Aψ + b ψ ,           ψ ∈ D(A).                (6.1)

The infimum of all constants a for which a corresponding b exists such that
(6.1) holds is called the A-bound of B.
   The triangle inequality implies

Lemma 6.1. Suppose Bj , j = 1, 2, are A bounded with respective A-bounds
ai , i = 1, 2. Then α1 B1 + α2 B2 is also A bounded with A-bound less than
|α1 |a1 + |α2 |a2 . In particular, the set of all A bounded operators forms a
linear space.

   There are also the following equivalent characterizations:

                                                                         133
134                           6. Perturbation theory for self-adjoint operators


Lemma 6.2. Suppose A is closed and B is closable. Then the following are
equivalent:
        (i) B is A bounded.
       (ii) D(A) ⊆ D(B).
       (iii) BRA (z) is bounded for one (and hence for all) z ∈ ρ(A).
Moreover, the A-bound of B is no larger than inf z∈ρ(A) BRA (z) .

Proof. (i) ⇒ (ii) is true by definition. (ii) ⇒ (iii) since BRA (z) is a closed
(Problem 2.9) operator defined on all of H and hence bounded by the closed
graph theorem (Theorem 2.8). To see (iii) ⇒ (i), let ψ ∈ D(A). Then
       Bψ = BRA (z)(A − z)ψ ≤ a (A − z)ψ ≤ a Aψ + (a|z|) ψ ,
where a = BRA (z) . Finally, note that if BRA (z) is bounded for one
z ∈ ρ(A), it is bounded for all z ∈ ρ(A) by the first resolvent formula.
                                                             2 d
Example. Let A be the self-adjoint operator A = − dx2 , D(A) = {f ∈
H 2 [0, 1]|f (0) = f (1) = 0} in the Hilbert space L2 (0, 1). If we want to add a

potential represented by a multiplication operator with a real-valued (mea-
surable) function q, then q will be relatively bounded if q ∈ L2 (0, 1): Indeed,
since all functions in D(A) are continuous on [0, 1] and hence bounded, we
clearly have D(A) ⊂ D(q) in this case.

    We are mainly interested in the situation where A is self-adjoint and B
is symmetric. Hence we will restrict our attention to this case.
Lemma 6.3. Suppose A is self-adjoint and B relatively bounded. The A-
bound of B is given by
                         lim BRA (±iλ) .                         (6.2)
                               λ→∞
If A is bounded from below, we can also replace ±iλ by −λ.

Proof. Let ϕ = RA (±iλ)ψ, λ  0, and let a∞ be the A-bound of B. Then
(use the spectral theorem to estimate the norms)
                                                           b
       BRA (±iλ)ψ ≤ a ARA (±iλ)ψ + b RA (±iλ)ψ ≤ (a + ) ψ .
                                                           λ
Hence lim supλ BRA (±iλ) ≤ a∞ which, together with the inequality a∞ ≤
inf λ BRA (±iλ) from the previous lemma, proves the claim.
      The case where A is bounded from below is similar, using
                                             |γ|    b
               BRA (−λ)ψ ≤       a max 1,        +               ψ ,
                                            λ+γ    λ+γ
for −λ  γ.

      Now we will show the basic perturbation result due to Kato and Rellich.
6.1. Relatively bounded operators and the Kato–Rellich theorem             135


Theorem 6.4 (Kato–Rellich). Suppose A is (essentially) self-adjoint and
B is symmetric with A-bound less than one. Then A + B, D(A + B) =
D(A), is (essentially) self-adjoint. If A is essentially self-adjoint, we have
D(A) ⊆ D(B) and A + B = A + B.
   If A is bounded from below by γ, then A + B is bounded from below by
                                                 b
                         γ − max a|γ| + b,          .                    (6.3)
                                                a−1

Proof. Since D(A) ⊆ D(B) and D(A) ⊆ D(A + B) by (6.1), we can assume
that A is closed (i.e., self-adjoint). It suffices to show that Ran(A+B ±iλ) =
H. By the above lemma we can find a λ  0 such that BRA (±iλ)  1.
Hence −1 ∈ ρ(BRA (±iλ)) and thus I + BRA (±iλ) is onto. Thus
                  (A + B ± iλ) = (I + BRA (±iλ))(A ± iλ)
is onto and the proof of the first part is complete.
    If A is bounded from below, we can replace ±iλ by −λ and the above
equation shows that RA+B exists for λ sufficiently large. By the proof of
the previous lemma we can choose −λ  min(γ, b/(a − 1)).

Example. In our previous example we have seen that q ∈ L2 (0, 1) is rel-
atively bounded by checking D(A) ⊂ D(q). However, working a bit harder
(Problem 6.2), one can even show that the relative bound is 0 and hence
A + q is self-adjoint by the Kato–Rellich theorem.

   Finally, let us show that there is also a connection between the resolvents.

Lemma 6.5. Suppose A and B are closed and D(A) ⊆ D(B). Then we
have the second resolvent formula
    RA+B (z) − RA (z) = −RA (z)BRA+B (z) = −RA+B (z)BRA (z)              (6.4)
for z ∈ ρ(A) ∩ ρ(A + B).

Proof. We compute
   RA+B (z) + RA (z)BRA+B (z) = RA (z)(A + B − z)RA+B (z) = RA (z).
The second identity is similar.

Problem 6.1. Show that (6.1) implies
                         Bψ   2
                                  ≤ a2 Aψ
                                    ˜       2
                                                + ˜2 ψ
                                                  b      2


with a = a(1 + ε2 ) and ˜ = b(1 + ε−2 ) for any ε  0. Conversely, show that
      ˜                  b
this inequality implies (6.1) with a = a and b = ˜
                                       ˜          b.
136                          6. Perturbation theory for self-adjoint operators


                                                             2  d
Problem 6.2. Let A be the self-adjoint operator A = − dx2 , D(A) = {f ∈
H 2 [0, 1]|f (0) = f (1) = 0} in the Hilbert space L2 (0, 1) and q ∈ L2 (0, 1).
    Show that for every f ∈ D(A) we have
                                   ε          1
                            f 2 ≤ f 2+
                              ∞                   f 2
                                   2         2ε
for any ε  0. Conclude that the relative bound of q with respect to A is
                           1             1                 1
zero. (Hint: |f (x)|2 ≤ | 0 f (t)dt|2 ≤ 0 |f (t)|2 dt = − 0 f (t)∗ f (t)dt.)
Problem 6.3. Let A be as in the previous example. Show that q is relatively
bounded if and only if x(1 − x)q(x) ∈ L2 (0, 1).
Problem 6.4. Compute the resolvent of A + α ψ, . ψ. (Hint: Show
                                          α
              (I + α ϕ, . ψ)−1 = I −             ϕ, . ψ
                                     1 + α ϕ, ψ
and use the second resolvent formula.)

6.2. More on compact operators
Recall from Section 5.2 that we have introduced the set of compact operators
C(H) as the closure of the set of all finite rank operators in L(H). Before we
can proceed, we need to establish some further results for such operators.
We begin by investigating the spectrum of self-adjoint compact operators
and show that the spectral theorem takes a particularly simple form in this
case.
Theorem 6.6 (Spectral theorem for compact operators). Suppose the op-
erator K is self-adjoint and compact. Then the spectrum of K consists of
an at most countable number of eigenvalues which can only cluster at 0.
Moreover, the eigenspace to each nonzero eigenvalue is finite dimensional.
      In addition, we have
                             K=            λPK ({λ}).                     (6.5)
                                  λ∈σ(K)

Proof. It suffices to show rank(PK ((λ − ε, λ + ε)))  ∞ for 0  ε  |λ|.
Let Kn be a sequence of finite rank operators such that K − Kn ≤ 1/n. If
Ran PK ((λ − ε, λ + ε)) is infinite dimensional, we can find a vector ψn in this
range such that ψn = 1 and Kn ψn = 0. But this yields a contradiction
since
           1
              ≥ | ψn , (K − Kn )ψn | = | ψn , Kψn | ≥ |λ| − ε  0
           n
by (4.2).

   As a consequence we obtain the canonical form of a general compact
operator.
6.2. More on compact operators                                              137


Theorem 6.7 (Canonical form of compact operators). Let K be compact.
                                 ˆ
There exist orthonormal sets {φj }, {φj } and positive numbers sj = sj (K)
such that
               K=                ˆ
                      s j φj , . φj ,   K∗ =         ˆ
                                                 s j φj , . φj .     (6.6)
                      j                                 j
              ˆ          ˆ                                               ˆ
Note Kφj = sj φj and K ∗ φj = sj φj , and hence K ∗ Kφj = s2 φj and KK ∗ φj =
                                                           j
    ˆ
s 2 φj .
  j
    The numbers sj (K)2  0 are the nonzero eigenvalues of KK ∗ , respec-
tively, K ∗ K (counted with multiplicity) and sj (K) = sj (K ∗ ) = sj are called
singular values of K. There are either finitely many singular values (if K
is finite rank) or they converge to zero.

Proof. By Lemma 5.5, K ∗ K is compact and hence Theorem 6.6 applies.
Let {φj } be an orthonormal basis of eigenvectors for PK ∗ K ((0, ∞))H and let
s2 be the eigenvalue corresponding to φj . Then, for any ψ ∈ H we can write
 j

                            ψ=                         ˜
                                           φj , ψ φj + ψ
                                   j

     ˜
with ψ ∈ Ker(K ∗ K) = Ker(K). Then
                            Kψ =                       ˆ
                                            s j φj , ψ φj ,
                                       j

where φj = s−1 Kφj , since K ψ 2 = ψ, K ∗ K ψ = 0. By φj , φk =
          ˆ
                 j
                                  ˜         ˜     ˜               ˆ ˆ
(sj sk )−1 Kφj , Kφk = (sj sk )−1 K ∗ Kφj , φk = sj s−1 φj , φk we see that
                                                     k
        ˆ
the {φj } are orthonormal and the formula for K ∗ follows by taking the
adjoint of the formula for K (Problem 6.5).

                                      ˆ     2
   If K is self-adjoint, then φj = σj φj , σj = 1 are the eigenvectors of K
and σj sj are the corresponding eigenvalues.
    Moreover, note that we have
                               K = max sj (K).                            (6.7)
                                            j

    Finally, let me remark that there are a number of other equivalent defi-
nitions for compact operators.
Lemma 6.8. For K ∈ L(H) the following statements are equivalent:
      (i) K is compact.
     (i’) K ∗ is compact.
                               s
     (ii) An ∈ L(H) and An → A strongly implies An K → AK.
     (iii) ψn   ψ weakly implies Kψn → Kψ in norm.
138                              6. Perturbation theory for self-adjoint operators


       (iv) ψn bounded implies that Kψn has a (norm) convergent subse-
            quence.

Proof. (i) ⇔ (i’). This is immediate from Theorem 6.7.
   (i) ⇒ (ii). Translating An → An − A, it is no restriction to assume
A = 0. Since An ≤ M , it suffices to consider the case where K is finite
rank. Then (by (6.6))
                            N                                     N
         An K   2
                    = sup                      2    ˆ
                                 s j | φ j , ψ | An φ j   2
                                                              ≤                ˆ
                                                                        s j An φ j   2
                                                                                         → 0.
                      ψ =1 j=1                                    j=1

   (ii) ⇒ (iii). Again, replace ψn → ψn − ψ and assume ψ = 0. Choose
An = ψn , . ϕ, ϕ = 1. Then Kψn = An K ∗ → 0.
   (iii) ⇒ (iv). If ψn is bounded, it has a weakly convergent subsequence
by Lemma 1.13. Now apply (iii) to this subsequence.
      (iv) ⇒ (i). Let ϕj be an orthonormal basis and set
                                          n
                                 Kn =          ϕj , . Kϕj .
                                         j=1

Then
                    γn = K − Kn =                   sup                  Kψ
                                          ψ∈span{ϕj }∞ , ψ =1
                                                     j=n

is a decreasing sequence tending to a limit ε ≥ 0. Moreover, we can find
a sequence of unit vectors ψn ∈ span{ϕj }∞ for which Kψn ≥ ε. By
                                           j=n
assumption, Kψn has a convergent subsequence which, since ψn converges
weakly to 0, converges to 0. Hence ε must be 0 and we are done.

    The last condition explains the name compact. Moreover, note that one
cannot replace An K → AK by KAn → KA in (ii) unless one additionally
requires An to be normal (then this follows by taking adjoints — recall
that only for normal operators is taking adjoints continuous with respect
to strong convergence). Without the requirement that An be normal, the
claim is wrong as the following example shows.
Example. Let H = 2 (N) and let An be the operator which shifts each
sequence n places to the left and let K = δ1 , . δ1 , where δ1 = (1, 0, . . . ).
Then s-lim An = 0 but KAn = 1.

Problem 6.5. Deduce the formula for K ∗ from the one for K in (6.6).

Problem 6.6. Show that it suffices to check conditions (iii) and (iv) from
Lemma 6.8 on a dense subset.
6.3. Hilbert–Schmidt and trace class operators                                               139


6.3. Hilbert–Schmidt and trace class operators
Among the compact operators two special classes are of particular impor-
tance. The first ones are integral operators

            Kψ(x) =                K(x, y)ψ(y)dµ(y),                 ψ ∈ L2 (M, dµ),        (6.8)
                               M

where K(x, y) ∈ L2 (M ×M, dµ⊗dµ). Such an operator is called a Hilbert–
Schmidt operator. Using Cauchy–Schwarz,
                                                                          2
        |Kψ(x)|2 dµ(x) =                       |K(x, y)ψ(y)|dµ(y) dµ(x)
    M                              M       M

            ≤                  |K(x, y)|2 dµ(y)                   |ψ(y)|2 dµ(y) dµ(x)
                M          M                                  M

            =                  |K(x, y)|2 dµ(y) dµ(x)                    |ψ(y)|2 dµ(y) ,    (6.9)
                  M M                                                M

we see that K is bounded. Next, pick an orthonormal basis ϕj (x) for
L2 (M, dµ). Then, by Lemma 1.10, ϕi (x)ϕj (y) is an orthonormal basis for
L2 (M × M, dµ ⊗ dµ) and

                K(x, y) =                ci,j ϕi (x)ϕj (y),       ci,j = ϕi , Kϕ∗ ,
                                                                                j          (6.10)
                                   i,j

where
                       |ci,j |2 =                |K(x, y)|2 dµ(y) dµ(x)  ∞.               (6.11)
                 i,j                     M M


    In particular,
                                Kψ(x) =                ci,j ϕ∗ , ψ ϕi (x)
                                                             j                             (6.12)
                                                 i,j

shows that K can be approximated by finite rank operators (take finitely
many terms in the sum) and is hence compact.
   Using (6.6), we can also give a different characterization of Hilbert–
Schmidt operators.

Lemma 6.9. If H = L2 (M, dµ), then a compact operator K is Hilbert–
Schmidt if and only if j sj (K)2  ∞ and

                           sj (K)2 =                   |K(x, y)|2 dµ(x)dµ(y),              (6.13)
                       j                     M     M

in this case.
140                               6. Perturbation theory for self-adjoint operators


Proof. If K is compact, we can define approximating finite rank operators
Kn by considering only finitely many terms in (6.6):
                                                   n
                                  Kn =                            ˆ
                                                       s j φj , . φj .
                                                j=1
                                                           n            ∗ˆ
Then Kn has the kernel Kn (x, y) =                         j=1 sj φj (y) φj (x)        and
                                                                              n
                                               2
                          |Kn (x, y)| dµ(x)dµ(y) =                                sj (K)2 .
                  M   M                                                     j=1

Now if one side converges, so does the other and, in particular, (6.13) holds
in this case.

    Hence we will call a compact operator Hilbert–Schmidt if its singular
values satisfy
                               sj (K)2  ∞.                        (6.14)
                                           j
By our lemma this coincides with our previous definition if H = L2 (M, dµ).
    Since every Hilbert space is isomorphic to some L2 (M, dµ), we see that
the Hilbert–Schmidt operators together with the norm
                                                                         1/2
                              K        2   =               sj (K)2                               (6.15)
                                                       j

form a Hilbert space (isomorphic to L2 (M ×M, dµ⊗dµ)). Note that K 2 =
 K ∗ 2 (since sj (K) = sj (K ∗ )). There is another useful characterization for
identifying Hilbert–Schmidt operators:
Lemma 6.10. A compact operator K is Hilbert–Schmidt if and only if
                                                               2
                                                   Kψn             ∞                            (6.16)
                                        n
for some orthonormal basis and
                                                           2            2
                                               Kψn             = K      2                        (6.17)
                                   n
for any orthonormal basis in this case.

Proof. This follows from
                  Kψn     2
                              =              ˆ
                                           | φj , Kψn |2 =                    | K ∗ φj , ψn |2
                                                                                    ˆ
              n                   n,j                                   n,j

                              =                K ∗ φn
                                                   ˆ       2
                                                               =        sj (K)2 .
                                   n                                j
6.3. Hilbert–Schmidt and trace class operators                                                                        141


Corollary 6.11. The set of Hilbert–Schmidt operators forms a ∗-ideal in
L(H) and

       KA     2   ≤ A         K       2,       respectively,                AK        2   ≤ A       K    2.         (6.18)

Proof. Let K be Hilbert–Schmidt and A bounded. Then AK is compact
and
                  2                                2            2                         2          2        2
             AK   2   =           AKψn                 ≤ A                  Kψn               = A        K    2.
                          n                                         n

For KA just consider adjoints.

   This approach can be generalized by defining
                                                                                1/p
                                           K   p   =            sj (K)p                                             (6.19)
                                                            j

plus corresponding spaces

                              Jp (H) = {K ∈ C(H)| K                         p    ∞},                               (6.20)

which are known as Schatten p-classes. Note that by (6.7)

                                                       K ≤ K            p                                           (6.21)

and that by sj (K) = sj (K ∗ ) we have

                                                   K    p   = K ∗ p.                                                (6.22)

Lemma 6.12. The spaces Jp (H) together with the norm . p are Banach
spaces. Moreover,
                                                      
                                  1/p                 
        K p = sup    | ψj , Kϕj |p     {ψj }, {ϕj } ONS ,     (6.23)
                                                      
                                  j

where the sup is taken over all orthonormal sets.

Proof. The hard part is to prove (6.23): Choose q such that p + 1 = 1 and 1
                                                                            q
use H¨lder’s inequality to obtain (sj |...|2 = (sp |...|2 )1/p |...|2/q )
     o                                           j

                                                                            1/p                               1/q
             sj | ϕn , φj |2 ≤                     sp | ϕn , φj |2
                                                    j                                         | ϕn , φj |2
         j                                     j                                          j
                                                                            1/p
                                  ≤                sp | ϕn , φj |2
                                                    j                             .
                                               j
142                                  6. Perturbation theory for self-adjoint operators


                                         ˆ
Clearly the analogous equation holds for φj , ψn . Now using Cauchy–Schwarz,
we have
                                             1/2                1/2                     p
         | ψn , Kϕn |p =                 sj        ϕn , φj sj         ˆ
                                                                      φj , ψn
                                 j
                                                                    1/2                                 1/2
                         ≤               sp | ϕn , φj |2
                                          j                                        sp | ψn , φj |2
                                                                                    j
                                                                                             ˆ                .
                                 j                                            j

Summing over n, a second appeal to Cauchy–Schwarz and interchanging the
order of summation finally gives
                                                                      1/2                                    1/2
             | ψn , Kϕn |p ≤                  sp | ϕn , φj |2
                                               j                                        sp | ψn , φj |2
                                                                                         j
                                                                                                  ˆ
         n                           n,j                                          n,j
                                                    1/2                   1/2
                             ≤                sp
                                               j                    sp
                                                                     j            =             sp .
                                                                                                 j
                                         j                      j                           j

                                                ˆ
Since equality is attained for ϕn = φn and ψn = φn , equation (6.23) holds.
      Now the rest is straightforward. From
                                                          1/p
                   | ψj , (K1 + K2 )ϕj |p
               j
                                                          1/p                                          1/p
                   ≤         | ψj , K1 ϕj |p                    +             | ψj , K2 ϕj |p
                        j                                                 j
                   ≤ K1      p + K2            p

we infer that Jp (H) is a vector space and the triangle inequality. The other
requirements for a norm are obvious and it remains to check completeness.
If Kn is a Cauchy sequence with respect to . p , it is also a Cauchy sequence
with respect to . ( K ≤ K p ). Since C(H) is closed, there is a compact
K with K − Kn → 0 and by Kn p ≤ C we have
                                                                    1/p
                                             | ψj , Kϕj |p                    ≤C
                                     j

for any finite ONS. Since the right-hand side is independent of the ONS
(and in particular on the number of vectors), K is in Jp (H).

    The two most important cases are p = 1 and p = 2: J2 (H) is the space
of Hilbert–Schmidt operators investigated in the previous section and J1 (H)
is the space of trace class operators. Since Hilbert–Schmidt operators are
easy to identify, it is important to relate J1 (H) with J2 (H):
Lemma 6.13. An operator is trace class if and only if it can be written as
the product of two Hilbert–Schmidt operators, K = K1 K2 , and in this case
6.3. Hilbert–Schmidt and trace class operators                                               143


we have
                                   K    1   ≤ K1   2   K2 2 .                           (6.24)

Proof. By Cauchy–Schwarz we have
                                                                                            1/2
                                ∗                                ∗      2               2
      | ϕn , Kψn | =         | K1 ϕn , K2 ψn | ≤                K1 ϕn           K2 ψn
  n                      n                               n                  n
                     = K1     2    K2   2

and hence K = K1 K2 is trace class if both K1 and K2 are Hilbert–Schmidt
operators. To see the converse, let K be given by (6.6) and choose K1 =
                    ˆ
      sj (K) φj , . φj , respectively, K2 = j sj (K) φj , . φj .
  j

Corollary 6.14. The set of trace class operators forms a ∗-ideal in L(H)
and
          KA   1   ≤ A   K    1,    respectively,       AK      1   ≤ A     K   1.      (6.25)

Proof. Write K = K1 K2 with K1 , K2 Hilbert–Schmidt and use Corol-
lary 6.11.

   Now we can also explain the name trace class:
Lemma 6.15. If K is trace class, then for any orthonormal basis {ϕn } the
trace
                       tr(K) =       ϕn , Kϕn                     (6.26)
                                              n
is finite and independent of the orthonormal basis.

Proof. Let {ψn } be another ONB. If we write K = K1 K2 with K1 , K2
Hilbert–Schmidt, we have
                                    ∗                            ∗
          ϕn , K1 K2 ϕn =          K1 ϕn , K2 ϕn =              K1 ϕn , ψm ψm , K2 ϕn
      n                       n                         n,m
                                    ∗                                        ∗
                         =         K2 ψm , ϕn ϕn , K1 ψm =                  K2 ψm , K1 ψm
                             m,n                                     m

                         =         ψm , K2 K1 ψm .
                              m
Hence the trace is independent of the ONB and we even have tr(K1 K2 ) =
tr(K2 K1 ).

     Clearly for self-adjoint trace class operators, the trace is the sum over
all eigenvalues (counted with their multiplicity). To see this, one just has to
choose the orthonormal basis to consist of eigenfunctions. This is even true
for all trace class operators and is known as Lidskij trace theorem (see [44]
or [20] for an easy to read introduction).
144                               6. Perturbation theory for self-adjoint operators


      Finally we note the following elementary properties of the trace:
Lemma 6.16. Suppose K, K1 , K2 are trace class and A is bounded.
        (i) The trace is linear.
       (ii) tr(K ∗ ) = tr(K)∗ .
       (iii) If K1 ≤ K2 , then tr(K1 ) ≤ tr(K2 ).
       (iv) tr(AK) = tr(KA).

Proof. (i) and (ii) are straightforward. (iii) follows from K1 ≤ K2 if and
only if ϕ, K1 ϕ ≤ ϕ, K2 ϕ for every ϕ ∈ H. (iv) By Problem 6.7 and (i) it
is no restriction to assume that A is unitary. Let {ϕn } be some ONB and
note that {ψn = Aϕn } is also an ONB. Then
                tr(AK) =           ψn , AKψn =               Aϕn , AKAϕn
                             n                           n

                         =         ϕn , KAϕn = tr(KA)
                             n
and the claim follows.
Problem 6.7. Show that every bounded operator can be written as a linear
combination of two self-adjoint operators. Furthermore, show that every
bounded self-adjoint operator can √ written as a linear combination of two
                                  be
unitary operators. (Hint: x ± i 1 − x2 has absolute value one for x ∈
[−1, 1].)
Problem 6.8. Let H = 2 (N) and let A be multiplication by a sequence
a(n). Show that A ∈ Jp ( 2 (N)) if and only if a ∈ p (N). Furthermore, show
that A p = a p in this case.
Problem 6.9. Show that A ≥ 0 is trace class if (6.26) is finite for one (and
                                                          √ √
hence all) ONB. (Hint: A is self-adjoint (why?) and A = A A.)
Problem 6.10. Show that for an orthogonal projection P we have
                                  dim Ran(P ) = tr(P ),
where we set tr(P ) = ∞ if (6.26) is infinite (for one and hence all ONB by
the previous problem).
Problem 6.11. Show that for K ∈ C we have
                                  |K| =          sj φj , . φj ,
                                             j
               √
where |K| =        K ∗ K. Conclude that
                                   K   p   = (tr(|A|p ))1/p .
6.4. Relatively compact operators and Weyl’s theorem                     145


Problem 6.12. Show that K : 2 (N) → 2 (N), f (n) → j∈N k(n+j)f (j) is
Hilbert–Schmidt with K 2 ≤ c 1 if |k(n)| ≤ c(n), where c(n) is decreasing
and summable.

6.4. Relatively compact operators and Weyl’s theorem
In the previous section we have seen that the sum of a self-adjoint operator
and a symmetric operator is again self-adjoint if the perturbing operator is
small. In this section we want to study the influence of perturbations on
the spectrum. Our hope is that at least some parts of the spectrum remain
invariant.
    We introduce some notation first. The discrete spectrum σd (A) is the
set of all eigenvalues which are discrete points of the spectrum and whose
corresponding eigenspace is finite dimensional. The complement of the dis-
crete spectrum is called the essential spectrum σess (A) = σ(A)σd (A). If
A is self-adjoint, we might equivalently set
 σd (A) = {λ ∈ σp (A)| rank(PA ((λ − ε, λ + ε)))  ∞ for some ε  0}, (6.27)
respectively,
   σess (A) = {λ ∈ R| rank(PA ((λ − ε, λ + ε))) = ∞ for all ε  0}.    (6.28)

Example. For a self-adjoint compact operator K we have by Theorem 6.6
that
                             σess (K) ⊆ {0},                    (6.29)
where equality holds if and only if H is infinite dimensional.

    Let A be self-adjoint. Note that if we add a multiple of the identity to
A, we shift the entire spectrum. Hence, in general, we cannot expect a (rel-
atively) bounded perturbation to leave any part of the spectrum invariant.
Next, if λ0 is in the discrete spectrum, we can easily remove this eigenvalue
with a finite rank perturbation of arbitrarily small norm. In fact, consider
                              A + εPA ({λ0 }).                         (6.30)
Hence our only hope is that the remainder, namely the essential spectrum,
is stable under finite rank perturbations. To show this, we first need a good
criterion for a point to be in the essential spectrum of A.
Lemma 6.17 (Weyl criterion). A point λ is in the essential spectrum of
a self-adjoint operator A if and only if there is a sequence ψn such that
 ψn = 1, ψn converges weakly to 0, and (A − λ)ψn → 0. Moreover, the
sequence can be chosen orthonormal. Such a sequence is called a singular
Weyl sequence.
146                           6. Perturbation theory for self-adjoint operators


Proof. Let ψn be a singular Weyl sequence for the point λ0 . By Lemma 2.16
we have λ0 ∈ σ(A) and hence it suffices to show λ0 ∈ σd (A). If λ0 ∈ σd (A),
we can find an ε  0 such that Pε = PA ((λ0 − ε, λ0 + ε)) is finite rank.
         ˜                               ˜
Consider ψn = Pε ψn . Clearly (A − λ0 )ψn = Pε (A − λ0 )ψn ≤ (A −
                                        ˜
λ0 )ψn → 0 and Lemma 6.8 (iii) implies ψn → 0. However,
                     ˜
                ψn − ψn   2
                              =              dµψn (λ)
                                  R(λ−ε,λ+ε)
                                1
                              ≤             (λ − λ0 )2 dµψn (λ)
                               ε2 R(λ−ε,λ+ε)
                                1
                              ≤ 2 (A − λ0 )ψn 2
                               ε
          ˜
and hence ψn → 1, a contradiction.
                                                          1        1
    Conversely, if λ0 ∈ σess (A), consider Pn = PA ([λ − n , λ − n+1 ) ∪ (λ +
 1        1
n+1 , λ + n ]). Then rank(Pnj )  0 for an infinite subsequence nj . Now pick
ψj ∈ Ran Pnj .

   Now let K be a self-adjoint compact operator and ψn a singular Weyl
sequence for A. Then ψn converges weakly to zero and hence
               (A + K − λ)ψn ≤ (A − λ)ψn + Kψn → 0                       (6.31)
since (A − λ)ψn → 0 by assumption and Kψn → 0 by Lemma 6.8 (iii).
Hence σess (A) ⊆ σess (A + K). Reversing the roles of A + K and A shows
σess (A + K) = σess (A). In particular, note that A and A + K have the same
singular Weyl sequences.
   Since we have shown that we can remove any point in the discrete spec-
trum by a self-adjoint finite rank operator, we obtain the following equivalent
characterization of the essential spectrum.
Lemma 6.18. The essential spectrum of a self-adjoint operator A is pre-
cisely the part which is invariant under compact perturbations. In particular,
                     σess (A) =                    σ(A + K).             (6.32)
                                   K∈C(H),K ∗ =K

   There is even a larger class of operators under which the essential spec-
trum is invariant.
Theorem 6.19 (Weyl). Suppose A and B are self-adjoint operators. If
                              RA (z) − RB (z) ∈ C(H)                     (6.33)
for one z ∈ ρ(A) ∩ ρ(B), then
                                σess (A) = σess (B).                     (6.34)
6.4. Relatively compact operators and Weyl’s theorem                      147


Proof. In fact, suppose λ ∈ σess (A) and let ψn be a corresponding singular
Weyl sequence. Then
                               1        RA (z)
                  (RA (z) −       )ψn =        (A − λ)ψn
                              λ−z       z−λ
                    1
and thus (RA (z)− λ−z )ψn → 0. Moreover, by our assumption we also have
            1
 (RB (z) − λ−z )ψn → 0 and thus (B − λ)ϕn → 0, where ϕn = RB (z)ψn .
Since
                lim ϕn = lim RA (z)ψn = |λ − z|−1 = 0
               n→∞            n→∞
                    1           1
(since (RA (z) −   λ−z )ψn = λ−z RA (z)(A − λ)ψn → 0), we obtain a
singular Weyl sequence for B, showing λ ∈ σess (B). Now interchange the
roles of A and B.

   As a first consequence note the following result:
Theorem 6.20. Suppose A is symmetric with equal finite defect indices.
Then all self-adjoint extensions have the same essential spectrum.

Proof. By Lemma 2.29 the resolvent difference of two self-adjoint extensions
is a finite rank operator if the defect indices are finite.

   In addition, the following result is of interest.
Lemma 6.21. Suppose
                             RA (z) − RB (z) ∈ C(H)                     (6.35)
for one z ∈ ρ(A)∩ρ(B). Then this holds for all z ∈ ρ(A)∩ρ(B). In addition,
if A and B are self-adjoint, then
                              f (A) − f (B) ∈ C(H)                      (6.36)
for all f ∈ C∞ (R).

Proof. If the condition holds for one z, it holds for all since we have (using
both resolvent formulas)
   RA (z ) − RB (z )
       = (1 − (z − z )RB (z ))(RA (z) − RB (z))(1 − (z − z )RA (z )).

    Let A and B be self-adjoint. The set of all functions f for which the
claim holds is a closed ∗-subalgebra of C∞ (R) (with sup norm). Hence the
claim follows from Lemma 4.4.

    Remember that we have called K relatively compact with respect to
A if KRA (z) is compact (for one and hence for all z) and note that the
resolvent difference RA+K (z) − RA (z) is compact if K is relatively compact.
148                        6. Perturbation theory for self-adjoint operators


In particular, Theorem 6.19 applies if B = A + K, where K is relatively
compact.
    For later use observe that the set of all operators which are relatively
compact with respect to A forms a linear space (since compact operators
do) and relatively compact operators have A-bound zero.
Lemma 6.22. Let A be self-adjoint and suppose K is relatively compact
with respect to A. Then the A-bound of K is zero.

Proof. Write
                   KRA (λi) = (KRA (i))((A + i)RA (λi))
and observe that the first operator is compact and the second is normal
and converges strongly to 0 (cf. Problem 3.7). Hence the claim follows from
Lemma 6.3 and the discussion after Lemma 6.8 (since RA is normal).

   In addition, note the following result which is a straightforward conse-
quence of the second resolvent formula.
Lemma 6.23. Suppose A is self-adjoint and B is symmetric with A-bound
less then one. If K is relatively compact with respect to A, then it is also
relatively compact with respect to A + B.

Proof. Since B is A bounded with A-bound less than one, we can choose a
z ∈ C such that BRA (z)  1 and hence
                   BRA+B (z) = BRA (z)(I + BRA (z))−1                 (6.37)
shows that B is also A + B bounded and the result follows from
                  KRA+B (z) = KRA (z)(I − BRA+B (z))                  (6.38)
since KRA (z) is compact and BRA+B (z) is bounded.
Problem 6.13. Let A and B be self-adjoint operators. Suppose B is rel-
atively bounded with respect to A and A + B is self-adjoint. Show that if
|B|1/2 RA (z) is Hilbert–Schmidt for one z ∈ ρ(A), then this is true for all
z ∈ ρ(A). Moreover, |B|1/2 RA+B (z) is also Hilbert–Schmidt and RA+B (z) −
RA (z) is trace class.
                                   d2
Problem 6.14. Show that A = − dx2 + q(x), D(A) = H 2 (R) is self-adjoint
if q ∈ L∞ (R). Show that if −u (x) + q(x)u(x) = zu(x) has a solution for
which u and u are bounded near +∞ (or −∞) but u is not square integrable
near +∞ (or −∞), then z ∈ σess (A). (Hint: Use u to construct a Weyl
sequence by restricting it to a compact set. Now modify your construction
to get a singular Weyl sequence by observing that functions with disjoint
support are orthogonal.)
6.5. Relatively form bounded operators and the KLMN theorem                149


6.5. Relatively form bounded operators and the KLMN
     theorem
In Section 6.1 we have considered the case where the operators A and B
have a common domain on which the operator sum is well-defined. In this
section we want to look at the case were this is no longer possible, but where
it is still possible to add the corresponding quadratic forms. Under suitable
conditions this form sum will give rise to an operator via Theorem 2.13.
                                                           d  2
Example. Let A be the self-adjoint operator A = − dx2 , D(A) = {f ∈
H 2 [0, 1]|f (0) = f (1) = 0} in the Hilbert space L2 (0, 1). If we want to

add a potential represented by a multiplication operator with a real-valued
(measurable) function q, then we already have seen that q will be relatively
bounded if q ∈ L2 (0, 1). Hence, if q ∈ L2 (0, 1), we are out of luck with the
theory developed so far. On the other hand, if we look at the corresponding
quadratic forms, we have Q(A) = {f ∈ H 1 [0, 1]|f (0) = f (1) = 0} and
Q(q) = D(|q|1/2 ). Thus we see that Q(A) ⊂ Q(q) if q ∈ L1 (0, 1).
   In summary, the operators can be added if q ∈ L2 (0, 1) while the forms
can be added under the less restrictive condition q ∈ L1 (0, 1).
    Finally, note that in some drastic cases, there might even be no way to
define the operator sum: Let xj be an enumeration of the rational numbers
in (0, 1) and set
                                   ∞
                                               1
                          q(x) =                        ,
                                   j=1
                                         2j   |x − xj |
where the sum is to be understood as a limit in L1 (0, 1). Then q gives
rise to a self-adjoint multiplication operator in L2 (0, 1). However, note that
D(A) ∩ D(q) = {0}! In fact, let f ∈ D(A) ∩ D(q). Then f is continuous
and q(x)f (x) ∈ L2 (0, 1). Now suppose f (xj ) = 0 for some rational number
xj ∈ (0, 1). Then by continuity |f (x)| ≥ δ for x ∈ (xj − ε, xj + ε) and
q(x)|f (x)| ≥ δ2−j |x − xj |−1/2 for x ∈ (xj − ε, xj + ε) which shows that
q(x)f (x) ∈ L2 (0, 1) and hence f must vanish at every rational point. By
continuity, we conclude f = 0.

    Recall from Section 2.3 that every closed semi-bounded form q = qA
corresponds to a self-adjoint operator A (Theorem 2.13).
    Given a self-adjoint operator A ≥ γ and a (hermitian) form q : Q → R
with Q(A) ⊆ Q, we call q relatively form bound with respect to qA if
there are constants a, b ≥ 0 such that
                |q(ψ)| ≤ a qA−γ (ψ) + b ψ 2 ,          ψ ∈ Q(A).        (6.39)
The infimum of all possible a is called the form bound of q with respect
to qA .
150                                 6. Perturbation theory for self-adjoint operators


   Note that we do not require that q is associated with some self-adjoint
operator (though it will be in most cases).
                    d           2
Example. Let A = − dx2 , D(A) = {f ∈ H 2 [0, 1]|f (0) = f (1) = 0}. Then
                 q(f ) = |f (c)|2 ,          f ∈ H 1 [0, 1],         c ∈ (0, 1),
is a well-defined nonnegative form. Formally, one can interpret q as the
quadratic form of the multiplication operator with the delta distribution at
x = c. But for f ∈ Q(A) = {f ∈ H 1 [0, 1]|f (0) = f (1) = 0} we have by
Cauchy–Schwarz
                         c                            1
                                                                                           1
   |f (c)|2 = 2 Re           f (t)∗ f (t)dt ≤ 2           |f (t)∗ f (t)|dt ≤ ε f   2
                                                                                       +     f    2
                                                                                                      .
                     0                            0                                        ε
Consequently q is relatively bounded with bound 0 and hence qA + q gives
rise to a well-defined operator as we will show in the next theorem.

   The following result is the analog of the Kato–Rellich theorem and is
due to Kato, Lions, Lax, Milgram, and Nelson.
Theorem 6.24 (KLMN). Suppose qA : Q(A) → R is a semi-bounded closed
hermitian form and q a relatively bounded hermitian form with relative bound
less than one. Then qA + q defined on Q(A) is closed and hence gives rise to
a semi-bounded self-adjoint operator. Explicitly we have qA +q ≥ (1−a)γ −b.

Proof. A straightforward estimate shows qA (ψ) + q(ψ) ≥ (1 − a)qa (ψ) −
b ψ 2 ≥ ((1 − a)γ − b) ψ 2 ; that is, qA + q is semi-bounded. Moreover, by
                                     1                                     2
                     qA (ψ) ≤           |qA (ψ) + q(ψ)| + b ψ
                                    1−a
we see that the norms . qA and . qA +q are equivalent. Hence qA + q is
closed and the result follows from Theorem 2.13.

     In the investigation of the spectrum of the operator A + B a key role
is played by the second resolvent formula. In our present case we have the
following analog.
Theorem 6.25. Suppose A − γ ≥ 0 is self-adjoint and let q be a hermitian
form with Q(q) ⊆ Q(A). Then the hermitian form
                                q(RA (−λ)1/2 ψ),               ψ ∈ H,                            (6.40)
                                                                                            b
corresponds to a bounded operator Cq (λ) with Cq (λ) ≤ a for λ                             a    − γ if
and only if q is relatively form bound with constants a and b.
      In particular, the form bound is given by
                                         lim Cq (λ) .                                            (6.41)
                                        λ→∞
6.5. Relatively form bounded operators and the KLMN theorem               151


   Moreover, if a  1, then
            RqA +q (−λ) = RA (−λ)1/2 (1 − Cq (λ))−1 RA (−λ)1/2 .       (6.42)
Here RqA +q (z) is the resolvent of the self-adjoint operator corresponding to
qA + q.
                                               1/2
Proof. We will abbreviate C = Cq (λ) and RA = RA (−λ)1/2 . If q is form
                         b
bounded, we have for λ  a − γ that
                     1/2                1/2          1/2   2
               |q(RA ψ)| ≤ a qA−γ (RA ψ) + b RA ψ
                                          b 1/2                 2
                           = a ψ, (A − γ + )RA ψ ≤ a ψ
                                          a
               1/2
and hence q(RA ψ) corresponds to a bounded operator C. The converse is
similar.
    If a  1, then (1 − C)−1 is a well-defined bounded operator and so is
       1/2            1/2
R = RA (1 − C)−1 RA . To see that R is the inverse of A1 − λ, where A1
                                                  1/2
is the operator associated with qA + q, take ϕ = RA ϕ ∈ Q(A) and ψ ∈ H.
                                                      ˜
Then
  sA1 +λ (ϕ, Rψ) = sA+λ (ϕ, Rψ) + s(ϕ, Rψ)
                                  1/2                          1/2
                = ϕ, (1 + C)−1 RA ψ + ϕ, C(1 + C)−1 RA ψ = ϕ, ψ .
                  ˜                   ˜
Taking ϕ ∈ D(A1 ) ⊆ Q(A), we see (A1 + λ)ϕ, Rψ = ϕ, ψ and thus
R = RA1 (−λ) (Problem 6.15).

   Furthermore, we can define Cq (λ) for all z ∈ ρ(A) using
    Cq (z) = ((A + λ)1/2 RA (−z)1/2 )∗ Cq (λ)(A + λ)1/2 RA (−z)1/2 .   (6.43)
We will call q relatively form compact if the operator Cq (z) is compact for
one and hence all z ∈ ρ(A). As in the case of relatively compact operators
we have
Lemma 6.26. Suppose A − γ ≥ 0 is self-adjoint and let q be a hermitian
form. If q is relatively form compact with respect to qA , then its relative
form bound is 0 and the resolvents of qA + q and qA differ by a compact
operator.
   In particular, by Weyl’s theorem, the operators associated with qA and
qA + q have the same essential spectrum.
                  b
Proof. Fix λ0  a − γ and let λ ≥ λ0 . Consider the operator D(λ) =
(A+λ0 ) 1/2 R (−λ)1/2 and note that D(λ) is a bounded self-adjoint operator
             A
with D(λ) ≤ 1. Moreover, D(λ) converges strongly to 0 as λ → ∞ (cf.
Problem 3.7). Hence D(λ)C(λ0 ) → 0 by Lemma 6.8 and the same is
true for C(λ) = D(λ)C(λ0 )D(λ). So the relative bound is zero by (6.41).
152                         6. Perturbation theory for self-adjoint operators


Finally, the resolvent difference is compact by (6.42) since (1 + C)−1 =
1 − C(1 + C)−1 .

Corollary 6.27. Suppose A−γ ≥ 0 is self-adjoint and let q1 , q2 be hermitian
forms. If q1 is relatively bounded with bound less than one and q2 is relatively
compact, then the resolvent difference of qA + q1 + q2 and qA + q1 is compact.
In particular, the operators associated with qA + q1 and qA + q1 + q2 have the
same essential spectrum.

Proof. Just observe that Cq1 +q2 = Cq1 + Cq2 and (1 + Cq1 + Cq2 )−1 =
(1 + Cq1 )−1 − (1 + Cq1 )−1 Cq2 (1 + Cq1 + Cq2 )−1 .

   Finally we turn to the special case where q = qB for some self-adjoint
operator B. In this case we have

         CB (z) = (|B|1/2 RA (−z)1/2 )∗ sign(B)|B|1/2 RA (−z)1/2         (6.44)
and hence
                        CB (z) ≤ |B|1/2 RA (−z)1/2     2
                                                                         (6.45)
with equality if V ≥ 0. Thus the following result is not too surprising.

Lemma 6.28. Suppose A − γ ≥ 0 and B is self-adjoint. Then the following
are equivalent:
       (i) B is A form bounded.
      (ii) Q(A) ⊆ Q(B).
      (iii) |B|1/2 RA (z)1/2 is bounded for one (and hence for all) z ∈ ρ(A).

Proof. (i) ⇒ (ii) is true by definition. (ii) ⇒ (iii) since |B|1/2 RA (z)1/2
is a closed (Problem 2.9) operator defined on all of H and hence bounded
by the closed graph theorem (Theorem 2.8). To see (iii) ⇒ (i), observe
|B|1/2 RA (z)1/2 = |B|1/2 RA (z0 )1/2 (A − z0 )1/2 RA (z)1/2 which shows that
|B|1/2 RA (z)1/2 is bounded for all z ∈ ρ(A) if it is bounded for one z0 ∈ ρ(A).
But then (6.45) shows that (i) holds.

    Clearly C(λ) will be compact if |B|1/2 RA (z)1/2 is compact. However,
        1/2
since RA (z) might be hard to compute, we provide the following more
handy criterion.

Lemma 6.29. Suppose A − γ ≥ 0 and B is self-adjoint where B is rela-
tively form bounded with bound less than one. Then the resolvent difference
RA+B (z) − RA (z) is compact if |B|1/2 RA (z) is compact and trace class if
|B|1/2 RA (z) is Hilbert–Schmidt.
6.6. Strong and norm resolvent convergence                                153


Proof. Abbreviate RA = RA (−λ), B1 = |B|1/2 , B2 = sign(B)|B|1/2 . Then
                                1/2       ˜           1/2         ˜
we have (1 − CB )−1 = 1 − (B1 RA )∗ (1 + CB )−1 B2 RA , where CB =
     1/2     1/2 ∗                                 ˜
B2 RA (B1 RA ) . Hence RA+B − RA = (B1 RA )∗ (1 + CB )−1 B2 RA and the
claim follows.

    Moreover, the second resolvent formula still holds when interpreted suit-
ably:
Lemma 6.30. Suppose A − γ ≥ 0 and B is self-adjoint. If Q(A) ⊆ Q(B)
and qA + qB is a closed semi-bounded form. Then
     RA+B (z) = RA (z) − (|B|1/2 RA+B (z ∗ ))∗ sign(B)|B|1/2 RA (z)
               = RA (z) − (|B|1/2 RA (z ∗ ))∗ sign(B)|B|1/2 RA+B (z)   (6.46)
for z ∈ ρ(A) ∩ ρ(A + B). Here A + B is the self-adjoint operator associated
with qA + qB .

Proof. Let ϕ ∈ D(A + B) and ψ ∈ H. Denote the right-hand side in
(6.46) by R(z) and abbreviate R = R(z), RA = RA (z), B1 = |B|1/2 , B2 =
sign(B)|B|1/2 . Then, using sA+B−z (ϕ, ψ) = (A + B + z ∗ )ϕ, ψ ,
  sA+B−z (ϕ, Rψ) = sA+B−z (ϕ, RA ψ) − B1 RA+B (A + B + z ∗ )ϕ, B2 RA ψ
                                          ∗

                  = sA+B−z (ϕ, RA ψ) − sB (ϕ, RA ψ) = sA−z (ϕ, RA ψ)
                  = ϕ, ψ .
Thus R = RA+B (z) (Problem 6.15). The second equality follows after ex-
changing the roles of A and A + B.

    It can be shown using abstract interpolation techniques that if B is
relatively bounded with respect to A, then it is also relatively form bounded.
In particular, if B is relatively bounded, then BRA (z) is bounded and it is
not hard to check that (6.46) coincides with (6.4). Consequently A + B
defined as operator sum is the same as A + B defined as form sum.
Problem 6.15. Suppose A is closed and R is bounded. Show that R =
RA (z) if and only if (A − z)∗ ϕ, Rψ = ϕ, ψ for all ϕ ∈ D(A∗ ), ψ ∈ H.
Problem 6.16. Let q be relatively form bounded with constants a and b.
                                         b
Show that Cq (λ) satisfies C(λ) ≤ max(a, λ+γ ) for λ  −γ. Furthermore,
show that C(λ) decreases as λ → ∞.

6.6. Strong and norm resolvent convergence
Suppose An and A are self-adjoint operators. We say that An converges to
A in the norm, respectively, strong resolvent sense, if
    lim RAn (z) = RA (z),    respectively,   s-lim RAn (z) = RA (z),   (6.47)
    n→∞                                       n→∞
154                         6. Perturbation theory for self-adjoint operators


for one z ∈ Γ = CΣ, Σ = σ(A) ∪ n σ(An ). In fact, in the case of
strong resolvent convergence it will be convenient to include the case if An
                                                                    s
is only defined on some subspace Hn ⊆ H, where we require Pn → 1 for
the orthogonal projection onto Hn . In this case RAn (z) (respectively, any
other function of An ) has to be understood as RAn (z)Pn , where Pn is the
orthogonal projector onto Hn . (This generalization will produce nothing
new in the norm case, since Pn → 1 implies Pn = 1 for sufficiently large n.)
      Using the Stone–Weierstraß theorem, we obtain as a first consequence
Theorem 6.31. Suppose An converges to A in the norm resolvent sense.
Then f (An ) converges to f (A) in norm for any bounded continuous function
f : Σ → C with limλ→−∞ f (λ) = limλ→∞ f (λ).
    If An converges to A in the strong resolvent sense, then f (An ) converges
to f (A) strongly for any bounded continuous function f : Σ → C.

Proof. The set of functions for which the claim holds clearly forms a ∗-
subalgebra (since resolvents are normal, taking adjoints is continuous even
with respect to strong convergence) and since it contains f (λ) = 1 and
          1
f (λ) = λ−z0 , this ∗-subalgebra is dense by the Stone–Weierstraß theorem
                                ε
(cf. Problem 1.21). The usual 3 argument shows that this ∗-subalgebra is
also closed.
    It remains to show the strong resolvent case for arbitrary bounded con-
tinuous functions. Let χn be a compactly supported continuous function
                                                                  s
(0 ≤ χm ≤ 1) which is one on the interval [−m, m]. Then χm (An ) → χm (A),
                s
f (An )χm (An ) → f (A)χm (A) by the first part and hence
           (f (An ) − f (A))ψ ≤ f (An )   (1 − χm (A))ψ
                                + f (An )    (χm (A) − χm (An ))ψ
                                + (f (An )χm (An ) − f (A)χm (A))ψ
                                + f (A)     (1 − χm (A))ψ
                                                                s
can be made arbitrarily small since f (.) ≤ f     ∞   and χm (.) → I by Theo-
rem 3.1.

      As a consequence, note that the point z ∈ Γ is of no importance, that
is,
Corollary 6.32. Suppose An converges to A in the norm or strong resolvent
sense for one z0 ∈ Γ. Then this holds for all z ∈ Γ.
Also,
Corollary 6.33. Suppose An converges to A in the strong resolvent sense.
Then
                             s
                       eitAn → eitA ,   t ∈ R,                    (6.48)
6.6. Strong and norm resolvent convergence                              155


and if all operators are semi-bounded by the same bound
                                s
                         e−tAn → e−tA ,      t ≥ 0.                   (6.49)

    Next we need some good criteria to check for norm, respectively, strong,
resolvent convergence.
Lemma 6.34. Let An , A be self-adjoint operators with D(An ) = D(A).
Then An converges to A in the norm resolvent sense if there are sequences
an and bn converging to zero such that
       (An − A)ψ ≤ an ψ + bn Aψ ,           ψ ∈ D(A) = D(An ).        (6.50)

Proof. From the second resolvent formula
                RAn (z) − RA (z) = RAn (z)(A − An )RA (z),
we infer
       (RAn (i) − RA (i))ψ ≤ RAn (i)      an RA (i)ψ + bn ARA (i)ψ
                           ≤ (an + bn ) ψ
and hence RAn (i) − RA (i) ≤ an + bn → 0.

   In particular, norm convergence implies norm resolvent convergence:
Corollary 6.35. Let An , A be bounded self-adjoint operators with An → A.
Then An converges to A in the norm resolvent sense.

   Similarly, if no domain problems get in the way, strong convergence
implies strong resolvent convergence:
Lemma 6.36. Let An , A be self-adjoint operators. Then An converges to
A in the strong resolvent sense if there is a core D0 of A such that for any
ψ ∈ D0 we have Pn ψ ∈ D(An ) for n sufficiently large and An ψ → Aψ.

Proof. We begin with the case Hn = H. Using the second resolvent formula,
we have
             (RAn (i) − RA (i))ψ ≤ (A − An )RA (i)ψ → 0
for ψ ∈ (A − i)D0 which is dense, since D0 is a core. The rest follows from
Lemma 1.14.
                               ˜                                    s
    If Hn ⊂ H, we can consider An = An ⊕ 0 and conclude R ˜ (i) → RA (i)
                                                             An
from the first case. By RAn (i) = RAn (i) − i(1 − Pn ) the same is true for
                           ˜
                     s
RAn (i) since 1 − Pn → 0 by assumption.

    If you wonder why we did not define weak resolvent convergence, here
is the answer: it is equivalent to strong resolvent convergence.
156                           6. Perturbation theory for self-adjoint operators


Lemma 6.37. Suppose w-limn→∞ RAn (z) = RA (z) for some z ∈ Γ. Then
s-limn→∞ RAn (z) = RA (z) also.

Proof. By RAn (z)      RA (z) we also have RAn (z)∗             RA (z)∗ and thus by
the first resolvent formula
       RAn (z)ψ   2
                      − RA (z)ψ   2
                                      = ψ, RAn (z ∗ )RAn (z)ψ − RA (z ∗ )RA (z)ψ
             1
          =       ψ, (RAn (z) − RAn (z ∗ ) + RA (z) − RA (z ∗ ))ψ → 0.
          z − z∗
Together with RAn (z)ψ     RA (z)ψ we have RAn (z)ψ → RA (z)ψ by virtue
of Lemma 1.12 (iv).

      Now what can we say about the spectrum?
Theorem 6.38. Let An and A be self-adjoint operators. If An converges
to A in the strong resolvent sense, we have σ(A) ⊆ limn→∞ σ(An ). If An
converges to A in the norm resolvent sense, we have σ(A) = limn→∞ σ(An ).

Proof. Suppose the first claim were incorrect. Then we can find a λ ∈ σ(A)
and some ε  0 such that σ(An ) ∩ (λ − ε, λ + ε) = ∅. Choose a bounded
                                               ε      ε
continuous function f which is one on (λ − 2 , λ + 2 ) and which vanishes
outside (λ − ε, λ + ε). Then f (An ) = 0 and hence f (A)ψ = lim f (An )ψ = 0
for every ψ. On the other hand, since λ ∈ σ(A), there is a nonzero ψ ∈
              ε      ε
Ran PA ((λ − 2 , λ + 2 )) implying f (A)ψ = ψ, a contradiction.
    To see the second claim, recall that the norm of RA (z) is just one over
the distance from the spectrum. In particular, λ ∈ σ(A) if and only if
 RA (λ + i)  1. So λ ∈ σ(A) implies RA (λ + i)  1, which implies
 RAn (λ + i)  1 for n sufficiently large, which implies λ ∈ σ(An ) for n
sufficiently large.

Example. Note that the spectrum can contract if we only have convergence
                                                            1
in the strong resolvent sense: Let An be multiplication by n x in L2 (R).
Then An converges to 0 in the strong resolvent sense, but σ(An ) = R and
σ(0) = {0}.

Lemma 6.39. Suppose An converges in the strong resolvent sense to A. If
PA ({λ}) = 0, then
 s-lim PAn ((−∞, λ)) = s-lim PAn ((−∞, λ]) = PA ((−∞, λ)) = PA ((−∞, λ]).
 n→∞                       n→∞
                                                                              (6.51)

Proof. By Theorem 6.31 the spectral measures µn,ψ corresponding to An
converge vaguely to those of A. Hence PAn (Ω)ψ 2 = µn,ψ (Ω) together with
Lemma A.25 implies the claim.
6.6. Strong and norm resolvent convergence                                          157


     Using P ((λ0 , λ1 )) = P ((−∞, λ1 )) − P ((−∞, λ0 ]), we also obtain the
following.
Corollary 6.40. Suppose An converges in the strong resolvent sense to A.
If PA ({λ0 }) = PA ({λ1 }) = 0, then
  s-lim PAn ((λ0 , λ1 )) = s-lim PAn ([λ0 , λ1 ]) = PA ((λ0 , λ1 )) = PA ([λ0 , λ1 ]).
   n→∞                     n→∞
                                                                                 (6.52)

Example. The following example shows that the requirement PA ({λ}) = 0
is crucial, even if we have bounded operators and norm convergence. In fact,
let H = C2 and
                                   1   1 0
                             An =               .                    (6.53)
                                   n 0 −1
Then An → 0 and
                                                            0 0
                 PAn ((−∞, 0)) = PAn ((−∞, 0]) =                     ,           (6.54)
                                                            0 1
but P0 ((−∞, 0)) = 0 and P0 ((−∞, 0]) = I.

Problem 6.17. Show that for self-adjoint operators, strong resolvent con-
vergence is equivalent to convergence with respect to the metric
                                   1
                   d(A, B) =         (RA (i) − RB (i))ϕn ,         (6.55)
                                  2n
                                 n∈N
where {ϕn }n∈N is some (fixed) ONB.
Problem 6.18 (Weak convergence of spectral measures). Suppose An → A
in the strong resolvent sense and let µn,ψ , µψ be the corresponding spectral
measures. Show that
                           f (λ)dµn,ψ (λ) →       f (λ)dµψ (λ)                   (6.56)

for every bounded continuous f . Give a counterexample when f is not con-
tinuous.
Mathematical methods in quantum mechanics
Part 2

Schr¨dinger Operators
    o
Mathematical methods in quantum mechanics
Chapter 7




The free Schr¨dinger
             o
operator


7.1. The Fourier transform
We first review some basic facts concerning the Fourier transform which
will be needed in the following section.
    Let C ∞ (Rn ) be the set of all complex-valued functions which have partial
derivatives of arbitrary order. For f ∈ C ∞ (Rn ) and α ∈ Nn we set
                                                              0

                ∂ |α| f
   ∂α f =                   ,   xα = xα1 · · · xαn ,      |α| = α1 + · · · + αn .   (7.1)
            ∂xα1 · · · ∂xαn
              1          n
                                      1         n

An element α ∈ Nn is called a multi-index and |α| is called its order.
                   0
Recall the Schwartz space
        S(Rn ) = {f ∈ C ∞ (Rn )| sup |xα (∂β f )(x)|  ∞, α, β ∈ Nn }
                                                                  0                 (7.2)
                                    x
                                   ∞
which is dense in L2 (Rn ) (since Cc (Rn ) ⊂ S(Rn ) is). Note that if f ∈
S(Rn ), then the same is true for xα f (x) and (∂α f )(x) for any multi-index
α. For f ∈ S(Rn ) we define
                                             1
                            ˆ
                 F(f )(p) ≡ f (p) =                      e−ipx f (x)dn x.           (7.3)
                                          (2π)n/2   Rn
Then,
Lemma 7.1. The Fourier transform maps the Schwartz space into itself,
F : S(Rn ) → S(Rn ). Furthermore, for any multi-index α ∈ Nn and any
                                                           0
f ∈ S(Rn ) we have
            (∂α f )∧ (p) = (ip)α f (p),
                                 ˆ            (xα f (x))∧ (p) = i|α| ∂α f (p).
                                                                        ˆ           (7.4)

                                                                                     161
162                                                     7. The free Schr¨dinger operator
                                                                        o


Proof. First of all, by integration by parts, we see
                ∂                   1                      ∂
           (       f (x))∧ (p) =                     e−ipx     f (x)dn x
               ∂xj               (2π)n/2        Rn        ∂xj
                                    1                     ∂ −ipx
                               =                       −     e       f (x)dn x
                                 (2π)n/2        Rn       ∂xj
                                    1
                               =                     ipj e−ipx f (x)dn x = ipj f (p).
                                                                               ˆ
                                 (2π)n/2        Rn
So the first formula follows by induction.
      Similarly, the second formula follows from induction using
                             1
        (xj f (x))∧ (p) =                     xj e−ipx f (x)dn x
                          (2π)n/2        Rn
                             1                      ∂ −ipx                ∂ ˆ
                        =                      i       e   f (x)dn x = i     f (p),
                          (2π)n/2        Rn        ∂pj                   ∂pj
where interchanging the derivative and integral is permissible by Prob-
                        ˆ
lem A.8. In particular, f (p) is differentiable.
                 ˆ
    To see that f ∈ S(Rn ) if f ∈ S(Rn ), we begin with the observation
     ˆ                         ˆ                                 ˆ
that f is bounded; in fact, f ∞ ≤ (2π)−n/2 f 1 . But then pα (∂β f )(p) =
i−|α|−|β| (∂ α xβ f (x))∧ (p) is bounded since ∂ α xβ f (x) ∈ S(Rn ) if f ∈ S(Rn ).


    Hence we will sometimes write pf (x) for −i∂f (x), where ∂ = (∂1 , . . . , ∂n )
is the gradient.
      Two more simple properties are left as an exercise.
Lemma 7.2. Let f ∈ S(Rn ). Then
                       (f (x + a))∧ (p) = eiap f (p),
                                               ˆ                 a ∈ Rn ,               (7.5)
                                           1 ˆ p
                          (f (λx))∧ (p) = n f ( ),               λ  0.                 (7.6)
                                          λ       λ
    Next, we want to compute the inverse of the Fourier transform. For this
the following lemma will be needed.
                                  2 /2
Lemma 7.3. We have e−zx                  ∈ S(Rn ) for Re(z)  0 and
                                     2 /2              1 −p2 /(2z)
                            F(e−zx          )(p) =         e       .                    (7.7)
                                                     z n/2
                                     √
Here z n/2 has to be understood as ( z)n , where the branch cut of the root
is chosen along the negative real axis.

Proof. Due to the product structure of the exponential, one can treat each
coordinate separately, reducing the problem to the case n = 1.
7.1. The Fourier transform                                                                163


                                                                      ˆ
     Let φz (x) = exp(−zx2 /2). Then φz (x)+zxφz (x) = 0 and hence i(pφz (p)+
  ˆ                  ˆ
z φz (p)) = 0. Thus φz (p) = cφ1/z (p) and (Problem 7.1)

                      ˆ         1                               1
                  c = φz (0) = √              exp(−zx2 /2)dx = √
                                 2π         R                    z
at least for z  0. However, since the integral is holomorphic for Re(z)  0
by Problem A.10, this holds for all z with Re(z)  0 if we choose the branch
cut of the root along the negative real axis.

   Now we can show
Theorem 7.4. The Fourier transform F : S(Rn ) → S(Rn ) is a bijection.
Its inverse is given by
                                               1
                F −1 (g)(x) ≡ g (x) =
                              ˇ                                   eipx g(p)dn p.         (7.8)
                                            (2π)n/2          Rn

We have F 2 (f )(x) = f (−x) and thus F 4 = I.

Proof. Abbreviate φε (x) = exp(−εx2 /2). By dominated convergence we
have
                         1
       (f (p))∨ (x) =
        ˆ                             ˆ
                                 eipx f (p)dn p
                      (2π)n/2 Rn
                             1                   ˆ
                    = lim             φε (p)eipx f (p)dn p
                      ε→0 (2π)n/2 Rn
                            1
                    = lim                φε (p)eipx f (y)e−ipx dn ydn p,
                      ε→0 (2π)n Rn Rn

and, invoking Fubini and Lemma 7.2, we further see
                               1
                      = lim                      (φε (p)eipx )∧ (y)f (y)dn y
                        ε→0 (2π)n/2         Rn
                               1                   1
                      = lim                             φ1/ε (y − x)f (y)dn y
                        ε→0 (2π)n/2         Rn   εn/2
                               1                                    √
                      = lim                      φ1 (z)f (x +        εz)dn z = f (x),
                        ε→0 (2π)n/2         Rn
                                                                                          y−x
which finishes the proof, where we used the change of coordinates z =                       √
                                                                                             ε
and again dominated convergence in the last two steps.

   From Fubini’s theorem we also obtain Parseval’s identity
                                     1
                 ˆ
                |f (p)|2 dn p =                             f (x)∗ f (p)eipx dn p dn x
                                                                   ˆ
           Rn                     (2π)n/2    Rn        Rn

                            =          |f (x)|2 dn x                                     (7.9)
                                  Rn
164                                                   7. The free Schr¨dinger operator
                                                                      o


for f ∈ S(Rn ). Thus, by Theorem 0.26, we can extend F to L2 (Rn ) by
setting
                              1
                ˆ
                f (p) = lim              e−ipx f (x)dn x,      (7.10)
                       R→∞ (2π)n/2 |x|≤R

where the limit is to be understood in L2 (Rn ) (Problem 7.5). If f ∈ L1 (Rn )∩
                                              ˆ
L2 (Rn ), we can omit the limit (why?) and f is still given by (7.3).
Theorem 7.5. The Fourier transform F extends to a unitary operator F :
L2 (Rn ) → L2 (Rn ). Its spectrum is given by
                       σ(F) = {z ∈ C|z 4 = 1} = {1, −1, i, −i}.                    (7.11)

Proof. As already noted, F extends uniquely to a bounded operator on
L2 (Rn ). Moreover, the same is true for F −1 . Since Parseval’s identity
remains valid by continuity of the norm, this extension is a unitary operator.
    It remains to compute the spectrum. In fact, if ψn is a Weyl sequence,
then (F 2 + z 2 )(F + z)(F − z)ψn = (F 4 − z 4 )ψn = (1 − z 4 )ψn → 0 implies
z 4 = 1. Hence σ(F) ⊆ {z ∈ C|z 4 = 1}. We defer the proof for equality
to Section 8.3, where we will explicitly compute an orthonormal basis of
eigenfunctions.

    Lemma 7.1 also allows us to extend differentiation to a larger class. Let
us introduce the Sobolev space
                                               ˆ
                H r (Rn ) = {f ∈ L2 (Rn )||p|r f (p) ∈ L2 (Rn )}.     (7.12)
Then, every function in H r (Rn ) has partial derivatives up to order r, which
are defined via
               ∂α f = ((ip)α f (p))∨ ,
                             ˆ           f ∈ H r (Rn ), |α| ≤ r.         (7.13)
By Lemma 7.1 this definition coincides with the usual one for every f ∈
S(Rn ) and we have

            g(x)(∂α f )(x)dn x = g ∗ , (∂α f ) = g (p)∗ , (ip)α f (p)
                                                 ˆ              ˆ
       Rn
                               = (−1)|α| (ip)α g (p)∗ , f (p) = (−1)|α| ∂α g ∗ , f
                                               ˆ        ˆ

                               = (−1)|α|           (∂α g)(x)f (x)dn x,             (7.14)
                                              Rn
for f, g ∈ H r (Rn ). Furthermore, recall that a function h ∈ L1 (Rn ) satisfy-
                                                               loc
ing

           ϕ(x)h(x)dn x = (−1)|α|          (∂α ϕ)(x)f (x)dn x,          ∞
                                                                   ϕ ∈ Cc (Rn ),   (7.15)
      Rn                              Rn
is also called the weak derivative or the derivative in the sense of distri-
butions of f (by Lemma 0.37 such a function is unique if it exists). Hence,
7.1. The Fourier transform                                                                          165


choosing g = ϕ in (7.14), we see that H r (Rn ) is the set of all functions hav-
ing partial derivatives (in the sense of distributions) up to order r, which
are in L2 (Rn ).
    Finally, we note that on L1 (Rn ) we have

Lemma 7.6 (Riemann-Lebesgue). Let C∞ (Rn ) denote the Banach space
of all continuous functions f : Rn → C which vanish at ∞ equipped with
the sup norm. Then the Fourier transform is a bounded injective map from
L1 (Rn ) into C∞ (Rn ) satisfying
                                    ˆ
                                    f   ∞    ≤ (2π)−n/2 f         1.                          (7.16)

                            ˆ
Proof. Clearly we have f ∈ C∞ (Rn ) if f ∈ S(Rn ). Moreover, since S(Rn )
is dense in L1 (Rn ), the estimate
                       1                                                  1
         ˆ
    sup |f (p)| ≤          sup               |e−ipx f (x)|dn x =                      |f (x)|dn x
     p              (2π)n/2 p           Rn                             (2π)n/2   Rn

shows that the Fourier transform extends to a continuous map from L1 (Rn )
into C∞ (Rn ).
                                                            ˆ
    To see that the Fourier transform is injective, suppose f = 0. Then
Fubini implies

                    0=                ˆ
                                  ϕ(x)f (x)dn x =             ϕ(x)f (x)dn x
                                                              ˆ
                             Rn                          Rn

for every ϕ ∈ S(Rn ). Hence Lemma 0.37 implies f = 0.

    Note that F : L1 (Rn ) → C∞ (Rn ) is not onto (cf. Problem 7.7).
    Another useful property is the convolution formula.

Lemma 7.7. The convolution

         (f ∗ g)(x) =        f (y)g(x − y)dn y =                  f (x − y)g(y)dn y           (7.17)
                        Rn                                   Rn

of two functions f, g ∈ L1 (Rn ) is again in L1 (Rn ) and we have Young’s
inequality
                                    f ∗g       1   ≤ f   1   g 1.                             (7.18)
Moreover, its Fourier transform is given by

                         (f ∗ g)∧ (p) = (2π)n/2 f (p)ˆ(p).
                                                ˆ g                                           (7.19)

Proof. The fact that f ∗ g is in L1 together with Young’s inequality follows
by applying Fubini’s theorem to h(x, y) = f (x − y)g(y). For the last claim
166                                                    7. The free Schr¨dinger operator
                                                                       o


we compute
                            1
        (f ∗ g)∧ (p) =                   e−ipx        f (y)g(x − y)dn y dn x
                         (2π)n/2    Rn           Rn
                                               1
                     =        e−ipy f (y)                   e−ip(x−y) g(x − y)dn x dn y
                         Rn                 (2π)n/2    Rn

                     =        e−ipy f (y)ˆ(p)dn y = (2π)n/2 f (p)ˆ(p),
                                         g                  ˆ g
                         Rn
where we have again used Fubini’s theorem.

    In other words, L1 (Rn ) together with convolution as a product is a
Banach algebra (without identity). For the case of convolution on L2 (Rn )
see Problem 7.9.
                                               √
Problem 7.1. Show that R exp(−x2 /2)dx = 2π. (Hint: Square the inte-
gral and evaluate it using polar coordinates.)
Problem 7.2. Compute the Fourier transform of the following functions
f : R → C:
                                                          1
      (i) f (x) = χ(−1,1) (x).        (ii) f (p) =     p2 +k2
                                                              ,   Re(k)  0.

Problem 7.3. Suppose f (x) ∈ L1 (R) and g(x) = −ixf (x) ∈ L1 (R). Then
ˆ                      ˆ
f is differentiable and f = g .
                           ˆ
Problem 7.4. A function f : Rn → C is called spherically symmetric if
it is invariant under rotations; that is, f (Ox) = f (x) for all O ∈ SO(Rn )
(equivalently, f depends only on the distance to the origin |x|). Show that the
Fourier transform of a spherically symmetric function is again spherically
symmetric.
Problem 7.5. Show (7.10). (Hint: First suppose f has compact support.
                                               ∞
Then there is a sequence of functions fn ∈ Cc (Rn ) converging to f in L2 .
The support of these functions can be chosen inside a fixed set and hence
this sequence also converges to f in L1 . Thus (7.10) follows for f ∈ L2 with
compact support. To remove this restriction, use that the projection onto a
ball with radius R converges strongly to the identity as R → ∞.)
Problem 7.6. Show that C∞ (Rn ) is indeed a Banach space. Show that
S(Rn ) is dense.
Problem 7.7. Show that F : L1 (Rn ) → C∞ (Rn ) is not onto as follows:
      (i) The range of F is dense.
      (ii) F is onto if and only if it has a bounded inverse.
      (iii) F has no bounded inverse.
7.2. The free Schr¨dinger operator
                  o                                                       167


    (Hint for (iii): Suppose ϕ is smooth with compact support in (0, 1) and
set fm (x) = m eikx ϕ(x − k). Then fm 1 = m ϕ 1 and fm ∞ ≤ const
                k=1
                                                            ˆ
since ϕ ∈ S(R) and hence ϕ(p) ≤ const(1 + |p|)  −2 ).


Problem 7.8. Show that the convolution of two S(Rn ) functions is in
S(Rn ).
Problem 7.9. Show that the convolution of two L2 (Rn ) functions is in
C∞ (Rn ) and we have f ∗ g ∞ ≤ f 2 g 2 .
Problem 7.10 (Wiener). Suppose f ∈ L2 (Rn ). Then the set {f (x + a)|a ∈
                                         ˆ
Rn } is total in L2 (Rn ) if and only if f (p) = 0 a.e. (Hint: Use Lemma 7.2
and the fact that a subspace is total if and only if its orthogonal complement
is zero.)
                                                               ˆ
Problem 7.11. Suppose f (x)ek|x| ∈ L1 (R) for some k  0. Then f (p) has
an analytic extension to the strip | Im(p)|  k.

7.2. The free Schr¨dinger operator
                  o
In Section 2.1 we have seen that the Hilbert space corresponding to one
particle in R3 is L2 (R3 ). More generally, the Hilbert space for N particles
in Rd is L2 (Rn ), n = N d. The corresponding nonrelativistic Hamilton
operator, if the particles do not interact, is given by
                                  H0 = −∆,                             (7.20)
where ∆ is the Laplace operator
                                      n
                                           ∂2
                                ∆=             .                       (7.21)
                                     j=1
                                           ∂x2
                                             j

Here we have chosen units such that all relevant physical constants disap-
pear; that is, = 1 and the mass of the particles is equal to m = 1 . Be
                                                                    2
                                                             1
aware that some authors prefer to use m = 1; that is, H0 = − 2 ∆.
   Our first task is to find a good domain such that H0 is a self-adjoint
operator.
   By Lemma 7.1 we have that
             − ∆ψ(x) = (p2 ψ(p))∨ (x),
                            ˆ                      ψ ∈ H 2 (Rn ),      (7.22)
and hence the operator
                    H0 ψ = −∆ψ,        D(H0 ) = H 2 (Rn ),             (7.23)
is unitarily equivalent to the maximally defined multiplication operator
  (F H0 F −1 )ϕ(p) = p2 ϕ(p),     D(p2 ) = {ϕ ∈ L2 (Rn )|p2 ϕ(p) ∈ L2 (Rn )}.
                                                                        (7.24)
168                                                     7. The free Schr¨dinger operator
                                                                        o


Theorem 7.8. The free Schr¨dinger operator H0 is self-adjoint and its
                             o
spectrum is characterized by
         σ(H0 ) = σac (H0 ) = [0, ∞),             σsc (H0 ) = σpp (H0 ) = ∅.           (7.25)

Proof. It suffices to show that dµψ is purely absolutely continuous for every
ψ. First observe that
                                                        ˆ
                                                      |ψ(p)|2 n               1
                     ˆ         ˆ
       ψ, RH0 (z)ψ = ψ, Rp2 (z)ψ =                            d p=               d˜ψ (r),
                                                                                  µ
                                              Rn       p2 − z        R   r2   −z
where
              d˜ψ (r) = χ[0,∞) (r)rn−1
               µ                                           ˆ
                                                          |ψ(rω)|2 dn−1 ω dr.
                                                  S n−1
Hence, after a change of coordinates, we have
                                                       1
                          ψ, RH0 (z)ψ =                   dµψ (λ),
                                                  R   λ−z
where
                    1                                         √
           dµψ (λ) = χ[0,∞) (λ)λn/2−1                       ˆ
                                                           |ψ( λω)|2 dn−1 ω dλ,
                    2                              S n−1
proving the claim.

    Finally, we note that the compactly supported smooth functions are a
core for H0 .
                    ∞
Lemma 7.9. The set Cc (Rn ) = {f ∈ S(Rn )| supp(f ) is compact} is a core
for H0 .

Proof. It is not hard to see that S(Rn ) is a core (Problem 7.12) and hence it
suffices to show that the closure of H0 |Cc (Rn ) contains H0 |S(Rn ) . To see this,
                                         ∞

let ϕ(x) ∈ Cc ∞ (Rn ) which is one for |x| ≤ 1 and vanishes for |x| ≥ 2. Set
            1                                       ∞
ϕn (x) = ϕ( n x). Then ψn (x) = ϕn (x)ψ(x) is in Cc (Rn ) for every ψ ∈ S(Rn )
and ψn → ψ, respectively, ∆ψn → ∆ψ.

      Note also that the quadratic form of H0 is given by
                    n
        qH0 (ψ) =              |∂j ψ(x)|2 dn x,       ψ ∈ Q(H0 ) = H 1 (Rn ).          (7.26)
                    j=1   Rn

Problem 7.12. Show that S(Rn ) is a core for H0 . (Hint: Show that the
closure of H0 |S(Rn ) contains H0 .)
Problem 7.13. Show that {ψ ∈ S(R)|ψ(0) = 0} is dense but not a core for
        d2
H0 = − dx2 .
7.3. The time evolution in the free case                                                              169


7.3. The time evolution in the free case
Now let us look at the time evolution. We have
                                                     2
                             e−itH0 ψ(x) = F −1 e−itp ψ(p).
                                                       ˆ                                            (7.27)
The right-hand side is a product and hence our operator should be express-
ible as an integral operator via the convolution formula. However, since
     2
e−itp is not in L2 , a more careful analysis is needed.
   Consider
                                                          2
                             fε (p2 ) = e−(it+ε)p ,                  ε  0.                         (7.28)
Then fε (H0 )ψ →     e−itH0 ψ
                            by Theorem 3.1. Moreover, by Lemma 7.3 and
the convolution formula we have
                                   1              |x−y|2
                                                −
             fε (H0 )ψ(x) =                    e 4(it+ε) ψ(y)dn y (7.29)
                            (4π(it + ε))n/2 Rn
and hence
                                         1                           |x−y|2
                   e−itH0 ψ(x) =                                ei     4t     ψ(y)dn y              (7.30)
                                     (4πit)n/2            Rn
for t = 0 and ψ ∈ L1 ∩ L2 . For general ψ ∈ L2 the integral has to be
understood as a limit.
    Using this explicit form, it is not hard to draw some immediate conse-
quences. For example, if ψ ∈ L2 (Rn ) ∩ L1 (Rn ), then ψ(t) ∈ C(Rn ) for t = 0
(use dominated convergence and continuity of the exponential) and satisfies
                                                 1
                              ψ(t)   ∞   ≤             ψ(0) 1 .                                     (7.31)
                                              |4πt|n/2
Thus we have spreading of wave functions in this case. Moreover, it is even
possible to determine the asymptotic form of the wave function for large t
as follows. Observe
                                         x2
                −itH0            ei 4t                          y2             xy
               e        ψ(x) =                                ei 4t ψ(y)ei 2t dn y
                               (4πit)n/2                Rn
                                              n/2                                   ∧
                                      1                 ix
                                                          2
                                                                 iy
                                                                      2
                                                                                            x
                               =                    e    4t     e 4t      ψ(y)          (      ).   (7.32)
                                     2it                                                    2t
                         2
Moreover, since exp(i y )ψ(y) → ψ(y) in L2 as |t| → ∞ (dominated conver-
                      4t
gence), we obtain
Lemma 7.10. For any ψ ∈ L2 (Rn ) we have
                                                    n/2
                                            1                    2
                                                                 x
                                                                    ˆ x
                     e−itH0 ψ(x) −                            ei 4t ψ( ) → 0                        (7.33)
                                           2it                        2t
in L2 as |t| → ∞.
170                                                 7. The free Schr¨dinger operator
                                                                    o


     Note that this result is not too surprising from a physical point of view.
In fact, if a classical particle starts at a point x(0) = x0 with velocity v = 2p
(recall that we use units where the mass is m = 1 ), then we will find it at
                                                        2
x = x0 + 2pt at time t. Dividing by 2t, we get 2t = p + x0 ≈ p for large t.
                                                      x
                                                               2t
Hence the probability distribution for finding a particle at a point x at time
                                                                                x
t should approach the probability distribution for the momentum at p = 2t ;
that is, |ψ(x, t)| 2 dn x = |ψ( x )|2 dn x . This could also be stated as follows:
                                2t   (2t)n
The probability of finding the particle in a region Ω ⊆ Rn is asymptotically
for |t| → ∞ equal to the probability of finding the momentum of the particle
    1
in 2t Ω.
     Next we want to apply the RAGE theorem in order to show that for any
initial condition, a particle will escape to infinity.
Lemma 7.11. Let g(x) be the multiplication operator by g and let f (p) be
                                             ˆ
the operator given by f (p)ψ(x) = F −1 (f (p)ψ(p))(x). Denote by L∞ (Rn ) the
                                                                  ∞
bounded Borel functions which vanish at infinity. Then
                           f (p)g(x)   and          g(x)f (p)                      (7.34)
are compact if f, g ∈ L∞ (Rn ) and (extend to) Hilbert–Schmidt operators if
                       ∞
f, g ∈ L2 (Rn ).

Proof. By symmetry it suffices to consider g(x)f (p). Let f, g ∈ L2 . Then
                                    1                   ˇ
               g(x)f (p)ψ(x) =                      g(x)f (x − y)ψ(y)dn y
                                 (2π)n/2       Rn
                                                  ˇ
shows that g(x)f (p) is Hilbert–Schmidt since g(x)f (x − y) ∈ L2 (Rn × Rn ).
    If f, g are bounded, then the functions fR (p) = χ{p|p2 ≤R} (p)f (p) and
gR (x) = χ{x|x2 ≤R} (x)g(x) are in L2 . Thus gR (x)fR (p) is compact and by
        g(x)f (p) − gR (x)fR (p) ≤ g       ∞    f − fR     ∞   + g − gR   ∞   fR   ∞

it tends to g(x)f (p) in norm since f, g vanish at infinity.

      In particular, this lemma implies that
                                  χΩ (H0 + i)−1                                    (7.35)
is compact if Ω ⊆ Rn is bounded and hence
                             lim χΩ e−itH0 ψ          2
                                                          =0                       (7.36)
                            t→∞

for any ψ ∈ L2 (Rn ) and any bounded subset Ω of Rn . In other words, the
particle will eventually escape to infinity since the probability of finding the
particle in any bounded set tends to zero. (If ψ ∈ L1 (Rn ), this of course
also follows from (7.31).)
7.4. The resolvent and Green’s function                                                     171


7.4. The resolvent and Green’s function
Now let us compute the resolvent of H0 . We will try to use an approach
similar to that for the time evolution in the previous section. However,
since it is highly nontrivial to compute the inverse Fourier transform of
exp(−εp2 )(p2 − z)−1 directly, we will use a small ruse.
    Note that
                                        ∞
                  RH0 (z) =                 ezt e−tH0 dt,             Re(z)  0,          (7.37)
                                    0
by Lemma 4.1. Moreover,
                                   1                        |x−y|2
             e−tH0 ψ(x) =                              e−     4t     ψ(y)dn y,   t  0,   (7.38)
                                (4πt)n/2          Rn

by the same analysis as in the previous section. Hence, by Fubini, we have

                  RH0 (z)ψ(x) =                   G0 (z, |x − y|)ψ(y)dn y,                (7.39)
                                             Rn
where
                          ∞
                                 1        r2
        G0 (z, r) =                    e− 4t +zt dt,            r  0, Re(z)  0.         (7.40)
                      0       (4πt)n/2
The function G0 (z, r) is called Green’s function of H0 . The integral can be
evaluated in terms of modified Bessel functions of the second kind as follows:
First of all it suffices to consider z  0 since the remaining values will follow
                                                                           r
by analytic continuation. Then, making the substitution t = 2√−z es , we
obtain
       ∞                             √       n
                                               −1   ∞
             1        2
                   − r +zt       1     −z 2
                  e 4t dt =                            e−νs e−x cosh(s) ds
     0   (4πt)n/2               4π 2πr            −∞
                                     √       n
                                               −1   ∞
                                 1     −z    2
                              =                       cosh(−νs)e−x cosh(s) ds,
                                2π 2πr            0
                                                                             (7.41)
                                   √
where we have abbreviated x = −zr and ν = n − 1. But the last integral
                                                    2
is given by the modified Bessel function Kν (x) (see [1, (9.6.24)]) and thus
                                    √      n
                                             −1
                                1     −z 2              √
                   G0 (z, r) =                  K n −1 ( −zr).               (7.42)
                               2π 2πr             2


Note Kν (x) = K−ν (x) and Kν (x)  0 for ν, x ∈ R. The functions Kν (x)
satisfy the differential equation (see [1, (9.6.1)])
                           d2   1 d    ν2
                              +     −1− 2                        Kν (x) = 0               (7.43)
                          dx2 x dx     x
172                                             7. The free Schr¨dinger operator
                                                                o


and have the asymptotics (see [1, (9.6.8) and (9.6.9)])
                                Γ(ν)  x −ν
                                            + O(x−ν+2 ),   ν  0,
                Kν (x) =         2    2
                                       x                                  (7.44)
                                − log( 2 ) + O(1),         ν = 0,
for |x| → 0 and (see [1, (9.7.2)])
                                   π −x
                       Kν (x) =       e (1 + O(x−1 ))                  (7.45)
                                   2x
for |x| → ∞. For more information see for example [1] or [59]. In particular,
G0 (z, r) has an analytic continuation for z ∈ C[0, ∞) = ρ(H0 ). Hence we
can define the right-hand side of (7.39) for all z ∈ ρ(H0 ) such that

                                ϕ(x)G0 (z, |x − y|)ψ(y)dn ydn x           (7.46)
                     Rn    Rn
is analytic for z ∈ ρ(H0 ) and ϕ, ψ ∈ S(Rn ) (by Morera’s theorem). Since
it is equal to ϕ, RH0 (z)ψ for Re(z)  0, it is equal to this function for all
z ∈ ρ(H0 ), since both functions are analytic in this domain. In particular,
(7.39) holds for all z ∈ ρ(H0 ).
    If n is odd, we have the case of spherical Bessel functions which can be
expressed in terms of elementary functions. For example, we have
                                  1     √
                    G0 (z, r) = √ e− −z r ,        n = 1,              (7.47)
                                2 −z
and
                                  1 −√−z r
                     G0 (z, r) =     e      ,     n = 3.               (7.48)
                                 4πr
Problem 7.14. Verify (7.39) directly in the case n = 1.
Chapter 8




Algebraic methods


8.1. Position and momentum
Apart from the Hamiltonian H0 , which corresponds to the kinetic energy,
there are several other important observables associated with a single par-
ticle in three dimensions. Using the commutation relation between these
observables, many important consequences about these observables can be
derived.
    First consider the one-parameter unitary group
                   (Uj (t)ψ)(x) = e−itxj ψ(x),      1 ≤ j ≤ 3.               (8.1)
For ψ ∈ S(R3 ) we compute
                                e−itxj ψ(x) − ψ(x)
                       lim i                       = xj ψ(x)                  (8.2)
                           t→0           t
and hence the generator is the multiplication operator by the j’th coordinate
function. By Corollary 5.3 it is essentially self-adjoint on ψ ∈ S(R3 ). It is
customary to combine all three operators into one vector-valued operator
x, which is known as the position operator. Moreover, it is not hard
to see that the spectrum of xj is purely absolutely continuous and given
by σ(xj ) = R. In fact, let ϕ(x) be an orthonormal basis for L2 (R). Then
ϕi (x1 )ϕj (x2 )ϕk (x3 ) is an orthonormal basis for L2 (R3 ) and x1 can be written
as an orthogonal sum of operators restricted to the subspaces spanned by
ϕj (x2 )ϕk (x3 ). Each subspace is unitarily equivalent to L2 (R) and x1 is
given by multiplication with the identity. Hence the claim follows (or use
Theorem 4.14).
    Next, consider the one-parameter unitary group of translations
                   (Uj (t)ψ)(x) = ψ(x − tej ),      1 ≤ j ≤ 3,               (8.3)

                                                                               173
174                                                                 8. Algebraic methods


where ej is the unit vector in the j’th coordinate direction. For ψ ∈ S(R3 )
we compute
                          ψ(x − tej ) − ψ(x)   1 ∂
                    lim i                    =       ψ(x)              (8.4)
                    t→0           t            i ∂xj
and hence the generator is pj = 1 ∂xj . Again it is essentially self-adjoint
                                  i
                                     ∂

on ψ ∈ S(R3 ). Moreover, since it is unitarily equivalent to xj by virtue of
the Fourier transform, we conclude that the spectrum of pj is again purely
absolutely continuous and given by σ(pj ) = R. The operator p is known as
the momentum operator. Note that since
                          [H0 , pj ]ψ(x) = 0,       ψ ∈ S(R3 ),                    (8.5)
we have
             d
               ψ(t), pj ψ(t) = 0,  ψ(t) = e−itH0 ψ(0) ∈ S(R3 );    (8.6)
            dt
that is, the momentum is a conserved quantity for the free motion. More
generally we have
Theorem 8.1 (Noether). Suppose A is a self-adjoint operator which com-
mutes with a self-adjoint operator H. Then D(A) is invariant under e−itH ,
that is, e−itH D(A) = D(A), and A is a conserved quantity, that is,
       ψ(t), Aψ(t) = ψ(0), Aψ(0) ,              ψ(t) = e−itH ψ(0) ∈ D(A).          (8.7)

Proof. By the second part of Lemma 4.5 (with f (λ) = λ and B = e−itH ) we
see D(A) = D(e−itH A) ⊆ D(Ae−itH ) = {ψ|e−itH ψ ∈ D(A)} which implies
e−itH D(A) ⊆ D(A), and [e−itH , A]ψ = 0 for ψ ∈ D(A).

      Similarly one has
                    i[pj , xk ]ψ(x) = δjk ψ(x),         ψ ∈ S(R3 ),                (8.8)
which is known as the Weyl relations. In terms of the corresponding
unitary groups they read
                          e−ispj e−itxk = eistδjk e−itxj e−ispk .                  (8.9)
   The Weyl relations also imply that the mean-square deviation of position
and momentum cannot be made arbitrarily small simultaneously:
Theorem 8.2 (Heisenberg Uncertainty Principle). Suppose A and B are
two symmetric operators. Then for any ψ ∈ D(AB) ∩ D(BA) we have
                                     1
                     ∆ψ (A)∆ψ (B) ≥ |Eψ ([A, B])|             (8.10)
                                     2
with equality if
              (B − Eψ (B))ψ = iλ(A − Eψ (A))ψ,                 λ ∈ R{0},         (8.11)
or if ψ is an eigenstate of A or B.
8.2. Angular momentum                                                          175


Proof. Let us fix ψ ∈ D(AB) ∩ D(BA) and abbreviate
                  ˆ
                 A = A − Eψ (A),   ˆ
                                  B = B − Eψ (B).
              ˆ             ˆ
Then ∆ψ (A) = Aψ , ∆ψ (B) = Bψ and hence by Cauchy–Schwarz
                           ˆ ˆ
                         | Aψ, Bψ | ≤ ∆ψ (A)∆ψ (B).
Now note that
             ˆˆ    1 ˆ ˆ      1                ˆ ˆ    ˆˆ   ˆˆ
            AB = {A, B} + [A, B],            {A, B} = AB + B A
                   2          2
        ˆ ˆ
where {A, B} and i[A, B] are symmetric. So
       ˆ ˆ              ˆˆ        1         ˆ ˆ        1
     | Aψ, Bψ |2 = | ψ, ABψ |2 = | ψ, {A, B}ψ |2 + | ψ, [A, B]ψ |2
                                  2                    2
which proves (8.10).
                                                           ˆ       ˆ
    To have equality if ψ is not an eigenstate, we need Bψ = z Aψ for
                                      ˆ ˆ
equality in Cauchy–Schwarz and ψ, {A, B}ψ = 0. Inserting the first into
                                             ˆ
the second requirement gives 0 = (z − z ∗ ) Aψ 2 and shows Re(z) = 0.

   In the case of position and momentum we have ( ψ = 1)
                                              δjk
                           ∆ψ (pj )∆ψ (xk ) ≥                              (8.12)
                                               2
and the minimum is attained for the Gaussian wave packets
                                     n/4
                                 λ            λ           2 −ip
                       ψ(x) =              e− 2 |x−x0 |           0x
                                                                       ,   (8.13)
                                 π
                                                                           λ
which satisfy Eψ (x) = x0 and Eψ (p) = p0 , respectively, ∆ψ (pj )2 =      2   and
             1
∆ψ (xk )2 = 2λ .
Problem 8.1. Check that (8.13) realizes the minimum.

8.2. Angular momentum
Now consider the one-parameter unitary group of rotations
                   (Uj (t)ψ)(x) = ψ(Mj (t)x),             1 ≤ j ≤ 3,       (8.14)
where Mj (t) is the matrix of rotation around ej by an angle of t. For
ψ ∈ S(R3 ) we compute
                                                  3
                        ψ(Mi (t)x) − ψ(x)
                  lim i                   =            εijk xj pk ψ(x),    (8.15)
                  t→0           t
                                               j,k=1
where
                 
                  1  if ijk is an even permutation of 123,
        εijk   =   −1 if ijk is an odd permutation of 123,                 (8.16)
                   0  otherwise.
                 
176                                                                       8. Algebraic methods


Again one combines the three components into one vector-valued operator
L = x ∧ p, which is known as the angular momentum operator. Since
ei2πLj = I, we see that the spectrum is a subset of Z. In particular, the
continuous spectrum is empty. We will show below that we have σ(Lj ) = Z.
Note that since
                       [H0 , Lj ]ψ(x) = 0,     ψ ∈ S(R3 ),          (8.17)
we again have
          d
             ψ(t), Lj ψ(t) = 0,       ψ(t) = e−itH0 ψ(0) ∈ S(R3 );  (8.18)
         dt
that is, the angular momentum is a conserved quantity for the free motion
as well.
      Moreover, we even have
                       3
 [Li , Kj ]ψ(x) = i         εijk Kk ψ(x),          ψ ∈ S(R3 ), Kj ∈ {Lj , pj , xj }, (8.19)
                      k=1
and these algebraic commutation relations are often used to derive informa-
tion on the point spectra of these operators. In this respect the domain
                                             x2
                       D = span{xα e−         2   | α ∈ Nn } ⊂ S(Rn )
                                                         0                                 (8.20)
is often used. It has the nice property that the finite dimensional subspaces
                                                        x2
                             Dk = span{xα e−             2   | |α| ≤ k}                    (8.21)
are invariant under Lj (and hence they reduce Lj ).
Lemma 8.3. The subspace D ⊂ L2 (Rn ) defined in (8.20) is dense.

Proof. By Lemma 1.10 it suffices to consider the case n = 1. Suppose
ϕ, ψ = 0 for every ψ ∈ D. Then
                                                    k
                            1               −x
                                              2              (itx)j
                           √       ϕ(x)e      2                     dx = 0
                             2π                                j!
                                                   j=1

for any finite k and hence also in the limit k → ∞ by the dominated conver-
                                                                                     x2
gence theorem. But the limit is the Fourier transform of ϕ(x)e−                       2   , which
shows that this function is zero. Hence ϕ(x) = 0.

    Since D is invariant under the unitary groups generated by Lj , the op-
erators Lj are essentially self-adjoint on D by Corollary 5.3.
      Introducing L2 = L2 + L2 + L2 , it is straightforward to check
                        1    2    3
                           [L2 , Lj ]ψ(x) = 0,               ψ ∈ S(R3 ).                   (8.22)
Moreover, Dk is invariant under L2 and L3 and hence Dk reduces L2 and
L3 . In particular, L2 and L3 are given by finite matrices on Dk . Now
8.2. Angular momentum                                                     177


let Hm = Ker(L3 − m) and denote by Pk the projector onto Dk . Since
L2 and L3 commute on Dk , the space Pk Hm is invariant under L2 , which
shows that we can choose an orthonormal basis consisting of eigenfunctions
of L2 for Pk Hm . Increasing k, we get an orthonormal set of simultaneous
eigenfunctions whose span is equal to D. Hence there is an orthonormal
basis of simultaneous eigenfunctions of L2 and L3 .
    Now let us try to draw some further consequences by using the commuta-
tion relations (8.19). (All commutation relations below hold for ψ ∈ S(R3 ).)
Denote by Hl,m the set of all functions in D satisfying
                        L3 ψ = mψ,       L2 ψ = l(l + 1)ψ.             (8.23)
By L2 ≥ 0 and σ(L3 ) ⊆ Z we can restrict our attention to the case l ≥ 0
and m ∈ Z.
   First introduce two new operators
                    L± = L1 ± iL2 ,        [L3 , L± ] = ±L± .          (8.24)
Then, for every ψ ∈ Hl,m we have
      L3 (L± ψ) = (m ± 1)(L± ψ),         L2 (L± ψ) = l(l + 1)(L± ψ);   (8.25)
that is, L± Hl,m → Hl,m±1 . Moreover, since
                             L2 = L2 ± L3 + L L± ,
                                   3                                   (8.26)
we obtain
                    2
             L± ψ       = ψ, L L± ψ = (l(l + 1) − m(m ± 1)) ψ          (8.27)
for every ψ ∈ Hl,m . If ψ = 0, we must have l(l + 1) − m(m ± 1) ≥ 0, which
shows Hl,m = {0} for |m|  l. Moreover, L± Hl,m → Hl,m±1 is injective
unless |m| = l. Hence we must have Hl,m = {0} for l ∈ N0 .
    Up to this point we know σ(L2 ) ⊆ {l(l + 1)|l ∈ N0 }, σ(L3 ) ⊆ Z. In order
to show that equality holds in both cases, we need to show that Hl,m = {0}
for l ∈ N0 , m = −l, −l + 1, . . . , l − 1, l. First of all we observe
                                          1      x2
                         ψ0,0 (x) = 3/4 e− 2 ∈ H0,0 .                   (8.28)
                                        π
Next, we note that (8.19) implies
                    [L3 , x± ] = ±x± ,      x± = x1 ± ix2 ,
                    [L± , x± ] = 0,        [L± , x ] = ±2x3 ,
                    [L2 , x± ] = 2x± (1 ± L3 )   2x3 L± .              (8.29)
Hence if ψ ∈ Hl,l , then (x1 ± ix2 )ψ ∈ Hl±1,l±1 . Thus
                                1
                    ψl,l (x) = √ (x1 ± ix2 )l ψ0,0 (x) ∈ Hl,l ,        (8.30)
                                 l!
178                                                                    8. Algebraic methods


respectively,

                                      (l + m)!
                  ψl,m (x) =                    Ll−m ψl,l (x) ∈ Hl,m .               (8.31)
                                   (l − m)!(2l)! −
The constants are chosen such that ψl,m = 1.
      In summary,
Theorem 8.4. There exists an orthonormal basis of simultaneous eigenvec-
tors for the operators L2 and Lj . Moreover, their spectra are given by
                    σ(L2 ) = {l(l + 1)|l ∈ N0 },                σ(L3 ) = Z.          (8.32)

      We will give an alternate derivation of this result in Section 10.3.

8.3. The harmonic oscillator
Finally, let us consider another important model whose algebraic structure
is similar to those of the angular momentum, the harmonic oscillator
                           H = H0 + ω 2 x2 ,                  ω  0.                 (8.33)
We will choose as domain
                                                 x2
                 D(H) = D = span{xα e−            2   | α ∈ N3 } ⊆ L2 (R3 )
                                                             0                       (8.34)
from our previous section.
      We will first consider the one-dimensional case. Introducing
                      1        √            1 d
                 A± = √            ωx      √              ,      D(A± ) = D,         (8.35)
                        2                    ω dx
we have
                                        [A− , A+ ] = 1                               (8.36)
and
                 H = ω(2N + 1),            N = A+ A− ,           D(N ) = D,          (8.37)
for any function in D. In particular, note that D is invariant under A± .
      Moreover, since
                                    [N, A± ] = ±A± ,                                 (8.38)
we see that N ψ = nψ implies N A± ψ = (n ± 1)A± ψ. Moreover, A+ ψ 2 =
 ψ, A− A+ ψ = (n + 1) ψ 2 , respectively, A− ψ 2 = n ψ 2 , in this case and
hence we conclude that σp (N ) ⊆ N0 .
    If N ψ0 = 0, then we must have A− ψ = 0 and the normalized solution
of this last equation is given by
                                           ω   1/4        ωx2
                           ψ0 (x) =                  e−    2    ∈ D.                 (8.39)
                                           π
8.4. Abstract commutation                                                          179


Hence
                                      1
                            ψn (x) = √ An ψ0 (x)
                                           +                          (8.40)
                                      n!
is a normalized eigenfunction of N corresponding to the eigenvalue n. More-
over, since
                               1     ω 1/4     √      ωx2
                   ψn (x) = √              Hn ( ωx)e− 2               (8.41)
                              2n n! π
where Hn (x) is a polynomial of degree n given by
                                        n
                         x2         d            x2                2    dn −x2
            Hn (x) = e    2   x−            e−    2   = (−1)n ex           e ,   (8.42)
                                   dx                                  dxn
we conclude span{ψn } = D. The polynomials Hn (x) are called Hermite
polynomials.
   In summary,
Theorem 8.5. The harmonic oscillator H is essentially self-adjoint on D
and has an orthonormal basis of eigenfunctions
                   ψn1 ,n2 ,n3 (x) = ψn1 (x1 )ψn2 (x2 )ψn3 (x3 ),                (8.43)
with ψnj (xj ) from (8.41). The spectrum is given by
                         σ(H) = {(2n + 3)ω|n ∈ N0 }.                             (8.44)

    Finally, there is also a close connection with the Fourier transformation.
Without restriction we choose ω = 1 and consider only one-dimension. Then
it easy to verify that H commutes with the Fourier transformation,
                                    FH = HF,                                     (8.45)
on D. Moreover, by FA± =           iA± F we even infer
                     1           (−i)n
            Fψn = √ FAn ψ0 = √ An Fψ0 = (−i)n ψn ,
                          +              +                                       (8.46)
                     n!             n!
since Fψ0 = ψ0 by Lemma 7.3. In particular,
                              σ(F) = {z ∈ C|z 4 = 1}.                            (8.47)

8.4. Abstract commutation
The considerations of the previous section can be generalized as follows.
First of all, the starting point was a factorization of H according to H = A∗ A
(note that A± from the previous section are adjoint to each other when
restricted to D). Then it turned out that commuting both operators just
corresponds to a shift of H; that is, AA∗ = H + c. Hence one could exploit
the close spectral relation of A∗ A and AA∗ to compute both the eigenvalues
and eigenvectors.
180                                                     8. Algebraic methods


     More generally, let A be a closed operator and recall that H0 = A∗ A is a
self-adjoint operator (cf. Problem 2.12) with Ker(H0 ) = Ker(A). Similarly,
H1 = AA∗ is a self-adjoint operator with Ker(H1 ) = Ker(A∗ ).
Theorem 8.6. Let A be a closed operator. The operators H0 = A∗ A      Ker(A)⊥
and H1 = AA∗    Ker(A∗ )⊥
                            are unitarily equivalent.
   If H0 ψ0 = Eψ0 , ψ0 ∈ D(H0 ), then ψ1 = Aψ0 ∈ D(H1 ) with H1 ψ1 = ψ1
            √
and ψ1 = E ψ0 . Moreover,
             1                              1
  RH1 (z) ⊇ (ARH0 (z)A∗ − 1) , RH0 (z) ⊇ (A∗ RH1 (z)A − 1) . (8.48)
             z                              z
                                 1/2
Proof. Introducing |A| = H0 , we have the polar decomposition (Prob-
lem 3.11)
                                 A = U |A|,
where
                         U : Ker(A)⊥ → Ker(A∗ )⊥
is unitary. Taking adjoints, we have (Problem 2.3)
                                       A∗ = |A|U ∗
and thus H1 = AA∗ = U |A||A|U ∗ = U H0 U ∗ shows the claimed unitary
equivalence.
   The√claims about the eigenvalues are straightforward (for the norm note
Aψ0 = EU ψ0 ). To see the connection between the resolvents, abbreviate
P1 = PH1 ({0}). Then
                                      1                    1
       RH1 (z) = RH1 (z)(1 − P1 ) + P1 = U RH0 U ∗ + P1
                                      z                    z
                 1
               ⊇     U (|H0 |1/2 RH0 |H0 |1/2 − 1)U ∗ + P1
                 z
                 1                                   1
               = (ARH0 A∗ + (1 − P1 ) + P1 ) = (ARH0 A∗ + 1) ,
                 z                                   z
where we have used U U ∗ = 1 − P1 .

    We will use this result to compute the eigenvalues and eigenfunctions of
the hydrogen atom in Section 10.4. In the physics literature this approach
is also known as supersymmetric quantum mechanics.
                                  d        2
Problem 8.2. Show that H0 = − dx2 + q can formally (i.e., ignoring do-
mains) be written as H0 = AA  ∗ , where A = − d + φ, if the differential
                                               dx
equation ψ + qψ = 0 has a positive solution. Compute H1 = A∗ A. (Hint:
φ = ψ .)
     ψ
                          d        2
Problem 8.3. Take H0 = − dx2 + λ, λ  0, and compute H1 . What about
domains?
Chapter 9




One-dimensional
Schr¨dinger operators
    o


9.1. Sturm–Liouville operators
In this section we want to illustrate some of the results obtained thus far by
investigating a specific example, the Sturm–Liouville equation

               1          d      d
  τ f (x) =          −      p(x) f (x) + q(x)f (x) ,        f, pf ∈ ACloc (I). (9.1)
              r(x)       dx     dx

    The case p = r = 1 can be viewed as the model of a particle in one-
dimension in the external potential q. Moreover, the case of a particle in
three dimensions can in some situations be reduced to the investigation of
Sturm–Liouville equations. In particular, we will see how this works when
explicitly solving the hydrogen atom.
   The suitable Hilbert space is

                                                    b
              L2 ((a, b), r(x)dx),     f, g =           f (x)∗ g(x)r(x)dx,     (9.2)
                                                a

where I = (a, b) ⊆ R is an arbitrary open interval.
   We require

      (i) p−1 ∈ L1 (I), positive,
                 loc
     (ii) q ∈ L1 (I), real-valued,
               loc
     (iii) r ∈ L1 (I), positive.
                loc


                                                                                181
182                                           9. One-dimensional Schr¨dinger operators
                                                                     o


If a is finite and if p−1 , q, r ∈ L1 ((a, c)) (c ∈ I), then the Sturm–Liouville
equation (9.1) is called regular at a. Similarly for b. If it is regular at both
a and b, it is called regular.
      The maximal domain of definition for τ in L2 (I, r dx) is given by
       D(τ ) = {f ∈ L2 (I, r dx)|f, pf ∈ ACloc (I), τ f ∈ L2 (I, r dx)}.                     (9.3)
It is not clear that D(τ ) is dense unless (e.g.) p ∈ ACloc (I), p , q ∈ L2 (I),
                                                                          loc
r−1 ∈ L∞ (I) since C0 (I) ⊂ D(τ ) in this case. We will defer the general
         loc
                       ∞

case to Lemma 9.4 below.
    Since we are interested in self-adjoint operators H associated with (9.1),
we perform a little calculation. Using integration by parts (twice), we obtain
the Lagrange identity (a  c  d  b)
              d                                                          d
                  g ∗ (τ f ) rdy = Wd (g ∗ , f ) − Wc (g ∗ , f ) +           (τ g)∗ f rdy,   (9.4)
          c                                                          c

for f, g, pf , pg ∈ ACloc (I), where

                              Wx (f1 , f2 ) = p(f1 f2 − f1 f2 ) (x)                          (9.5)

is called the modified Wronskian.
    Equation (9.4) also shows that the Wronskian of two solutions of τ u = zu
is constant
                Wx (u1 , u2 ) = W (u1 , u2 ), τ u1,2 = zu1,2 .           (9.6)
Moreover, it is nonzero if and only if u1 and u2 are linearly independent
(compare Theorem 9.1 below).
   If we choose f, g ∈ D(τ ) in (9.4), then we can take the limits c → a and
d → b, which results in
         g, τ f = Wb (g ∗ , f ) − Wa (g ∗ , f ) + τ g, f ,                   f, g ∈ D(τ ).   (9.7)
Here Wa,b (g ∗ , f ) has to be understood as a limit.
    Finally, we recall the following well-known result from ordinary differ-
ential equations.
Theorem 9.1. Suppose rg ∈ L1 (I). Then there exists a unique solution
                                 loc
f, pf ∈ ACloc (I) of the differential equation
                                    (τ − z)f = g,          z ∈ C,                            (9.8)
satisfying the initial condition
                     f (c) = α,      (pf )(c) = β,         α, β ∈ C,            c ∈ I.       (9.9)
In addition, f is entire with respect to z.
9.1. Sturm–Liouville operators                                                                                 183


Proof. Introducing
                                 f              0
                            u=       , v=          ,
                                pf             rg
we can rewrite (9.8) as the linear first order system
                                                               0       p−1 (x)
            u − Au = v,                A(x) =                                  .
                                                         q(x) − z r(x)   0
Integrating with respect to x, we see that this system is equivalent to the
Volterra integral equation
                                              x                                                        x
                                                                                          α
u − Ku = w,        (Ku)(x) =                      A(y)u(y)dy,     w(x) =                    +              v(y)dy.
                                          c                                               β        c
We will choose some d ∈ (c, b) and consider the integral operator K in the
Banach space C([c, d]). Then for any h ∈ C([c, d]) and x ∈ [c, d] we have
the estimate
                                              x
                    a1 (x)n
     |K n (h)(x)| ≤         h ,   a1 (x) =      a(y)dy, a(x) = A(x) ,
                       n!                   c
which follows from induction
                              x                                           x
       |K n+1 (h)(x)| =           A(y)K n (h)(y)dy ≤                          a(y)|K n (h)(y)|dy
                          c                                           c
                                      x
                                               a1 (y)n      a1 (x)n+1
                     ≤ h                  a(y)         dy =           h .
                                  c               n!        (n + 1)!
Hence the unique solution of our integral equation is given by the Neumann
series (show this)
                                                   ∞
                              u(x) =                    K n (w)(x).
                                                  n=0
To see that the solution u(x) is entire with respect to z, note that the partial
sums are entire (in fact polynomial) in z and hence so is the limit by uniform
convergence with respect to z in compact sets. An analogous argument for
d ∈ (a, c) finishes the proof.

   Note that f, pf can be extended continuously to a regular endpoint.
Lemma 9.2. Suppose u1 , u2 are two solutions of (τ − z)u = 0 which satisfy
W (u1 , u2 ) = 1. Then any other solution of (9.8) can be written as (α, β ∈ C)
                                  x                                               x
       f (x) = u1 (x) α +             u2 g rdy + u2 (x) β −                           u1 g rdy ,
                              c                                               c
                                  x                                               x
      f (x) = u1 (x) α +              u2 g rdy + u2 (x) β −                           u1 g rdy .            (9.10)
                              c                                               c
Note that the constants α, β coincide with those from Theorem 9.1 if u1 (c) =
(pu2 )(c) = 1 and (pu1 )(c) = u2 (c) = 0.
184                                 9. One-dimensional Schr¨dinger operators
                                                           o


Proof. It suffices to check τ f − z f = g. Differentiating the first equation
of (9.10) gives the second. Next we compute

 (pf ) = (pu1 ) α +       u2 g rdy + (pu2 ) β −      u1 g rdy − W (u1 , u2 )gr

         = (q − zr)u1 α +       u2 grdy + (q − zr)u2 β −       u1 gdy − gr

         = (q − zr)f − gr
which proves the claim.

      Now we want to obtain a symmetric operator and hence we choose
                  A0 f = τ f,      D(A0 ) = D(τ ) ∩ ACc (I),              (9.11)
where ACc (I) denotes the functions in AC(I) with compact support. This
definition clearly ensures that the Wronskian of two such functions vanishes
on the boundary, implying that A0 is symmetric by virtue of (9.7). Our first
task is to compute the closure of A0 and its adjoint. For this the following
elementary fact will be needed.
Lemma 9.3. Suppose V is a vector space and l, l1 , . . . , ln are linear func-
tionals (defined on all of V ) such that n Ker(lj ) ⊆ Ker(l). Then l =
                                        j=1
   n
   j=0 αj lj for some constants αj ∈ C.

Proof. First of all it is no restriction to assume that the functionals lj are
linearly independent. Then the map L : V → Cn , f → (l1 (f ), . . . , ln (f )) is
surjective (since x ∈ Ran(L)⊥ implies n xj lj (f ) = 0 for all f ). Hence
                                             j=1
there are vectors fk ∈ V such that lj (fk ) = 0 for j = k and lj (fj ) = 1. Then
f − n lj (f )fj ∈ n Ker(lj ) and hence l(f ) − n lj (f )l(fj ) = 0. Thus
       j=1            j=1                              j=1
we can choose αj = l(fj ).

      Now we are ready to prove
Lemma 9.4. The operator A0 is densely defined and its closure is given by
  A0 f = τ f,   D(A0 ) = {f ∈ D(τ ) | Wa (f, g) = Wb (f, g) = 0, ∀g ∈ D(τ )}.
                                                                        (9.12)
Its adjoint is given by
                          A∗ f = τ f,
                           0            D(A∗ ) = D(τ ).
                                           0                              (9.13)

Proof. We start by computing A∗ and ignore the fact that we do not know
                                 0
whether D(A0 ) is dense for now.
    By (9.7) we have D(τ ) ⊆ D(A∗ ) and it remains to show D(A∗ ) ⊆ D(τ ).
                                0                             0
If h ∈ D(A∗ ), we must have
           0

                      h, A0 f = k, f ,       ∀f ∈ D(A0 ),
9.1. Sturm–Liouville operators                                                                              185


                                                      ˜             ˜
for some k ∈ L2 (I, r dx). Using (9.10), we can find a h such that τ h = k
and from integration by parts we obtain
                  b
                      (h(x) − h(x))∗ (τ f )(x)r(x)dx = 0,
                              ˜                                           ∀f ∈ D(A0 ).                (9.14)
              a
                           ˜
Clearly we expect that h − h will be a solution of τ u = 0 and to prove this,
we will invoke Lemma 9.3. Therefore we consider the linear functionals
                          b                                                       b
      l(g) =                  (h(x) − h(x))∗ g(x)r(x)dx,
                                      ˜                        lj (g) =               uj (x)∗ g(x)r(x)dx,
                      a                                                       a

on L2 (I, r dx),
    c        where uj are two solutions of τ u = 0 with W (u1 , u2 ) = 0.
Then we have Ker(l1 ) ∩ Ker(l2 ) ⊆ Ker(l). In fact, if g ∈ Ker(l1 ) ∩ Ker(l2 ),
then
                                         x                                    b
            f (x) = u1 (x)                   u2 (y)g(y)r(y)dy + u2 (x)            u1 (y)g(y)r(y)dy
                                     a                                    x
is in D(A0 ) and g = τ f ∈ Ker(l) by (9.14). Now Lemma 9.3 implies
      b
          (h(x) − h(x) + α1 u1 (x) + α2 u2 (x))∗ g(x)r(x)dx = 0,
                  ˜                                                                        ∀g ∈ L2 (I, rdx)
                                                                                                 c
  a
              ˜
and hence h = h + α1 u1 + α2 u2 ∈ D(τ ).
     Now what if D(A0 ) were not dense? Then there would be some freedom
in the choice of k since we could always add a component in D(A0 )⊥ . So
suppose we have two choices k1 = k2 . Then by the above calculation, there
                                ˜       ˜                 ˜
are corresponding functions h1 and h2 such that h = h1 + α1,1 u1 + α1,2 u2 =
˜ 2 + α2,1 u1 + α2,2 u2 . In particular, h1 − h2 is in the kernel of τ and hence
h                                        ˜    ˜
       ˜       ˜
k1 = τ h1 = τ h2 = k2 , a contradiction to our assumption.
   Next we turn to A0 . Denote the set on the right-hand side of (9.12) by
D. Then we have D ⊆ D(A∗∗ ) = A0 by (9.7). Conversely, since A0 ⊆ A∗ ,
                             0                                           0
we can use (9.7) to conclude
                       Wa (f, h) + Wb (f, h) = 0,            f ∈ D(A0 ), h ∈ D(A∗ ).
                                                                                0
                     ˜
Now replace h by a h ∈ D(A∗ ) which coincides with h near a and vanishes
                              0
                                                          ˜           ˜
identically near b (Problem 9.1). Then Wa (f, h) = Wa (f, h) + Wb (f, h) = 0.
Finally, Wb (f, h) = −Wa (f, h) = 0 shows f ∈ D.

Example. If τ is regular at a, then Wa (f, g) = 0 for all g ∈ D(τ ) if and
only if f (a) = (pf )(a) = 0. This follows since we can prescribe the values
of g(a), (pg )(a) for g ∈ D(τ ) arbitrarily.

    This result shows that any self-adjoint extension of A0 must lie between
A0 and A∗ . Moreover, self-adjointness seems to be related to the Wronskian
          0
of two functions at the boundary. Hence we collect a few properties first.
186                                           9. One-dimensional Schr¨dinger operators
                                                                     o


Lemma 9.5. Suppose v ∈ D(τ ) with Wa (v ∗ , v) = 0 and suppose there is a
ˆ                        ˆ
f ∈ D(τ ) with Wa (v ∗ , f ) = 0. Then, for f, g ∈ D(τ ), we have
                             Wa (v, f ) = 0     ⇔      Wa (v, f ∗ ) = 0              (9.15)
and
                 Wa (v, f ) = Wa (v, g) = 0            ⇒     Wa (g ∗ , f ) = 0.      (9.16)

Proof. For all f1 , . . . , f4 ∈ D(τ ) we have the Pl¨ cker identity
                                                     u
   Wx (f1 , f2 )Wx (f3 , f4 ) + Wx (f1 , f3 )Wx (f4 , f2 ) + Wx (f1 , f4 )Wx (f2 , f3 ) = 0
                                                                                       (9.17)
which remains valid in the limit x → a. Choosing f1 = v, f2 = f, f3 =
            ˆ
v ∗ , f4 = f , we infer (9.15). Choosing f1 = f, f2 = g ∗ , f3 = v, f4 = f , we         ˆ
infer (9.16).
Problem 9.1. Given α, β, γ, δ, show that there is a function f in D(τ )
restricted to [c, d] ⊆ (a, b) such that f (c) = α, (pf )(c) = β and f (d) = γ,
(pf )(c) = δ. (Hint: Lemma 9.2.)
                           d              2
Problem 9.2. Let A0 = − dx2 , D(A0 ) = {f ∈ H 2 [0, 1]|f (0) = f (1) = 0}
and B = q, D(B) = {f ∈ L2 (0, 1)|qf ∈ L2 (0, 1)}. Find a q ∈ L1 (0, 1) such
that D(A0 ) ∩ D(B) = {0}. (Hint: Problem 0.30.)
Problem 9.3. Let φ ∈ L1 (I). Define
                      loc
            d
  A± = ±      + φ, D(A± ) = {f ∈ L2 (I)|f ∈ AC(I), ±f + φf ∈ L2 (I)}
           dx
and A0,±   = A± |ACc (I) . Show A∗ = A and
                                 0,±

          D(A0,± ) = {f ∈ D(A± )| lim f (x)g(x) = 0, ∀g ∈ D(A )}.
                                              x→a,b

In particular, show that the limits above exist.
Problem 9.4 (Liouville normal form). Show that every Sturm–Liouville
equation can be transformed into one with r = p = 1 as follows: Show that
                                                          b r(t)
the transformation U : L2 ((a, b), r dx) → L2 (0, c), c = a p(t) dt, defined via
u(x) → v(y), where
                         x
                               r(t)                    4
          y(x) =                    dt,       v(y) =       r(x(y))p(x(y)) u(x(y)),
                     a         p(t)
is unitary. Moreover, if p, r, p , r ∈ AC(a, b), then
                                     −(pu ) + qu = rλu
transforms into
                                          −v + Qv = λv,
9.2. Weyl’s limit circle, limit point alternative                          187


where
                                 (pr)1/4
                        Q=q−             p((pr)−1/4 ) .
                                    r

9.2. Weyl’s limit circle, limit point alternative
Inspired by Lemma 9.5, we make the following definition: We call τ limit
circle (l.c.) at a if there is a v ∈ D(τ ) with Wa (v ∗ , v) = 0 such that
Wa (v, f ) = 0 for at least one f ∈ D(τ ). Otherwise τ is called limit point
(l.p.) at a and similarly for b.
Example. If τ is regular at a, it is limit circle at a. Since
                   Wa (v, f ) = (pf )(a)v(a) − (pv )(a)f (a),           (9.18)
any real-valued v with (v(a), (pv )(a)) = (0, 0) works.

    Note that if Wa (f, v) = 0, then Wa (f, Re(v)) = 0 or Wa (f, Im(v)) = 0.
Hence it is no restriction to assume that v is real and Wa (v ∗ , v) = 0 is
trivially satisfied in this case. In particular, τ is limit point if and only if
Wa (f, g) = 0 for all f, g ∈ D(τ ).
Theorem 9.6. If τ is l.c. at a, then let v ∈ D(τ ) with Wa (v ∗ , v) = 0 and
Wa (v, f ) = 0 for some f ∈ D(τ ). Similarly, if τ is l.c. at b, let w be an
analogous function. Then the operator
                         A : D(A) → L2 (I, r dx)                        (9.19)
                             f    → τf
with
                D(A) = {f ∈ D(τ )| Wa (v, f ) = 0 if l.c. at a
                                                                        (9.20)
                                   Wb (w, f ) = 0 if l.c. at b}
is self-adjoint. Moreover, the set
        D1 = {f ∈ D(τ )| ∃x0 ∈ I : ∀x ∈ (a, x0 ), Wx (v, f ) = 0,       (9.21)
                         ∃x1 ∈ I : ∀x ∈ (x1 , b), Wx (w, f ) = 0}
is a core for A.

Proof. By Lemma 9.5, A is symmetric and hence A ⊆ A∗ ⊆ A∗ . Let g ∈0
D(A∗ ). As in the computation of A0 we conclude Wa (f, g) = Wb (f, g) = 0
for all f ∈ D(A). Moreover, we can choose f such that it coincides with v
near a and hence Wa (v, g) = 0. Similarly Wb (w, g) = 0; that is, g ∈ D(A).
    To see that D1 is a core, let A1 be the corresponding operator and observe
that the argument from above, with A1 in place of A, shows A∗ = A.1

   The name limit circle, respectively, limit point, stems from the original
approach of Weyl, who considered the set of solutions τ u = zu, z ∈ CR,
which satisfy Wx (u∗ , u) = 0. They can be shown to lie on a circle which
188                                     9. One-dimensional Schr¨dinger operators
                                                               o


converges to a circle, respectively, a point, as x → a or x → b (see Prob-
lem 9.9).
     Before proceeding, let us shed some light on the number of possible
boundary conditions. Suppose τ is l.c. at a and let u1 , u2 be two solutions
of τ u = 0 with W (u1 , u2 ) = 1. Abbreviate
                           j
                         BCx (f ) = Wx (uj , f ),           f ∈ D(τ ).           (9.22)
Let v be as in Theorem 9.6. Then, using Lemma 9.5, it is not hard to see
that
                                         1                2
        Wa (v, f ) = 0     ⇔     cos(α)BCa (f ) − sin(α)BCa (f ) = 0,            (9.23)
                         1
                       BCa (v)
where tan(α) = −         2. Hence all possible boundary conditions can be
                       BCa (v)
parametrized by α ∈ [0, π). If τ is regular at a and if we choose u1 (a) =
(pu2 )(a) = 1 and (pu1 )(a) = u2 (a) = 0, then
                      1                               2
                    BCa (f ) = f (a),               BCa (f ) = (pf )(a)          (9.24)
and the boundary condition takes the simple form
                           sin(α)(pf )(a) − cos(α)f (a) = 0.                     (9.25)
The most common choice of α = 0 is known as the Dirichlet boundary
condition f (a) = 0. The choice α = π/2 is known as the Neumann
boundary condition (pf )(a) = 0.
    Finally, note that if τ is l.c. at both a and b, then Theorem 9.6 does not
give all possible self-adjoint extensions. For example, one could also choose
               BCa (f ) = eiα BCb (f ),
                 1              1
                                                      BCa (f ) = eiα BCb (f ).
                                                        2              2
                                                                                 (9.26)
The case α = 0 gives rise to periodic boundary conditions in the regular
case.
      Next we want to compute the resolvent of A.
Lemma 9.7. Suppose z ∈ ρ(A). Then there exists a solution ua (z, x) of
(τ − z)u = g which is in L2 ((a, c), r dx) and which satisfies the boundary
condition at a if τ is l.c. at a. Similarly, there exists a solution ub (z, x) with
the analogous properties near b.
      The resolvent of A is given by
                                                b
                   (A − z)−1 g(x) =                 G(z, x, y)g(y)r(y)dy,        (9.27)
                                            a
where
                               1                    ub (z, x)ua (z, y), x ≥ y,
        G(z, x, y) =                                                             (9.28)
                       W (ub (z), ua (z))           ua (z, x)ub (z, y), x ≤ y.
9.2. Weyl’s limit circle, limit point alternative                                                          189


Proof. Let g ∈ L2 (I, r dx) be real-valued and consider f = (A − z)−1 g ∈
                   c
D(A). Since (τ − z)f = 0 near a, respectively, b, we obtain ua (z, x) by
setting it equal to f near a and using the differential equation to extend it
to the rest of I. Similarly we obtain ub . The only problem is that ua or ub
might be identically zero. Hence we need to show that this can be avoided
by choosing g properly.
    Fix z and let g be supported in (c, d) ⊂ I. Since (τ − z)f = g, we have
                                         x                                           b
     f (x) = u1 (x) α +                      u2 gr dy + u2 (x) β +                       u1 gr dy .      (9.29)
                                     a                                           x

                                           ˜
Near a (x  c) we have f (x) = αu1 (x) + βu2 (x) and near b (x  d) we have
                                             b            ˜        b
f (x) = αu1 (x) + βu2 (x), where α = α + a u2 gr dy and β = β + a u1 gr dy.
         ˜                        ˜
                                                                       ˜
If f vanishes identically near both a and b, we must have α = β = α = β = 0
                                                                  ˜
                              b
and thus α = β = 0 and a uj (y)g(y)r(y)dy = 0, j = 1, 2. This case can
be avoided by choosing a suitable g and hence there is at least one solution,
say ub (z).
    Now choose u1 = ub and consider the behavior near b. If u2 is not square
integrable on (d, b), we must have β = 0 since βu2 = f − αub is. If u2 is
                                                              ˜
square integrable, we can find two functions in D(τ ) which coincide with ub
and u2 near b. Since W (ub , u2 ) = 1, we see that τ is l.c. at a and hence
0 = Wb (ub , f ) = Wb (ub , αub + βu2 ) = β. Thus β = 0 in both cases and we
                            ˜
have
                                                      x                                  b
            f (x) = ub (x) α +                            u2 gr dy + u2 (x)                  ub gr dy.
                                                  a                                  x
                                              b
Now choosing g such that a ub gr dy = 0, we infer the existence of ua (z).
Choosing u2 = ua and arguing as before, we see α = 0 and hence
                                x                                            b
       f (x) = ub (x)               ua (y)g(y)r(y)dy + ua (x)                    ub (y)g(y)r(y)dy
                            a                                            x
                     b
             =           G(z, x, y)g(y)r(y)dy
                 a

for any g ∈ L2 (I, r dx). Since this set is dense, the claim follows.
             c

Example. If τ is regular at a with a boundary condition as in the pre-
vious example, we can choose ua (z, x) to be the solution corresponding to
the initial conditions (ua (z, a), (pua )(z, a)) = (sin(α), cos(α)). In particular,
ua (z, x) exists for all z ∈ C.
    If τ is regular at both a and b, there is a corresponding solution ub (z, x),
again for all z. So the only values of z for which (A − z)−1 does not exist
must be those with W (ub (z), ua (z)) = 0. However, in this case ua (z, x)
190                                  9. One-dimensional Schr¨dinger operators
                                                            o


and ub (z, x) are linearly dependent and ua (z, x) = γub (z, x) satisfies both
boundary conditions. That is, z is an eigenvalue in this case.
    In particular, regular operators have pure point spectrum. We will see
in Theorem 9.10 below that this holds for any operator which is l.c. at both
endpoints.

    In the previous example ua (z, x) is holomorphic with respect to z and
satisfies ua (z, x)∗ = ua (z ∗ , x) (since it corresponds to real initial conditions
and our differential equation has real coefficients). In general we have:
Lemma 9.8. Suppose z ∈ ρ(A). Then ua (z, x) from the previous lemma
can be chosen locally holomorphic with respect to z such that
                               ua (z, x)∗ = ua (z ∗ , x)                    (9.30)
and similarly for ub (z, x).

Proof. Since this is a local property near a, we can assume b is regular
and choose ub (z, x) such that (ub (z, b), (pub )(z, b)) = (sin(β), − cos(β)) as in
the example above. In addition, choose a second solution vb (z, x) such that
(vb (z, b), (pvb )(z, b)) = (cos(β), sin(β)) and observe W (ub (z), vb (z)) = 1. If
z ∈ ρ(A), z is no eigenvalue and hence ua (z, x) cannot be a multiple of
ub (z, x). Thus we can set
                       ua (z, x) = vb (z, x) + m(z)ub (z, x)
and it remains to show that m(z) is holomorphic with m(z)∗ = m(z ∗ ).
   Choosing h with compact support in (a, c) and g with support in (c, b),
we have
           h, (A − z)−1 g = h, ua (z) g ∗ , ub (z)
                            = ( h, vb (z) + m(z) h, ub (z) ) g ∗ , ub (z)
(with a slight abuse of notation since ub , vb might not be square integrable).
Choosing (real-valued) functions h and g such that h, ub (z) g ∗ , ub (z) = 0,
we can solve for m(z):
                          h, (A − z)−1 g − h, vb (z) g ∗ , ub (z)
                m(z) =                                            .
                                   h, ub (z) g ∗ , ub (z)
This finishes the proof.
                                          d            2
Example. We already know that τ = − dx2 on I = (−∞, ∞) gives rise to
the free Schr¨dinger operator H0 . Furthermore,
             o
                                         √
                                             −zx
                         u± (z, x) = e             ,       z ∈ C,     (9.31)
                                                                   √
are two linearly independent solutions (for z = 0) and since Re( −z)  0
for z ∈ C[0, ∞), there is precisely one solution (up to a constant multiple)
9.2. Weyl’s limit circle, limit point alternative                                   191


which is square integrable near ±∞, namely u± . In particular, the only
choice for ua is u− and for ub is u+ and we get

                                      1  √
                        G(z, x, y) = √ e− −z|x−y|                                (9.32)
                                    2 −z

which we already found in Section 7.4.

    If, as in the previous example, there is only one square integrable solu-
tion, there is no choice for G(z, x, y). But since different boundary condi-
tions must give rise to different resolvents, there is no room for boundary
conditions in this case. This indicates a connection between our l.c., l.p.
distinction and square integrability of solutions.

Theorem 9.9 (Weyl alternative). The operator τ is l.c. at a if and only if
for one z0 ∈ C all solutions of (τ − z0 )u = 0 are square integrable near a.
This then holds for all z ∈ C and similarly for b.

Proof. If all solutions are square integrable near a, τ is l.c. at a since the
Wronskian of two linearly independent solutions does not vanish.
    Conversely, take two functions v, v ∈ D(τ ) with Wa (v, v ) = 0. By con-
                                       ˜                     ˜
sidering real and imaginary parts, it is no restriction to assume that v and
v are real-valued. Thus they give rise to two different self-adjoint operators
˜
        ˜
A and A (choose any fixed w for the other endpoint). Let ua and ua be the
                                                                     ˜
corresponding solutions from above. Then W (ua , ua ) = 0 (since otherwise
                                                     ˜
      ˜
A = A by Lemma 9.5) and thus there are two linearly independent solutions
which are square integrable near a. Since any other solution can be written
as a linear combination of those two, every solution is square integrable near
a.
    It remains to show that all solutions of (τ − z)u = 0 for all z ∈ C are
square integrable near a if τ is l.c. at a. In fact, the above argument ensures
                             ˜
this for every z ∈ ρ(A) ∩ ρ(A), that is, at least for all z ∈ CR.
   Suppose (τ − z)u = 0 and choose two linearly independent solutions uj ,
j = 1, 2, of (τ − z0 )u = 0 with W (u1 , u2 ) = 1. Using (τ − z0 )u = (z − z0 )u
and (9.10), we have (a  c  x  b)
                                               x
u(x) = αu1 (x) + βu2 (x) + (z − z0 )           (u1 (x)u2 (y) − u1 (y)u2 (x))u(y)r(y) dy.
                                           c

Since uj ∈ L2 ((c, b), rdx), we can find a constant M ≥ 0 such that
                               b
                                   |u1,2 (y)|2 r(y) dy ≤ M.
                           c
192                                                                       9. One-dimensional Schr¨dinger operators
                                                                                                 o


Now choose c close to b such that |z − z0 |M 2 ≤ 1/4. Next, estimating the
integral using Cauchy–Schwarz gives
                           x                                                                                   2
                               (u1 (x)u2 (y) − u1 (y)u2 (x))u(y)r(y) dy
                       c
                                        x                                                                              x
                       ≤                    |u1 (x)u2 (y) − u1 (y)u2 (x)|2 r(y) dy                                         |u(y)|2 r(y) dy
                                    c                                                                              c
                                                                                        x
                       ≤ M |u1 (x)|2 + |u2 (x)|2                                            |u(y)|2 r(y) dy
                                                                                    c
and hence
           x                                                                                                                    x
               |u(y)|2 r(y) dy ≤ (|α|2 + |β|2 )M + 2|z − z0 |M 2                                                                    |u(y)|2 r(y) dy
       c                                                                                                                    c
                                                                                                        x
                                                                                                1
                                                       ≤ (|α|2 + |β|2 )M +                                  |u(y)|2 r(y) dy.
                                                                                                2   c
Thus
                                                       x
                                                           |u(y)|2 r(y) dy ≤ 2(|α|2 + |β|2 )M
                                                   c
and since u ∈ ACloc (I), we have u ∈ L2 ((c, b), r dx) for every c ∈ (a, b).

   Now we turn to the investigation of the spectrum of A. If τ is l.c. at
both endpoints, then the spectrum of A is very simple
Theorem 9.10. If τ is l.c. at both endpoints, then the resolvent is a Hilbert–
Schmidt operator; that is,
                                                   b           b
                                                                   |G(z, x, y)|2 r(y)dy r(x)dx  ∞.                                              (9.33)
                                               a           a
In particular, the spectrum of any self-adjoint extensions is purely discrete
and the eigenfunctions (which are simple) form an orthonormal basis.

Proof. This follows from the estimate
                   b                x                                                       b
                                        |ub (x)ua (y)|2 r(y)dy +                                |ub (y)ua (x)|2 r(y)dy r(x)dx
               a                a                                                       x
                                        b                                    b
                   ≤2                       |ua (y)|2 r(y)dy                     |ub (y)|2 r(y)dy,
                                    a                                    a
which shows that the resolvent is Hilbert–Schmidt and hence compact.

    Note that all eigenvalues are simple. If τ is l.p. at one endpoint, this is
clear, since there is at most one solution of (τ − λ)u = 0 which is square
integrable near this endpoint. If τ is l.c., this also follows since the fact that
two solutions of (τ − λ)u = 0 satisfy the same boundary condition implies
that their Wronskian vanishes.
9.2. Weyl’s limit circle, limit point alternative                                                   193


   If τ is not l.c., the situation is more complicated and we can only say
something about the essential spectrum.
Theorem 9.11. All self-adjoint extensions have the same essential spec-
trum. Moreover, if Aac and Acb are self-adjoint extensions of τ restricted to
(a, c) and (c, b) (for any c ∈ I), then
                       σess (A) = σess (Aac ) ∪ σess (Acb ).                                      (9.34)

Proof. Since (τ − i)u = 0 has two linearly independent solutions, the defect
indices are at most two (they are zero if τ is l.p. at both endpoints, one if
τ is l.c. at one and l.p. at the other endpoint, and two if τ is l.c. at both
endpoints). Hence the first claim follows from Theorem 6.20.
     For the second claim restrict τ to the functions with compact support
in (a, c) ∪ (c, d). Then, this operator is the orthogonal sum of the operators
A0,ac and A0,cb . Hence the same is true for the adjoints and hence the defect
indices of A0,ac ⊕ A0,cb are at most four. Now note that A and Aac ⊕ Acb
are both self-adjoint extensions of this operator. Thus the second claim also
follows from Theorem 6.20.

   In particular, this result implies that for the essential spectrum only the
behaviour near the endpoints a and b is relevant.
    Another useful result to determine if q is relatively compact is the fol-
lowing:
Lemma 9.12. Suppose k ∈ L2 ((a, b), r dx). Then kRA (z) is Hilbert–
                         loc
Schmidt if and only if
                                           b
                       2         1
             kRA (z)   2   =                   |k(x)|2 Im(G(z, x, x))r(x)dx                       (9.35)
                               Im(z)   a
is finite.

Proof. From the first resolvent formula we have
                                                           b
       G(z, x, y) − G(z , x, y) = (z − z )                     G(z, x, t)G(z , t, y)r(t)dt.
                                                       a
Setting x = y and z = z ∗ , we obtain
                                                       b
                Im(G(z, x, x)) = Im(z)                     |G(z, x, t)|2 r(t)dt.                  (9.36)
                                                   a
Using this last formula to compute the Hilbert–Schmidt norm proves the
lemma.
                                                              d                               2
Problem 9.5. Compute the spectrum and the resolvent of τ = − dx2 , I =
(0, ∞) defined on D(A) = {f ∈ D(τ )|f (0) = 0}.
194                                       9. One-dimensional Schr¨dinger operators
                                                                 o


Problem 9.6. Suppose τ is given on (a, ∞), where a is a regular endpoint.
Suppose there are two solutions u± of τ u = zu satisfying r(x)1/2 |u± (x)| ≤
Ce αx for some C, α  0. Then z is not in the essential spectrum of any self-
adjoint operator corresponding to τ . (Hint: You can take any self-adjoint
extension, say the one for which ua = u− and ub = u+ . Write down what
you expect the resolvent to be and show that it is a bounded operator by
comparison with the resolvent from the previous problem.)
Problem 9.7. Suppose a is regular and limx→b q(x)/r(x) = ∞. Show that
σess (A) = ∅ for every self-adjoint extension. (Hint: Fix some positive con-
stant n and choose c ∈ (a, b) such that q(x)/r(x) ≥ n in (c, b) and use
Theorem 9.11.)
Problem 9.8 (Approximation by regular operators). Fix functions v, w ∈
D(τ ) as in Theorem 9.6. Pick Im = (cm , dm ) with cm ↓ a, dm ↑ b and define
                       Am : D(Am ) → L2 (Im , r dr) ,
                            f      → τf
where
       D(Am ) = {f ∈ L2 (Im , r dr)| f, pf ∈ AC(Im ), τ f ∈ L2 (Im , r dr),
                                     Wcm (v, f ) = Wdm (w, f ) = 0}.
Then Am converges to A in the strong resolvent sense as m → ∞. (Hint:
Lemma 6.36.)
Problem 9.9 (Weyl circles). Fix z ∈ CR and c ∈ (a, b). Introduce
                                           W (u, u∗ )x
                               [u]x =                  ∈R
                                            z − z∗
and use (9.4) to show that
                                   x
               [u]x = [u]c +           |u(y)|2 r(y)dy,        (τ − z)u = 0.
                               c
Hence [u]x is increasing and exists if and only if u ∈ L2 ((c, b), r dx).
    Let u1,2 be two solutions of (τ − z)u = 0 which satisfy [u1 ]c = [u2 ]c = 0
and W (u1 , u2 ) = 1. Then, all (nonzero) solutions u of (τ − z)u = 0 which
satisfy [u]b = 0 can be written as
                           u = u2 + m u1 ,               m ∈ C,
up to a complex multiple (note [u1 ]x  0 for x  c).
      Show that
                  [u2 + m u1 ]x = [u1 ]x |m − M (x)|2 − R(x)2 ,
where
                                              W (u2 , u∗ )x
                                                       1
                               M (x) = −
                                              W (u1 , u∗ )x
                                                       1
9.3. Spectral transformations I                                                                   195


and
                                                                                             −2
      R(x)2 = |W (u2 , u∗ )x |2 + W (u2 , u∗ )x W (u1 , u∗ )x
                        1                  2             1                |z − z ∗ |[u1 ]x
                                   −2
             = |z − z ∗ |[u1 ]x         .

Hence the numbers m for which [u]x = 0 lie on a circle which either converges
to a circle (if limx→b R(x)  0) or to a point (if limx→b R(x) = 0) as x → b.
Show that τ is l.c. at b in the first case and l.p. in the second case.


9.3. Spectral transformations I
In this section we want to provide some fundamental tools for investigating
the spectra of Sturm–Liouville operators and, at the same time, give some
nice illustrations of the spectral theorem.
                                     d            2
Example. Consider again τ = − dx2 on I = (−∞, ∞). From Section 7.2
we know that the Fourier transform maps the associated operator H0 to the
multiplication operator with p2 in √2 (R). To get multiplication by λ, as in
                                     L
the spectral theorem, we set p = λ and split the Fourier integral into a
positive and negative part, that is,
                               √
                               i λx
        (U f )(λ) =        R e √ f (x) dx             ,    λ ∈ σ(H0 ) = [0, ∞).              (9.37)
                              −i λx f (x) dx
                           Re

Then
                                             2
                                                           χ[0,∞) (λ)
                       U : L2 (R) →               L2 (R,       √      dλ)                    (9.38)
                                            j=1
                                                             2 λ
is the spectral transformation whose existence is guaranteed by the spectral
theorem (Lemma 3.4). Note, however, √ the measure is not finite. This
                                        that                 √
can be easily fixed if we replace exp(±i λx) by γ(λ) exp(±i λx).
                                                                     √
    Note that in the previous example the kernel e±i λx of the integral trans-
form U is just a pair of linearly independent solutions of the underlying
differential equation (though no eigenfunctions, since they are not square
integrable).
      More generally, if

       U : L2 (I, r dx) → L2 (R, dµ),            f (x) →         u(λ, x)f (x)r(x) dx         (9.39)
                                                             I

is an integral transformation which maps a self-adjoint Sturm–Liouville op-
erator A to multiplication by λ, then its kernel u(λ, x) is a solution of the
underlying differential equation. This formally follows from U Af = λU f
196                                        9. One-dimensional Schr¨dinger operators
                                                                  o


which implies

   0=         u(λ, x)(τ − λ)f (x)r(x) dx =                          (τ − λ)u(λ, x)f (x)r(x) dx          (9.40)
          I                                                     I

and hence (τ − λ)u(λ, .) = 0.

Lemma 9.13. Suppose
                                                                    k
                                2
                          U : L (I, r dx) →                             L2 (R, dµj )                    (9.41)
                                                               j=1

is a spectral mapping as in Lemma 3.4. Then U is of the form
                                           b
                          U f (x) =            u(λ, x)f (x)r(x) dx,                                     (9.42)
                                       a

where u(λ, x) = (u1 (λ, x), . . . , uk (λ, x)) is measurable and for a.e. λ (with
respect to µj ) and each uj (λ, .) is a solution of τ uj = λuj which satisfies the
boundary conditions of A (if any). Here the integral has to be understood as
  b                  d                                 2
 a dx = limc↓a,d↑b c dx with limit taken in         j L (R, dµj ).
      The inverse is given by
                                       k
                    (U −1 F )(x) =                     uj (λ, x)∗ Fj (λ)dµj (λ).                        (9.43)
                                      j=1          R

                                                                                                 R
Again the integrals have to be understood as                              R dµj      = limR→∞    −R dµj   with
limits taken in L2 (I, r dx).
    If the spectral measures are ordered, then the solutions uj (λ), 1 ≤ j ≤ l,
are linearly independent for a.e. λ with respect to µl . In particular, for
ordered spectral measures we always have k ≤ 2 and even k = 1 if τ is l.c.
at one endpoint.

                                1
Proof. Using Uj RA (z) =       λ−z Uj ,     we have
                                                           b
                   Uj f (x) = (λ − z)Uj                        G(z, x, y)f (y)r(y) dy.
                                                       a

If we restrict RA (z) to a compact interval [c, d] ⊂ (a, b), then RA (z)χ[c,d]
is Hilbert–Schmidt since G(z, x, y)χ[c,d] (y) is square integrable over (a, b) ×
(a, b). Hence Uj χ[c,d] = (λ − z)Uj RA (z)χ[c,d] is Hilbert–Schmidt as well and
                                                                             [c,d]
by Lemma 6.9 there is a corresponding kernel uj                                      (λ, y) such that
                                                   b
                                                           [c,d]
                    (Uj χ[c,d] f )(λ) =                uj           (λ, x)f (x)r(x) dx.
                                               a
9.3. Spectral transformations I                                                                                      197


                                    c ˆ
Now take a larger compact interval [ˆ, d] ⊇ [c, d]. Then the kernels coincide
            [c,d]                        c ˆ
                                        [ˆ,d]
on [c, d], uj (λ, .) = uj (λ, .)χ[c,d] , since we have Uj χ[c,d] = Uj χ[ˆ,d] χ[c,d] .
                                                                        c ˆ
In particular, there is a kernel uj (λ, x) such that
                                                            b
                                    Uj f (x) =                  uj (λ, x)f (x)r(x) dx
                                                        a
for every f with compact support in (a, b). Since functions with compact
support are dense and Uj is continuous, this formula holds for any f provided
the integral is understood as the corresponding limit.
    Using the fact that U is unitary, F , U g = U −1 F , g , we see
                                b                                                   b
              Fj (λ)∗               uj (λ, x)g(x)r(x) dx =                              (U −1 F )(x)∗ g(x)r(x) dx.
      j   R                 a                                                   a

Interchanging integrals on the right-hand side (which is permitted at least
for g, F with compact support), the formula for the inverse follows.
    Next, from Uj Af = λUj f we have
                   b                                                        b
                       uj (λ, x)(τ f )(x)r(x) dx = λ                            uj (λ, x)f (x)r(x) dx
               a                                                        a
for a.e. λ and every f ∈ D(A0 ). Restricting everything to [c, d] ⊂ (a, b),
the above equation implies uj (λ, .)|[c,d] ∈ D(A∗ ) and A∗ uj (λ, .)|[c,d] =
                                                        cd,0        cd,0
λuj (λ, .)|[c,d] . In particular, uj (λ, .) is a solution of τ uj = λuj . Moreover, if
τ is l.c. near a, we can choose c = a and allow all f ∈ D(τ ) which satisfy
the boundary condition at a and vanish identically near b.
    Finally, assume the µj are ordered and fix l ≤ k. Suppose
                                                 l
                                                      cj (λ)uj (λ, x) = 0.
                                                j=1

Then we have
                                    l
                                         cj (λ)Fj (λ) = 0,                      Fj = Uj f,
                                j=1
for every f . Since U is surjective, we can prescribe Fj arbitrarily on σ(µl ),
e.g., Fj (λ) = 1 for j = j0 and Fj (λ) = 0 otherwise, which shows cj0 (λ) = 0.
Hence the solutions uj (λ, x), 1 ≤ j ≤ l, are linearly independent for λ ∈
σ(µl ) which shows k ≤ 2 since there are at most two linearly independent
solutions. If τ is l.c. and uj (λ, x) must satisfy the boundary condition, there
is only one linearly independent solution and thus k = 1.

     Note that since we can replace uj (λ, x) by γj (λ)uj (λ, x) where |γj (λ)| =
1, it is no restriction to assume that uj (λ, x) is real-valued.
198                                  9. One-dimensional Schr¨dinger operators
                                                            o


    For simplicity we will only pursue the case where one endpoint, say a,
is regular. The general case can often be reduced to this case and will be
postponed until Section 9.6.
      We choose a boundary condition
                      cos(α)f (a) − sin(α)p(a)f (a) = 0                         (9.44)
and introduce two solution s(z, x) and c(z, x) of τ u = zu satisfying the
initial conditions
                 s(z, a) = sin(α),     p(a)s (z, a) = cos(α),
                 c(z, a) = cos(α),     p(a)c (z, a) = − sin(α).                 (9.45)
Note that s(z, x) is the solution which satisfies the boundary condition at
a; that is, we can choose ua (z, x) = s(z, x). In fact, if τ is not regular
at a but only l.c., everything below remains valid if one chooses s(z, x) to
be a solution satisfying the boundary condition at a and c(z, x) a linearly
independent solution with W (c(z), s(z)) = 1.
    Moreover, in our previous lemma we have u1 (λ, x) = γa (λ)s(λ, x) and
using the rescaling dµ(λ) = |γa (λ)|2 dµa (λ) and (U1 f )(λ) = γa (λ)(U f )(λ),
we obtain a unitary map
                                                         b
  U : L2 (I, r dx) → L2 (R, dµ),      (U f )(λ) =            s(λ, x)f (x)r(x)dx (9.46)
                                                     a
with inverse
                     (U −1 F )(x) =        s(λ, x)F (λ)dµ(λ).                   (9.47)
                                       R
Note, however, that while this rescaling gets rid of the unknown factor γa (λ),
it destroys the normalization of the measure µ. For µ1 we know µ1 (R) (if
the corresponding vector is normalized), but µ might not even be bounded!
In fact, it turns out that µ is indeed unbounded.
     So up to this point we have our spectral transformation U which maps A
to multiplication by λ, but we know nothing about the measure µ. Further-
more, the measure µ is the object of desire since it contains all the spectral
information of A. So our next aim must be to compute µ. If A has only
pure point spectrum (i.e., only eigenvalues), this is straightforward as the
following example shows.
Example. Suppose E ∈ σp (A) is an eigenvalue. Then s(E, x) is the cor-
responding eigenfunction and the same is true for SE (λ) = (U s(E))(λ). In
particular, χ{E} (A)s(E, x) = s(E, x) shows SE (λ) = (U χ{E} (A)s(E))(λ) =
χ{E} (λ)SE (λ); that is,
                                       s(E) 2 , λ = E,
                       SE (λ) =                                                 (9.48)
                                      0,        λ = 0.
9.3. Spectral transformations I                                                                      199


Moreover, since U is unitary, we have
                     b
         2
  s(E)       =           s(E, x)2 r(x)dx =                     SE (λ)2 dµ(λ) = s(E) 4 µ({E}); (9.49)
                 a                                         R
that is, µ({E}) = s(E)                        −2 .
                                   In particular, if A has pure point spectrum
(e.g., if τ is limit circle at both endpoints), we have
                             ∞
                                      1
         dµ(λ) =                                2
                                                     dΘ(λ − Ej ),           σp (A) = {Ej }∞ ,
                                                                                          j=1      (9.50)
                                    s(Ej )
                         j=1

where dΘ is the Dirac measure centered at 0. For arbitrary A, the above
formula holds at least for the pure point part µpp .

    In the general case we have to work a bit harder. Since c(z, x) and s(z, x)
are linearly independent solutions,
                                                 W (c(z), s(z)) = 1,                               (9.51)
we can write ub (z, x) = γb (z)(c(z, x) + mb (z)s(z, x)), where
                         cos(α)p(a)ub (z, a) + sin(α)ub (z, a)
       mb (z) =                                                ,                       z ∈ ρ(A),   (9.52)
                         cos(α)ub (z, a) − sin(α)p(a)ub (z, a)
is known as the Weyl–Titchmarsh m-function. Note that mb (z) is holo-
morphic in ρ(A) and that
                                                     mb (z)∗ = mb (z ∗ )                           (9.53)
since the same is true for ub (z, x) (the denominator in (9.52) only vanishes if
ub (z, x) satisfies the boundary condition at a, that is, if z is an eigenvalue).
Moreover, the constant γb (z) is of no importance and can be chosen equal
to one,
                        ub (z, x) = c(z, x) + mb (z)s(z, x).              (9.54)
Lemma 9.14. The Weyl m-function is a Herglotz function and satisfies
                                                                    b
                             Im(mb (z)) = Im(z)                         |ub (z, x)|2 r(x) dx,      (9.55)
                                                                a
where ub (z, x) is normalized as in (9.54).

Proof. Given two solutions u(x), v(x) of τ u = zu, τ v = z v, respectively, it
                                                         ˆ
is straightforward to check
                                        x
                 (ˆ − z)
                  z                         u(y)v(y)r(y) dy = Wx (u, v) − Wa (u, v)
                                    a
(clearly it is true for x = a; now differentiate with respect to x). Now choose
u(x) = ub (z, x) and v(x) = ub (z, x)∗ = ub (z ∗ , x),
                             x
     −2 Im(z)                    |ub (z, y)|2 r(y) dy = Wx (ub (z), ub (z)∗ ) − 2 Im(mb (z)),
                         a
200                                          9. One-dimensional Schr¨dinger operators
                                                                    o


and observe that Wx (ub , u∗ ) vanishes as x ↑ b, since both ub and u∗ are in
                           b                                         b
D(τ ) near b.
Lemma 9.15. Let
                                             s(z, x)ub (z, y),     y ≥ x,
                         G(z, x, y) =                                                   (9.56)
                                             s(z, y)ub (z, x),     y ≤ x,
be the Green function of A. Then
                           s(λ, x)                                              p(x)s (λ, x)
  (U G(z, x, .))(λ) =                     and    (U p(x)∂x G(z, x, .))(λ) =
                            λ−z                                                    λ−z
                                                                                        (9.57)
for every x ∈ (a, b) and every z ∈ ρ(A).

Proof. First of all note that G(z, x, .) ∈ L2 ((a, b), r dx) for every x ∈ (a, b)
and z ∈ ρ(A). Moreover, from RA (z)f = U −1 λ−z U f we have
                                                 1

                    b
                                                             s(λ, x)F (λ)
                        G(z, x, y)f (y)r(y) dy =                          dµ(λ),        (9.58)
                a                                        R      λ−z
where F = U f . Here equality is to be understood in L2 , that is, for a.e.
x. However, the left-hand side is continuous with respect to x and so is the
right-hand side, at least if F has compact support. Since this set is dense,
the first equality follows. Similarly, the second follows after differentiating
(9.58) with respect to x.
Corollary 9.16. We have
                                                          1
                                       (U ub (z))(λ) =       ,                          (9.59)
                                                         λ−z
where ub (z, x) is normalized as in (9.54).

Proof. Choosing x = a in the lemma, we obtain the claim from the first
identity if sin(α) = 0 and from the second if cos(α) = 0.

    Now combining Lemma 9.14 and Corollary 9.16, we infer from unitarity
of U that
                                 b
                                                                            1
  Im(mb (z)) = Im(z)                 |ub (z, x)|2 r(x) dx = Im(z)                 dµ(λ) (9.60)
                             a                                       R   |λ − z|2
and since a holomorphic function is determined up to a real constant by its
imaginary part, we obtain
Theorem 9.17. The Weyl m-function is given by
                                            1     λ
             mb (z) = d +                      −                dµ(λ),      d ∈ R,      (9.61)
                                      R   λ − z 1 + λ2
9.3. Spectral transformations I                                                        201


and
                                  1
        d = Re(mb (i)),              2
                                       dµ(λ) = Im(mb (i))  ∞.                       (9.62)
                            R 1+λ
Moreover, µ is given by the Stieltjes inversion formula
                                              λ+δ
                                      1
                 µ(λ) = lim lim                     Im(mb (λ + iε))dλ,               (9.63)
                            δ↓0 ε↓0   π   δ
where
                                              b
                Im(mb (λ + iε)) = ε               |ub (λ + iε, x)|2 r(x) dx.         (9.64)
                                          a

Proof. Choosing z = i in (9.60) shows (9.62) and hence the right-hand side
of (9.61) is a well-defined holomorphic function in CR. By
                              1   λ       Im(z)
                      Im(       −   2
                                      )=
                             λ−z 1+λ     |λ − z|2
its imaginary part coincides with that of mb (z) and hence equality follows.
The Stieltjes inversion formula follows as in the case where the measure is
bounded.
                         d      2
Example. Consider τ = − dx2 on I = (0, ∞). Then
                                     √     sin(α) √
                c(z, x) = cos(α) cos( zx) − √ sin( zx)                               (9.65)
                                               z
and
                                     √     cos(α) √
                s(z, x) = sin(α) cos( zx) + √ sin( zx).                              (9.66)
                                               z
Moreover,                                                   √
                            ub (z, x) = ub (z, 0)e−             −zx
                                                                                     (9.67)
and thus                                   √
                                   sin(α) − −z cos(α)
                          mb (z) =         √          ,                              (9.68)
                                   cos(α) + −z sin(α)
respectively,                                     √
                                                     λ
                     dµ(λ) =                                      dλ.                (9.69)
                                 π(cos(α)2          + λ sin(α)2 )

                                                            1
   Note that if α = 0, we even have                       |λ−z| dµ(λ)    0 in the previous
example and hence
                                                           1
                     mb (z) = − cot(α) +                      dµ(λ)                  (9.70)
                                                      R   λ−z
in this case (the factor − cot(α) follows by considering the limit |z| → ∞
of both sides). Formally this even follows in the general case by choosing
x = a in ub (z, x) = (U −1 λ−z )(x); however, since we know equality only for
                            1
202                                  9. One-dimensional Schr¨dinger operators
                                                            o


a.e. x, a more careful analysis is needed. We will address this problem in
the next section.

Problem 9.10. Show
                               cos(α − β)mb,β (z) + sin(α − β)
                  mb,α (z) =                                   .         (9.71)
                               cos(α − β) − sin(α − β)mb,β (z)

(Hint: The case β = 0 is (9.52).)

Problem 9.11. Let φ0 (x), θ0 (x) be two real-valued solutions of τ u = λ0 u
for some fixed λ0 ∈ R such that W (θ0 , φ0 ) = 1. We will call τ quasi-regular
at a if the limits

                    lim Wx (φ0 , u(z)),       lim Wx (θ0 , u(z))         (9.72)
                    x→a                      x→a

exist for every solution u(z) of τ u = zu. Show that this definition is inde-
pendent of λ0 (Hint: Pl¨cker’s identity). Show that τ is quasi-regular at a
                         u
if it is l.c. at a.
      Introduce

              φ(z, x) = Wa (c(z), φ0 )s(z, x) − Wa (s(z), φ0 )c(z, x),
              θ(z, x) = Wa (s(z), θ0 )c(z, x) − Wa (c(z), θ0 )s(z, x),   (9.73)

where c(z, x) and s(z, x) are chosen with respect to some base point c ∈ (a, b),
and a singular Weyl m-function Mb (z) such that

                  ψ(z, x) = θ(z, x) + Mb (z)φ(z, x) ∈ L2 (c, b).         (9.74)

Show that all claims from this section still hold true in this case for the
operator associated with the boundary condition Wa (φ0 , f ) = 0 if τ is l.c. at
a.


9.4. Inverse spectral theory
In this section we want to show that the Weyl m-function (respectively,
the corresponding spectral measure) uniquely determines the operator. For
simplicity we only consider the case p = r ≡ 1.
    We begin with some asymptotics for large z away from the spectrum.
               √
We recall that z always denotes the branch with arg(z) ∈ (−π, π]. We will
write c(z, x) = cα (z, x) and s(z, x) = sα (z, x) to display the dependence on
α whenever necessary.
      We first observe (Problem 9.12)
9.4. Inverse spectral theory                                             203


Lemma 9.18. For α = 0 we have
                      √                 1 √
     c0 (z, x) = cosh( −z(x − a)) + O( √ e −z(x−a) ),
                                        −z
                   1       √               1 √
     s0 (z, x) = √    sinh( −z(x − a)) + O( e −z(x−a) ),               (9.75)
                   −z                      z
uniformly for x ∈ (a, c) as |z| → ∞.

   Note that for z ∈ C[0, ∞) this can be written as
                             1 √                 1
                  c0 (z, x) = e −z(x−a) (1 + O( √ )),
                             2                   −z
                               1    √               1
                  s0 (z, x) = √ e −z(x−a) (1 + O( )),                  (9.76)
                             2 −z                   z
for Im(z) → ∞ and for z = λ ∈ [0, ∞) we have
                                √                1
                c0 (λ, x) = cos( λ(x − a)) + O( √ ),
                                                  λ
                             1     √                1
                s0 (λ, x) = √ sin( λ(x − a)) + O( ),                   (9.77)
                              λ                     λ
as λ → ∞.
   From this lemma we obtain
Lemma 9.19. The Weyl m-function satisfies
                               − cot(α) + O( √1 ),
                                              −z
                                                     α = 0,
                  mb (z) =      √                                      (9.78)
                               − −z + O(1),          α = 0,
as z → ∞ in any sector | Re(z)| ≤ C| Im(z)|.

Proof. As in the proof of Theorem 9.17 we obtain from Lemma 9.15
                                      1     λ
          G(z, x, x) = d(x) +            −           s(λ, x)2 dµ(λ).
                                R   λ − z 1 + λ2
Hence, since the integrand converges pointwise to 0, dominated convergence
(Problem 9.13) implies G(z, x, x) = o(z) as z → ∞ in any sector | Re(z)| ≤
C| Im(z)|. Now solving G(z, x, y) = s(z, x)ub (z, x) for mb (z) and using the
asymptotic expansions from Lemma 9.18, we see
                                c(z, x)         √
                   mb (z) = −           + o(ze−2 −z(x−a) )
                                s(z, x)
from which the claim follows.

    Note that assuming q ∈ C k ([a, b)), one can obtain further asymptotic
terms in Lemma 9.18 and hence also in the expansion of mb (z).
   The asymptotics of mb (z) in turn tell us more about L2 (R, dµ).
204                                                 9. One-dimensional Schr¨dinger operators
                                                                           o


Lemma 9.20. Let
                                                       1     λ
                          F (z) = d +                     −                    dµ(λ)
                                                R    λ − z 1 + λ2
be a Herglotz function. Then, for any 0  γ  2, we have
                                                                 ∞
                          dµ(λ)                                      Im(F (iy))
                                  ∞                ⇐⇒                          dy  ∞.             (9.79)
                     R   1 + |λ|γ                            1          yγ

Proof. First of all note that we can split F (z) = F1 (z) + F2 (z) according
to dµ = χ[−1,1] dµ + (1 + χ[−1,1] )dµ. The part F1 (z) corresponds to a finite
measure and does not contribute by Theorem 3.20. Hence we can assume
that µ is not supported near 0. Then Fubini shows
          ∞                             ∞
              Im(F (iy))                          y 1−γ              π/2                      1
                         dy =                           dµ(λ)dy =                                 dµ(λ),
      0          yγ                 0       R   λ2 + y2           sin(γπ/2)              R   |λ|γ
which proves the claim. Here we have used (Problem 9.14)
                                    ∞
                                         y 1−γ              π/2
                                                 dy =                .
                                0       λ2 + y 2      |λ|γ sin(γπ/2)



   For the case γ = 0 see Theorem 3.20 and for the case γ = 2 see Prob-
lem 9.15.

Corollary 9.21. We have
                                                        s(λ, x)s(λ, y)
                            G(z, x, y) =                               dµ(λ),                       (9.80)
                                                    R       λ−z
where the integrand is integrable. Moreover, for any ε  0 we have
                                                                     √
                            G(z, x, y) = O(z −1/2+ε e−                    −z|y−x|
                                                                                    ),              (9.81)
as z → ∞ in any sector | Re(z)| ≤ C| Im(z)|.

Proof. The previous lemma implies s(λ, x)2 (1+|λ|)γ dµ(λ)  ∞ for γ  1 .  2
This already proves the first part and also the second in the case x = y, and
hence the result follows from |λ − z|−1 ≤ const Im(z)−1/2+ε (1 + λ2 )−1/4−ε/2
(Problem 9.13) in any sector | Re(z)| ≤ C| Im(z)|. But the case x = y implies
                                                                     √
                             ub (z, x) = O(z −1/2+ε e−                   −z(a−x)
                                                                                   ),
which in turn implies the x = y case.

      Now we come to our main result of this section:
9.4. Inverse spectral theory                                                             205


Theorem 9.22. Suppose τj , j = 0, 1, are given on (a, b) and both are regular
at a. Moreover, Aj are some self-adjoint operators associated with τj and
the same boundary condition at a.
   Let c ∈ (0, b). Then q0 (x) = q1 (x) for x ∈ (a, c) if and only if for every
                                                        √
ε  0 we have that m1,b (z) − m0,b (z) = O(e−2(a−ε) Re( −z) ) as z → ∞ along
some nonreal ray.

Proof. By (9.75) we have s1 (z, x)/s0 (z, x) → 1 as z → ∞ along any nonreal
ray. Moreover, (9.81) in the case y = x shows s0 (z, x)u1,b (z, x) → 0 and
s1 (z, x)u0,b (z, x) → 0 as well. In particular, the same is true for the difference
   s1 (z, x)c0 (z, x) − s0 (z, x)c1 (z, x) + (m1,b (z) − m0,b (z))s0 (z, x)s1 (z, x).
Since the first two terms cancel for x ∈ (a, c), (9.75) implies m1,b (z) −
                         √
m0,b (z) = O(e−2(a−ε) Re( −z) ).
    To see the converse, first note that the entire function
  s1 (z, x)c0 (z, x) − s0 (z, x)c1 (z, x) =s1 (z, x)u0,b (z, x) − s0 (z, x)u1,b (z, x)
                                            − (m1,b (z) − m0,b (z))s0 (z, x)s1 (z, x)
vanishes as z → ∞ along any nonreal ray for fixed x ∈ (a, c) by the same
arguments used before together with the assumption on m1,b (z) − m0,b (z).
Moreover, by (9.75) this function has an order of growth ≤ 1/2 and thus
by the Phragm´n–Lindel¨f theorem (e.g., [53, Thm. 4.3.4]) is bounded on
                  e         o
all of C. By Liouville’s theorem it must be constant and since it vanishes
along rays, it must be zero; that is, s1 (z, x)c0 (z, x) = s0 (z, x)c1 (z, x) for all
z ∈ C and x ∈ (a, c). Differentiating this identity with respect to x and us-
ing W (cj (z), sj (z)) = 1 shows s1 (z, x)2 = s0 (z, x)2 . Taking the logarithmic
derivative further gives s1 (z, x)/s1 (z, x) = s0 (z, x)/s0 (z, x) and differentiat-
ing once more shows s1 (z, x)/s1 (z, x) = s0 (z, x)/s0 (z, x). This finishes the
proof since qj (x) = z + sj (z, x)/sj (z, x).

Problem 9.12. Prove Lemma 9.18. (Hint: Without loss set a = 0. Now
use that
                                √       sin(α)      √
         c(z, x) = cos(α) cosh( −zx) − √       sinh( −zx)
                                           −z
                              x      √
                       1
                   −√           sinh( −z(x − y))q(y)c(z, y)dy
                       −z 0
                                               √
by Lemma 9.2 and consider c(z, x) = e−
                          ˜                        −zx c(z, x).)


Problem 9.13. Show
                                1      2     |z|
                                   ≤√                                              (9.82)
                               λ−z    1+λ 2 Im(z)
206                                         9. One-dimensional Schr¨dinger operators
                                                                   o


and
                         1   λ        2 (1 + |z|)|z|
                           −   2
                                 ≤                                                  (9.83)
                        λ−z 1+λ    1 + λ2 Im(z)
for any λ ∈ R. (Hint: To obtain the first, search for the maximum as
a function of λ (cf. also Problem 3.7). The second then follows from the
first.)
Problem 9.14. Show
                        ∞
                            y 1−γ          π/2
                                 2
                                   dy =           ,             γ ∈ (0, 2),
                    0       1+y         sin(γπ/2)
by proving
                        ∞
                              eαx           π
                                  x
                                    dx =         ,             α ∈ (0, 1).
                     −∞      1+e         sin(γπ)
(Hint: To compute the last integral, use a contour consisting of the straight
lines connecting the points −R, R, R + 2πi, −R + 2πi. Evaluate the contour
integral using the residue theorem and let R → ∞. Show that the contribu-
tions from the vertical lines vanish in the limit and relate the integrals along
the horizontal lines.)
Problem 9.15. In Lemma 9.20 we assumed 0  γ  2. Show that in the
case γ = 2 we have
                                                           ∞
              log(1 + λ2 )                                     Im(F (iy))
                           dµ(λ)  ∞              ⇐⇒                      dy  ∞.
          R     1 + λ2                                 1          y2
          ∞ y −1            log(1+λ2 )
(Hint:   1 λ2 +y 2 dy   =      2λ2
                                       .)


9.5. Absolutely continuous spectrum
In this section we will show how to locate the absolutely continuous spec-
trum. We will again assume that a is a regular endpoint. Moreover, we
assume that b is l.p. since otherwise the spectrum is discrete and there will
be no absolutely continuous spectrum.
    In this case we have seen in the Section 9.3 that A is unitarily equivalent
to multiplication by λ in the space L2 (R, dµ), where µ is the measure asso-
ciated to the Weyl m-function. Hence by Theorem 3.23 we conclude that
the set
                   Ms = {λ| lim sup Im(mb (λ + iε)) = ∞}                 (9.84)
                                      ε↓0

is a support for the singularly continuous part and
                  Mac = {λ|0  lim sup Im(mb (λ + iε))  ∞}                         (9.85)
                                            ε↓0
9.5. Absolutely continuous spectrum                                                                                 207


is a minimal support for the absolutely continuous part. Moreover, σ(Aac )
can be recovered from the essential closure of Mac ; that is,
                                                                      ess
                                                σ(Aac ) = M ac .                                                (9.86)
Compare also Section 3.2.
      We now begin our investigation with a crucial estimate on Im(mb (λ+iε)).
Set
                                                    x
                    f       (a,x)   =                   |f (y)|2 r(y)dy,           x ∈ (a, b).                  (9.87)
                                                a

Lemma 9.23. Let
                                                                                    −1
                                    ε = (2 s(λ)             (1,x)   c(λ)     (a,x) )                            (9.88)
and note that since b is l.p., there is a one-to-one correspondence between
ε ∈ (0, ∞) and x ∈ (a, b). Then
                  √                      s(λ) (a,x)      √
              5 − 24 ≤ |mb (λ + iε)|                ≤ 5 + 24.         (9.89)
                                         c(λ) (a,x)

Proof. Let x  a. Then by Lemma 9.2
        u+ (λ + iε, x) = c(λ, x) − mb (λ + iε)s(λ, x)
                        x
             − iε            c(λ, x)s(λ, y) − c(λ, y)s(λ, x) u+ (λ + iε, y)r(y)dy.
                    a
Hence one obtains after a little calculation (as in the proof of Theorem 9.9)
  c(λ) − mb (λ + iε)s(λ)                (a,x)   ≤ ub (λ + iε)           (a,x)
                                                        + 2ε s(λ)      (a,x)    c(λ)     (a,x)   ub (λ + iε)    (a,x) .

Using the definition of ε and (9.55), we obtain
                                          2                                    2
      c(λ) − mb (λ + iε)s(λ)              (a,x)         ≤ 4 ub (λ + iε)        (a,x)

                                                                               2      4
                                                        ≤ 4 ub (λ + iε)        (a,b)   =Im(mb (λ + iε))
                                                                                      ε
                                                        ≤ 8 s(λ)     (a,x)    c(λ) (a,x) Im(mb (λ + iε)).
Combining this estimate with
                                                                                                                    2
                                         2
      c(λ) − mb (λ + iε)s(λ)             (a,x)      ≥        c(λ)    (a,x)   − |mb (λ + iε)| s(λ)           (a,x)

                                                                                                  −1
shows (1 − t)2 ≤ 8t, where t = |mb (λ + iε)| s(λ)                                (a,x)     c(λ)   (a,x) .

    We now introduce the concept of subordinacy. A nonzero solution u of
τ u = zu is called sequentially subordinate at b with respect to another
solution v if
                                   u (a,x)
                           lim inf         = 0.                    (9.90)
                             x→b   v (a,x)
208                                  9. One-dimensional Schr¨dinger operators
                                                            o


If the lim inf can be replaced by a lim, the solution is called subordinate.
Both concepts will eventually lead to the same results (cf. Remark 9.26
below). We will work with (9.90) since this will simplify proofs later on and
hence we will drop the additional sequentially.
    It is easy to see that if u is subordinate with respect to v, then it is
subordinate with respect to any linearly independent solution. In particular,
a subordinate solution is unique up to a constant. Moreover, if a solution
u of τ u = λu, λ ∈ R, is subordinate, then it is real up to a constant, since
both the real and the imaginary parts are subordinate. For z ∈ CR we
know that there is always a subordinate solution near b, namely ub (z, x).
The following result considers the case z ∈ R.
Lemma 9.24. Let λ ∈ R. There is a subordinate solution u(λ) near b if
and only if there is a sequence εn ↓ 0 such that mb (λ + iεn ) converges to a
limit in R ∪ {∞} as n → ∞. Moreover,
                                cos(α)p(a)u (λ, a) + sin(α)u(λ, a)
          lim mb (λ + iεn ) =                                              (9.91)
          n→∞                   cos(α)u(λ, a) − sin(α)p(a)u (λ, a)
in this case (compare (9.52)).

Proof. We will consider the number α fixing the boundary condition as a
parameter and write sα (z, x), cα (z, x), mb,α , etc., to emphasize the depen-
dence on α.
    Every solution can (up to a constant) be written as sβ (λ, x) for some
β ∈ [0, π). But by Lemma 9.23, sβ (λ, x) is subordinate if and only there is
a sequence εn ↓ 0 such that limn→∞ mb,β (λ + iεn ) = ∞ and by (9.71) this is
the case if and only if
                         cos(α − β)mb,β (λ + iεn ) + sin(α − β)
 lim mb,α (λ+iεn ) = lim                                        = cot(α−β)
n→∞                  n→∞ cos(α − β) − sin(α − β)mb,β (λ + iεn )

is a number in R ∪ {∞}.

    We are interested in N (τ ), the set of all λ ∈ R for which no subordinate
solution exists, that is,
       N (τ ) = {λ ∈ R|No solution of τ u = λu is subordinate at b}        (9.92)
and the set
                   Sα (τ ) = {λ| s(λ, x) is subordinate at b}.             (9.93)
      From the previous lemma we obtain
Corollary 9.25. We have λ ∈ N (τ ) if and only if
         lim inf Im(mb (λ + iε))  0    and   lim sup |mb (λ + iε)|  ∞.
           ε↓0                                  ε↓0

Similarly, λ ∈ Sα (τ ) if and only if lim supε↓0 |mb (λ + iε)| = ∞.
9.6. Spectral transformations II                                           209


Remark 9.26. Since the set, for which the limit limε↓0 mb (λ + iε) does not
exist, is of zero spectral and Lebesgue measure (Corollary 3.25), changing
the lim in (9.90) to a lim inf will affect N (τ ) only on such a set (which
is irrelevant for our purpose). Moreover, by (9.71) the set where the limit
exists (finitely or infinitely) is independent of the boundary condition α.

   Then, as a consequence of the previous corollary, we have
Theorem 9.27. The set N (τ ) ⊆ Mac is a minimal support for the absolutely
continuous spectrum of H. In particular,
                                             ess
                             σac (H) = N (τ )      .                    (9.94)
Moreover, the set Sα (τ ) ⊇ Ms is a minimal support for the singular spectrum
of H.

Proof. By our corollary we have N (τ ) ⊆ Mac . Moreover, if λ ∈ Mac N (τ ),
then either 0 = lim inf Im(mb )  lim sup Im(mb ) or lim sup Re(mb ) = ∞.
The first case can only happen on a set of Lebesgue measure zero by Theo-
rem 3.23 and the same is true for the second by Corollary 3.25.
    Similarly, by our corollary we also have Sα (τ ) ⊇ Ms and λ ∈ Sα (τ )Ms
happens precisely when lim sup Re(mb ) = ∞ which can only happen on a
set of Lebesgue measure zero by Corollary 3.25.

    Note that if (λ1 , λ2 ) ⊆ N (τ ), then the spectrum of any self-adjoint
extension H of τ is purely absolutely continuous in the interval (λ1 , λ2 ).
                                2
                                d
Example. Consider H0 = − dx2 on (0, ∞) with a Dirichlet boundary con-
dition at x = 0. Then it is easy to check H0 ≥ 0 and N (τ0 ) = (0, ∞). Hence
σac (H0 ) = [0, ∞). Moreover, since the singular spectrum is supported on
[0, ∞)N (τ0 ) = {0}, we see σsc (H0 ) = ∅ (since the singular continuous spec-
trum cannot be supported on a finite set) and σpp (H0 ) ⊆ {0}. Since 0 is no
eigenvalue, we have σpp (H0 ) = ∅.

                                                d         2
Problem 9.16. Determine the spectrum of H0 = − dx2 on (0, ∞) with a
general boundary condition (9.44) at a = 0.

9.6. Spectral transformations II
In Section 9.3 we have looked at the case of one regular endpoint. In this
section we want to remove this restriction. In the case of a regular endpoint
(or more generally an l.c. endpoint), the choice of u(λ, x) in Lemma 9.13 was
dictated by the fact that u(λ, x) is required to satisfy the boundary condition
at the regular (l.c.) endpoint. We begin by showing that in the general case
we can choose any pair of linearly independent solutions. We will choose
210                                    9. One-dimensional Schr¨dinger operators
                                                              o


some arbitrary point c ∈ I and two linearly independent solutions according
to the initial conditions
   c(z, c) = 1,    p(c)c (z, c) = 0,         s(z, c) = 0,       p(c)s (z, c) = 1.   (9.95)
We will abbreviate
                                                 c(z, x)
                               s(z, x) =                 .                          (9.96)
                                                 s(z, x)
Lemma 9.28. There is measure dµ(λ) and a nonnegative matrix R(λ) with
trace one such that
                U : L2 (I, r dx) → L2 (R, R dµ)
                                    b                          (9.97)
                    f (x)        → a s(λ, x)f (x)r(x) dx
is a spectral mapping as in Lemma 9.13. As before, the integral has to be
                 b                d
understood as a dx = limc↓a,d↑b c dx with limit taken in L2 (R, R dµ), where
L2 (R, R dµ) is the Hilbert space of all C2 -valued measurable functions with
scalar product
                               f, g =            f ∗ Rg dµ.                         (9.98)
                                             R
The inverse is given by

                    (U −1 F )(x) =         s(λ, x)R(λ)F (λ)dµ(λ).                   (9.99)
                                       R

Proof. Let U0 be a spectral transformation as in Lemma 9.13 with corre-
sponding real solutions uj (λ, x) and measures dµj (x), 1 ≤ j ≤ k. Without
loss of generality we can assume k = 2 since we can always choose dµ2 = 0
and u2 (λ, x) such that u1 and u2 are linearly independent.
      Now define the 2 × 2 matrix C(λ) via
                           u1 (λ, x)                  c(λ, x)
                                           = C(λ)
                           u2 (λ, x)                  s(λ, x)
and note that C(λ) is nonsingular since u1 , u2 as well as s, c are linearly
independent.
    Set d˜ = dµ1 + dµ2 . Then dµj = rj d˜ and we can introduce R =
          µ                                    µ                            ˜
                               ˜
  ∗ r1 0 C. By construction R is a (symmetric) nonnegative matrix. More-
C 0 r2
                                    ˜
over, since C(λ) is nonsingular, tr(R) is positive a.e. with respect to µ. Thus
                                                                        ˜
we can set R = tr(R) ˜
                    ˜                     ˜
                      −1 R and dµ = tr(R)−1 d˜ .µ
      This matrix gives rise to an operator
          C : L2 (R, R dµ) →         L2 (R, dµj ),        F (λ) → C(λ)F (λ),
                                j

which, by our choice of R dµ, is norm preserving. By CU = U0 it is onto
and hence it is unitary (this also shows that L2 (R, R dµ) is a Hilbert space,
i.e., complete).
9.6. Spectral transformations II                                               211


    It is left as an exercise to check that C maps multiplication by λ in
L2 (R, R dµ) to multiplication by λ in        2
                                           j L (R, dµj ) and the formula for
U −1 .

    Clearly the matrix-valued measure R dµ contains all the spectral in-
formation of A. Hence it remains to relate it to the resolvent of A as in
Section 9.3
    For our base point x = c there are corresponding Weyl m-functions
ma (z) and mb (z) such that
 ua (z) = c(z, x) − ma (z)s(z, x),        ub (z) = c(z, x) + mb (z)s(z, x). (9.100)
The different sign in front of ma (z) is introduced such that ma (z) will again
be a Herglotz function. In fact, this follows using reflection at c, x − c →
−(x − c), which will interchange the roles of ma (z) and mb (z). In particular,
all considerations from Section 9.3 hold for ma (z) as well.
    Furthermore, we will introduce the Weyl M -matrix
                    1                     −1          (ma (z) − mb (z))/2
   M (z) =                                                                 .
             ma (z) + mb (z)      (ma (z) − mb (z))/2    ma (z)mb (z)
                                                                        (9.101)
Note det(M (z)) = − 1 . Since
                    4

                       p(c)ua (z, c)                  p(c)ub (z, c)
          ma (z) = −                   and mb (z) =                 ,      (9.102)
                         ua (z, c)                      ub (z, c)
it follows that W (ua (z), ub (z)) = ma (z) + mb (z) and
  M (z) =
                    G(z, x, x)            (p(x)∂x + p(y)∂y )G(z, x, y)/2
   lim                                                                    ,
  x,y→c    (p(x)∂x + p(y)∂y )G(z, x, y)/2    p(x)∂x p(y)∂y G(z, x, y)
                                                                      (9.103)
where G(z, x, y) is the Green function of A. The limit is necessary since
∂x G(z, x, y) has different limits as y → x from y  x, respectively, y  x.
    We begin by showing

Lemma 9.29. Let U be the spectral mapping from the previous lemma.
Then
                                        1
                   (U G(z, x, .))(λ) =     s(λ, x),
                                       λ−z
                                        1
            (U p(x)∂x G(z, x, .))(λ) =     p(x)s (λ, x)     (9.104)
                                       λ−z
for every x ∈ (a, b) and every z ∈ ρ(A).
212                                               9. One-dimensional Schr¨dinger operators
                                                                         o


Proof. First of all note that G(z, x, .) ∈ L2 ((a, b), r dx) for every x ∈ (a, b)
and z ∈ ρ(A). Moreover, from RA (z)f = U −1 λ−z U f we have
                                                 1

                 b
                                       1
                     G(z, x, y)f (y)r(y)dy =
                                           s(λ, x)R(λ)F (λ)dµ(λ)
         a                         R λ−z
where F = U f . Now proceed as in the proof of Lemma 9.15.

      With the aid of this lemma we can now show
Theorem 9.30. The Weyl M -matrix is given by
                                        1     λ
          M (z) = D +                      −                 R(λ)dµ(λ),             Djk ∈ R,            (9.105)
                                 R    λ − z 1 + λ2
and
                                                   1
             D = Re(M (i)),                            R(λ)dµ(λ) = Im(M (i)),                           (9.106)
                                            R   1 + λ2
where
                       1                                                 1
 Re(M (z)) =             M (z) + M ∗ (z) ,           Im(M (z)) =           M (z) − M ∗ (z) . (9.107)
                       2                                                 2
Proof. By the previous lemma we have
                           b
                                                                     1
                               |G(z, c, y)|2 r(y)dy =                     R11 (λ)dµ(λ).
                       a                                 R       |z − λ|2
Moreover by (9.28), (9.55), and (9.100) we infer
      b                                                                       c
                                             1
          |G(z, c, y)|2 r(y)dy =                        |ub (z, c)|2              |ua (z, y)|2 r(y)dy
  a                                     |W (ua , ub )|2                   a
                                                             b
                                                                                             Im(M11 (z))
                                        + |ua (z, c)|2           |ub (z, y)|2 r(y)dy =                   .
                                                         c                                     Im(z)
Similarly we obtain
                                         1                Im(M22 (z))
                                             R (λ)dµ(λ) =
                                            2 22
                                 R   |z − λ|                Im(z)
and
                          1                  Im(M12 (z))
                              R (λ)dµ(λ) =
                             2 12
                                                         .
                    R |z − λ|                   Im(z)
Hence the result follows as in the proof of Theorem 9.17.

      Now we are also able to extend Theorem 9.27. Note that by
                                                                     1     λ
  tr(M (z)) = M11 (z) + M22 (z) = d +                                   −                  dµ(λ) (9.108)
                                                         R         λ − z 1 + λ2
(with d = tr(D) ∈ R) we have that the set
                               Ms = {λ| lim sup Im(tr(M (λ + iε))) = ∞}                                 (9.109)
                                            ε↓0
9.6. Spectral transformations II                                              213


is a support for the singularly continuous part and
               Mac = {λ|0  lim sup Im(tr(M (λ + iε)))  ∞}               (9.110)
                                   ε↓0

is a minimal support for the absolutely continuous part.
Theorem 9.31. The set Na (τ ) ∪ Nb (τ ) ⊆ Mac is a minimal support for the
absolutely continuous spectrum of H. In particular,
                                                       ess
                         σac (H) = Na (τ ) ∪ Nb (τ )         .            (9.111)
Moreover, the set
                                  Sa,α (τ ) ∩ Sb,α (τ ) ⊇ Ms              (9.112)
                        α∈[0,π)
is a support for the singular spectrum of H.

Proof. By Corollary 9.25 we have 0  lim inf Im(ma ) and lim sup |ma |  ∞
if and only if λ ∈ Na (τ ) and similarly for mb .
    Now suppose λ ∈ Na (τ ). Then lim sup |M11 |  ∞ since lim sup |M11 | =
                                  −1
∞ is impossible by 0 = lim inf |M11 | = lim inf |ma + mb | ≥ lim inf Im(ma ) 
0. Similarly lim sup |M22 |  ∞. Moreover, if lim sup |mb |  ∞, we also have
                             Im(ma + mb )           lim inf Im(ma )
lim inf Im(M11 ) = lim inf               2
                                           ≥                                 0
                              |ma + mb |     lim sup |ma |2 + lim sup |mb |2
and if lim sup |mb | = ∞, we have
                                            ma
       lim inf Im(M22 ) = lim inf Im                 ≥ lim inf Im(ma )  0.
                                          1 + ma
                                              mb

Thus Na (τ ) ⊆ Mac and similarly Nb (τ ) ⊆ Mac .
    Conversely, let λ ∈ Mac . By Corollary 3.25 we can assume that the
limits lim ma and lim mb both exist and are finite after disregarding a set of
Lebesgue measure zero. For such λ, lim Im(M11 ) and lim Im(M22 ) both exist
and are finite. Moreover, either lim Im(M11 )  0, in which case lim Im(ma +
mb )  0, or lim Im(M11 ) = 0, in which case
                                     |ma |2 Im(mb ) + |mb |2 Im(ma )
         0  lim Im(M22 ) = lim                                      =0
                                              |ma |2 + |mb |2
yields a contradiction. Thus λ ∈ Na (τ ) ∪ Nb (τ ) and the first part is proven.
    To prove the second part, let λ ∈ Ms . If lim sup Im(M11 ) = ∞, we have
lim sup |M11 | = ∞ and thus lim inf |ma + mb | = 0. But this implies that
there is some subsequence such that lim mb = − lim ma = cot(α) ∈ R∪{∞}.
Similarly, if lim sup Im(M22 ) = ∞, we have lim inf |m−1 +m−1 | = 0 and there
                                                       a   b
is some subsequence such that lim m−1 = − lim m−1 = tan(α) ∈ R ∪ {∞}.
                                      b              a
This shows Ms ⊆ α Sa,α (τ ) ∩ Sb,α (τ ).
214                                               9. One-dimensional Schr¨dinger operators
                                                                         o


Problem 9.17. Show
                                                                                                             
                      Im(ma (λ)+mb (λ))                             Im(ma (λ)m∗ (λ))
                                                                                 b
                     |ma (λ)|2 +|m∗ (λ)|2                         |ma (λ)|2 +|mb (λ)|2                           dλ
R(λ)dµac (λ) =                     b
                       Im(ma (λ)mb (λ))               |ma (λ)| 2 Im(m (λ))+|m (λ)|2 Im(m (λ))
                                                                                                                    ,
                                                                      b          b      a                         π
                      |ma (λ)|2 +|mb (λ)|2                         |ma (λ)| 2 +|m (λ)|2
                                                                                 b

where ma (λ) = limε↓0 ma (λ + iε) and similarly for mb (λ).
      Moreover, show that the choice of solutions
                              ub (λ, x)                            c(λ, x)
                                                   = V (λ)                 ,
                              ua (λ, x)                            s(λ, x)
where
                                  1          1 mb (λ)
                      V (λ) =                         ,
                           ma (λ) + mb (λ) 1 −ma (λ)
diagonalizes the absolutely continuous part,
                                                       1       Im(ma (λ))     0
        V −1 (λ)∗ R(λ)V (λ)−1 dµac (λ) =                                             dλ.
                                                       π           0      Im(mb (λ))

9.7. The spectra of one-dimensional Schr¨dinger operators
                                        o
In this section we want to look at the case of one-dimensional Schr¨dinger
                                                                   o
operators; that is, r = p = 1 on (a, b) = (0, ∞).
      Recall that
                                           d2
                         H0 = −               ,         D(H0 ) = H 2 (R),                                     (9.113)
                                          dx2
is self-adjoint and
                                              2
                       qH0 (f ) = f               ,        Q(H0 ) = H 1 (R).                                  (9.114)
Hence we can try to apply the results from Chapter 6. We begin with a
simple estimate:
Lemma 9.32. Suppose f ∈ H 1 (0, 1). Then
                                      1                                           1
                                                            1
          sup |f (x)|2 ≤ ε                |f (x)|2 dx + (1 + )                        |f (x)|2 dx             (9.115)
         x∈[0,1]                  0                         ε                 0

for every ε  0.

Proof. First note that
                                  x                                                              1
   |f (x)|2 = |f (c)|2 + 2            Re(f (t)∗ f (t))dt ≤ |f (c)|2 + 2                              |f (t)f (t)|dt
                              c                                                              0
                                  1                                1
                                                           1
            ≤ |f (c)|2 + ε            |f (t)|2 dt +                    |f (t)|2 dt
                              0                            ε   0
for any c ∈ [0, 1]. But by the mean value theorem there is a c ∈ (0, 1) such
                 1
that |f (c)|2 = 0 |f (t)|2 dt.
9.7. The spectra of one-dimensional Schr¨dinger operators
                                        o                                                 215


   As a consequence we obtain
Lemma 9.33. Suppose q ∈ L2 (R) and
                         loc
                                      n+1
                              sup           |q(x)|2 dx  ∞.                            (9.116)
                              n∈Z    n
Then q is relatively bounded with respect to H0 with bound zero.
   Similarly, if q ∈ L1 (R) and
                      loc
                                         n+1
                              sup              |q(x)|dx  ∞.                           (9.117)
                              n∈Z     n
Then q is relatively form bounded with respect to H0 with bound zero.
                                                                             n+1
Proof. Let Q be in L2 (R) and abbreviate M = supn∈Z
                      loc                                                    n   |Q(x)|2 dx.
Using the previous lemma, we have for f ∈ H 1 (R) that
                             n+1
              2
         Qf       ≤                |Q(x)f (x)|2 dx ≤ M             sup      |f (x)|2
                      n∈Z n                                 n∈Z x∈[n,n+1]
                              n+1                                n+1
                                                  1
                  ≤M     ε      |f (x)|2 dx + (1 + )       |f (x)|2 dx
                           n                      ε n
                                      1
               = M ε f 2 + (1 + ) f 2 .
                                      ε
Choosing Q = |q| 1/2 , this already proves the form case since f 2 = q (f ).
                                                                       H0
Choosing Q = q and observing qH0 (f ) = f, H0 f ≤ H0 f f for f ∈
H 2 (R) shows the operator case.

    Hence in both cases H0 + q is a well-defined (semi-bounded) operator
defined as operator sum on D(H0 + q) = D(H0 ) = H 2 (R) in the first case
and as form sum on Q(H0 + q) = Q(H0 ) = H 1 (R) in the second case. Note
also that the first case implies the second one since by Cauchy–Schwarz we
have
                             n+1                   n+1
                                   |q(x)|dx ≤            |q(x)|2 dx.                   (9.118)
                          n                       n
This is not too surprising since we already know how to turn H0 + q into
a self-adjoint operator without imposing any conditions on q (except for
L1 (R)) at all. However, we get at least a simple description of the (form)
 loc
domains and by requiring a bit more, we can even compute the essential
spectrum of the perturbed operator.
Lemma 9.34. Suppose q ∈ L1 (R). Then the resolvent difference of H0 and
H0 + q is trace class.
                                √
Proof. Using G0 (z, x, x) = 1/(2 −z), Lemma 9.12 implies that |q|1/2 RH0 (z)
is Hilbert–Schmidt and hence the result follows from Lemma 6.29.
216                                   9. One-dimensional Schr¨dinger operators
                                                             o


Lemma 9.35. Suppose q ∈ L1 (R) and
                         loc
                                      n+1
                             lim            |q(x)|dx = 0.                (9.119)
                            |n|→∞ n

Then RH0 +q (z) − RH0 (z) is compact and hence σess (H0 + q) = σess (H0 ) =
[0, ∞).

Proof. By Weyl’s theorem it suffices to show that the resolvent difference is
compact. Let qn (x) = q(x)χR[−n,n] (x). Then RH0 +q (z)−RH0 +qn (z) is trace
class, which can be shown as in the previous theorem since q−qn has compact
support (no information on the corresponding diagonal Green’s function is
needed since by continuity it is bounded on every compact set). Moreover,
by the proof of Lemma 9.33, qn is form bounded with respect to H0 with
                                                              m+1
constants a = Mn and b = 2Mn , where Mn = sup|m|≥n m |q(x)|2 dx.
Hence by Theorem 6.25 we see
       RH0 +qn (−λ) = RH0 (−λ)1/2 (1 − Cqn (λ))−1 RH0 (−λ)1/2 ,       λ  2,
with Cqn (λ) ≤ Mn . So we conclude

RH0 +qn (−λ) − RH0 (−λ) = −RH0 (−λ)1/2 Cqn (λ)(1 − Cqn (λ))−1 RH0 (−λ)1/2 ,
λ  2, which implies that the sequence of compact operators RH0 +q (−λ) −
RH0 +qn (−λ) converges to RH0 +q (−λ) − RH0 (−λ) in norm, which implies
that the limit is also compact and finishes the proof.

      Using Lemma 6.23, respectively, Corollary 6.27, we even obtain
Corollary 9.36. Let q = q1 + q2 where q1 and q2 satisfy the assumptions of
Lemma 9.33 and Lemma 9.35, respectively. Then H0 + q1 + q2 is self-adjoint
and σess (H0 + q1 + q2 ) = σess (H0 + q1 ).

    This result applies for example in the case where q2 is a decaying per-
turbation of a periodic potential q1 .
      Finally we turn to the absolutely continuous spectrum.
Lemma 9.37. Suppose q = q1 + q2 , where q1 ∈ L1 (0, ∞) and q2 ∈ AC[0, ∞)
with q2 ∈ L1 (0, ∞) and limx→∞ q2 (x) = 0. Then there are two solutions
u± (λ, x) of τ u = λu, λ  0, of the form
 u± (λ, x) = (1 + o(1))u0,± (λ, x),     u± (λ, x) = (1 + o(1))u0,± (λ, x) (9.120)
as x → ∞, where
                                                 x
                   u0,± (λ, x) = exp ±i              λ − q2 (y)dy .      (9.121)
                                             0
9.7. The spectra of one-dimensional Schr¨dinger operators
                                        o                                       217


Proof. We will omit the dependence on λ for notational simplicity. More-
over, we will choose x so large that Wx (u− , u+ ) = 2i λ − q2 (x) = 0. Write
                                     u0,+ (x) u0,+ (x)                    a+ (x)
  u(x) = U0 (x)a(x),      U0 (x) =                     ,         a(x) =          .
                                     u0,− (x) u0,− (x)                    a− (x)
Then
                      0     1
        u (x) =               u(x)
                   q(x) − λ 0
                            0              0
                  +                                 a(x) + U0 (x)a (x),
                      q+ (x)u0,+ (x) q− (x)u0,− (x)
where
                                                q2 (x)
                          q± (x) = q1 (x) ± i                .
                                                λ − q2 (x)
Hence u(x) will solve τ u = λu if
                       1               q+ (x)      q− (x)u0,− (x)2
        a (x) =                                                    a(x).
                  Wx (u− , u+ )   −q+ (x)u0,+ (x)2     −q− (x)
Since the coefficient matrix of this linear system is integrable, the claim
follows by a simple application of Gronwall’s inequality.
Theorem 9.38 (Weidmann). Let q1 and q2 be as in the previous lemma
and suppose q = q1 + q2 satisfies the assumptions of Lemma 9.35. Let
H = H0 +q1 +q2 . Then σac (H) = [0, ∞), σsc (H) = ∅, and σp (H) ⊆ (−∞, 0].

Proof. By the previous lemma there is no subordinate solution for λ  0 on
(0, ∞) and hence 0  Im(mb (λ+i0))  ∞. Similarly, there is no subordinate
solution (−∞, 0) and hence 0  Im(ma (λ + i0))  ∞. Thus the same is true
for the diagonal entries Mjj (z) of the Weyl M -matrix, 0  Im(Mjj (λ +
i0))  ∞, and hence dµ is purely absolutely continuous on (0, ∞). Since
σess (H) = [0, ∞), we conclude σac (H) = [0, ∞) and σsc (H) ⊆ {0}. Since
the singular continuous part cannot live on a single point, we are done.

    Note that the same results hold for operators on [0, ∞) rather than R.
Moreover, observe that the conditions from Lemma 9.37 are only imposed
near +∞ but not near −∞. The conditions from Lemma 9.35 are only used
to ensure that there is no essential spectrum in (−∞, 0).
    Having dealt with the essential spectrum, let us next look at the discrete
spectrum. In the case of decaying potentials, as in the previous theorem,
one key question is if the number of eigenvalues below the essential spectrum
is finite or not.
   As preparation, we shall prove Sturm’s comparison theorem:
218                                    9. One-dimensional Schr¨dinger operators
                                                              o


Theorem 9.39 (Sturm). Let τ0 , τ1 be associated with q0 ≥ q1 on (a, b),
respectively. Let (c, d) ⊆ (a, b) and τ0 u = 0, τ1 v = 0. Suppose at each end
of (c, d) either Wx (u, v) = 0 or, if c, d ∈ (a, b), u = 0. Then v is either a
multiple of u in (c, d) or v must vanish at some point in (c, d).

Proof. By decreasing d to the first zero of u in (c, d] (and perhaps flipping
signs), we can suppose u  0 on (c, d). If v has no zeros in (c, d), we can
suppose v  0 on (c, d) again by perhaps flipping signs. At each endpoint,
W (u, v) vanishes or else u = 0, v  0, and u (c)  0 (or u (d)  0). Thus,
Wc (u, v) ≤ 0, Wd (u, v) ≥ 0. But this is inconsistent with
                                          d
           Wd (u, v) − Wc (u, v) =            (q0 (t) − q1 (t))u(t)v(t) dt,     (9.122)
                                      c
unless both sides vanish.

    In particular, choosing q0 = q − λ0 and q1 = q − λ1 , this result holds for
solutions of τ u = λ0 u and τ v = λ1 v.
      Now we can prove
Theorem 9.40. Suppose q satisfies (9.117) such that H is semi-bounded
and Q(H) = H 1 (R). Let λ0  · · ·  λn  · · · be its eigenvalues below
the essential spectrum and ψ0 , . . . , ψn , . . . the corresponding eigenfunctions.
Then ψn has n zeros.

Proof. We first prove that ψn has at least n zeros and then that if ψn has
m zeros, then (−∞, λn ] has at least (m + 1) eigenvalues. If ψn has m zeros
at x1 , x2 , . . . , xm and we let x0 = a, xm+1 = b, then by Theorem 9.39, ψn+1
must have at least one zero in each of (x0 , x1 ), (x1 , x2 ), . . . , (xm , xm+1 ); that
is, ψn+1 has at least m + 1 zeros. It follows by induction that ψn has at least
n zeros.
      On the other hand, if ψn has m zeros x1 , . . . , xm , define
                     ψn (x), xj ≤ x ≤ xj+1 ,
          ηj (x) =                                          j = 0, . . . , m,   (9.123)
                     0       otherwise,
where we set x0 = −∞ and xm+1 = ∞. Then ηj is in the form domain
                                                             m
of H and satisfies ηj , Hηj = λn ηj 2 . Hence if η =          j=0 cj ηj , then
 η, Hη = λn η   2 and it follows by Theorem 4.12 (i) that there are at least

m + 1 eigenvalues in (−∞, λn ].

   Note that by Theorem 9.39, the zeros of ψn interlace the zeros of ψn .
The second part of the proof also shows
Corollary 9.41. Let H be as in the previous theorem. If the Weyl solution
u± (λ, .) has m zeros, then dim Ran(−∞,λ) (H) ≥ m. In particular, λ below
the spectrum of H implies that u± (λ, .) has no zeros.
9.7. The spectra of one-dimensional Schr¨dinger operators
                                        o                                 219


    The equation (τ −λ)u is called oscillating if one solution has an infinite
number of zeros. Theorem 9.39 implies that this is then true for all solu-
tions. By our previous considerations this is the case if and only if σ(H) has
infinitely many points below λ. Hence it remains to find a good oscillation
criterion.
Theorem 9.42 (Kneser). Consider q on (0, ∞). Then
                        1
    lim inf x2 q(x)  − implies nonoscillation of τ near ∞            (9.124)
     x→∞                4
and
                          1
      lim sup x2 q(x)  − implies oscillation of τ near ∞.            (9.125)
        x→∞               4
Proof. The key idea is that the equation
                                       d2     µ
                              τ0 = −     2
                                           + 2
                                      dx     x
is of Euler type. Hence it is explicitly solvable with a fundamental system
given by                              q
                                   1
                                       ±   µ+ 1
                                 x2      4.


There are two cases to distinguish. If µ ≥ −1/4, all solutions are nonoscil-
latory. If µ  −1/4, one has to take real/imaginary parts and all solutions
are oscillatory. Hence a straightforward application of Sturm’s comparison
theorem between τ0 and τ yields the result.
Corollary 9.43. Suppose q satisfies (9.117). Then H has finitely many
eigenvalues below the infimum of the essential spectrum 0 if
                                                1
                           lim inf x2 q(x)  −                (9.126)
                         |x|→∞                  4
and infinitely many if
                                               1
                            lim sup x2 q(x)  − .                     (9.127)
                          |x|→∞                4
Problem 9.18. Show that if q is relatively bounded with respect to H0 , then
necessarily q ∈ L2 (R) and (9.116) holds. Similarly, if q is relatively form
                 loc
bounded with respect to H0 , then necessarily q ∈ L1 (R) and (9.117) holds.
                                                   loc
                                                              d 2
Problem 9.19. Suppose q ∈ L1 (R) and consider H = − dx2 + q. Show
that inf σ(H) ≤ R q(x)dx. In particular, there is at least one eigenvalue
                                                                  ∞
below the essential spectrum if R q(x)dx  0. (Hint: Let ϕ ∈ Cc (R) with
ϕ(x) = 1 for |x| ≤ 1 and investigate qH (ϕn ), where ϕn (x) = ϕ(x/n).)
Mathematical methods in quantum mechanics
Chapter 10




One-particle
Schr¨dinger operators
    o


10.1. Self-adjointness and spectrum
Our next goal is to apply these results to Schr¨dinger operators. The Hamil-
                                               o
tonian of one particle in d dimensions is given by
                                        H = H0 + V,                       (10.1)
where V : Rd → R is the potential energy of the particle. We are mainly
interested in the case 1 ≤ d ≤ 3 and want to find classes of potentials which
are relatively bounded, respectively, relatively compact. To do this, we need
a better understanding of the functions in the domain of H0 .
Lemma 10.1. Suppose n ≤ 3 and ψ ∈ H 2 (Rn ). Then ψ ∈ C∞ (Rn ) and for
any a  0 there is a b  0 such that
                                ψ   ∞   ≤ a H0 ψ + b ψ .                  (10.2)

Proof. The important observation is that (p2 + γ 2 )−1 ∈ L2 (Rn ) if n ≤ 3.
                        ˆ
Hence, since (p2 + γ 2 )ψ ∈ L2 (Rn ), the Cauchy–Schwarz inequality
                      ˆ
                      ψ   1   = (p2 + γ 2 )−1 (p2 + γ 2 )ψ(p)
                                                         ˆ      1
                                    2      2 −1    2    2 ˆ
                              ≤ (p + γ )          (p + γ )ψ(p)
      ˆ
shows ψ ∈ L1 (Rn ). But now everything follows from the Riemann-Lebesgue
lemma, that is,
            ψ   ∞   ≤ (2π)−n/2 (p2 + γ 2 )−1 ( p2 ψ(p) + γ 2 ψ(p) )
                                                  ˆ          ˆ

                    = (γ/2π)n/2 (p2 + 1)−1 (γ −2 H0 ψ + ψ ),
which finishes the proof.

                                                                            221
222                                     10. One-particle Schr¨dinger operators
                                                             o


      Now we come to our first result.
Theorem 10.2. Let V be real-valued and V ∈ L∞ (Rn ) if n  3 and V ∈
                                                     ∞
L∞ (Rn ) + L2 (Rn ) if n ≤ 3. Then V is relatively compact with respect to H0 .
  ∞
In particular,
                       H = H0 + V,     D(H) = H 2 (Rn ),                 (10.3)
is self-adjoint, bounded from below and
                               σess (H) = [0, ∞).                          (10.4)
           ∞
Moreover, Cc (Rn ) is a core for H.

Proof. Our previous lemma shows D(H0 ) ⊆ D(V ). Moreover, invoking
Lemma 7.11 with f (p) = (p2 − z)−1 and g(x) = V (x) (note that f ∈
L∞ (Rn ) ∩ L2 (Rn ) for n ≤ 3) shows that V is relatively compact. Since
 ∞
 ∞
Cc (Rn ) is a core for H0 by Lemma 7.9, the same is true for H by the
Kato–Rellich theorem.

                       ∞
   Observe that since Cc (Rn ) ⊆ D(H0 ), we must have V ∈ L2 (Rn ) if
                                                           loc
D(V ) ⊆ D(H0 ).

10.2. The hydrogen atom
We begin with the simple model of a single electron in R3 moving in the
external potential V generated by a nucleus (which is assumed to be fixed
at the origin). If one takes only the electrostatic force into account, then
V is given by the Coulomb potential and the corresponding Hamiltonian is
given by
                                    γ
                    H (1) = −∆ −       , D(H (1) ) = H 2 (R3 ).             (10.5)
                                   |x|
If the potential is attracting, that is, if γ  0, then it describes the hydrogen
atom and is probably the most famous model in quantum mechanics.
                                                        1
      We have chosen as domain D(H (1) ) = D(H0 ) ∩ D( |x| ) = D(H0 ) and by
Theorem 10.2 we conclude that H (1) is self-adjoint. Moreover, Theorem 10.2
also tells us
                          σess (H (1) ) = [0, ∞)                     (10.6)
and that H (1) is bounded from below,
                           E0 = inf σ(H (1) )  −∞.                        (10.7)
If γ ≤ 0, we have H (1) ≥ 0 and hence E0 = 0, but if γ  0, we might have
E0  0 and there might be some discrete eigenvalues below the essential
spectrum.
10.2. The hydrogen atom                                                       223


    In order to say more about the eigenvalues of H (1) , we will use the fact
that both H0 and V (1) = −γ/|x| have a simple behavior with respect to
scaling. Consider the dilation group
                      U (s)ψ(x) = e−ns/2 ψ(e−s x),       s ∈ R,            (10.8)
which is a strongly continuous one-parameter unitary group. The generator
can be easily computed:
               1                         in
     Dψ(x) = (xp + px)ψ(x) = (xp − )ψ(x),             ψ ∈ S(Rn ).  (10.9)
               2                          2
Now let us investigate the action of U (s) on H (1) :
 H (1) (s) = U (−s)H (1) U (s) = e−2s H0 + e−s V (1) ,   D(H (1) (s)) = D(H (1) ).
                                                                          (10.10)
Now suppose Hψ = λψ. Then
         ψ, [U (s), H]ψ = U (−s)ψ, Hψ − Hψ, U (s)ψ = 0                    (10.11)
and hence
                   1                               H − H(s)
            0 = lim  ψ, [U (s), H]ψ = lim U (−s)ψ,          ψ
              s→0 s                   s→0             s
             = ψ, (2H0 + V (1) )ψ .                                       (10.12)
   Thus we have proven the virial theorem.
Theorem 10.3. Suppose H = H0 + V with U (−s)V U (s) = e−s V . Then
any normalized eigenfunction ψ corresponding to an eigenvalue λ satisfies
                                           1
                         λ = − ψ, H0 ψ = ψ, V ψ .                 (10.13)
                                           2
In particular, all eigenvalues must be negative.

    This result even has some further consequences for the point spectrum
of H (1) .
Corollary 10.4. Suppose γ  0. Then
   σp (H (1) ) = σd (H (1) ) = {Ej−1 }j∈N0 ,    E0  Ej  Ej+1  0,       (10.14)
with limj→∞ Ej = 0.
                   ∞
Proof. Choose ψ ∈ Cc (R{0}) and set ψ(s) = U (−s)ψ. Then
               ψ(s), H (1) ψ(s) = e−2s ψ, H0 ψ + e−s ψ, V (1) ψ
which is negative for s large. Now choose a sequence sn → ∞ such that
we have supp(ψ(sn )) ∩ supp(ψ(sm )) = ∅ for n = m. Then Theorem 4.12
(i) shows that rank(PH (1) ((−∞, 0))) = ∞. Since each eigenvalue Ej has
finite multiplicity (it lies in the discrete spectrum), there must be an infinite
number of eigenvalues which accumulate at 0.
224                                      10. One-particle Schr¨dinger operators
                                                              o


      If γ ≤ 0, we have σd (H (1) ) = ∅ since H (1) ≥ 0 in this case.
    Hence we have obtained quite a complete picture of the spectrum of
H (1) . Next, we could try to compute the eigenvalues of H (1) (in the case
γ  0) by solving the corresponding eigenvalue equation, which is given by
the partial differential equation
                                    γ
                         − ∆ψ(x) −     ψ(x) = λψ(x).                (10.15)
                                   |x|
For a general potential this is hopeless, but in our case we can use the rota-
tional symmetry of our operator to reduce our partial differential equation
to ordinary ones.
     First of all, it suggests itself to switch from Cartesian coordinates x =
(x1 , x2 , x3 ) to spherical coordinates (r, θ, ϕ) defined by
       x1 = r sin(θ) cos(ϕ),    x2 = r sin(θ) sin(ϕ),     x3 = r cos(θ),   (10.16)
where r ∈ [0, ∞), θ ∈ [0, π], and ϕ ∈ (−π, π]. This change of coordinates
corresponds to a unitary transform
 L2 (R3 ) → L2 ((0, ∞), r2 dr) ⊗ L2 ((0, π), sin(θ)dθ) ⊗ L2 ((0, 2π), dϕ). (10.17)
In these new coordinates (r, θ, ϕ) our operator reads
                       1 ∂ 2∂       1                            γ
           H (1) = −     2 ∂r
                              r   + 2 L2 + V (r),       V (r) = − ,        (10.18)
                       r        ∂r r                             r
where
                                     1 ∂           ∂     1     ∂2
        L2 = L2 + L2 + L2 = −
              1    2    3                   sin(θ)   −            .        (10.19)
                                  sin(θ) ∂θ        ∂θ sin(θ)2 ∂ϕ2
(Recall the angular momentum operators Lj from Section 8.2.)
      Making the product ansatz (separation of variables)
                            ψ(r, θ, ϕ) = R(r)Θ(θ)Φ(ϕ),                     (10.20)
we obtain the three Sturm–Liouville equations
                  1 d 2d       l(l + 1)
               −   2 dr
                        r    +          + V (r) R(r) = λR(r),
                 r        dr      r2
                 1        d         d     m2
                        − sin(θ) +               Θ(θ) = l(l + 1)Θ(θ),
              sin(θ)      dθ       dθ sin(θ)
                                              d2
                                          − 2 Φ(ϕ) = m2 Φ(ϕ).              (10.21)
                                             dϕ
The form chosen for the constants l(l + 1) and m2 is for convenience later
on. These equations will be investigated in the following sections.
Problem 10.1. Generalize the virial theorem to the case U (−s)V U (s) =
e−αs V , α ∈ R{0}. What about Corollary 10.4?
10.3. Angular momentum                                                        225


10.3. Angular momentum
We start by investigating the equation for Φ(ϕ) which is associated with the
Sturm–Liouville equation
                           τ Φ = −Φ ,      I = (0, 2π).                   (10.22)
Since we want ψ defined via (10.20) to be in the domain of H0 (in particular
continuous), we choose periodic boundary conditions the Sturm–Liouville
equation
  AΦ = τ Φ,     D(A) = {Φ ∈ L2 (0, 2π)| Φ ∈ AC 1 [0, 2π],
                                        Φ(0) = Φ(2π), Φ (0) = Φ (2π)}.
                                                                   (10.23)
   From our analysis in Section 9.1 we immediately obtain
Theorem 10.5. The operator A defined via (10.22) is self-adjoint. Its
spectrum is purely discrete, that is,
                       σ(A) = σd (A) = {m2 |m ∈ Z},                       (10.24)
and the corresponding eigenfunctions
                                 1
                     Φm (ϕ) = √ eimϕ ,            m ∈ Z,                  (10.25)
                                 2π
form an orthonormal basis for L2 (0, 2π).

   Note that except for the lowest eigenvalue, all eigenvalues are twice de-
generate.
   We note that this operator is essentially the square of the angular mo-
mentum in the third coordinate direction, since in polar coordinates
                                         1 ∂
                                  L3 =        .                           (10.26)
                                         i ∂ϕ
   Now we turn to the equation for Θ(θ):
                 1         d        d    m2
  τm Θ(θ) =            −      sin(θ) +            Θ(θ),    I = (0, π), m ∈ N0 .
              sin(θ)       dθ       dθ sin(θ)
                                                                          (10.27)
    For the investigation of the corresponding operator we use the unitary
transform
  L2 ((0, π), sin(θ)dθ) → L2 ((−1, 1), dx),
                                        Θ(θ) → f (x) = Θ(arccos(x)).
                                                                (10.28)
The operator τ transforms to the somewhat simpler form
                                 d            d   m2
                       τm = −      (1 − x2 )    −      .                  (10.29)
                                dx           dx 1 − x2
226                                      10. One-particle Schr¨dinger operators
                                                              o


The corresponding eigenvalue equation
                                 τm u = l(l + 1)u                            (10.30)
is the associated Legendre equation. For l ∈ N0 it is solved by the
associated Legendre functions [1, (8.6.6)]
                                            dm
            Plm (x) = (−1)m (1 − x2 )m/2       Pl (x),         |m| ≤ l,      (10.31)
                                           dxm
where the
                            1 dl 2
                      Pl (x) =       (x − 1)l ,  l ∈ N0 ,          (10.32)
                           2l l! dxl
are the Legendre polynomials [1, (8.6.18)] (Problem 10.2). Moreover,
note that the Pl (x) are (nonzero) polynomials of degree l and since τm
depends only on m2 , there must be a relation between Plm (x) and Pl−m (x).
In fact, (Problem 10.3)
                                                   (l + m)! m
                         Pl−m (x) = (−1)m                  P .               (10.33)
                                                   (l − m)! l
A second, linearly independent, solution is given by
                                               x
                                                          dt
                      Qm (x) = Plm (x)
                       l                                             .       (10.34)
                                           0       (1 − t2 )Plm (t)2
                                                                 dt      x
In fact, for every Sturm–Liouville equation, v(x) = u(x)     p(t)u(t)2
                                                                       satisfies
τ v = 0 whenever τ u = 0. Now fix l = 0 and note P0 (x) = 1. For m = 0 we
have Q0 = arctanh(x) ∈ L2 and so τ0 is l.c. at both endpoints. For m  0
        0
we have Qm = (x ± 1)−m/2 (C + O(x ± 1)) which shows that it is not square
            0
integrable. Thus τm is l.c. for m = 0 and l.p. for m  0 at both endpoints.
In order to make sure that the eigenfunctions for m = 0 are continuous (such
that ψ defined via (10.20) is continuous), we choose the boundary condition
generated by P0 (x) = 1 in this case:
      Am f = τm f,
  D(Am ) = {f ∈ L2 (−1, 1)| f ∈ AC 1 (−1, 1), τm f ∈ L2 (−1, 1),             (10.35)
                            limx→±1 (1 − x2 )f (x) = 0 if m = 0}.
Theorem 10.6. The operator Am , m ∈ N0 , defined via (10.35) is self-
adjoint. Its spectrum is purely discrete, that is,
                 σ(Am ) = σd (Am ) = {l(l + 1)|l ∈ N0 , l ≥ m},              (10.36)
and the corresponding eigenfunctions

                       2l + 1 (l − m)! m
         ul,m (x) =                   P (x),               l ∈ N0 , l ≥ m,   (10.37)
                          2 (l + m)! l
form an orthonormal basis for L2 (−1, 1).
10.3. Angular momentum                                                            227


Proof. By Theorem 9.6, Am is self-adjoint. Moreover, Plm is an eigenfunc-
tion corresponding to the eigenvalue l(l + 1) and it suffices to show that the
Plm form a basis. To prove this, it suffices to show that the functions Plm (x)
are dense. Since (1 − x2 )  0 for x ∈ (−1, 1), it suffices to show that the
functions (1 − x2 )−m/2 Plm (x) are dense. But the span of these functions
contains every polynomial. Every continuous function can be approximated
by polynomials (in the sup norm, Theorem 0.15, and hence in the L2 norm)
and since the continuous functions are dense, so are the polynomials.
    For the normalization of the eigenfunctions see Problem 10.7, respec-
tively, [1, (8.14.13)].

   Returning to our original setting, we conclude that the
                      2l + 1 (l + m)! m
         Θm (θ) =
          l                          P (cos(θ)),        |m| ≤ l,              (10.38)
                         2 (l − m)! l
form an orthonormal basis for L2 ((0, π), sin(θ)dθ) for any fixed m ∈ N0 .
Theorem 10.7. The operator L2 on L2 ((0, π), sin(θ)dθ) ⊗ L2 ((0, 2π)) has
a purely discrete spectrum given
                         σ(L2 ) = {l(l + 1)|l ∈ N0 }.                         (10.39)
The spherical harmonics
                                 2l + 1 (l − m)! m
 Ylm (θ, ϕ) = Θm (θ)Φm (ϕ) =
               l                                P (cos(θ))eimϕ ,            |m| ≤ l,
                                   4π (l + m)! l
                                                                            (10.40)
form an orthonormal basis and satisfy    L2 Ylm   = l(l +   1)Ylm      and L3 Ylm =
mYlm .

Proof. Everything follows from our construction, if we can show that the
Ylm form a basis. But this follows as in the proof of Lemma 1.10.

   Note that transforming the Ylm back to cartesian coordinates gives
                                                                   m
                       2l + 1 (l − |m|)! ˜ m x3     x1 ± ix2
   Yl±m (x) = (−1)m                     P ( )                          ,   r = |x|,
                         4π (l + |m|)! l r             r
                                                                              (10.41)
      ˜
where Plm is a polynomial of degree l − m given by
                                               dl+m
           Plm (x) = (1 − x2 )−m/2 Plm (x) = l+m (1 − x2 )l .
           ˜                                                       (10.42)
                                              dx
In particular, the Ylm are smooth away from the origin and by construction
they satisfy
                                        l(l + 1) m
                            − ∆Ylm =             Yl .              (10.43)
                                           r2
228                                                10. One-particle Schr¨dinger operators
                                                                        o


Problem 10.2. Show that the associated Legendre functions satisfy the
differential equation (10.30). (Hint: Start with the Legendre polynomials
(10.32) which correspond to m = 0. Set v(x) = (x2 − 1)l and observe
(x2 − 1)v (x) = 2lx v(x). Then differentiate this identity l + 1 times using
Leibniz’s rule. For the case of the associated Legendre functions, substitute
v(x) = (1 − x2 )m/2 u(x) in (10.30) and differentiate the resulting equation
once.)

Problem 10.3. Show (10.33). (Hint: Write (x2 − 1)l = (x − 1)l (x + 1)l
and use Leibniz’s rule.)

Problem 10.4 (Orthogonal polynomials). Suppose the monic polynomials
Pj (x) = xj + βj xj−1 + . . . are orthogonal with respect to the weight function
w(x):
                            b                                2
                                                           αj ,    i = j,
                                Pi (x)Pj (x)w(x)dx =
                        a                                  0,      otherwise.
Note that they are uniquely determined by the Gram–Schmidt procedure.
     ¯        −1
Let Pj (x) = αj P (x) and show that they satisfy the three term recurrence
relation
                  ¯             ¯             ¯           ¯
               aj Pj+1 (x) + bj Pj (x) + aj−1 Pj−1 (x) = xPj (x),
where
                   b                                                    b
        aj =            ¯       ¯
                       xPj+1 (x)Pj (x)w(x)dx,               bj =             ¯
                                                                            xPj (x)2 w(x)dx.
               a                                                    a

Moreover, show
                                        αj+1
                                 aj =        ,       bj = βj − βj+1 .
                                         αj
(Note that w(x)dx could be replaced by a measure dµ(x).)

Problem 10.5. Consider the orthogonal polynomials with respect to the
weight function w(x) as in the previous problem. Suppose |w(x)| ≤ Ce−k|x|
for some C, k  0. Show that the orthogonal polynomials are dense in
L2 (R, w(x)dx). (Hint: It suffices to show that f (x)xj w(x)dx = 0 for
all j ∈ N0 implies f = 0. Consider the Fourier transform of f (x)w(x) and
note that it has an analytic extension by Problem 7.11. Hence this Fourier
transform will be zero if, e.g., all derivatives at p = 0 are zero (cf. Prob-
lem 7.3).)

Problem 10.6. Show
                                        l/2
                                                 (−1)k (2l − 2k)!
                         Pl (x) =                                    xl−2k .
                                              2l k!(l − k)!(l − 2k)!
                                        k=0
10.4. The eigenvalues of the hydrogen atom                                               229


Moreover, by Problem 10.4 there is a recurrence relation of the form Pl+1 (x) =
(˜l + ˜l x)Pl (x) + cl Pl−1 (x). Find the coefficients by comparing the highest
 a    b             ˜
powers in x and conclude
                     (l + 1)Pl+1 (x) = (2l + 1)xPl (x) − lPl−1 .
Use this to prove
                                         1
                                                               2
                                             Pl (x)2 dx =          .
                                     −1                     2l + 1
Problem 10.7. Prove
                             1
                                                         2 (l + m)!
                                 Plm (x)2 dx =                        .
                            −1                        2l + 1 (l − m)!
                                               1
(Hint: Use (10.33) to compute −1 Plm (x)Pl−m (x)dx by integrating by parts
until you can use the case m = 0 from the previous problem.)

10.4. The eigenvalues of the hydrogen atom
Now we want to use the considerations from the previous section to decom-
pose the Hamiltonian of the hydrogen atom. In fact, we can even admit any
spherically symmetric potential V (x) = V (|x|) with
                        V (r) ∈ L∞ (R) + L2 ((0, ∞), r2 dr)
                                 ∞                                                   (10.44)
such that Theorem 10.2 holds.
   The important observation is that the spaces
        Hl,m = {ψ(x) = R(r)Ylm (θ, ϕ)|R(r) ∈ L2 ((0, ∞), r2 dr)}                     (10.45)
with corresponding projectors
                            2π       π
  Plm ψ(r, θ, ϕ) =                       ψ(r, θ , ϕ )Ylm (θ , ϕ ) sin(θ )dθ dϕ   Ylm (θ, ϕ)
                        0        0
                                                                     (10.46)
reduce our operator H = H0 + V . By Lemma 2.24 it suffices to check
                          ∞
this for H restricted to Cc (R3 ), which is straightforward. Hence, again by
Lemma 2.24,
                           H = H0 + V =          ˜
                                                Hl ,                 (10.47)
                                                            l,m
where
           ˜                             1 d 2d    l(l + 1)
           Hl R(r) = τl R(r),
                     ˜                       τl = −
                                             ˜
                                             r   +          + V (r),
                                        r2 dr dr      r2
            D(Hl ) ⊆ L2 ((0, ∞), r2 dr).                                             (10.48)
Using the unitary transformation
     L2 ((0, ∞), r2 dr) → L2 ((0, ∞)),                  R(r) → u(r) = rR(r),         (10.49)
230                                       10. One-particle Schr¨dinger operators
                                                               o


our operator transforms to
                                      d2     l(l + 1)
                      Al f = τl f,   τl = −
                                          +           + V (r),
                                     dr2        r2
                 D(Al ) = Plm D(H) ⊆ L2 ((0, ∞)).                                 (10.50)
It remains to investigate this operator (that its domain is indeed independent
of m follows from the next theorem).
Theorem 10.8. The domain of the operator Al is given by
      D(Al ) = {f ∈ L2 (I)| f, f ∈ AC(I), τ f ∈ L2 (I),                           (10.51)
                            limr→0 (f (r) − rf (r)) = 0 if l = 0},
where I = (0, ∞). Moreover,
   σess (Al ) = σac (Al ) = [0, ∞),     σsc (Al ) = ∅,            σp ⊂ (−∞, 0].   (10.52)

Proof. By construction of Al we know that it is self-adjoint and satisfies
σess (Al ) ⊆ [0, ∞) (Problem 10.8). By Lemma 9.37 we have (0, ∞) ⊆ N∞ (τl )
and hence Theorem 9.31 implies σac (Al ) = [0, ∞), σsc (Al ) = ∅, and σp ⊂
(−∞, 0]. So it remains to compute the domain. We know at least D(Al ) ⊆
D(τ ) and since D(H) = D(H0 ), it suffices to consider the case V = 0. In this
case the solutions of −u (r)+ l(l+1) u(r) = 0 are given by u(r) = αrl+1 +βr−l .
                                  r2
Thus we are in the l.p. case at ∞ for any l ∈ N0 . However, at 0 we are in
the l.p. case only if l  0; that is, we need an additional boundary condition
at 0 if l = 0. Since we need R(r) = u(r) to be bounded (such that (10.20)
                                          r
is in the domain of H0 , that is, continuous), we have to take the boundary
condition generated by u(r) = r.

    Finally let us turn to some explicit choices for V , where the correspond-
ing differential equation can be explicitly solved. The simplest case is V = 0.
In this case the solutions of
                                      l(l + 1)
                        − u (r) +              u(r) = zu(r)             (10.53)
                                         r2
are given by
                                      √                        √
                u(r) = α z −l/2 r jl ( zr) + β z (l+1)/2 r yl ( zr),    (10.54)
where jl (r) and yl (r) are the spherical Bessel, respectively, spherical
Neumann, functions
                                                          l
                        π                        1 d          sin(r)
           jl (r) =       J      (r) = (−r)l                         ,
                        2r l+1/2                 r dr            r
                                                              l
                        π                          1 d            cos(r)
           yl (r) =       Y      (r) = −(−r)l                            .        (10.55)
                        2r l+1/2                   r dr             r
10.4. The eigenvalues of the hydrogen atom                                              231


                       √                        √
Note that z −l/2 r jl ( zr) and z (l+1)/2 r yl ( zr) are entire as functions of z
                                                   √                  √
and their Wronskian is given by W (z −l/2 r jl ( zr), z (l+1)/2 r yl ( zr)) = 1.
See [1, Sects. 10.1 and 10.3]. In particular,
                    r     √   2l l!
   ua (z, r) =         j ( zr) =
                    l/2 l
                                     rl+1 (1 + O(r2 )),
              z            (2l + 1)!
              √         √             √             √              1
   ub (z, r) = −zr jl (i −zr) + iyl (i −zr) = e− −zr+ilπ/2 (1 + O( ))
                                                                   r
                                                                  (10.56)
are the functions which are square integrable and satisfy the boundary con-
dition (if any) near a = 0 and b = ∞, respectively.
    The second case is that of our Coulomb potential
                                    γ
                          V (r) = − ,     γ  0,                   (10.57)
                                    r
where we will try to compute the eigenvalues plus corresponding eigenfunc-
tions. It turns out that they can be expressed in terms of the Laguerre
polynomials ([1, (22.2.13)])
                                                       er dj −r j
                                           Lj (r) =           e r                    (10.58)
                                                       j! drj
and the generalized Laguerre polynomials ([1, (22.2.12)])

                                     (k)                  dk
                                Lj (r) = (−1)k                Lj+k (r).              (10.59)
                                                          drk
                        (k)
Note that the Lj (r) are polynomials of degree j − k which are explicitly
given by
                                                  j
                                 (k)                               j + k ri
                                Lj (r)       =         (−1)i                         (10.60)
                                                                   j − i i!
                                                 i=0
and satisfy the differential equation (Problem 10.9)
                          r y (r) + (k + 1 − r)y (r) + j y(r) = 0.                   (10.61)
Moreover, they are orthogonal in the Hilbert space L2 ((0, ∞), rk e−r dr) (Prob-
lem 10.10):
               ∞                                           (j+k)!
                    (k)        (k)                           j! ,       i = j,
                   Lj (r)Lj (r)rk e−r dr =                                           (10.62)
           0                                               0,           otherwise.

Theorem 10.9. The eigenvalues of H (1) are explicitly given by
                                                           2
                                                γ
                              En = −                           ,      n ∈ N0 .       (10.63)
                                             2(n + 1)
232                                         10. One-particle Schr¨dinger operators
                                                                 o


An orthonormal basis for the corresponding eigenspace is given by the (n+1)2
functions
                   ψn,l,m (x) = Rn,l (r)Ylm (x),              |m| ≤ l ≤ n,             (10.64)
where
                                                              l
                        γ 3 (n − l)!              γr                  γr
                                                                  − 2(n+1)     γr
                                                                              (2l+1)
      Rn,l (r) =                                                  e          Ln−l ( ).
                   2(n + 1)4 (n + l + 1)!        n+1                          n+1
                                                                                  (10.65)
                                                        2
In particular, the lowest eigenvalue E0 = − γ4              is simple and the correspond-
                                     γ 3 −γr/2
ing eigenfunction ψ000 (x) =          2 e         is positive.

Proof. Since all eigenvalues are negative, we need to look at the equation
                               l(l + 1) γ
                      −u (r) + (       − )u(r) = λu(r)
                                  r2     r
                                           √                ex/2     x
for λ  0. Introducing new variables x = 2 −λ r and v(x) = xl+1 u( 2√−λ ),
this equation transforms into Kummer’s equation
                                                          γ
 xv (x) + (k + 1 − x)v (x) + j v(x) = 0, k = 2l + 1, j = √     − (l + 1).
                                                        2 −λ
Now let us search for a solution which can be expanded into a convergent
power series
                                       ∞
                             v(x) =         vi xi ,    v0 = 1.                         (10.66)
                                      i=0
The corresponding u(r) is square integrable near 0 and satisfies the boundary
condition (if any). Thus we need to find those values of λ for which it is
square integrable near +∞.
    Substituting the ansatz (10.66) into our differential equation and com-
paring powers of x gives the following recursion for the coefficients
                                            (i − j)
                            vi+1 =                       vi
                                      (i + 1)(i + k + 1)
and thus
                                           i−1
                                      1           −j
                               vi =                   .
                                      i!         +k+1
                                           =0
Now there are two cases to distinguish. If j ∈ N0 , then vi = 0 for i  j and
v(x) is a polynomial; namely
                                                  −1
                                        j+k             (k)
                            v(x) =                     Lj (x).
                                         j
10.4. The eigenvalues of the hydrogen atom                                       233


In this case u(r) is square integrable and hence an eigenfunction correspond-
                                   γ
ing to the eigenvalue λj = −( 2(n+1) )2 , n = j + l. This proves the formula
for Rn,l (r) except for the normalization which follows from (Problem 10.11)
                      ∞
                           (k)                     (j + k)!
                          Lj (r)2 rk+1 e−r dr =             (2j + k + 1).     (10.67)
                  0                                   j!
It remains to show that we have found all eigenfunctions, that is, that there
are no other square integrable solutions. Otherwise, if j ∈ N, we have
vi+1    (1−ε)
 vi ≥ i+1 for i sufficiently large. Hence by adding a polynomial to v(x)
(and perhaps flipping its sign), we can get a function v (x) such that vi ≥
                                                        ˜                ˜
(1−ε)i
   i!  for all i. But then v (x) ≥ exp((1 − ε)x) and thus the corresponding
                           ˜
u(r) is not square integrable near +∞.

    Finally, let us also look at an alternative algebraic approach for com-
puting the eigenvalues and eigenfunctions of Al based on the commutation
methods from Section 8.4. We begin by introducing
                 d    l+1       γ
     Ql f = −       +     −          ,
                 dr    r    2(l + 1)
   D(Ql ) = {f ∈ L2 ((0, ∞))|f ∈ AC((0, ∞)), Ql f ∈ L2 ((0, ∞))}.             (10.68)

Then (Problem 9.3) Ql is closed and its adjoint is given by
            d    l+1       γ
   Q∗ f =
    l          +     −          ,
            dr    r    2(l + 1)
 D(Q∗ ) = {f ∈ L2 ((0, ∞))| f ∈ AC((0, ∞)), Q∗ f ∈ L2 ((0, ∞)),
    l                                          l                              (10.69)
                            limx→0,∞ f (x)g(x) = 0, ∀g ∈ D(Ql )}.
It is straightforward to check
                   Ker(Ql ) = span{ul,0 },             Ker(Q∗ ) = {0},
                                                            l                 (10.70)
where
                                                (l+1)+1/2
                                 1      γ                             γ
                                                                 − 2(l+1) r
         ul,0 (r) =                                         rl+1 e            (10.71)
                           (2l + 2)!   l+1
is normalized.

Theorem 10.10. The radial Schr¨dinger operator Al satisfies
                              o
                      Al = Q∗ Ql − c2 ,
                            l       l            Al+1 = Ql Q∗ − c2 ,
                                                            l    l            (10.72)
where
                                                  γ
                                       cl =            .                      (10.73)
                                              2(l + 1)
234                                            10. One-particle Schr¨dinger operators
                                                                    o


Proof. Equality is easy to check for f ∈ AC 2 with compact support. Hence
Q∗ Ql − c2 is a self-adjoint extension of τl restricted to this set. If l  0,
  l       l
there is only one self-adjoint extension and equality follows. If l = 0, we
know u0,0 ∈ D(Q∗ Ql ) and since Al is the only self-adjoint extension with
                  l
u0,0 ∈ D(Al ), equality follows in this case as well.

    Hence, as a consequence of Theorem 8.6 we see σ(Al ) = σ(Al+1 )∪{−c2 },
                                                                       l
or, equivalently,
                               σp (Al ) = {−c2 |j ≥ l}
                                             j                                    (10.74)
if we use that σp (Al ) ⊂ (−∞, 0), which already follows from the virial the-
orem. Moreover, using Ql , we can turn any eigenfunction of Hl into one
of Hl+1 . However, we only know the lowest eigenfunction ul,0 , which is
mapped to 0 by Ql . On the other hand, we can also use Q∗ to turn an
                                                               l
eigenfunction of Hl+1 into one of Hl . Hence Q∗ ul+1,0 will give the second
                                                l
eigenfunction of Hl . Proceeding inductively, the normalized eigenfunction
of Hl corresponding to the eigenvalue −c2 is given by
                                          l+j

                 j−1                      −1

        ul,j =         (cl+j − cl+k )          Q∗ Q∗ · · · Q∗
                                                l l+1       l+j−1 ul+j,0 .        (10.75)
                 k=0

The connection with Theorem 10.9 is given by
                                          1
                                Rn,l (r) = ul,n−l (r).                            (10.76)
                                          r
Problem 10.8. Let A =            n An .   Then       n σess (An )   ⊆ σess (A).

Problem 10.9. Show that the generalized Laguerre polynomials satisfy the
differential equation (10.61). (Hint: Start with the Laguerre polynomials
(10.58) which correspond to k = 0. Set v(r) = rj e−r and observe r v (r) =
(j − r)v(r). Then differentiate this identity j + 1 times using Leibniz’s
rule. For the case of the generalized Laguerre polynomials, start with the
differential equation for Lj+k (r) and differentiate k times.)

Problem 10.10. Show that the differential equation (10.58) can be rewritten
in Sturm–Liouville form as
                                     d k+1 −r d
                           −r−k er      r e     u = ju.
                                     dr      dr
We have found one entire solution in the proof of Theorem 10.9. Show that
any linearly independent solution behaves like log(r) if k = 0, respectively,
like r−k otherwise. Show that it is l.c. at the endpoint r = 0 if k = 0 and
l.p. otherwise.
10.5. Nondegeneracy of the ground state                                    235


    Let H = L2 ((0, ∞), rk e−r dr). The operator
                                     d k+1 −r d
                 Ak f = τ f = −r−k er  r e         f,
                                    dr         dr
               D(Ak ) = {f ∈ H| f ∈ AC 1 (0, ∞), τk f ∈ H,
                                limr→0 rf (r) = 0 if k = 0}
for k ∈ N0 is self-adjoint. Its spectrum is purely discrete, that is,
                            σ(Ak ) = σd (Ak ) = N0 ,                    (10.77)
and the corresponding eigenfunctions
                               (k)
                             Lj (r),      j ∈ N0 ,                      (10.78)
form an orthogonal base for H. (Hint: Compare the argument for the asso-
ciated Legendre equation and Problem 10.5.)
Problem 10.11. By Problem 10.4 there is a recurrence relation of the form
  (k)                   (k)         (k)
Lj+1 (r) = (˜j + ˜j r)Lj (r) + cj Lj−1 (r). Find the coefficients by comparing
             a   b               ˜
the highest powers in r and conclude
         (k)          1                      (k)             (k)
        Lj+1 (r) =          (2j + k + 1 − r)Lj (r) − (j + k)Lj−1 (r) .
                    1+j
Use this to prove (10.62) and (10.67).

10.5. Nondegeneracy of the ground state
The lowest eigenvalue (below the essential spectrum) of a Schr¨dinger op-
                                                                  o
erator is called the ground state. Since the laws of physics state that a
quantum system will transfer energy to its surroundings (e.g., an atom emits
radiation) until it eventually reaches its ground state, this state is in some
sense the most important state. We have seen that the hydrogen atom has
a nondegenerate (simple) ground state with a corresponding positive eigen-
function. In particular, the hydrogen atom is stable in the sense that there
is a lowest possible energy. This is quite surprising since the corresponding
classical mechanical system is not — the electron could fall into the nucleus!
    Our aim in this section is to show that the ground state is simple with a
corresponding positive eigenfunction. Note that it suffices to show that any
ground state eigenfunction is positive since nondegeneracy then follows for
free: two positive functions cannot be orthogonal.
    To set the stage, let us introduce some notation. Let H = L2 (Rn ). We
call f ∈ L2 (Rn ) positive if f ≥ 0 a.e. and f ≡ 0. We call f strictly posi-
tive if f  0 a.e. A bounded operator A is called positivity preserving if
f ≥ 0 implies Af ≥ 0 and positivity improving if f ≥ 0 implies Af  0.
Clearly A is positivity preserving (improving) if and only if f, Ag ≥ 0
( 0) for f, g ≥ 0.
236                                  10. One-particle Schr¨dinger operators
                                                          o


Example. Multiplication by a positive function is positivity preserving (but
not improving). Convolution with a strictly positive function is positivity
improving.

    We first show that positivity improving operators have positive eigen-
functions.

Theorem 10.11. Suppose A ∈ L(L2 (Rn )) is a self-adjoint, positivity im-
proving and real (i.e., it maps real functions to real functions) operator. If
 A is an eigenvalue, then it is simple and the corresponding eigenfunction
is strictly positive.

Proof. Let ψ be an eigenfunction. It is no restriction to assume that ψ is
real (since A is real, both real and imaginary part of ψ are eigenfunctions
as well). We assume ψ = 1 and denote by ψ± = f ±2 f | the positive and
negative parts of ψ. Then by |Aψ| = |Aψ+ − Aψ− | ≤ Aψ+ + Aψ− = A|ψ|
we have
              A = ψ, Aψ ≤ |ψ|, |Aψ| ≤ |ψ|, A|ψ| ≤ A ;
that is, ψ, Aψ = |ψ|, A|ψ| and thus
                             1
                 ψ+ , Aψ− = ( |ψ|, A|ψ| − ψ, Aψ ) = 0.
                             4
Consequently ψ− = 0 or ψ+ = 0 since otherwise Aψ−  0 and hence also
 ψ+ , Aψ−  0. Without restriction ψ = ψ+ ≥ 0 and since A is positivity
increasing, we even have ψ = A −1 Aψ  0.

    So we need a positivity improving operator. By (7.38) and (7.39) both
e−tH0 , t  0, and Rλ (H0 ), λ  0, are since they are given by convolution
with a strictly positive function. Our hope is that this property carries over
to H = H0 + V .

Theorem 10.12. Suppose H = H0 + V is self-adjoint and bounded from
             ∞
below with Cc (Rn ) as a core. If E0 = min σ(H) is an eigenvalue, it is
simple and the corresponding eigenfunction is strictly positive.

Proof. We first show that e−tH , t  0, is positivity preserving. If we set
Vn = V χ{x| |V (x)|≤n} , then Vn is bounded and Hn = H0 + Vn is positivity
preserving by the Trotter product formula since both e−tH0 and e−tV are.
                                               ∞
Moreover, we have Hn ψ → Hψ for ψ ∈ Cc (Rn ) (note that necessarily
                            sr
V ∈ L2 ) and hence Hn → H in the strong resolvent sense by Lemma 6.36.
       loc
               s
Hence e−tHn → e−tH by Theorem 6.31, which shows that e−tH is at least
positivity preserving (since 0 cannot be an eigenvalue of e−tH , it cannot map
a positive function to 0).
10.5. Nondegeneracy of the ground state                                   237


   Next I claim that for ψ positive the closed set
           N (ψ) = {ϕ ∈ L2 (Rn ) | ϕ ≥ 0, ϕ, e−sH ψ = 0 ∀s ≥ 0}
is just {0}. If ϕ ∈ N (ψ), we have by e−sH ψ ≥ 0 that ϕe−sH ψ = 0. Hence
etVn ϕe−sH ψ = 0; that is, etVn ϕ ∈ N (ψ). In other words, both etVn and e−tH
leave N (ψ) invariant and invoking Trotter’s formula again, the same is true
for
                                              t   t    k
                      e−t(H−Vn ) = s-lim e− k H e k Vn .
                                       k→∞
                 s
Since e−t(H−Vn ) → e−tH0 , we finally obtain that e−tH0 leaves N (ψ) invariant,
but this operator is positivity increasing and thus N (ψ) = {0}.
   Now it remains to use (7.37), which shows
                                  ∞
             ϕ, RH (λ)ψ =             eλt ϕ, e−tH ψ dt  0,   λ  E0 ,
                              0
for ϕ, ψ positive. So RH (λ) is positivity increasing for λ  E0 .
    If ψ is an eigenfunction of H corresponding to E0 , it is an eigenfunction
of RH (λ) corresponding to E01 and the claim follows since RH (λ) =
                                −λ
  1
E0 −λ .

    The assumptions are for example satisfied for the potentials V considered
in Theorem 10.2.
Mathematical methods in quantum mechanics
Chapter 11




Atomic Schr¨dinger
           o
operators


11.1. Self-adjointness
In this section we want to have a look at the Hamiltonian corresponding to
more than one interacting particle. It is given by

                               N           N
                      H=−           ∆j +         Vj,k (xj − xk ).          (11.1)
                              j=1          jk


    We first consider the case of two particles, which will give us a feeling
for how the many particle case differs from the one particle case and how
the difficulties can be overcome.
    We denote the coordinates corresponding to the first particle by x1 =
(x1,1 , x1,2 , x1,3 ) and those corresponding to the second particle by x2 =
(x2,1 , x2,2 , x2,3 ). If we assume that the interaction is again of the Coulomb
type, the Hamiltonian is given by

                                        γ
              H = −∆1 − ∆2 −                   ,     D(H) = H 2 (R6 ).     (11.2)
                                    |x1 − x2 |

Since Theorem 10.2 does not allow singularities for n ≥ 3, it does not tell
us whether H is self-adjoint or not. Let

                                    1       I I
                      (y1 , y2 ) = √                   (x1 , x2 ).         (11.3)
                                     2     −I I

                                                                             239
240                                        11. Atomic Schr¨dinger operators
                                                          o


Then H reads in this new coordinate system as
                                              √
                                           γ/ 2
                       H = (−∆1 ) + (−∆2 −        ).                     (11.4)
                                            |y2 |
In particular, it is the sum of a free particle plus a particle in an external
Coulomb field. From a physics point of view, the first part corresponds to
the center of mass motion and the second part to the relative motion.
                     √
     Using that γ/( 2|y2 |) has (−∆2 )-bound 0 in L2 (R3 ), it is not hard to
see that the same is true for the (−∆1 − ∆2 )-bound in L2 (R6 ) (details will
follow in the next section). In particular, H is self-adjoint and semi-bounded
                                                            √
for any γ ∈ R. Moreover, you might suspect that γ/( 2|y2 |) is relatively
compact with respect to −∆1 − ∆2 in L2 (R6 ) since it is with respect to −∆2
                                                                        √
in L2 (R6 ). However, this is not true! This is due to the fact that γ/( 2|y2 |)
does not vanish as |y| → ∞.
    Let us look at this problem from the physical view point. If λ ∈ σess (H),
this means that the movement of the whole system is somehow unbounded.
There are two possibilities for this.
    First, both particles are far away from each other (such that we can
neglect the interaction) and the energy corresponds to the sum of the kinetic
energies of both particles. Since both can be arbitrarily small (but positive),
we expect [0, ∞) ⊆ σess (H).
    Secondly, both particles remain close to each other and move together.
In the last set of coordinates this corresponds to a bound state of the second
operator. Hence we expect [λ0 , ∞) ⊆ σess (H), where λ0 = −γ 2 /8 is the
smallest eigenvalue of the second operator if the forces are attracting (γ ≥ 0)
and λ0 = 0 if they are repelling (γ ≤ 0).
     It is not hard to translate this intuitive idea into a rigorous proof. Let
ψ1 (y1 ) be a Weyl sequence corresponding to λ ∈ [0, ∞) for −∆1 and let
                                                                 √
ψ2 (y2 ) be a Weyl sequence corresponding to λ0 for −∆2 −γ/( 2|y2 |). Then,
ψ1 (y1 )ψ2 (y2 ) is a Weyl sequence corresponding to λ + λ0 for H and thus
[λ0 ,√ ⊆ σess (H). Conversely, we have −∆1 ≥ 0, respectively, −∆2 −
     ∞)
γ/( 2|y2 |) ≥ λ0 , and hence H ≥ λ0 . Thus we obtain
                                                −γ 2 /8, γ ≥ 0,
        σ(H) = σess (H) = [λ0 , ∞),    λ0 =                              (11.5)
                                                0,       γ ≤ 0.
Clearly, the physically relevant information is the spectrum of the operator
           √
−∆2 − γ/( 2|y2 |) which is hidden by the spectrum of −∆1 . Hence, in order
to reveal the physics, one first has to remove the center of mass motion.
    To avoid clumsy notation, we will restrict ourselves to the case of one
atom with N electrons whose nucleus is fixed at the origin. In particular,
this implies that we do not have to deal with the center of mass motion
11.1. Self-adjointness                                                                               241


encountered in our example above. In this case the Hamiltonian is given by
                                  N              N                     N    N
                 (N )
             H          =−            ∆j −            Vne (xj ) +                Vee (xj − xk ),
                              j=1             j=1                   j=1 jk

         D(H (N ) ) = H 2 (R3N ),                                                                  (11.6)
where Vne describes the interaction of one electron with the nucleus and Vee
describes the interaction of two electrons. Explicitly we have
                              γj
                    Vj (x) =      ,   γj  0, j = ne, ee.             (11.7)
                              |x|
We first need to establish the self-adjointness of H (N ) = H0 + V (N ) . This
will follow from Kato’s theorem.
Theorem 11.1 (Kato). Let Vk ∈ L∞ (Rd ) + L2 (Rd ), d ≤ 3, be real-valued
                                         ∞
and let Vk (y (k) ) be the multiplication operator in L2 (Rn ), n = N d, obtained
by letting y (k) be the first d coordinates of a unitary transform of Rn . Then
Vk is H0 bounded with H0 -bound 0. In particular,

                 H = H0 +                  Vk (y (k) ),         D(H) = H 2 (Rn ),                  (11.8)
                                      k
                     ∞
is self-adjoint and C0 (Rn ) is a core.

Proof. It suffices to consider one k. After a unitary transform of Rn we can
assume y (1) = (x1 , . . . , xd ) since such transformations leave both the scalar
product of L2 (Rn ) and H0 invariant. Now let ψ ∈ S(Rn ). Then
                        2
              Vk ψ          ≤ a2           |∆1 ψ(x)|2 dn x + b2                 |ψ(x)|2 dn x,
                                      Rn                                   Rn
                 d    2  2
where ∆1 =       j=1 ∂ /∂ xj ,            by our previous lemma. Hence we obtain
                                                      d
                     Vk ψ     2
                                  ≤ a2           |            ˆ
                                                           p2 ψ(p)|2 dn p + b2 ψ       2
                                                            j
                                            Rn       j=1
                                                      n
                                  ≤ a2           |            ˆ
                                                           p2 ψ(p)|2 dn p + b2 ψ       2
                                                            j
                                            Rn       j=1
                                       2              2
                                  = a H0 ψ                + b2 ψ 2 ,
which implies that Vk is relatively bounded with bound 0. The rest follows
from the Kato–Rellich theorem.

     So V (N ) is H0 bounded with H0 -bound 0 and thus H (N ) = H0 + V (N )
is self-adjoint on D(H (N ) ) = D(H0 ).
242                                            11. Atomic Schr¨dinger operators
                                                              o


11.2. The HVZ theorem
The considerations of the beginning of this section show that it is not so
easy to determine the essential spectrum of H (N ) since the potential does
not decay in all directions as |x| → ∞. However, there is still something we
can do. Denote the infimum of the spectrum of H (N ) by λN . Then, let us
split the system into H (N −1) plus a single electron. If the single electron is
far away from the remaining system such that there is little interaction, the
energy should be the sum of the kinetic energy of the single electron and
the energy of the remaining system. Hence, arguing as in the two electron
example of the previous section, we expect

Theorem 11.2 (HVZ). Let H (N ) be the self-adjoint operator given in (11.6).
Then H (N ) is bounded from below and


                             σess (H (N ) ) = [λN −1 , ∞),                      (11.9)


where λN −1 = min σ(H (N −1) )  0.


    In particular, the ionization energy (i.e., the energy needed to remove
one electron from the atom in its ground state) of an atom with N electrons
is given by λN − λN −1 .
     Our goal for the rest of this section is to prove this result which is due
to Zhislin, van Winter, and Hunziker and is known as the HVZ theorem. In
fact there is a version which holds for general N -body systems. The proof
is similar but involves some additional notation.
    The idea of proof is the following. To prove [λN −1 , ∞) ⊆ σess (H (N ) ),
we choose Weyl sequences for H (N −1) and −∆N and proceed according to
our intuitive picture from above. To prove σess (H (N ) ) ⊆ [λN −1 , ∞), we
will localize H (N ) on sets where one electron is far away from the nucleus
whenever some of the others are. On these sets, the interaction term between
this electron and the nucleus is decaying and hence does not contribute to
the essential spectrum. So it remain to estimate the infimum of the spectrum
of a system where one electron does not interact with the nucleus. Since the
interaction term with the other electrons is positive, we can finally estimate
this infimum by the infimum of the case where one electron is completely
decoupled from the rest.
   We begin with the first inclusion. Let ψ N −1 (x1 , . . . , xN −1 ) ∈ H 2 (R3(N −1) )
such that ψ N −1 = 1, (H (N −1) − λN −1 )ψ N −1 ≤ ε and ψ 1 ∈ H 2 (R3 )
such that ψ 1 = 1, (−∆N − λ)ψ 1 ≤ ε for some λ ≥ 0. Now consider
11.2. The HVZ theorem                                                                  243


ψr (x1 , . . . , xN ) = ψ N −1 (x1 , . . . , xN −1 )ψr (xN ), ψr (xN ) = ψ 1 (xN − r). Then
                                                     1         1


            (H (N ) − λ − λN −1 )ψr ≤ (H (N −1) − λN −1 )ψ N −1              1
                                                                            ψr
                                            + ψ N −1               1
                                                         (−∆N − λ)ψr
                                                        N −1
                                            + (VN −            VN,j )ψr ,         (11.10)
                                                        j=1

where VN = Vne (xN ) and VN,j = Vee (xN − xj ). Using the fact that
(VN − N −1 VN,j )ψ N −1 ∈ L2 (R3N ) and |ψr | → 0 pointwise as |r| → ∞
           j=1
                                          1

(by Lemma 10.1), the third term can be made smaller than ε by choosing
|r| large (dominated convergence). In summary,
                             (H (N ) − λ − λN −1 )ψr ≤ 3ε                         (11.11)
proving [λN −1 , ∞) ⊆ σess (H (N ) ).
   The second inclusion is more involved. We begin with a localization
formula.
Lemma 11.3 (IMS localization formula). Suppose φj ∈ C ∞ (Rn ), 1 ≤ j ≤
m, is such that
                               m
                                    φj (x)2 = 1,    x ∈ Rn .                      (11.12)
                              j=1
Then
                   m
          ∆ψ =          φj ∆(φj ψ) + |∂φj |2 ψ ,         ψ ∈ H 2 (Rn ).           (11.13)
                  j=1

Proof. The proof follows from a straightforward computation using the
identities j φj ∂k φj = 0 and j ((∂k φj )2 + φj ∂k φj ) = 0 which follow by
                                                 2

differentiating (11.12).

   Now we will choose φj , 1 ≤ j ≤ N , in such a way that, for x outside
some ball, x ∈ supp(φj ) implies that the j’th particle is far away from the
nucleus.
Lemma 11.4. Fix some C ∈ (0, √1 ). There exist smooth functions φj ∈
                                       N
C ∞ (Rn , [0, 1]), 1 ≤ j ≤ N , such that (11.12) holds,
                    supp(φj ) ∩ {x| |x| ≥ 1} ⊆ {x| |xj | ≥ C|x|},                 (11.14)
and |∂φj (x)| → 0 as |x| → ∞.

Proof. The open sets
                            Uj = {x ∈ S 3N −1 | |xj |  C}
244                                                   11. Atomic Schr¨dinger operators
                                                                     o


cover the unit sphere in RN ; that is,
                                     N
                                           Uj = S 3N −1 .
                                     j=1

                                            ˜
By Lemma 0.13 there is a partition of unity φj (x) subordinate to this cover.
       ˜
Extend φj (x) to a smooth function from R  3N {0} to [0, 1] by

                    ˜         ˜
                    φj (λx) = φj (x),            x ∈ S 3N −1 , λ  0,
                     ˜
and pick a function φ ∈ C ∞ (R3N , [0, 1]) with support inside the unit ball
which is 1 in a neighborhood of the origin. Then the
                                           ˜        ˜ ˜
                                           φ + (1 − φ)φj
                             φj =
                                           N    ˜          ˜ ˜
                                            =1 (φ +   (1 − φ)φ )2
are the desired functions. The gradient tends to zero since φj (λx) = φj (x)
for λ ≥ 1 and |x| ≥ 1 which implies (∂φj )(λx) = λ−1 (∂φj )(x).

      By our localization formula we have
                             N
                H (N ) =         φj H (N,j) φj + P − K,
                           j=1
                       N                                     N          N
                K=             φ2 Vj + |∂φj |2 ,
                                j                     P =          φ2
                                                                    j         Vj, ,   (11.15)
                       j=1                                   j=1        =j

where
                                     N           N               N
                       (N,j)
                   H           =−          ∆ −        V +               Vk,           (11.16)
                                     =1          =j         k , k, =j
is the Hamiltonian with the j’th electron decoupled from the rest of the
system. Here we have abbreviated Vj (x) = Vne (xj ) and Vj, = Vee (xj − x ).
    Since K vanishes as |x| → ∞, we expect it to be relatively compact with
respect to the rest. By Lemma 6.23 it suffices to check that it is relatively
compact with respect to H0 . The terms |∂φj |2 are bounded and vanish at
∞; hence they are H0 compact by Lemma 7.11. However, the terms φ2 Vj    j
have singularities and will be covered by the following lemma.
Lemma 11.5. Let V be a multiplication operator which is H0 bounded with
H0 -bound 0 and suppose that χ{x||x|≥R} V RH0 (z) → 0 as R → ∞. Then
V is relatively compact with respect to H0 .

Proof. Let ψn converge to 0 weakly. Note that ψn ≤ M for some
M  0. It suffices to show that V RH0 (z)ψn converges to 0. Choose
11.2. The HVZ theorem                                                              245


     ∞
φ ∈ C0 (Rn , [0, 1]) such that it is one for |x| ≤ R. Note φD(H0 ) ⊂ D(H0 ).
Then
           V RH0 (z)ψn ≤ (1 − φ)V RH0 (z)ψn + V φRH0 (z)ψn
                           ≤ (1 − φ)V RH0 (z) ψn
                             + a H0 φRH0 (z)ψn + b φRH0 (z)ψn .
By assumption, the first term can be made smaller than ε by choosing R
large. Next, the same is true for the second term choosing a small since
H0 φRH0 (z) is bounded (by Problem 2.9 and the closed graph theorem).
Finally, the last term can also be made smaller than ε by choosing n large
since φ is H0 compact.

    So K is relatively compact with respect to H (N ) . In particular H (N ) +
K is self-adjoint on H 2 (R3N ) and σess (H (N ) ) = σess (H (N ) + K). Since
the operators H (N,j) , 1 ≤ j ≤ N , are all of the form H (N −1) plus one
particle which does not interact with the others and the nucleus, we have
H (N,j) − λN −1 ≥ 0, 1 ≤ j ≤ N . Moreover, we have P ≥ 0 and hence
                                           N
                             N −1
         ψ, (H   (N )
                        +K −λ       )ψ =         φj ψ, (H (N,j) − λN −1 )φj ψ
                                           j=1
                                           + ψ, P ψ ≥ 0.                        (11.17)
Thus we obtain the remaining inclusion
    σess (H (N ) ) = σess (H (N ) + K) ⊆ σ(H (N ) + K) ⊆ [λN −1 , ∞),           (11.18)
which finishes the proof of the HVZ theorem.
    Note that the same proof works if we add additional nuclei at fixed
locations. That is, we can also treat molecules if we assume that the nuclei
are fixed in space.
    Finally, let us consider the example of helium-like atoms (N = 2). By
the HVZ theorem and the considerations of the previous section we have
                                            2
                                          γne
                            σess (H (2) ) = [−, ∞).                  (11.19)
                                            4
Moreover, if γee = 0 (no electron interaction), we can take products of one-
particle eigenfunctions to show that
         2     1      1
      − γne      2
                   +            ∈ σp (H (2) (γee = 0)),       n, m ∈ N.         (11.20)
              4n     4m2
In particular, there are eigenvalues embedded in the essential spectrum in
this case. Moreover, since the electron interaction term is positive, we see
                                                  2
                                                 γne
                                    H (2) ≥ −        .                          (11.21)
                                                  2
246                                      11. Atomic Schr¨dinger operators
                                                        o


Note that there can be no positive eigenvalues by the virial theorem. This
even holds for arbitrary N ,
                          σp (H (N ) ) ⊂ (−∞, 0).                  (11.22)
Chapter 12




Scattering theory


12.1. Abstract theory
In physical measurements one often has the following situation. A particle
is shot into a region where it interacts with some forces and then leaves
the region again. Outside this region the forces are negligible and hence the
time evolution should be asymptotically free. Hence one expects asymptotic
states ψ± (t) = exp(−itH0 )ψ± (0) to exist such that
                     ψ(t) − ψ± (t) → 0    as   t → ±∞.                  (12.1)

                         !
                         ¡         ψ+ (t)$$$$
                                            X
                         ¡         $$
                           ¡ $$$$
                            $
                    $$$   ¡
            $$$          ¡
                       ¡
                      ¡      ψ(t)
                     ¡
                   ¡
                  ¡
                 ¡
        ψ− (t) ¡
              ¡
             ¡
           ¡
Rewriting this condition, we see
  0 = lim    e−itH ψ(0) − e−itH0 ψ± (0) = lim     ψ(0) − eitH e−itH0 ψ± (0)
      t→±∞                                t→±∞
                                                                        (12.2)
and motivated by this, we define the wave operators by
               D(Ω± ) = {ψ ∈ H|∃ limt→±∞ eitH e−itH0 ψ},
                                                                        (12.3)
                Ω± ψ = limt→±∞ eitH e−itH0 ψ.

                                                                          247
248                                                        12. Scattering theory


The set D(Ω± ) is the set of all incoming/outgoing asymptotic states ψ± and
Ran(Ω± ) is the set of all states which have an incoming/outgoing asymptotic
state. If a state ψ has both, that is, ψ ∈ Ran(Ω+ ) ∩ Ran(Ω− ), it is called a
scattering state.
      By construction we have
                Ω± ψ = lim        eitH e−itH0 ψ = lim     ψ = ψ           (12.4)
                           t→±∞                  t→±∞

and it is not hard to see that D(Ω± ) is closed. Moreover, interchanging the
roles of H0 and H amounts to replacing Ω± by Ω−1 and hence Ran(Ω± ) is
                                                  ±
also closed. In summary,
Lemma 12.1. The sets D(Ω± ) and Ran(Ω± ) are closed and Ω± : D(Ω± ) →
Ran(Ω± ) is unitary.

      Next, observe that
        lim eitH e−itH0 (e−isH0 ψ) = lim e−isH (ei(t+s)H e−i(t+s)H0 ψ)    (12.5)
       t→±∞                          t→±∞
and hence
                 Ω± e−itH0 ψ = e−itH Ω± ψ, ψ ∈ D(Ω± ).                (12.6)
In addition, D(Ω± ) is invariant under exp(−itH0 ) and Ran(Ω± ) is invariant
under exp(−itH). Moreover, if ψ ∈ D(Ω± )⊥ , then
          ϕ, exp(−itH0 )ψ = exp(itH0 )ϕ, ψ = 0,         ϕ ∈ D(Ω± ).       (12.7)
Hence D(Ω± )⊥ is invariant under exp(−itH0 ) and Ran(Ω± )⊥ is invariant
under exp(−itH). Consequently, D(Ω± ) reduces exp(−itH0 ) and Ran(Ω± )
reduces exp(−itH). Moreover, differentiating (12.6) with respect to t, we
obtain from Theorem 5.1 the intertwining property of the wave operators.
Theorem 12.2. The subspaces D(Ω± ), respectively, Ran(Ω± ), reduce H0 ,
respectively, H, and the operators restricted to these subspaces are unitarily
equivalent:
                Ω± H0 ψ = HΩ± ψ,        ψ ∈ D(Ω± ) ∩ D(H0 ).           (12.8)

    It is interesting to know the correspondence between incoming and out-
going states. Hence we define the scattering operator
         S = Ω−1 Ω− ,
              +            D(S) = {ψ ∈ D(Ω− )|Ω− ψ ∈ Ran(Ω+ )}.           (12.9)
Note that we have D(S) = D(Ω− ) if and only if Ran(Ω− ) ⊆ Ran(Ω+ ) and
Ran(S) = D(Ω+ ) if and only if Ran(Ω+ ) ⊆ Ran(Ω− ). Moreover, S is unitary
from D(S) onto Ran(S) and we have
                        H0 Sψ = SH0 ψ,      D(H0 ) ∩ D(S).               (12.10)
However, note that this whole theory is meaningless until we can show that
the domains D(Ω± ) are nontrivial. We first show a criterion due to Cook.
12.1. Abstract theory                                                                    249


Lemma 12.3 (Cook). Suppose D(H) ⊆ D(H0 ). If
              ∞
                  (H − H0 ) exp( itH0 )ψ dt  ∞,                  ψ ∈ D(H0 ),         (12.11)
          0
then ψ ∈ D(Ω± ), respectively. Moreover, we even have
                                            ∞
              (Ω± − I)ψ ≤                       (H − H0 ) exp( itH0 )ψ dt             (12.12)
                                        0
in this case.

Proof. The result follows from
                                        t
     eitH e−itH0 ψ = ψ + i                  exp(isH)(H − H0 ) exp(−isH0 )ψds          (12.13)
                                    0
which holds for ψ ∈ D(H0 ).

   As a simple consequence we obtain the following result for Schr¨dinger
                                                                  o
operators in R3
Theorem 12.4. Suppose H0 is the free Schr¨dinger operator and H =
                                             o
H0 + V with V ∈ L2 (R3 ). Then the wave operators exist and D(Ω± ) = H.

Proof. Since we want to use Cook’s lemma, we need to estimate
                       2
              V ψ(s)       =        |V (x)ψ(s, x)|2 dx,      ψ(s) = exp(isH0 )ψ,
                               R3
for given ψ ∈ D(H0 ). Invoking (7.31), we get
                                         1
            V ψ(s) ≤ ψ(s) ∞ V ≤                ψ                   1   V ,   s  0,
                                      (4πs)3/2
at least for ψ ∈ L1 (R3 ). Moreover, this implies
                          ∞
                                            1
                            V ψ(s) ds ≤ 3/2 ψ 1 V
                        1                 4π
and thus any such ψ is in D(Ω+ ). Since such functions are dense, we obtain
D(Ω+ ) = H, and similarly for Ω− .

     By the intertwining property ψ is an eigenfunction of H0 if and only
if it is an eigenfunction of H. Hence for ψ ∈ Hpp (H0 ) it is easy to check
whether it is in D(Ω± ) or not and only the continuous subspace is of interest.
We will say that the wave operators exist if all elements of Hac (H0 ) are
asymptotic states, that is,
                                            Hac (H0 ) ⊆ D(Ω± ),                       (12.14)
and that they are complete if, in addition, all elements of Hac (H) are
scattering states, that is,
                                        Hac (H) ⊆ Ran(Ω± ).                           (12.15)
250                                                        12. Scattering theory


If we even have
                                Hc (H) ⊆ Ran(Ω± ),                        (12.16)
they are called asymptotically complete.
     We will be mainly interested in the case where H0 is the free Schr¨dinger
                                                                       o
operator and hence Hac (H0 ) = H. In this latter case the wave operators ex-
ist if D(Ω± ) = H, they are complete if Hac (H) = Ran(Ω± ), and they are
asymptotically complete if Hc (H) = Ran(Ω± ). In particular, asymptotic
completeness implies Hsc (H) = {0} since H restricted to Ran(Ω± ) is uni-
tarily equivalent to H0 . Completeness implies that the scattering operator
is unitary. Hence, by the intertwining property, kinetic energy is preserved
during scattering:
      ψ− , H0 ψ− = Sψ− , SH0 ψ− = Sψ− , H0 Sψ− = ψ+ , H0 ψ+               (12.17)
for ψ− ∈ D(H0 ) and ψ+ = Sψ− .

12.2. Incoming and outgoing states
In the remaining sections we want to apply this theory to Schr¨dinger op-
                                                                 o
erators. Our first goal is to give a precise meaning to some terms in the
intuitive picture of scattering theory introduced in the previous section.
    This physical picture suggests that we should be able to decompose
ψ ∈ H into an incoming and an outgoing part. But how should incoming,
respectively, outgoing, be defined for ψ ∈ H? Well, incoming (outgoing)
means that the expectation of x2 should decrease (increase). Set x(t)2 =
exp(iH0 t)x2 exp(−iH0 t). Then, abbreviating ψ(t) = e−itH0 ψ,
       d
          Eψ (x(t)2 ) = ψ(t), i[H0 , x2 ]ψ(t) = 4 ψ(t), Dψ(t) ,   ψ ∈ S(Rn ),
       dt
                                                                      (12.18)
where D is the dilation operator introduced in (10.9). Hence it is natural to
consider ψ ∈ Ran(P± ),
                            P± = PD ((0, ±∞)),                        (12.19)
as outgoing, respectively, incoming, states. If we project a state in Ran(P± )
to energies in the interval (a2 , b2 ), we expect that it cannot be found in a
ball of radius proportional to a|t| as t → ±∞ (a is the minimal velocity of
the particle, since we have assumed the mass to be two). In fact, we will
show below that the tail decays faster then any inverse power of |t|.
      We first collect some properties of D which will be needed later on. Note
                                   FD = −DF                               (12.20)
and hence Ff (D) = f (−D)F. Additionally, we will look for a transforma-
tion which maps D to a multiplication operator.
12.2. Incoming and outgoing states                                                            251


    Since the dilation group acts on |x| only, it seems reasonable to switch to
polar coordinates x = rω, (t, ω) ∈ R+ × S n−1 . Since U (s) essentially trans-
forms r into r exp(s), we will replace r by ρ = log(r). In these coordinates
we have
                       U (s)ψ(eρ ω) = e−ns/2 ψ(e(ρ−s) ω)                (12.21)
and hence U (s) corresponds to a shift of ρ (the constant in front is absorbed
by the volume element). Thus D corresponds to differentiation with respect
to this coordinate and all we have to do to make it a multiplication operator
is to take the Fourier transform with respect to ρ.
   This leads us to the Mellin transform
  M : L2 (Rn ) → L2 (R × S n−1 ),                                                          (12.22)
                                                          ∞
                                       1                                      n
        ψ(rω) → (Mψ)(λ, ω) = √                                r−iλ ψ(rω)r     2
                                                                                −1
                                                                                     dr.
                                       2π             0
By construction, M is unitary; that is,

                  |(Mψ)(λ, ω)|2 dλdn−1 ω =                        |ψ(rω)|2 rn−1 drdn−1 ω,
      R   S n−1                                  R+       S n−1
                                                                                           (12.23)
where   dn−1 ω    is the normalized surface measure on              S n−1 .   Moreover,
                                   −1                 −isλ
                               M        U (s)M = e                                         (12.24)
and hence
                               M−1 DM = λ.                                                 (12.25)
From this it is straightforward to show that
                   σ(D) = σac (D) = R,           σsc (D) = σpp (D) = ∅                     (12.26)
and that S(Rn ) is a core for D. In particular we have P+ + P− = I.
   Using the Mellin transform, we can now prove Perry’s estimate.
                                     ∞
Lemma 12.5. Suppose f ∈ Cc (R) with supp(f ) ⊂ (a2 , b2 ) for some a, b 
0. For any R ∈ R, N ∈ N there is a constant C such that
                                                     C
   χ{x| |x|2a|t|} e−itH0 f (H0 )PD ((±R, ±∞)) ≤            , ±t ≥ 0, (12.27)
                                                 (1 + |t|)N
respectively.

Proof. We prove only the + case, the remaining one being similar. Consider
ψ ∈ S(Rn ). Introducing
        ψ(t, x) = e−itH0 f (H0 )PD ((R, ∞))ψ(x) = Kt,x , FPD ((R, ∞))ψ
                                        ˆ
                  = Kt,x , PD ((−∞, −R))ψ ,
where
                                         1          2
                         Kt,x (p) =        n/2
                                               ei(tp −px) f (p2 )∗ ,
                                      (2π)
252                                                                    12. Scattering theory


we see that it suffices to show
                               2          const
         PD ((−∞, −R))Kt,x         ≤               ,             for |x|  2a|t|, t  0.
                                       (1 + |t|)2N
Now we invoke the Mellin transform to estimate this norm:
                                         −R
                               2
         PD ((−∞, −R))Kt,x         =                   |(MKt,x )(λ, ω)|2 dλdn−1 ω.
                                        −∞     S n−1
Since
                                                             ∞
                                              1                  ˜
                 (MKt,x )(λ, ω) =                                f (r)eiα(r) dr         (12.28)
                                       (2π)(n+1)/2       0
     ˜
with f (r) = f (r2 )∗ rn/2−1 ∈ Cc ((a, b)), α(r) = tr2 + rωx − λ log(r). Esti-
                                ∞

mating the derivative of α, we see
                      α (r) = 2tr + ωx − λ/r  0,             r ∈ (a, b),
for λ ≤ −R and t       −R(2εa)−1 ,
                               where ε is the distance of a to the support
   ˜. Hence we can find a constant such that
of f
                  |α (r)| ≥ const(1 + |λ| + |t|),                     ˜
                                                             r ∈ supp(f ),
for λ ≤ −R, t  −R(εa)−1 . Now the method of stationary phase (Prob-
lem 12.1) implies
                                          const
                  |(MKt,x )(λ, ω)| ≤
                                     (1 + |λ| + |t|)N
for λ, t as before. By increasing the constant, we can even assume that it
holds for t ≥ 0 and λ ≤ −R. This finishes the proof.
                                      ∞
Corollary 12.6. Suppose that f ∈ Cc ((0, ∞)) and R ∈ R. Then the
operator PD ((±R, ±∞))f (H0 ) exp(−itH0 ) converges strongly to 0 as t →
  ∞.

Proof. Abbreviating PD = PD ((±R, ±∞)) and χ = χ{x| |x|2a|t|} , we have
        PD f (H0 )e−itH0 ψ ≤ χeitH0 f (H0 )∗ PD           ψ + f (H0 )             (I − χ)ψ
since A =        A∗
                 . Taking t → ∞, the first term goes to zero by our
lemma and the second goes to zero since χψ → ψ.
Problem 12.1 (Method of stationary phase). Consider the integral
                                          ∞
                             I(t) =           f (r)eitφ(r) dr
                                        −∞
with f ∈     ∞
            Cc (R)and a real-valued phase φ ∈ C ∞ (R). Show that |I(t)| ≤
CN t−N for any N ∈ N if |φ (r)| ≥ 1 for r ∈ supp(f ). (Hint: Make a change

of variables ρ = φ(r) and conclude that it suffices to show the case φ(r) = r.
Now use integration by parts.)
12.3. Schr¨dinger operators with short range potentials
          o                                                                        253


12.3. Schr¨dinger operators with short range potentials
          o
By the RAGE theorem we know that for ψ ∈ Hc , ψ(t) will eventually leave
every compact ball (at least on the average). Hence we expect that the
time evolution will asymptotically look like the free one for ψ ∈ Hc if the
potential decays sufficiently fast. In other words, we expect such potentials
to be asymptotically complete.
   Suppose V is relatively bounded with bound less than one. Introduce
        h1 (r) = V RH0 (z)χr ,    h2 (r) = χr V RH0 (z) ,         r ≥ 0,       (12.29)
where
                               χr = χ{x| |x|≥r} .                      (12.30)
The potential V will be called short range if these quantities are integrable.
We first note that it suffices to check this for h1 or h2 and for one z ∈ ρ(H0 ).
Lemma 12.7. The function h1 is integrable if and only if h2 is. Moreover,
hj integrable for one z0 ∈ ρ(H0 ) implies hj integrable for all z ∈ ρ(H0 ).
                     ∞
Proof. Pick φ ∈ Cc (Rn , [0, 1]) such that φ(x) = 0 for 0 ≤ |x| ≤ 1/2 and
φ(x) = 0 for 1 ≤ |x|. Then it is not hard to see that hj is integrable if and
        ˜
only if hj is integrable, where
           ˜
           h1 (r) = V RH0 (z)φr ,     ˜
                                      h2 (r) = φr V RH0 (z) ,         r ≥ 1,
and φr (x) = φ(x/r). Using
                 [RH0 (z), φr ] = −RH0 (z)[H0 (z), φr ]RH0 (z)
                              = RH0 (z)(∆φr + (∂φr )∂)RH0 (z)
and ∆φr = φr/2 ∆φr , ∆φr ∞ ≤ ∆φ                 2,
                                         ∞ /r        respectively, (∂φr ) = φr/2 (∂φr ),
 ∂φr ∞ ≤ ∂φ ∞ /r2 , we see
                      ˜        ˜         c˜
                     |h1 (r) − h2 (r)| ≤ h1 (r/2), r ≥ 1.
                                         r
      ˜ 2 is integrable if h1 is. Conversely,
Hence h                    ˜

        ˜         ˜        c˜           ˜       c˜        2c ˜
        h1 (r) ≤ h2 (r) + h1 (r/2) ≤ h2 (r) + h2 (r/2) + 2 h1 (r/4)
                           r                    r         r
            ˜                    ˜
shows that h2 is integrable if h1 is.
   Invoking the first resolvent formula
              φr V RH0 (z) ≤ φr V RH0 (z0 )          I − (z − z0 )RH0 (z)
finishes the proof.

   As a first consequence note
Lemma 12.8. If V is short range, then RH (z) − RH0 (z) is compact.
254                                                      12. Scattering theory


Proof. The operator RH (z)V (I−χr )RH0 (z) is compact since (I−χr )RH0 (z)
is by Lemma 7.11 and RH (z)V is bounded by Lemma 6.23. Moreover, by
our short range condition it converges in norm to
                     RH (z)V RH0 (z) = RH (z) − RH0 (z)
as r → ∞ (at least for some subsequence).

   In particular, by Weyl’s theorem we have σess (H) = [0, ∞). Moreover,
V short range implies that H and H0 look alike far outside.

Lemma 12.9. Suppose RH (z)−RH0 (z) is compact. Then so is f (H)−f (H0 )
for any f ∈ C∞ (R) and
                        lim (f (H) − f (H0 ))χr = 0.                     (12.31)
                        r→∞

Proof. The first part is Lemma 6.21 and the second part follows from part
(ii) of Lemma 6.8 since χr converges strongly to 0.

   However, this is clearly not enough to prove asymptotic completeness
and we need a more careful analysis.
   We begin by showing that the wave operators exist. By Cook’s criterion
(Lemma 12.3) we need to show that
      V exp( itH0 )ψ ≤ V RH0 (−1)        (I − χ2a|t| ) exp( itH0 )(H0 + I)ψ
                         + V RH0 (−1)χ2a|t|      (H0 + I)ψ               (12.32)
is integrable for a dense set of vectors ψ. The second term is integrable by our
short range assumption. The same is true by Perry’s estimate (Lemma 12.5)
for the first term if we choose ψ = f (H0 )PD ((±R, ±∞))ϕ. Since vectors of
this form are dense, we see that the wave operators exist,
                                 D(Ω± ) = H.                             (12.33)
Since H restricted to Ran(Ω∗ ) is unitarily equivalent to H0 , we obtain
                              ±
[0, ∞) = σac (H0 ) ⊆ σac (H). Furthermore, by σac (H) ⊆ σess (H) = [0, ∞)
we even have σac (H) = [0, ∞).
    To prove asymptotic completeness of the wave operators, we will need
that the (Ω± − I)f (H0 )P± are compact.
                      ∞
Lemma 12.10. Let f ∈ Cc ((0, ∞)) and suppose ψn converges weakly to 0.
Then
                       lim (Ω± − I)f (H0 )P± ψn = 0;                     (12.34)
                      n→∞

that is, (Ω± − I)f (H0 )P± is compact.
12.3. Schr¨dinger operators with short range potentials
          o                                                                255


Proof. By (12.13) we see
                                       ∞
  RH (z)(Ω± − I)f (H0 )P± ψn ≤             RH (z)V exp(−isH0 )f (H0 )P± ψn dt.
                                   0
Since RH (z)V RH0 is compact, we see that the integrand
           RH (z)V exp(−isH0 )f (H0 )P± ψn
                 = RH (z)V RH0 exp(−isH0 )(H0 + 1)f (H0 )P± ψn
converges pointwise to 0. Moreover, arguing as in (12.32), the integrand
is bounded by an L1 function depending only on ψn . Thus RH (z)(Ω± −
I)f (H0 )P± is compact by the dominated convergence theorem. Furthermore,
using the intertwining property, we see that
                     ˜
             (Ω± − I)f (H0 )P± =RH (z)(Ω± − I)f (H0 )P±
                                  − (RH (z) − RH0 (z))f (H0 )P±
                                ˜
is compact by Lemma 6.21, where f (λ) = (λ + 1)f (λ).

   Now we have gathered enough information to tackle the problem of
asymptotic completeness.
    We first show that the singular continuous spectrum is absent. This
is not really necessary, but it avoids the use of Ces`ro means in our main
                                                     a
argument.
                         sc
     Abbreviate P = PH PH ((a, b)), 0  a  b. Since H restricted to
Ran(Ω± ) is unitarily equivalent to H0 (which has purely absolutely continu-
ous spectrum), the singular part must live on Ran(Ω± )⊥ ; that is, PH Ω± = 0.
                                                                    sc

Thus P f (H0 ) = P (I − Ω+ )f (H0 )P+ + P (I − Ω− )f (H0 )P− is compact. Since
f (H) − f (H0 ) is compact, it follows that P f (H) is also compact. Choos-
ing f such that f (λ) = 1 for λ ∈ [a, b], we see that P = P f (H) is com-
pact and hence finite dimensional. In particular σsc (H) ∩ (a, b) is a fi-
nite set. But a continuous measure cannot be supported on a finite set,
showing σsc (H) ∩ (a, b) = ∅. Since 0  a  b are arbitrary, we even
have σsc (H) ∩ (0, ∞) = ∅ and by σsc (H) ⊆ σess (H) = [0, ∞), we obtain
σsc (H) = ∅.
                              sc     pp
   Observe that replacing PH by PH , the same argument shows that all
nonzero eigenvalues are finite dimensional and cannot accumulate in (0, ∞).
   In summary we have shown
Theorem 12.11. Suppose V is short range. Then
                σac (H) = σess (H) = [0, ∞),        σsc (H) = ∅.        (12.35)
All nonzero eigenvalues have finite multiplicity and cannot accumulate in
(0, ∞).
256                                                          12. Scattering theory


    Now we come to the anticipated asymptotic completeness result of Enß.
Choose
              ψ ∈ Hc (H) = Hac (H) such that ψ = f (H)ψ           (12.36)
for some f ∈ Cc∞ ((0, ∞)). By the RAGE theorem the sequence ψ(t) con-

verges weakly to zero as t → ±∞. Abbreviate ψ(t) = exp(−itH)ψ. Intro-
duce
                          ϕ± (t) = f (H0 )P± ψ(t),                (12.37)
which satisfy
                      lim ψ(t) − ϕ+ (t) − ϕ− (t) = 0.             (12.38)
                        t→±∞
Indeed this follows from
               ψ(t) = ϕ+ (t) + ϕ− (t) + (f (H) − f (H0 ))ψ(t)              (12.39)
and Lemma 6.21. Moreover, we even have
                                 lim   (Ω± − I)ϕ± (t) = 0                  (12.40)
                                t→±∞

by Lemma 12.10. Now suppose ψ ∈ Ran(Ω± )⊥ . Then
                        2
                    ψ       = lim ψ(t), ψ(t)
                                t→±∞
                            = lim ψ(t), ϕ+ (t) + ϕ− (t)
                                t→±∞
                            = lim ψ(t), Ω+ ϕ+ (t) + Ω− ϕ− (t) .            (12.41)
                                t→±∞

By Theorem 12.2, Ran(Ω± )⊥ is invariant under H and thus ψ(t) ∈ Ran(Ω± )⊥
implying
                            2
                    ψ           = lim ψ(t), Ω ϕ (t)                        (12.42)
                                  t→±∞
                                = lim P f (H0 )∗ Ω∗ ψ(t), ψ(t) .
                                  t→±∞
Invoking the intertwining property, we see
                ψ   2
                        = lim P f (H0 )∗ e−itH0 Ω∗ ψ, ψ(t) = 0             (12.43)
                            t→±∞
by Corollary 12.6. Hence Ran(Ω± ) = Hac (H) = Hc (H) and we thus have
shown
Theorem 12.12 (Enß). Suppose V is short range. Then the wave operators
are asymptotically complete.
Part 3

Appendix
Mathematical methods in quantum mechanics
Appendix A




Almost everything
about Lebesgue
integration


In this appendix I give a brief introduction to measure theory. Good refer-
ences are [7], [32], or [47].


A.1. Borel measures in a nut shell
The first step in defining the Lebesgue integral is extending the notion of
size from intervals to arbitrary sets. Unfortunately, this turns out to be too
much, since a classical paradox by Banach and Tarski shows that one can
break the unit ball in R3 into a finite number of (wild – choosing the pieces
uses the Axiom of Choice and cannot be done with a jigsaw;-) pieces, rotate
and translate them, and reassemble them to get two copies of the unit ball
(compare Problem A.1). Hence any reasonable notion of size (i.e., one which
is translation and rotation invariant) cannot be defined for all sets!
   A collection of subsets A of a given set X such that
       • X ∈ A,
       • A is closed under finite unions,
       • A is closed under complements
is called an algebra. Note that ∅ ∈ A and that, by de Morgan, A is also
closed under finite intersections. If an algebra is closed under countable
unions (and hence also countable intersections), it is called a σ-algebra.

                                                                          259
260                            A. Almost everything about Lebesgue integration


    Moreover, the intersection of any family of (σ-)algebras {Aα } is again
a (σ-)algebra and for any collection S of subsets there is a unique smallest
(σ-)algebra Σ(S) containing S (namely the intersection of all (σ-)algebras
containing S). It is called the (σ-)algebra generated by S.
    If X is a topological space, the Borel σ-algebra of X is defined to be
the σ-algebra generated by all open (respectively, all closed) sets. Sets in
the Borel σ-algebra are called Borel sets.
Example. In the case X = Rn the Borel σ-algebra will be denoted by Bn
and we will abbreviate B = B1 .

    Now let us turn to the definition of a measure: A set X together with
a σ-algebra Σ is called a measurable space. A measure µ is a map
µ : Σ → [0, ∞] on a σ-algebra Σ such that
        • µ(∅) = 0,
                              ∞
               ∞
        • µ(   j=1 Aj )   =         µ(Aj ) if Aj ∩ Ak = ∅ for all j = k (σ-additivity).
                              j=1

It is called σ-finite if there is a countable cover {Xj }∞ of X with µ(Xj ) 
                                                        j=1
∞ for all j. (Note that it is no restriction to assume Xj ⊆ Xj+1 .) It is
called finite if µ(X)  ∞. The sets in Σ are called measurable sets and
the triple X, Σ, and µ is referred to as a measure space.
   If we replace the σ-algebra by an algebra A, then µ is called a premea-
sure. In this case σ-additivity clearly only needs to hold for disjoint sets
An for which n An ∈ A.
   We will write An           A if An ⊆ An+1 (note A =          n An )   and An   A if
An+1 ⊆ An (note A =           n An ).

Theorem A.1. Any measure µ satisfies the following properties:
       (i) A ⊆ B implies µ(A) ≤ µ(B) (monotonicity).
      (ii) µ(An ) → µ(A) if An            A (continuity from below).
      (iii) µ(An ) → µ(A) if An           A and µ(A1 )  ∞ (continuity from above).

                                                            ˜
Proof. The first claim is obvious. The second follows using An = An An−1
                                                          ˜
and σ-additivity. The third follows from the second using An = A1 An and
  ˜n ) = µ(A1 ) − µ(An ).
µ(A

Example. Let A ∈ P(M ) and set µ(A) to be the number of elements of A
(respectively, ∞ if A is infinite). This is the so-called counting measure.
    Note that if X = N and An = {j ∈ N|j ≥ n}, then µ(An ) = ∞, but
µ( n An ) = µ(∅) = 0 which shows that the requirement µ(A1 )  ∞ in the
last claim of Theorem A.1 is not superfluous.
A.1. Borel measures in a nut shell                                         261


   A measure on the Borel σ-algebra is called a Borel measure if µ(C) 
∞ for any compact set C. A Borel measures is called outer regular if

                           µ(A) =        inf     µ(O)                    (A.1)
                                    A⊆O,O   open
and inner regular if

                        µ(A) =         sup      µ(C).                    (A.2)
                                 C⊆A,C  compact
It is called regular if it is both outer and inner regular.
    But how can we obtain some more interesting Borel measures? We will
restrict ourselves to the case of X = R for simplicity. Then the strategy
is as follows: Start with the algebra of finite unions of disjoint intervals
and define µ for those sets (as the sum over the intervals). This yields a
premeasure. Extend this to an outer measure for all subsets of R. Show
that the restriction to the Borel sets is a measure.
   Let us first show how we should define µ for intervals: To every Borel
measure on B we can assign its distribution function
                            
                             −µ((x, 0]), x  0,
                    µ(x) =     0,          x = 0,                 (A.3)
                               µ((0, x]),  x  0,
                            

which is right continuous and nondecreasing. Conversely, given a right con-
tinuous nondecreasing function µ : R → R, we can set
                         
                          µ(b) − µ(a),
                                            A = (a, b],
                            µ(b) − µ(a−),    A = [a, b],
                         
                  µ(A) =                                              (A.4)
                          µ(b−) − µ(a),
                                            A = (a, b),
                            µ(b−) − µ(a−), A = [a, b),
                         

where µ(a−) = limε↓0 µ(a − ε). In particular, this gives a premeasure on the
algebra of finite unions of intervals which can be extended to a measure:

Theorem A.2. For every right continuous nondecreasing function µ : R →
R there exists a unique regular Borel measure µ which extends (A.4). Two
different functions generate the same measure if and only if they differ by a
constant.

    Since the proof of this theorem is rather involved, we defer it to the next
section and look at some examples first.
Example. Suppose Θ(x) = 0 for x  0 and Θ(x) = 1 for x ≥ 0. Then we
obtain the so-called Dirac measure at 0, which is given by Θ(A) = 1 if
0 ∈ A and Θ(A) = 0 if 0 ∈ A.
262                         A. Almost everything about Lebesgue integration


Example. Suppose λ(x) = x. Then the associated measure is the ordinary
Lebesgue measure on R. We will abbreviate the Lebesgue measure of a
Borel set A by λ(A) = |A|.

    It can be shown that Borel measures on a locally compact second count-
able space are always regular ([7, Thm. 29.12]).
    A set A ∈ Σ is called a support for µ if µ(XA) = 0. A property is
said to hold µ-almost everywhere (a.e.) if it holds on a support for µ or,
equivalently, if the set where it does not hold is contained in a set of measure
zero.
Example. The set of rational numbers has Lebesgue measure zero: λ(Q) =
0. In fact, any single point has Lebesgue measure zero, and so has any
countable union of points (by countable additivity).

Example. The Cantor set is an example of a closed uncountable set of
Lebesgue measure zero. It is constructed as follows: Start with C0 = [0, 1]
and remove the middle third to obtain C1 = [0, 3 ]∪[ 2 , 1]. Next, again remove
                                                  1
                                                     3
the middle third’s of the remaining sets to obtain C2 = [0, 1 ]∪[ 2 , 1 ]∪[ 2 , 7 ]∪
                                                              9    9 3      3 9
[ 8 , 1]:
  9
                                                           C0
                                                           C1
                                                           C2
                                         .
                                         .                 C3
                                         .
Proceeding like this, we obtain a sequence of nesting sets Cn and the limit
C = n Cn is the Cantor set. Since Cn is compact, so is C. Moreover,
Cn consists of 2n intervals of length 3−n , and thus its Lebesgue measure
is λ(Cn ) = (2/3)n . In particular, λ(C) = limn→∞ λ(Cn ) = 0. Using the
ternary expansion, it is extremely simple to describe: C is the set of all
x ∈ [0, 1] whose ternary expansion contains no one’s, which shows that C is
uncountable (why?). It has some further interesting properties: it is totally
disconnected (i.e., it contains no subintervals) and perfect (it has no isolated
points).




Problem A.1 (Vitali set). Call two numbers x, y ∈ [0, 1) equivalent if x − y
is rational. Construct the set V by choosing one representative from each
equivalence class. Show that V cannot be measurable with respect to any
nontrivial finite translation invariant measure on [0, 1). (Hint: How can
you build up [0, 1) from translations of V ?)
A.2. Extending a premeasure to a measure                                    263


A.2. Extending a premeasure to a measure
The purpose of this section is to prove Theorem A.2. It is rather technical and
should be skipped on first reading.
    In order to prove Theorem A.2, we need to show how a premeasure can
be extended to a measure. As a prerequisite we first establish that it suffices
to check increasing (or decreasing) sequences of sets when checking whether
a given algebra is in fact a σ-algebra:
    A collection of sets M is called a monotone class if An       A implies
A ∈ M whenever An ∈ M and An           A implies A ∈ M whenever An ∈ M.
Every σ-algebra is a monotone class and the intersection of monotone classes
is a monotone class. Hence every collection of sets S generates a smallest
monotone class M(S).
Theorem A.3. Let A be an algebra. Then M(A) = Σ(A).

Proof. We first show that M = M(A) is an algebra.
     Put M (A) = {B ∈ M|A ∪ B ∈ M}. If Bn is an increasing sequence
of sets in M (A), then A ∪ Bn is an increasing sequence in M and hence
  n (A ∪ Bn ) ∈ M. Now

                         A∪         Bn =       (A ∪ Bn )
                                n          n
shows that M (A) is closed under increasing sequences. Similarly, M (A) is
closed under decreasing sequences and hence it is a monotone class. But
does it contain any elements? Well, if A ∈ A, we have A ⊆ M (A) implying
M (A) = M for A ∈ A. Hence A ∪ B ∈ M if at least one of the sets is in A.
But this shows A ⊆ M (A) and hence M (A) = M for any A ∈ M. So M is
closed under finite unions.
    To show that we are closed under complements, consider M = {A ∈
M|XA ∈ M}. If An is an increasing sequence, then XAn is a decreasing
sequence and X n An = n XAn ∈ M if An ∈ M and similarly for
decreasing sequences. Hence M is a monotone class and must be equal to
M since it contains A.
   So we know that M is an algebra. To show that it is a σ-algebra, let
                        ˜                         ˜
An ∈ M be given and put An = k≤n An ∈ M. Then An is increasing and
   ˜
 n An =   n An ∈ M.

    The typical use of this theorem is as follows: First verify some property
for sets in an algebra A. In order to show that it holds for any set in Σ(A), it
suffices to show that the collection of sets for which it holds is closed under
countable increasing and decreasing sequences (i.e., is a monotone class).
    Now we start by proving that (A.4) indeed gives rise to a premeasure.
264                          A. Almost everything about Lebesgue integration


Lemma A.4. The interval function µ defined in (A.4) gives rise to a unique
σ-finite regular premeasure on the algebra A of finite unions of disjoint in-
tervals.

Proof. First of all, (A.4) can be extended to finite unions of disjoint inter-
vals by summing over all intervals. It is straightforward to verify that µ is
well-defined (one set can be represented by different unions of intervals) and
by construction additive.
    To show regularity, we can assume any such union to consist of open
intervals and points only. To show outer regularity, replace each point {x}
by a small open interval (x+ε, x−ε) and use that µ({x}) = limε↓0 µ(x+ε)−
µ(x−ε). Similarly, to show inner regularity, replace each open interval (a, b)
by a compact one, [an , bn ] ⊆ (a, b), and use µ((a, b)) = limn→∞ µ(bn )−µ(an )
if an ↓ a and bn ↑ b.
      It remains to verify σ-additivity. We need to show

                                  µ(       Ik ) =            µ(Ik )
                                       k                k

whenever In ∈ A and I = k Ik ∈ A. Since each In is a finite union of in-
tervals, we can as well assume each In is just one interval (just split In into
its subintervals and note that the sum does not change by additivity). Sim-
ilarly, we can assume that I is just one interval (just treat each subinterval
separately).
      By additivity µ is monotone and hence
                          n                         n
                                  µ(Ik ) = µ(               Ik ) ≤ µ(I)
                         k=1                      k=1

which shows
                                       ∞
                                           µ(Ik ) ≤ µ(I).
                                    k=1
To get the converse inequality, we need to work harder.
    By outer regularity we can cover each Ik by some open interval Jk such
that µ(Jk ) ≤ µ(Ik ) + 2εk . First suppose I is compact. Then finitely many of
the Jk , say the first n, cover I and we have
                              n               n                   ∞
                 µ(I) ≤ µ(         Jk ) ≤           µ(Jk ) ≤            µ(Ik ) + ε.
                             k=1             k=1                  k=1

Since ε  0 is arbitrary, this shows σ-additivity for compact intervals. By
additivity we can always add/subtract the endpoints of I and hence σ-
additivity holds for any bounded interval. If I is unbounded, say I = [a, ∞),
A.2. Extending a premeasure to a measure                                      265


then given x  0, we can find an n such that Jn cover at least [0, x] and
hence
                    n                n
                         µ(Ik ) ≥         µ(Jk ) − ε ≥ µ([a, x]) − ε.
                   k=1              k=1
Since x  a and ε  0 are arbitrary, we are done.

    This premeasure determines the corresponding measure µ uniquely (if
there is one at all):
Theorem A.5 (Uniqueness of measures). Let µ be a σ-finite premeasure
on an algebra A. Then there is at most one extension to Σ(A).

Proof. We first assume that µ(X)  ∞. Suppose there is another extension
µ and consider the set
˜
                          S = {A ∈ Σ(A)|µ(A) = µ(A)}.
                                               ˜
I claim S is a monotone class and hence S = Σ(A) since A ⊆ S by assump-
tion (Theorem A.3).
    Let An     A. If An ∈ S, we have µ(An ) = µ(An ) and taking limits
                                                     ˜
(Theorem A.1 (ii)), we conclude µ(A) = µ(A). Next let An
                                           ˜                       A and take
limits again. This finishes the finite case. To extend our result to the σ-finite
case, let Xj    X be an increasing sequence such that µ(Xj )  ∞. By the
finite case µ(A ∩ Xj ) = µ(A ∩ Xj ) (just restrict µ, µ to Xj ). Hence
                         ˜                           ˜
              µ(A) = lim µ(A ∩ Xj ) = lim µ(A ∩ Xj ) = µ(A)
                                          ˜            ˜
                         j→∞                    j→∞

and we are done.

    Note that if our premeasure is regular, so will the extension be:
Lemma A.6. Suppose µ is a σ-finite measure on the Borel sets B. Then
outer (inner) regularity holds for all Borel sets if it holds for all sets in some
algebra A generating the Borel sets B.

Proof. We first assume that µ(X)  ∞. Set
                         µ◦ (A) =          inf     µ(O) ≥ µ(A)
                                    A⊆O,O     open
and let M = {A ∈ B|µ◦ (A) = µ(A)}. Since by assumption M contains
some algebra generating B, it suffices to prove that M is a monotone class.
    Let An ∈ M be a monotone sequence and let On ⊇ An be open sets such
that µ(On ) ≤ µ(An ) + 2ε . Then
                        n

                                                 ε
                      µ(An ) ≤ µ(On ) ≤ µ(An ) + n .
                                                2
266                            A. Almost everything about Lebesgue integration


Now if An    A, just take limits and use continuity from below of µ to see
that On ⊇ An ⊇ A is a sequence of open sets with µ(On ) → µ(A). Similarly
if An   A, observe that O = n On satisfies O ⊇ A and
                   µ(O) ≤ µ(A) +                 µ(On A) ≤ µ(A) + ε
                                       ε
since µ(On A) ≤ µ(On An ) ≤         2n .
    Next let µ be arbitrary. Let Xj be a cover with µ(Xj )  ∞. Given
A, we can split it into disjoint sets Aj such that Aj ⊆ Xj (A1 = A ∩ X1 ,
A2 = (AA1 ) ∩ X2 , etc.). By regularity, we can assume Xj open. Thus there
are open (in X) sets Oj covering Aj such that µ(Oj ) ≤ µ(Aj ) + 2εj . Then
O = j Oj is open, covers A, and satisfies

                    µ(A) ≤ µ(O) ≤                 µ(Oj ) ≤ µ(A) + ε.
                                             j

This settles outer regularity.
    Next let us turn to inner regularity. If µ(X)  ∞, one can show as
before that M = {A ∈ B|µ◦ (A) = µ(A)}, where
                    µ◦ (A) =             sup      µ(C) ≤ µ(A)
                                 C⊆A,C    compact
is a monotone class. This settles the finite case.
    For the σ-finite case split A again as before. Since Xj has finite measure,
there are compact subsets Kj of Aj such that µ(Aj ) ≤ µ(Kj ) + 2εj . Now
we need to distinguish two cases: If µ(A) = ∞, the sum             j µ(Aj ) will
diverge and so will                       ˜ n = n ⊆ A is compact with
                        j µ(Kj ). Hence K          j=1
µ(K˜ n ) → ∞ = µ(A). If µ(A)  ∞, the sum
                                                    j µ(Aj ) will converge and
choosing n sufficiently large, we will have
                           ˜                 ˜
                        µ(Kn ) ≤ µ(A) ≤ µ(Kn ) + 2ε.
This finishes the proof.

   So it remains to ensure that there is an extension at all. For any pre-
measure µ we define
                                  ∞                     ∞
                µ∗ (A) = inf           µ(An ) A ⊆            An , A n ∈ A   (A.5)
                                 n=1                   n=1
where the infimum extends over all countable covers from A. Then the
function µ∗ : P(X) → [0, ∞] is an outer measure; that is, it has the
properties (Problem A.2)
       • µ∗ (∅) = 0,
       • A1 ⊆ A2 ⇒ µ∗ (A1 ) ≤ µ∗ (A2 ), and
                ∞                ∞
       • µ∗ (   n=1 An )   ≤          ∗
                                 n=1 µ (An )         (subadditivity).
A.2. Extending a premeasure to a measure                                 267


Note that µ∗ (A) = µ(A) for A ∈ A (Problem A.3).
Theorem A.7 (Extensions via outer measures). Let µ∗ be an outer measure.
Then the set Σ of all sets A satisfying the Carath´odory condition
                                                  e
               µ∗ (E) = µ∗ (A ∩ E) + µ∗ (A ∩ E),            ∀E ⊆ X     (A.6)
(where A = XA is the complement of A) forms a σ-algebra and µ∗ re-
stricted to this σ-algebra is a measure.

Proof. We first show that Σ is an algebra. It clearly contains X and is closed
under complements. Let A, B ∈ Σ. Applying Carath´odory’s condition
                                                          e
twice finally shows
        µ∗ (E) =µ∗ (A ∩ B ∩ E) + µ∗ (A ∩ B ∩ E) + µ∗ (A ∩ B ∩ E)
                 + µ∗ (A ∩ B ∩ E)
               ≥µ∗ ((A ∪ B) ∩ E) + µ∗ ((A ∪ B) ∩ E),
where we have used de Morgan and
   µ∗ (A ∩ B ∩ E) + µ∗ (A ∩ B ∩ E) + µ∗ (A ∩ B ∩ E) ≥ µ∗ ((A ∪ B) ∩ E)
which follows from subadditivity and (A ∪ B) ∩ E = (A ∩ B ∩ E) ∪ (A ∩
B ∩ E) ∪ (A ∩ B ∩ E). Since the reverse inequality is just subadditivity, we
conclude that Σ is an algebra.
   Next, let An be a sequence of sets from Σ. Without restriction we
can assume that they are disjoint (compare the last argument in proof of
                          ˜
Theorem A.3). Abbreviate An = k≤n An , A = n An . Then for any set E
we have
           µ∗ (An ∩ E) = µ∗ (An ∩ An ∩ E) + µ∗ (An ∩ An ∩ E)
               ˜                   ˜                 ˜
                        = µ∗ (An ∩ E) + µ∗ (An−1 ∩ E)
                                            ˜
                                        n
                        = ... =              µ∗ (Ak ∩ E).
                                       k=1
      ˜
Using An ∈ Σ and monotonicity of µ∗ , we infer
                 µ∗ (E) = µ∗ (An ∩ E) + µ∗ (An ∩ E)
                              ˜              ˜
                                 n
                            ≥         µ∗ (Ak ∩ E) + µ∗ (A ∩ E).
                                k=1
Letting n → ∞ and using subadditivity finally gives
                            ∞
                 µ∗ (E) ≥         µ∗ (Ak ∩ E) + µ∗ (A ∩ E)
                            k=1
                       ≥ µ∗ (A ∩ E) + µ∗ (B ∩ E) ≥ µ∗ (E)              (A.7)
268                              A. Almost everything about Lebesgue integration


and we infer that Σ is a σ-algebra.
      Finally, setting E = A in (A.7), we have
                           ∞                                  ∞
               ∗                  ∗              ∗
              µ (A) =            µ (Ak ∩ A) + µ (A ∩ A) =          µ∗ (Ak )
                           k=1                               k=1
and we are done.

    Remark: The constructed measure µ is complete; that is, for any mea-
surable set A of measure zero, any subset of A is again measurable (Prob-
lem A.4).
    The only remaining question is whether there are any nontrivial sets
satisfying the Carath´odory condition.
                     e
Lemma A.8. Let µ be a premeasure on A and let µ∗ be the associated outer
measure. Then every set in A satisfies the Carath´odory condition.
                                                e

Proof. Let An ∈ A be a countable cover for E. Then for any A ∈ A we
have
  ∞                ∞                   ∞
        µ(An ) =         µ(An ∩ A) +         µ(An ∩ A ) ≥ µ∗ (E ∩ A) + µ∗ (E ∩ A )
  n=1              n=1                 n=1
since An ∩ A ∈ A is a cover for E ∩ A and An ∩ A ∈ A is a cover for E ∩ A .
Taking the infimum, we have µ∗ (E) ≥ µ∗ (E ∩A)+µ∗ (E ∩A ), which finishes
the proof.

      Thus, as a consequence we obtain Theorem A.2.
Problem A.2. Show that µ∗ defined in (A.5) is an outer measure. (Hint
for the last property: Take a cover {Bnk }∞ for An such that µ∗ (An ) =
                                            k=1
 ε      ∞                               ∞
2n +    k=1 µ(Bnk ) and note that {Bnk }n,k=1 is a cover for n An .)

Problem A.3. Show that µ∗ defined in (A.5) extends µ. (Hint: For the
cover An it is no restriction to assume An ∩ Am = ∅ and An ⊆ A.)
Problem A.4. Show that the measure constructed in Theorem A.7 is com-
plete.

A.3. Measurable functions
The Riemann integral works by splitting the x coordinate into small intervals
and approximating f (x) on each interval by its minimum and maximum.
The problem with this approach is that the difference between maximum
and minimum will only tend to zero (as the intervals get smaller) if f (x) is
sufficiently nice. To avoid this problem, we can force the difference to go to
zero by considering, instead of an interval, the set of x for which f (x) lies
A.3. Measurable functions                                                     269


between two given numbers a  b. Now we need the size of the set of these
x, that is, the size of the preimage f −1 ((a, b)). For this to work, preimages
of intervals must be measurable.
    A function f : X → Rn is called measurable if f −1 (A) ∈ Σ for every
A ∈ Bn . A complex-valued function is called measurable if both its real and
imaginary parts are. Clearly it suffices to check this condition for every set A
in a collection of sets which generate Bn , since the collection of sets for which
it holds forms a σ-algebra by f −1 (Rn A) = Xf −1 (A) and f −1 ( j Aj ) =
     −1 (A ).
  jf       j

Lemma A.9. A function f : X → Rn is measurable if and only if
                                                n
                       f −1 (I) ∈ Σ     ∀I =        (aj , ∞).               (A.8)
                                               j=1
In particular, a function f : X → Rn is measurable if and only if every
component is measurable.

Proof. We need to show that B is generated by rectangles of the above
form. The σ-algebra generated by these rectangles also contains all open
rectangles of the form I = n (aj , bj ). Moreover, given any open set O,
                             j=1
we can cover it by such open rectangles satisfying I ⊆ O. By Lindel¨f’s
                                                                     o
theorem there is a countable subcover and hence every open set can be
written as a countable union of open rectangles.

    Clearly the intervals (aj , ∞) can also be replaced by [aj , ∞), (−∞, aj ),
or (−∞, aj ].
    If X is a topological space and Σ the corresponding Borel σ-algebra,
we will also call a measurable function a Borel function. Note that, in
particular,
Lemma A.10. Let X be a topological space and Σ its Borel σ-algebra. Any
continuous function is Borel. Moreover, if f : X → Rn and g : Y ⊆ Rn →
Rm are Borel functions, then the composition g ◦ f is again Borel.
    Sometimes it is also convenient to allow ±∞ as possible values for f ,
that is, functions f : X → R, R = R ∪ {−∞, ∞}. In this case A ⊆ R is
called Borel if A ∩ R is.
    The set of all measurable functions forms an algebra.
Lemma A.11. Let X be a topological space and Σ its Borel σ-algebra.
Suppose f, g : X → R are measurable functions. Then the sum f + g and
the product f g are measurable.

Proof. Note that addition and multiplication are continuous functions from
R2 → R and hence the claim follows from the previous lemma.
270                                  A. Almost everything about Lebesgue integration


    Moreover, the set of all measurable functions is closed under all impor-
tant limiting operations.
Lemma A.12. Suppose fn : X → R is a sequence of measurable functions.
Then
             inf fn , sup fn , lim inf fn , lim sup fn         (A.9)
                        n∈N              n∈N             n→∞         n→∞
are measurable as well.

Proof. It suffices to prove that sup fn is measurable since the rest follows
from inf fn = − sup(−fn ), lim inf fn = supk inf n≥k fn , and lim sup fn =
inf k supn≥k fn . But (sup fn )−1 ((a, ∞)) = n fn ((a, ∞)) and we are done.
                                                −1




    A few immediate consequences are worthwhile noting: It follows that
if f and g are measurable functions, so are min(f, g), max(f, g), |f | =
max(f, −f ), and f ± = max(±f, 0). Furthermore, the pointwise limit of
measurable functions is again measurable.

A.4. The Lebesgue integral
Now we can define the integral for measurable functions as follows. A mea-
surable function s : X → R is called simple if its range is finite; that is,
if
                                 p
                          s=         α j χA j ,           Aj = s−1 (αj ) ∈ Σ.            (A.10)
                               j=1
Here χA is the characteristic function of A; that is, χA (x) = 1 if x ∈ A
and χA (x) = 0 otherwise.
      For a nonnegative simple function we define its integral as
                                                     p
                                         s dµ =           αj µ(Aj ∩ A).                  (A.11)
                                     A              j=1

Here we use the convention 0 · ∞ = 0.
Lemma A.13. The integral has the following properties:
        (i)    A s dµ = X χA s dµ.
       (ii)    S∞      s dµ = ∞ Aj
                               j=1                  s dµ,       Aj ∩ Ak = ∅ for j = k.
                j=1 Aj

       (iii)   A α s dµ   =α     A s dµ,        α ≥ 0.
       (iv)    A (s   + t)dµ =       A s dµ     +    A t dµ.
       (v) A ⊆ B ⇒            A s dµ      ≤     B   s dµ.
       (vi) s ≤ t ⇒        A s dµ        ≤     A t dµ.
A.4. The Lebesgue integral                                                                         271


Proof. (i) is clear from the definition. (ii) follows from σ-additivity of µ.
(iii) is obvious. (iv) Let s =    j α j χA j , t =  j βj χBj and abbreviate
Cjk = (Aj ∩ Bk ) ∩ A. Then, by (ii),

           (s + t)dµ =               (s + t)dµ =               (αj + βk )µ(Cjk )
          A              j,k    Cjk                      j,k


                     =                 s dµ +         t dµ        =         s dµ +       t dµ.
                         j,k       Cjk              Cjk                 A            A

(v) follows from monotonicity of µ. (vi) follows since by (iv) we can write
s = j αj χCj , t = j βj χCj where, by assumption, αj ≤ βj .

     Our next task is to extend this definition to arbitrary positive functions
by
                                      f dµ = sup             s dµ,                               (A.12)
                                 A             s≤f       A
where the supremum is taken over all simple functions s ≤ f . Note that,
except for possibly (ii) and (iv), Lemma A.13 still holds for this extension.
Theorem A.14 (Monotone convergence). Let fn be a monotone nondecreas-
ing sequence of nonnegative measurable functions, fn f . Then

                                       fn dµ →            f dµ.                                  (A.13)
                                   A                 A

Proof. By property (vi), A fn dµ is monotone and converges to some num-
ber α. By fn ≤ f and again (vi) we have

                                       α≤          f dµ.
                                               A
To show the converse, let s be simple such that s ≤ f and let θ ∈ (0, 1). Put
An = {x ∈ A|fn (x) ≥ θs(x)} and note An       A (show this). Then

                             fn dµ ≥          fn dµ ≥ θ               s dµ.
                         A               An                      An
Letting n → ∞, we see
                                       α≥θ          s dµ.
                                                A
Since this is valid for any θ  1, it still holds for θ = 1. Finally, since s ≤ f
is arbitrary, the claim follows.

     In particular
                                   f dµ = lim                sn dµ,                              (A.14)
                               A              n→∞ A
272                                  A. Almost everything about Lebesgue integration


for any monotone sequence sn     f of simple functions. Note that there is
always such a sequence, for example,
                    2n
                         k                                k k+1
      sn (x) =             χ −1     (x),         Ak = [     ,   ), A2n = [n, ∞).                  (A.15)
                         2n f (Ak )                       2n 2n
                 k=0

By construction sn converges uniformly if f is bounded, since sn (x) = n if
                               1
f (x) = ∞ and f (x) − sn (x)  n if f (x)  n + 1.
    Now what about the missing items (ii) and (iv) from Lemma A.13? Since
limits can be spread over sums, the extension is linear (i.e., item (iv) holds)
and (ii) also follows directly from the monotone convergence theorem. We
even have the following result:
Lemma A.15. If f ≥ 0 is measurable, then dν = f dµ defined via

                                            ν(A) =           f dµ                                 (A.16)
                                                         A
is a measure such that
                                               g dν =        gf dµ.                               (A.17)

Proof. As already mentioned, additivity of µ is equivalent to linearity of the
integral and σ-additivity follows from the monotone convergence theorem:
                ∞                   ∞                     ∞                        ∞
          ν(         An ) =     (         χAn )f dµ =               χAn f dµ =          ν(An ).
             n=1                    n=1                  n=1                      n=1
The second claim holds for simple functions and hence for all functions by
construction of the integral.

      If fn is not necessarily monotone, we have at least
Theorem A.16 (Fatou’s lemma). If fn is a sequence of nonnegative mea-
surable function, then

                                    lim inf fn dµ ≤ lim inf              fn dµ.                   (A.18)
                               A n→∞                      n→∞        A

Proof. Set gn = inf k≥n fk . Then gn ≤ fn implying

                                               gn dµ ≤        fn dµ.
                                           A              A
Now take the lim inf on both sides and note that by the monotone conver-
gence theorem

      lim inf        gn dµ = lim               gn dµ =        lim gn dµ =          lim inf fn dµ,
       n→∞       A             n→∞ A                      A n→∞                   A n→∞
proving the claim.
A.4. The Lebesgue integral                                                                              273


    If the integral is finite for both the positive and negative part f ± of an
arbitrary measurable function f , we call f integrable and set

                                    f dµ =            f + dµ −            f − dµ.                   (A.19)
                                A                 A                   A
The set of all integrable functions is denoted by L1 (X, dµ).
Lemma A.17. Lemma A.13 holds for integrable functions s, t.

    Similarly, we handle the case where f is complex-valued by calling f
integrable if both the real and imaginary part are and setting

                              f dµ =            Re(f )dµ + i              Im(f )dµ.                 (A.20)
                          A                 A                         A
Clearly f is integrable if and only if |f | is.
Lemma A.18. For any integrable functions f , g we have

                                      |       f dµ| ≤           |f | dµ                             (A.21)
                                          A                 A
and (triangle inequality)

                                |f + g| dµ ≤              |f | dµ +            |g| dµ.              (A.22)
                            A                         A                    A

                    z∗
Proof. Put α =      |z| ,   where z =           Af    dµ (without restriction z = 0). Then

      |       f dµ| = α         f dµ =            α f dµ =            Re(α f ) dµ ≤          |f | dµ,
          A                 A                 A                   A                      A
proving the first claim. The second follows from |f + g| ≤ |f | + |g|.

    In addition, our integral is well behaved with respect to limiting opera-
tions.
Theorem A.19 (Dominated convergence). Let fn be a convergent sequence
of measurable functions and set f = limn→∞ fn . Suppose there is an inte-
grable function g such that |fn | ≤ g. Then f is integrable and

                                      lim         fn dµ =         f dµ.                             (A.23)
                                     n→∞

Proof. The real and imaginary parts satisfy the same assumptions and so
do the positive and negative parts. Hence it suffices to prove the case where
fn and f are nonnegative.
    By Fatou’s lemma

                                    lim inf         fn dµ ≥           f dµ
                                    n→∞         A                 A
274                         A. Almost everything about Lebesgue integration


and
                     lim inf        (g − fn )dµ ≥          (g − f )dµ.
                      n→∞       A                      A
Subtracting A g dµ on both sides of the last inequality finishes the proof
since lim inf(−fn ) = − lim sup fn .

    Remark: Since sets of measure zero do not contribute to the value of the
integral, it clearly suffices if the requirements of the dominated convergence
theorem are satisfied almost everywhere (with respect to µ).
                                                                    1
   Note that the existence of g is crucial, as the example fn (x) = n χ[−n,n] (x)
on R with Lebesgue measure shows.
Example. If µ(x) =          n αn Θ(x     − xn ) is a sum of Dirac measures, Θ(x)
centered at x = 0, then

                               f (x)dµ(x) =           αn f (xn ).          (A.24)
                                                  n

Hence our integral contains sums as special cases.

Problem A.5. Show that the set B(X) of bounded measurable functions
with the sup norm is a Banach space. Show that the set S(X) of simple
functions is dense in B(X). Show that the integral is a bounded linear func-
tional on B(X). (Hence Theorem 0.26 could be used to extend the integral
from simple to bounded measurable functions.)

Problem A.6. Show that the dominated convergence theorem implies (un-
der the same assumptions)

                                lim      |fn − f |dµ = 0.
                               n→∞

Problem A.7. Let X ⊆ R, Y be some measure space, and f : X × Y → R.
Suppose y → f (x, y) is measurable for every x and x → f (x, y) is continuous
for every y. Show that

                               F (x) =        f (x, y) dµ(y)               (A.25)
                                          A

is continuous if there is an integrable function g(y) such that |f (x, y)| ≤ g(y).

Problem A.8. Let X ⊆ R, Y be some measure space, and f : X × Y → R.
Suppose y → f (x, y) is measurable for all x and x → f (x, y) is differentiable
for a.e. y. Show that
                               F (x) =        f (x, y) dµ(y)               (A.26)
                                          A
A.5. Product measures                                                               275


                                                                     ∂
is differentiable if there is an integrable function g(y) such that | ∂x f (x, y)| ≤
                         ∂
g(y). Moreover, x → ∂x f (x, y) is measurable and
                                           ∂
                            F (x) =           f (x, y) dµ(y)                      (A.27)
                                       A   ∂x
in this case.

A.5. Product measures
Let µ1 and µ2 be two measures on Σ1 and Σ2 , respectively. Let Σ1 ⊗ Σ2 be
the σ-algebra generated by rectangles of the form A1 × A2 .
Example. Let B be the Borel sets in R. Then B2 = B ⊗ B are the Borel
sets in R2 (since the rectangles are a basis for the product topology).

    Any set in Σ1 ⊗ Σ2 has the section property; that is,
Lemma A.20. Suppose A ∈ Σ1 ⊗ Σ2 . Then its sections
   A1 (x2 ) = {x1 |(x1 , x2 ) ∈ A}    and       A2 (x1 ) = {x2 |(x1 , x2 ) ∈ A}   (A.28)
are measurable.

Proof. Denote all sets A ∈ Σ1 ⊗ Σ2 with the property that A1 (x2 ) ∈ Σ1 by
S. Clearly all rectangles are in S and it suffices to show that S is a σ-algebra.
Now, if A ∈ S, then (A )1 (x2 ) = (A1 (x2 )) ∈ Σ2 and thus S is closed under
complements. Similarly, if An ∈ S, then ( n An )1 (x2 ) = n (An )1 (x2 ) shows
that S is closed under countable unions.

    This implies that if f is a measurable function on X1 ×X2 , then f (., x2 ) is
measurable on X1 for every x2 and f (x1 , .) is measurable on X2 for every x1
(observe A1 (x2 ) = {x1 |f (x1 , x2 ) ∈ B}, where A = {(x1 , x2 )|f (x1 , x2 ) ∈ B}).
    Given two measures µ1 on Σ1 and µ2 on Σ2 , we now want to construct
the product measure µ1 ⊗ µ2 on Σ1 ⊗ Σ2 such that
       µ1 ⊗ µ2 (A1 × A2 ) = µ1 (A1 )µ2 (A2 ),          Aj ∈ Σj , j = 1, 2.        (A.29)
Theorem A.21. Let µ1 and µ2 be two σ-finite measures on Σ1 and Σ2 ,
respectively. Let A ∈ Σ1 ⊗ Σ2 . Then µ2 (A2 (x1 )) and µ1 (A1 (x2 )) are mea-
surable and
                     µ2 (A2 (x1 ))dµ1 (x1 ) =         µ1 (A1 (x2 ))dµ2 (x2 ).     (A.30)
                X1                               X2

Proof. Let S be the set of all subsets for which our claim holds. Note
that S contains at least all rectangles. It even contains the algebra of finite
disjoint unions of rectangles. Thus it suffices to show that S is a monotone
class. If µ1 and µ2 are finite, measurability and equality of both integrals
follow from the monotone convergence theorem for increasing sequences of
276                             A. Almost everything about Lebesgue integration


sets and from the dominated convergence theorem for decreasing sequences
of sets.
   If µ1 and µ2 are σ-finite, let Xi,j       Xi with µi (Xi,j )  ∞ for i = 1, 2.
Now µ2 ((A ∩ X1,j × X2,j )2 (x1 )) = µ2 (A2 (x1 ) ∩ X2,j )χX1,j (x1 ) and similarly
with 1 and 2 exchanged. Hence by the finite case

               µ2 (A2 ∩ X2,j )χX1,j dµ1 =            µ1 (A1 ∩ X1,j )χX2,j dµ2          (A.31)
          X1                                    X2

and the σ-finite case follows from the monotone convergence theorem.

      Hence we can define

   µ1 ⊗ µ2 (A) =            µ2 (A2 (x1 ))dµ1 (x1 ) =          µ1 (A1 (x2 ))dµ2 (x2 )   (A.32)
                       X1                                X2

or equivalently, since χA1 (x2 ) (x1 ) = χA2 (x1 ) (x2 ) = χA (x1 , x2 ),

          µ1 ⊗ µ2 (A) =                   χA (x1 , x2 )dµ2 (x2 ) dµ1 (x1 )
                              X1     X2

                        =                 χA (x1 , x2 )dµ1 (x1 ) dµ2 (x2 ).            (A.33)
                              X2     X1

Additivity of µ1 ⊗ µ2 follows from the monotone convergence theorem.
    Note that (A.29) uniquely defines µ1 ⊗ µ2 as a σ-finite premeasure on
the algebra of finite disjoint unions of rectangles. Hence by Theorem A.5 it
is the only measure on Σ1 ⊗ Σ2 satisfying (A.29).
      Finally we have
Theorem A.22 (Fubini). Let f be a measurable function on X1 × X2 and
let µ1 , µ2 be σ-finite measures on X1 , X2 , respectively.
        (i) If f ≥ 0, then          f (., x2 )dµ2 (x2 ) and       f (x1 , .)dµ1 (x1 ) are both
            measurable and

           f (x1 , x2 )dµ1 ⊗ µ2 (x1 , x2 ) =             f (x1 , x2 )dµ1 (x1 ) dµ2 (x2 )

            =           f (x1 , x2 )dµ2 (x2 ) dµ1 (x1 ).                               (A.34)

       (ii) If f is complex, then

                             |f (x1 , x2 )|dµ1 (x1 ) ∈ L1 (X2 , dµ2 ),                 (A.35)

           respectively,

                             |f (x1 , x2 )|dµ2 (x2 ) ∈ L1 (X1 , dµ1 ),                 (A.36)
A.5. Product measures                                                      277


          if and only if f ∈ L1 (X1 × X2 , dµ1 ⊗ dµ2 ). In this case (A.34)
          holds.

Proof. By Theorem A.21 and linearity the claim holds for simple functions.
To see (i), let sn  f be a sequence of nonnegative simple functions. Then it
follows by applying the monotone convergence theorem (twice for the double
integrals).
   For (ii) we can assume that f is real-valued by considering its real and
imaginary parts separately. Moreover, splitting f = f + −f − into its positive
and negative parts, the claim reduces to (i).

   In particular, if f (x1 , x2 ) is either nonnegative or integrable, then the
order of integration can be interchanged.
Lemma A.23. If µ1 and µ2 are σ-finite regular Borel measures, then so is
µ 1 ⊗ µ2 .

Proof. Regularity holds for every rectangle and hence also for the algebra of
finite disjoint unions of rectangles. Thus the claim follows from Lemma A.6.


   Note that we can iterate this procedure.
Lemma A.24. Suppose µj , j = 1, 2, 3, are σ-finite measures. Then
                      (µ1 ⊗ µ2 ) ⊗ µ3 = µ1 ⊗ (µ2 ⊗ µ3 ).                 (A.37)

Proof. First of all note that (Σ1 ⊗ Σ2 ) ⊗ Σ3 = Σ1 ⊗ (Σ2 ⊗ Σ3 ) is the sigma
algebra generated by the rectangles A1 ×A2 ×A3 in X1 ×X2 ×X3 . Moreover,
since
          ((µ1 ⊗ µ2 ) ⊗ µ3 )(A1 × A2 × A3 ) = µ1 (A1 )µ2 (A2 )µ3 (A3 )
               = (µ1 ⊗ (µ2 ⊗ µ3 ))(A1 × A2 × A3 ),
the two measures coincide on the algebra of finite disjoint unions of rectan-
gles. Hence they coincide everywhere by Theorem A.5.

Example. If λ is Lebesgue measure on R, then λn = λ ⊗ · · · ⊗ λ is Lebesgue
measure on Rn . Since λ is regular, so is λn .

Problem A.9. Show that the set of all finite union of rectangles A1 × A2
forms an algebra.
Problem A.10. Let U ⊆ C be a domain, Y be some measure space, and
f : U × Y → R. Suppose y → f (z, y) is measurable for every z and z →
f (z, y) is holomorphic for every y. Show that

                           F (z) =       f (z, y) dµ(y)                  (A.38)
                                     A
278                         A. Almost everything about Lebesgue integration


is holomorphic if for every compact subset V ⊂ U there is an integrable
function g(y) such that |f (z, y)| ≤ g(y), z ∈ V . (Hint: Use Fubini and
Morera.)

A.6. Vague convergence of measures
Let µn be a sequence of Borel measures, we will say that µn converges to µ
vaguely if
                                     f dµn →       f dµ                       (A.39)
                                 X             X
for every f ∈ Cc (X).
   We are only interested in the case of Borel measures on R. In this case
we have the following equivalent characterization of vague convergence.
Lemma A.25. Let µn be a sequence of Borel measures on R. Then µn → µ
vaguely if and only if the (normalized) distribution functions converge at
every point of continuity of µ.

Proof. Suppose µn → µ vaguely. Let I be any bounded interval (closed, half
closed, or open) with boundary points x0 , x1 . Moreover, choose continuous
functions f, g with compact support such that f ≤ χI ≤ g. Then we have
  f dµ ≤ µ(I) ≤ gdµ and similarly for µn . Hence

   µ(I) − µn (I) ≤      gdµ −    f dµn ≤       (g − f )dµ +        f dµ −   f dµn

and
  µ(I) − µn (I) ≥       f dµ −   gdµn ≥        (f − g)dµ −         gdµ −    gdµn .

Combining both estimates, we see

 |µ(I) − µn (I)| ≤      (g − f )dµ +     f dµ −      f dµn +        gdµ −    gdµn

and so
                     lim sup |µ(I) − µn (I)| ≤       (g − f )dµ.
                      n→∞
Choosing f , g such that g − f → χ{x0 } + χ{x1 } pointwise, we even get from
dominated convergence that
                lim sup |µ(I) − µn (I)| ≤ µ({x0 }) + µ({x1 }),
                 n→∞
which proves that the distribution functions converge at every point of con-
tinuity of µ.
    Conversely, suppose that the distribution functions converge at every
point of continuity of µ. To see that in fact µn → µ vaguely, let f ∈ Cc (R).
Fix some ε  0 and note that, since f is uniformly continuous, there is a
A.6. Vague convergence of measures                                                       279


δ  0 such that |f (x) − f (y)| ≤ ε whenever |x − y| ≤ δ. Next, choose some
points x0  x1  · · ·  xk such that supp(f ) ⊂ (x0 , xk ), µ is continuous at
xj , and xj −xj−1 ≤ δ (recall that a monotone function has at most countable
discontinuities). Furthermore, there is some N such that |µn (xj ) − µ(xj )| ≤
 ε
2k for all j and n ≥ N . Then
                              k
         f dµn −    f dµ ≤                           |f (x) − f (xj )|dµn (x)
                             j=1       (xj−1 ,xj ]

                                   k
                             +          |f (xj )||µ((xj−1 , xj ]) − µn ((xj−1 , xj ])|
                                  j=1
                                   k
                             +                          |f (x) − f (xj )|dµ(x).
                                  j=1     (xj−1 ,xj ]

Now, for n ≥ N , the first and the last term on the right-hand side are both
                            ε
bounded by (µ((x0 , xk ]) + k )ε and the middle term is bounded by max |f |ε.
Thus the claim follows.

   Moreover, every bounded sequence of measures has a vaguely convergent
subsequence.
Lemma A.26. Suppose µn is a sequence of finite Borel measures on R such
that µn (R) ≤ M . Then there exists a subsequence which converges vaguely
to some measure µ with µ(R) ≤ M .

Proof. Let µn (x) = µn ((−∞, x]) be the corresponding distribution func-
tions. By 0 ≤ µn (x) ≤ M there is a convergent subsequence for fixed x.
Moreover, by the standard diagonal series trick, we can assume that µn (x)
converges to some number µ(x) for each rational x. For irrational x we set
µ(x) = inf x0 x {µ(x0 )|x0 rational}. Then µ(x) is monotone, 0 ≤ µ(x1 ) ≤
µ(x2 ) ≤ M for x1 ≤ x2 . Furthermore,
               µ(x−) ≤ lim inf µn (x) ≤ lim sup µn (x) ≤ µ(x)
shows that µn (x) → µ(x) at every point of continuity of µ. So we can
redefine µ to be right continuous without changing this last fact.

    In the case where the sequence is bounded, (A.39) even holds for a larger
class of functions.
Lemma A.27. Suppose µn → µ vaguely and µn (R) ≤ M . Then (A.39)
holds for any f ∈ C∞ (R).

Proof. Split f = f1 + f2 , where f1 has compact support and |f2 | ≤ ε. Then
| f dµ − f dµn | ≤ | f1 dµ − f1 dµn | + 2εM and the claim follows.
280                         A. Almost everything about Lebesgue integration


Example. The example dµn (λ) = dΘ(λ − n) shows that in the above claim
f cannot be replaced by a bounded continuous function. Moreover, the
example dµn (λ) = n dΘ(λ − n) also shows that the uniform bound cannot
be dropped.


A.7. Decomposition of measures
Let µ, ν be two measures on a measure space (X, Σ). They are called
mutually singular (in symbols µ ⊥ ν) if they are supported on disjoint
sets. That is, there is a measurable set N such that µ(N ) = 0 and ν(XN ) =
0.
Example. Let λ be the Lebesgue measure and Θ the Dirac measure (cen-
tered at 0). Then λ ⊥ Θ: Just take N = {0}; then λ({0}) = 0 and
Θ(R{0}) = 0.

    On the other hand, ν is called absolutely continuous with respect to
µ (in symbols ν   µ) if µ(A) = 0 implies ν(A) = 0.
Example. The prototypical example is the measure dν = f dµ (compare
Lemma A.15). Indeed µ(A) = 0 implies

                              ν(A) =           f dµ = 0                         (A.40)
                                           A
and shows that ν is absolutely continuous with respect to µ. In fact, we will
show below that every absolutely continuous measure is of this form.

    The two main results will follow as simple consequence of the following
result:
Theorem A.28. Let µ, ν be σ-finite measures. Then there exists a unique
(a.e.) nonnegative function f and a set N of µ measure zero, such that

                        ν(A) = ν(A ∩ N ) +                    f dµ.             (A.41)
                                                          A

Proof. We first assume µ, ν to be finite measures. Let α = µ + ν and
consider the Hilbert space L2 (X, dα). Then

                                  (h) =            h dν
                                               X
is a bounded linear functional by Cauchy–Schwarz:
                                       2
            | (h)|2 =       1 · h dν       ≤        |1|2 dν           |h|2 dν
                        X

                    ≤ ν(X)        |h|2 dα          = ν(X) h 2 .
A.7. Decomposition of measures                                                        281


Hence by the Riesz lemma (Theorem 1.8) there exists a g ∈ L2 (X, dα) such
that
                                   (h) =        hg dα.
                                            X
By construction

                   ν(A) =         χA dν =       χA g dα =       g dα.             (A.42)
                                                            A
In particular, g must be positive a.e. (take A the set where g is negative).
Furthermore, let N = {x|g(x) ≥ 1}. Then

                  ν(N ) =         g dα ≥ α(N ) = µ(N ) + ν(N ),
                              N
which shows µ(N ) = 0. Now set
                          g
                     f=       χN ,               N = XN.
                         1−g
Then, since (A.42) implies dν = g dα, respectively, dµ = (1 − g)dα, we have
                              g
              f dµ =    χA       χN dµ =          χA∩N g dα = ν(A ∩ N )
          A                  1−g
                                                                     ˜
as desired. Clearly f is unique, since if there is a second function f , then
        ˜                                 ˜
 A (f − f )dµ = 0 for every A shows f − f = 0 a.e.
    To see the σ-finite case, observe that Xn    X, µ(Xn )  ∞ and Yn      X,
ν(Yn )  ∞ implies Xn ∩ Yn          X and α(Xn ∩ Yn )  ∞. Hence when
restricted to Xn ∩Yn , we have sets Nn and functions fn . Now take N = Nn
and choose f such that f |Xn = fn (this is possible since fn+1 |Xn = fn a.e.).
Then µ(N ) = 0 and

     ν(A ∩ N ) = lim ν(A ∩ (Xn N )) = lim                       f dµ =       f dµ,
                       n→∞                        n→∞ A∩X                 A
                                                         n

which finishes the proof.

   Now the anticipated results follow with no effort:
Theorem A.29 (Lebesgue decomposition). Let µ, ν be two σ-finite mea-
sures on a measure space (X, Σ). Then ν can be uniquely decomposed as
ν = νac + νsing , where νac and νsing are mutually singular and νac is abso-
lutely continuous with respect to µ.

Proof. Taking νsing (A) = ν(A ∩ N ) and dνac = f dµ, there is at least
one such decomposition. To show uniqueness, first let ν be finite. If there
                                           ˜                 ˜
is another one, ν = νac + νsing , then let N be such that µ(N ) = 0 and
                     ˜     ˜
˜      ˜               ˜          ˜              ˜
νsing (N ) = 0. Then νsing (A) − νsing (A) = A (f − f )dµ. In particular,
           ˜                         ˜                          ˜
 A∩N ∩N (f − f )dµ = 0 and hence f = f a.e. away from N ∪ N . Since
         ˜
282                       A. Almost everything about Lebesgue integration


        ˜                  ˜
µ(N ∪ N ) = 0, we have f = f a.e. and hence νac = νac as well as νsing =
                                                 ˜               ˜
ν − νac = ν − νac = νsing . The σ-finite case follows as usual.
    ˜
Theorem A.30 (Radon–Nikodym). Let µ, ν be two σ-finite measures on a
measure space (X, Σ). Then ν is absolutely continuous with respect to µ if
and only if there is a positive measurable function f such that

                              ν(A) =        f dµ                          (A.43)
                                        A
for every A ∈ Σ. The function f is determined uniquely a.e. with respect to
                                                   dν
µ and is called the Radon–Nikodym derivative dµ of ν with respect to
µ.

Proof. Just observe that in this case ν(A ∩ N ) = 0 for every A; that is,
νsing = 0.
Problem A.11. Let µ be a Borel measure on B and suppose its distribution
function µ(x) is differentiable. Show that the Radon–Nikodym derivative
equals the ordinary derivative µ (x).
Problem A.12. Suppose µ and ν are inner regular measures. Show that
ν  µ if and only if µ(C) = 0 implies ν(C) = 0 for every compact set.
Problem A.13. Let dν = f dµ. Suppose f  0 a.e. with respect to µ. Then
µ   ν and dµ = f −1 dν.
Problem A.14 (Chain rule). Show that ν             µ is a transitive relation. In
particular, if ω ν µ, show that
                           dω    dω dν
                              =        .
                           dµ    dν dµ
Problem A.15. Suppose ν         µ. Show that for any measure ω we have
                            dω       dω
                               dµ =     dν + dζ,
                            dµ       dν
where ζ is a positive measure (depending on ω) which is singular with respect
to ν. Show that ζ = 0 if and only if µ    ν.

A.8. Derivatives of measures
If µ is a Borel measure on B and its distribution function µ(x) is differen-
tiable, then the Radon–Nikodym derivative is just the ordinary derivative
µ (x) (Problem A.11). Our aim in this section is to generalize this result to
arbitrary regular Borel measures on Bn .
      We call
                                            µ(Bε (x))
                          (Dµ)(x) = lim                                   (A.44)
                                     ε↓0     |Bε (x)|
A.8. Derivatives of measures                                                      283


the derivative of µ at x ∈ Rn provided the above limit exists. (Here Br (x) ⊂
Rn is a ball of radius r centered at x ∈ Rn and |A| denotes the Lebesgue
measure of A ∈ Bn .)
    Note that for a Borel measure on B, (Dµ)(x) exists if and only if µ(x)
(as defined in (A.3)) is differentiable at x and (Dµ)(x) = µ (x) in this case.
   To compute the derivative of µ, we introduce the upper and lower
derivative,
                        µ(Bε (x))                                   µ(Bε (x))
  (Dµ)(x) = lim sup                   and   (Dµ)(x) = lim inf                 . (A.45)
                ε↓0      |Bε (x)|                             ε↓0    |Bε (x)|

Clearly µ is differentiable if (Dµ)(x) = (Dµ)(x)  ∞. First of all note that
they are measurable:
Lemma A.31. The upper derivative is lower semicontinuous; that is, the
set {x|(Dµ)(x)  α} is open for every α ∈ R. Similarly, the lower derivative
is upper semicontinuous; that is, {x|(Dµ)(x)  α} is open.

Proof. We only prove the claim for Dµ, the case Dµ being similar. Abbre-
viate
                                      µ(Bε (x))
                       Mr (x) = sup
                                 0εr |Bε (x)|
and note that it suffices to show that Or = {x|Mr (x)  α} is open.
   If x ∈ Or , there is some ε  r such that
                                    µ(Bε (x))
                                               α.
                                     |Bε (x)|
Let δ  0 and y ∈ Bδ (x). Then Bε (x) ⊆ Bε+δ (y) implying
                                               n
                      µ(Bε+δ (y))        ε          µ(Bε (x))
                                  ≥                           α
                       |Bε+δ (y)|       ε+δ          |Bε (x)|
for δ sufficiently small. That is, Bδ (x) ⊆ O.

   In particular, both the upper and lower derivatives are measurable.
Next, the following geometric fact of Rn will be needed.
Lemma A.32. Given open balls B1 , . . . , Bm in Rn , there is a subset of
disjoint balls Bj1 , . . . , Bjk such that
                               m              k
                                    Bi ≤ 3n         |Bji |.                    (A.46)
                              i=1             i=1

Proof. Assume that the balls Bj are ordered by radius. Start with Bj1 =
B1 = Br1 (x1 ) and remove all balls from our list which intersect Bj1 . Observe
284                         A. Almost everything about Lebesgue integration


that the removed balls are all contained in 3B1 = B3r1 (x1 ). Proceeding like
this, we obtain Bj1 , . . . , Bjk such that
                                 m             k
                                       Bi ⊆         3Brji
                                 i=1          i=1

and the claim follows since |3B| = 3n |B|.

      Now we can show
Lemma A.33. Let α  0. For any Borel set A we have
                                                              µ(A)
                      |{x ∈ A | (Dµ)(x)  α}| ≤ 3n                          (A.47)
                                                               α
and
               |{x ∈ A | (Dµ)(x)  0}| = 0, whenever µ(A) = 0.              (A.48)

Proof. Let Aα = {x ∈ A|(Dµ)(x)  α}. We will show
                                        µ(O)
                                  |K| ≤ 3n
                                         α
for any compact set K and open set O with K ⊆ Aα ⊆ O. The first claim
then follows from regularity of µ and the Lebesgue measure.
    Given fixed K, O, for every x ∈ K there is some rx such that Brx (x) ⊆ O
and |Brx (x)|  α−1 µ(Brx (x)). Since K is compact, we can choose a finite
subcover of K. Moreover, by Lemma A.32 we can refine our set of balls such
that
                     k                  k
                                     3n                     µ(O)
           |K| ≤ 3n    |Bri (xi )|       µ(Bri (xi )) ≤ 3n      .
                                     α                       α
                      i=1                      i=1

      To see the second claim, observe that
                                              ∞
                                                                     1
              {x ∈ A | (Dµ)(x)  0} =              {x ∈ A | (Dµ)(x)  }
                                                                     j
                                              j=1

and by the first part |{x ∈ A | (Dµ)(x)  1 }| = 0 for any j if µ(A) = 0.
                                         j

Theorem A.34 (Lebesgue). Let f be (locally) integrable, then for a.e. x ∈
Rn we have
                      1
              lim                  |f (y) − f (x)|dy = 0.         (A.49)
               r↓0 |Br (x)| Br (x)


Proof. Decompose f as f = g + h, where g is continuous and h                1    ε
(Theorem 0.34) and abbreviate
                                    1
                  Dr (f )(x) =                         |f (y) − f (x)|dy.
                                 |Br (x)|     Br (x)
A.8. Derivatives of measures                                                                285


Then, since lim Dr (g)(x) = 0 (for every x) and Dr (f ) ≤ Dr (g) + Dr (h), we
have
          lim sup Dr (f )(x) ≤ lim sup Dr (h)(x) ≤ (Dµ)(x) + |h(x)|,
            r↓0                        r↓0

where dµ = |h|dx. This implies
     {x | lim sup Dr (f )(x) ≥ 2α} ⊆ {x|(Dµ)(x) ≥ α} ∪ {x | |h(x)| ≥ α}
            r↓0

and using the first part of Lemma A.33 plus |{x | |h(x)| ≥ α}| ≤ α−1 h 1 ,
we see
                                                         ε
               |{x | lim sup Dr (f )(x) ≥ 2α}| ≤ (3n + 1) .
                       r↓0                               α
Since ε is arbitrary, the Lebesgue measure of this set must be zero for every
α. That is, the set where the lim sup is positive has Lebesgue measure
zero.

   The points where (A.49) holds are called Lebesgue points of f .
    Note that the balls can be replaced by more general sets: A sequence of
sets Aj (x) is said to shrink to x nicely if there are balls Brj (x) with rj → 0
and a constant ε  0 such that Aj (x) ⊆ Brj (x) and |Aj | ≥ ε|Brj (x)|. For
example Aj (x) could be some balls or cubes (not necessarily containing x).
However, the portion of Brj (x) which they occupy must not go to zero! For
example the rectangles (0, 1 ) × (0, 2 ) ⊂ R2 do shrink nicely to 0, but the
                               j      j
rectangles (0, 1 ) × (0, j2 ) do not.
                j         2


Lemma A.35. Let f be (locally) integrable. Then at every Lebesgue point
we have
                                 1
                f (x) = lim                 f (y)dy             (A.50)
                        j→∞ |Aj (x)| A (x)
                                         j

whenever Aj (x) shrinks to x nicely.

Proof. Let x be a Lebesgue point and choose some nicely shrinking sets
Aj (x) with corresponding Brj (x) and ε. Then
        1                                            1
                           |f (y) − f (x)|dy ≤                          |f (y) − f (x)|dy
     |Aj (x)|     Aj (x)                         ε|Brj (x)|   Brj (x)

and the claim follows.

Corollary A.36. Suppose µ is an absolutely continuous Borel measure on
R. Then its distribution function is differentiable a.e. and dµ(x) = µ (x)dx.

   As another consequence we obtain
286                        A. Almost everything about Lebesgue integration


Theorem A.37. Let µ be a Borel measure on Rn . The derivative Dµ
exists a.e. with respect to Lebesgue measure and equals the Radon–Nikodym
derivative of the absolutely continuous part of µ with respect to Lebesgue
measure; that is,
                           µac (A) =        (Dµ)(x)dx.                (A.51)
                                        A

Proof. If dµ = f dx is absolutely continuous with respect to Lebesgue mea-
sure, the claim follows from Theorem A.34. To see the general case, use the
Lebesgue decomposition of µ and let N be a support for the singular part
with |N | = 0. Then (Dµsing )(x) = 0 for a.e. x ∈ Rn N by the second part
of Lemma A.33.

    In particular, µ is singular with respect to Lebesgue measure if and only
if Dµ = 0 a.e. with respect to Lebesgue measure.
   Using the upper and lower derivatives, we can also give supports for the
absolutely and singularly continuous parts.
Theorem A.38. The set {x|(Dµ)(x) = ∞} is a support for the singular
and {x|0  (Dµ)(x)  ∞} is a support for the absolutely continuous part.

Proof. First suppose µ is purely singular. Let us show that the set Ok =
{x | (Dµ)(x)  k} satisfies µ(Ok ) = 0 for every k ∈ N.
     Let K ⊂ Ok be compact, and let Vj ⊃ K be some open set such that
|Vj K| ≤ 1 . For every x ∈ K there is some ε = ε(x) such that Bε (x) ⊆ Vj
          j
and µ(Bε (x)) ≤ k|Bε (x)|. By compactness, finitely many of these balls cover
K and hence
                  µ(K) ≤      µ(Bεi (xi )) ≤ k  |Bεi (xi )|.
                             i                     i
Selecting disjoint balls as in Lemma A.32 further shows
                     µ(K) ≤ k3n        |Bεi (xi )| ≤ k3n |Vj |.

Letting j → ∞, we see µ(K) ≤ k3n |K| and by regularity we even have
µ(A) ≤ k3n |A| for every A ⊆ Ok . Hence µ is absolutely continuous on Ok
and since we assumed µ to be singular, we must have µ(Ok ) = 0.
      Thus (Dµsing )(x) = ∞ for a.e. x with respect to µsing and we are done.


     Finally, we note that these supports are minimal. Here a support M of
some measure µ is called a minimal support (it is sometimes also called
an essential support) if any subset M0 ⊆ M which does not support µ
(i.e., µ(M0 ) = 0) has Lebesgue measure zero.
A.8. Derivatives of measures                                             287


Lemma A.39. The set Mac = {x|0  (Dµ)(x)  ∞} is a minimal support
for µac .

Proof. Suppose M0 ⊆ Mac and µac (M0 ) = 0. Set Mε = {x ∈ M0 |ε 
(Dµ)(x)} for ε  0. Then Mε   M0 and
                      1               1           1
     |Mε | =     dx ≤      (Dµ)(x)dx = µac (Mε ) ≤ µac (M0 ) = 0
              Mε      ε Mε            ε           ε
shows |M0 | = limε↓0 |Mε | = 0.

   Note that the set M = {x|0  (Dµ)(x)} is a minimal support of µ.
Example. The Cantor function is constructed as follows: Take the sets
Cn used in the construction of the Cantor set C: Cn is the union of 2n
closed intervals with 2n − 1 open gaps in between. Set fn equal to j/2n
on the j’th gap of Cn and extend it to [0, 1] by linear interpolation. Note
that, since we are creating precisely one new gap between every old gap
when going from Cn to Cn+1 , the value of fn+1 is the same as the value of
fn on the gaps of Cn . In particular, fn − fm ∞ ≤ 2− min(n,m) and hence
we can define the Cantor function as f = limn→∞ fn . By construction f
is a continuous function which is constant on every subinterval of [0, 1]C.
Since C is of Lebesgue measure zero, this set is of full Lebesgue measure
and hence f = 0 a.e. in [0, 1]. In particular, the corresponding measure, the
Cantor measure, is supported on C and is purely singular with respect to
Lebesgue measure.

Problem A.16. Show that M = {x|0  (Dµ)(x)} is a minimal support of
µ.
Mathematical methods in quantum mechanics
Bibliographical notes


The aim of this section is not to give a comprehensive guide to the literature,
but to document the sources from which I have learned the materials and
which I have used during the preparation of this text. In addition, I will
point out some standard references for further reading. In some sense all
books on this topic are inspired by von Neumann’s celebrated monograph
[64] and the present text is no exception.
   General references for the first part are Akhiezer and Glazman [2],
Berthier (Boutet de Monvel) [9], Blank, Exner, and Havl´cek [10], Edmunds
                                                       ıˇ
and Evans [16], Lax [25], Reed and Simon [40], Weidmann [60], [62], or
Yosida [66].
Chapter 0: A first look at Banach and Hilbert spaces
As a reference for general background I can warmly recommend Kelly’s
classical book [26]. The rest is standard material and can be found in any
book on functional analysis.
Chapter 1: Hilbert spaces
The material in this chapter is again classical and can be found in any book
on functional analysis. I mainly follow Reed and Simon [40], respectively,
Weidmann [60], with the main difference being that I use orthonormal sets
and their projections as the central theme from which everything else is
derived. For an alternate problem based approach see Halmos’ book [22].
Chapter 2: Self-adjointness and spectrum
This chapter is still similar in spirit to [40], [60] with some ideas taken from
Schechter [48].




                                                                            289
290                                                     Bibliographical notes


Chapter 3: The spectral theorem
The approach via the Herglotz representation theorem follows Weidmann
[60]. However, I use projection-valued measures as in Reed and Simon [40]
rather than the resolution of the identity. Moreover, I have augmented the
discussion by adding material on spectral types and the connections with
the boundary values of the resolvent. For a survey containing several recent
results see [28].
Chapter 4: Applications of the spectral theorem
This chapter collects several applications from various sources which I have
found useful or which are needed later on. Again Reed and Simon [40] and
Weidmann [60], [63] are the main references here.
Chapter 5: Quantum dynamics
The material is a synthesis of the lecture notes by Enß [18], Reed and Simon
[40], [42], and Weidmann [63].
Chapter 6: Perturbation theory for self-adjoint operators
This chapter is similar to [60] (which contains more results) with the main
difference that I have added some material on quadratic forms. In particular,
the section on quadratic forms contains, in addition to the classical results,
some material which I consider useful but was unable to find (at least not
in the present form) in the literature. The prime reference here is Kato’s
monumental treatise [24] and Simon’s book [49]. For further information
on trace class operators see Simon’s classic [52]. The idea to extend the
usual notion of strong resolvent convergence by allowing the approximating
operators to live on subspaces is taken from Weidmann [62].
Chapter 7: The free Schr¨dinger operator
                        o
Most of the material is classical. Much more on the Fourier transform can
be found in Reed and Simon [41].
Chapter 8: Algebraic methods
This chapter collects some material which can be found in almost any physics
text book on quantum mechanics. My only contribution is to provide some
mathematical details. I recommend the classical book by Thirring [58] and
the visual guides by Thaller [56], [57].
Chapter 9: One-dimensional Schr¨dinger operators
                               o
One-dimensional models have always played a central role in understand-
ing quantum mechanical phenomena. In particular, general wisdom used to
say that Schr¨dinger operators should have absolutely continuous spectrum
              o
plus some discrete point spectrum, while singular continuous spectrum is a
pathology that should not occur in examples with bounded V [14, Sect. 10.4].
Bibliographical notes                                                     291


In fact, a large part of [43] is devoted to establishing the absence of sin-
gular continuous spectrum. This was proven wrong by Pearson, who con-
structed an explicit one-dimensional example with singular continuous spec-
trum. Moreover, after the appearance of random models, it became clear
that such kind of exotic spectra (singular continuous or dense pure point)
are frequently generic. The starting point is often the boundary behaviour
of the Weyl m-function and its connection with the growth properties of
solutions of the underlying differential equation, the latter being known as
Gilbert and Pearson or subordinacy theory. One of my main goals is to give
a modern introduction to this theory. The section on inverse spectral theory
presents a simple proof for the Borg–Marchenko theorem (in the local ver-
sion of Simon) from Bennewitz [8]. Again this result is the starting point of
almost all other inverse spectral results for Sturm–Liouville equations and
should enable the reader to start reading research papers in this area.
    Other references with further information are the lecture notes by Weid-
mann [61] or the classical books by Coddington and Levinson [13], Levitan
[29], Levitan and Sargsjan [30], [31], Marchenko [33], Naimark [34], Pear-
son [37]. See also the recent monographs by Rofe-Betekov and Kholkin [46],
Zettl [67] or the recent collection of historic and survey articles [4]. For a
nice introduction to random models I can recommend the recent notes by
Kirsch [27] or the classical monographs by Carmona and Lacroix [11] or Pas-
tur and Figotin [36]. For the discrete analog of Sturm–Liouville operators,
Jacobi operators, see my monograph [54].
Chapter 10: One-particle Schr¨dinger operators
                             o
The presentation in the first two sections is influenced by Enß [18] and
Thirring [58]. The solution of the Schr¨dinger equation in spherical coordi-
                                       o
nates can be found in any text book on quantum mechanics. Again I tried
to provide some missing mathematical details. Several other explicitly solv-
able examples can be found in the books by Albeverio et al. [3] or Fl¨gge
                                                                       u
[19]. For the formulation of quantum mechanics via path integrals I suggest
Roepstorff [45] or Simon [50].
Chapter 11: Atomic Schr¨dinger operators
                       o
This chapter essentially follows Cycon, Froese, Kirsch, and Simon [14]. For
a recent review see Simon [51].
Chapter 12: Scattering theory
This chapter follows the lecture notes by Enß [18] (see also [17]) using some
material from Perry [38]. Further information on mathematical scattering
theory can be found in Amrein, Jauch, and Sinha [5], Baumgaertel and
Wollenberg [6], Chadan and Sabatier [12], Cycon, Froese, Kirsch, and Simon
[14], Newton [35], Pearson [37], Reed and Simon [42], or Yafaev [65].
292                                                    Bibliographical notes


Appendix A: Almost everything about Lebesgue integration
Most parts follow Rudin’s book [47], respectively, Bauer [7], with some ideas
also taken from Weidmann [60]. I have tried to strip everything down to the
results needed here while staying self-contained. Another useful reference is
the book by Lieb and Loss [32].
Bibliography


  [1] M. Abramovitz and I. A. Stegun, Handbook of Mathematical Functions, Dover,
      New York, 1972.
  [2] N. I. Akhiezer and I. M. Glazman, Theory of Linear Operators in Hilbert Space,
      Vols. I and II, Pitman, Boston, 1981.
  [3] S. Albeverio, F. Gesztesy, R. Høegh-Krohn, and H. Holden, Solvable Models
      in Quantum Mechanics, 2nd ed., American Mathematical Society, Providence,
      2005.
  [4] W. O. Amrein, A. M. Hinz, and D. B. Pearson, Sturm–Liouville Theory: Past
      and Present, Birkh¨user, Basel, 2005.
                        a
  [5] W. O. Amrein, J. M. Jauch, and K. B. Sinha, Scattering Theory in Quantum
      Mechanics, W. A. Benajmin Inc., New York, 1977.
  [6] H. Baumgaertel and M. Wollenberg,          Mathematical Scattering Theory,
      Birkh¨user, Basel, 1983.
           a
  [7] H. Bauer, Measure and Integration Theory, de Gruyter, Berlin, 2001.
  [8] C. Bennewitz, A proof of the local Borg–Marchenko theorem, Commun. Math.
      Phys. 218, 131–132 (2001).
  [9] A. M. Berthier, Spectral Theory and Wave Operators for the Schr¨dinger Equa-
                                                                     o
      tion, Pitman, Boston, 1982.
 [10] J. Blank, P. Exner, and M. Havl´cek, Hilbert-Space Operators in Quantum
                                          ıˇ
      Physics, 2nd ed., Springer, Dordrecht, 2008.
 [11] R. Carmona and J. Lacroix, Spectral Theory of Random Schr¨dinger Operators,
                                                               o
      Birkh¨user, Boston, 1990.
           a
 [12] K. Chadan and P. C. Sabatier, Inverse Problems in Quantum Scattering Theory,
      Springer, New York, 1989.
 [13] E. A. Coddington and N. Levinson, Theory of Ordinary Differential Equations,
      Krieger, Malabar, 1985.
 [14] H. L. Cycon, R. G. Froese, W. Kirsch, and B. Simon, Schr¨dinger Operators,
                                                              o
      2nd printing, Springer, Berlin, 2008.
 [15] M. Demuth and M. Krishna, Determining Spectra in Quantum Theory,
      Birkh¨user, Boston, 2005.
           a

                                                                                293
294                                                                        Bibliography


      [16] D. E. Edmunds and W. D. Evans, Spectral Theory and Differential Operators,
           Oxford University Press, Oxford, 1987.
      [17] V. Enss, Asymptotic completeness for quantum mechanical potential scattering,
           Comm. Math. Phys. 61, 285–291 (1978).
      [18] V. Enß, Schr¨dinger Operators, lecture notes (unpublished).
                       o
      [19] S. Fl¨gge, Practical Quantum Mechanics, Springer, Berlin, 1994.
                u
      [20] I. Gohberg, S. Goldberg, and N. Krupnik, Traces and Determinants of Linear
           Operators, Birkh¨user, Basel, 2000.
                           a
      [21] S. Gustafson and I. M. Sigal, Mathematical Concepts of Quantum Mechanics,
           Springer, Berlin, 2003.
      [22] P. R. Halmos, A Hilbert Space Problem Book, 2nd ed., Springer, New York, 1984.
      [23] P. D. Hislop and I. M. Sigal, Introduction to Spectral Theory, Springer, New
           York, 1996.
      [24] T. Kato, Perturbation Theory for Linear Operators, Springer, New York, 1966.
      [25] P. D. Lax, Functional Analysis, Wiley-Interscience, New York, 2002.
      [26] J. L. Kelly, General Topology, Springer, New York, 1955.
      [27] W. Kirsch, An Invitation to Random Schr¨dinger Operators, in Random
                                                           o
           Schr¨dinger Operators, M. Dissertori et al. (eds.), 1–119, Panoramas et Synth`se
               o                                                                        e
           25, Soci´t´ Math´matique de France, Paris, 2008.
                   ee      e
      [28] Y. Last, Quantum dynamics and decompositions of singular continuous spectra,
           J. Funct. Anal. 142, 406–445 (1996).
      [29] B. M. Levitan, Inverse Sturm–Liouville Problems, VNU Science Press, Utrecht,
           1987.
      [30] B. M. Levitan and I. S. Sargsjan, Introduction to Spectral Theory, American
           Mathematical Society, Providence, 1975.
      [31] B. M. Levitan and I. S. Sargsjan, Sturm–Liouville and Dirac Operators, Kluwer
           Academic Publishers, Dordrecht, 1991.
      [32] E. Lieb and M. Loss, Analysis, American Mathematical Society, Providence,
           1997.
      [33] V. A. Marchenko, Sturm–Liouville Operators and Applications, Birkh¨user,
                                                                             a
           Basel, 1986.
      [34] M.A. Naimark, Linear Differential Operators, Parts I and II , Ungar, New York,
           1967 and 1968.
      [35] R. G. Newton, Scattering Theory of Waves and Particles, 2nd ed., Dover, New
           York, 2002.
      [36] L. Pastur and A. Figotin, Spectra of Random and Almost-Periodic Operators,
           Springer, Berlin, 1992.
      [37] D. Pearson, Quantum Scattering and Spectral Theory, Academic Press, London,
           1988.
      [38] P. Perry, Mellin transforms and scattering theory, Duke Math. J. 47, 187–193
           (1987).
      [39] E. Prugoveˇki, Quantum Mechanics in Hilbert Space, 2nd ed., Academic Press,
                     c
           New York, 1981.
      [40] M. Reed and B. Simon, Methods of Modern Mathematical Physics I. Functional
           Analysis, rev. and enl. ed., Academic Press, San Diego, 1980.
Bibliography                                                                       295


   [41] M. Reed and B. Simon, Methods of Modern Mathematical Physics II. Fourier
        Analysis, Self-Adjointness, Academic Press, San Diego, 1975.
   [42] M. Reed and B. Simon, Methods of Modern Mathematical Physics III. Scattering
        Theory, Academic Press, San Diego, 1979.
   [43] M. Reed and B. Simon, Methods of Modern Mathematical Physics IV. Analysis
        of Operators, Academic Press, San Diego, 1978.
   [44] J. R. Retherford, Hilbert Space: Compact Operators and the Trace Theorem,
        Cambridge University Press, Cambridge, 1993.
   [45] G. Roepstorff, Path Integral Approach to Quantum Physics, Springer, Berlin,
        1994.
   [46] F.S. Rofe-Beketov and A.M. Kholkin, Spectral Analysis of Differential Operators.
        Interplay Between Spectral and Oscillatory Properties, World Scientific, Hacken-
        sack, 2005.
   [47] W. Rudin, Real and Complex Analysis, 3rd ed., McGraw-Hill, New York, 1987.
   [48] M. Schechter, Operator Methods in Quantum Mechanics, North Holland, New
        York, 1981.
   [49] B. Simon, Quantum Mechanics for Hamiltonians Defined as Quadratic Forms,
        Princeton University Press, Princeton, 1971.
   [50] B. Simon, Functional Integration and Quantum Physics, Academic Press, New
        York, 1979.
   [51] B. Simon, Schr¨dinger operators in the twentieth century, J. Math. Phys. 41:6,
                      o
        3523–3555 (2000).
   [52] B. Simon, Trace Ideals and Their Applications, 2nd ed., Amererican Mathemat-
        ical Society, Providence, 2005.
   [53] E. Stein and R. Shakarchi, Complex Analysis, Princeton University Press, Prince-
        ton, 2003.
   [54] G. Teschl, Jacobi Operators and Completely Integrable Nonlinear Lattices, Math.
        Surv. and Mon. 72, Amer. Math. Soc., Rhode Island, 2000.
   [55] B. Thaller, The Dirac Equation, Springer, Berlin 1992.
   [56] B. Thaller, Visual Quantum Mechanics, Springer, New York, 2000.
   [57] B. Thaller, Advanced Visual Quantum Mechanics, Springer, New York, 2005.
   [58] W. Thirring, Quantum Mechanics of Atoms and Molecules, Springer, New York,
        1981.
   [59] G. N. Watson, A Treatise on the Theory of Bessel Functions, 2nd ed., Cambridge
        University Press, Cambridge, 1962.
   [60] J. Weidmann, Linear Operators in Hilbert Spaces, Springer, New York, 1980.
   [61] J. Weidmann, Spectral Theory of Ordinary Differential Operators, Lecture Notes
        in Mathematics, 1258, Springer, Berlin, 1987.
   [62] J. Weidmann, Lineare Operatoren in Hilbertr¨umen, Teil 1: Grundlagen, B. G.
                                                   a
        Teubner, Stuttgart, 2000.
   [63] J. Weidmann, Lineare Operatoren in Hilbertr¨umen, Teil 2: Anwendungen, B.
                                                   a
        G. Teubner, Stuttgart, 2003.
   [64] J. von Neumann, Mathematical Foundations of Quantum Mechanics, Princeton
        University Press, Princeton, 1996.
   [65] D. R. Yafaev, Mathematical Scattering Theory: General Theory, American Math-
        ematical Society, Providence, 1992.
296                                                                           Bibliography


      [66] K. Yosida, Functional Analysis, 6th ed., Springer, Berlin, 1980.
      [67] A. Zettl, Sturm–Liouville Theory, American Mathematical Society, Providence,
           2005.
Glossary of notation


AC(I)         . . . absolutely continuous functions, 84
B              = B1
Bn            . . . Borel σ-field of Rn , 260
C(H)          . . . set of compact operators, 128
C(U )         . . . set of continuous functions from U to C
C∞ (U )       . . . set of functions in C(U ) which vanish at ∞
C(U, V )      . . . set of continuous functions from U to V
  ∞
Cc (U, V )    . . . set of compactly supported smooth functions
χΩ (.)        . . . characteristic function of the set Ω
dim           . . . dimension of a linear space
dist(x, Y )    = inf y∈Y x − y , distance between x and Y
D(.)          . . . domain of an operator
e             . . . exponential function, ez = exp(z)
E(A)          . . . expectation of an operator A, 55
F             . . . Fourier transform, 161
H             . . . Schr¨dinger operator, 221
                         o
H0            . . . free Schr¨dinger operator, 167
                             o
H m (a, b)    . . . Sobolev space, 85
H m (Rn )     . . . Sobolev space, 164
hull(.)       . . . convex hull
H             . . . a separable Hilbert space
i             . . . complex unity, i2 = −1
I             . . . identity operator
Im(.)         . . . imaginary part of a complex number
inf           . . . infimum
Ker(A)        . . . kernel of an operator A, 22


                                                                  297
298                                                      Glossary of notation


      L(X, Y )     . . . set of all bounded linear operators from X to Y , 23
      L(X)          = L(X, X)
      Lp (X, dµ)   . . . Lebesgue space of p integrable functions, 26
      Lp (X, dµ)
        loc        . . . locally p integrable functions, 31
      Lp (X, dµ)
        c          . . . compactly supported p integrable functions
      L∞ (X, dµ)   . . . Lebesgue space of bounded functions, 26
      L∞ (Rn )
        ∞          . . . Lebesgue space of bounded functions vanishing at ∞
       1 (N)       . . . Banach space of summable sequences, 13
       2 (N)       . . . Hilbert space of square summable sequences, 17
       ∞ (N)       . . . Banach space of bounded summable sequences, 13
      λ            . . . a real number
      ma (z)       . . . Weyl m-function, 199
      M (z)        . . . Weyl M -matrix, 211
      max          . . . maximum
      M            . . . Mellin transform, 251
      µψ           . . . spectral measure, 95
      N            . . . the set of positive integers
      N0            = N ∪ {0}
      o(x)         . . . Landau symbol little-o
      O(x)         . . . Landau symbol big-O
      Ω            . . . a Borel set
      Ω±           . . . wave operators, 247
      PA (.)       . . . family of spectral projections of an operator A, 96
      P±           . . . projector onto outgoing/incoming states, 250
      Q            . . . the set of rational numbers
      Q(.)         . . . form domain of an operator, 97
      R(I, X)      . . . set of regulated functions, 112
      RA (z)       . . . resolvent of A, 74
      Ran(A)       . . . range of an operator A, 22
      rank(A)       = dim Ran(A), rank of an operator A, 127
      Re(.)        . . . real part of a complex number
      ρ(A)         . . . resolvent set of A, 73
      R            . . . the set of real numbers
      S(I, X)      . . . set of simple functions, 112
      S(Rn )       . . . set of smooth functions with rapid decay, 161
      sign(x)      . . . +1 for x  0 and −1 for x  0; sign function
      σ(A)         . . . spectrum of an operator A, 73
      σac (A)      . . . absolutely continuous spectrum of A, 106
      σsc (A)      . . . singular continuous spectrum of A, 106
      σpp (A)      . . . pure point spectrum of A, 106
      σp (A)       . . . point spectrum (set of eigenvalues) of A, 103
      σd (A)       . . . discrete spectrum of A, 145
      σess (A)     . . . essential spectrum of A, 145
Glossary of notation                                                           299


    span(M )      . . . set of finite linear combinations from M , 14
    sup           . . . supremum
    supp(f )      . . . support of a function f , 7
    Z             . . . the set of integers
    z             . . . a complex number
    √
        z         . . . square root of z with branch cut along (−∞, 0]
    z∗            . . . complex conjugation
    A∗            . . . adjoint of A, 59
    A             . . . closure of A, 63
    fˆ             = Ff , Fourier transform of f , 161
    fˇ             = F −1 f , inverse Fourier transform of f , 163
      .           . . . norm in the Hilbert space H, 17
      . p         . . . norm in the Banach space Lp , 25
     ., ..        . . . scalar product in H, 17
    Eψ (A)         = ψ, Aψ , expectation value, 56
    ∆ψ (A)         = Eψ (A2 ) − Eψ (A)2 , variance, 56
    ∆             . . . Laplace operator, 167
    ∂             . . . gradient, 162
    ∂α            . . . derivative, 161
    ⊕             . . . orthogonal sum of linear spaces or operators, 45, 79
    M⊥            . . . orthogonal complement, 43
    A             . . . complement of a set
    (λ1 , λ2 )     = {λ ∈ R | λ1  λ  λ2 }, open interval
    [λ1 , λ2 ]     = {λ ∈ R | λ1 ≤ λ ≤ λ2 }, closed interval
    ψn → ψ        . . . norm convergence, 12
    ψn        ψ   . . . weak convergence, 49
    An → A        . . . norm convergence
            s
    An → A        . . . strong convergence, 50
    An        A   . . . weak convergence, 50
          nr
    An → A        . . . norm resolvent convergence, 153
           sr
    An → A        . . . strong resolvent convergence, 153
Mathematical methods in quantum mechanics
Index


a.e., see almost everywhere          operator, 23
absolue value of an operator, 99     sesquilinear form, 21
absolute convergence, 16
absolutely continuous
                                   C-real, 83
   function, 84
                                   canonical form of compact operators, 137
   measure, 280
                                   Cantor
   spectrum, 106
                                      function, 287
adjoint operator, 47, 59
                                      measure, 287
algebra, 259
                                      set, 262
almost everywhere, 262
                                   Cauchy sequence, 6
angular momentum operator, 176
                                   Cauchy–Schwarz–Bunjakowski inequality,
                                         18
B.L.T. theorem, 23                 Cayley transform, 81
Baire category theorem, 32         Ces`ro average, 126
                                        a
Banach algebra, 24                 characteristic function, 270
Banach space, 13                   closable
Banach–Steinhaus theorem, 33          form, 71
base, 5                               operator, 63
basis, 14                          closed
  orthonormal, 40                     form, 71
  spectral, 93                        operator, 63
Bessel function, 171               closed graph theorem, 66
  spherical, 230                   closed set, 5
Bessel inequality, 39              closure, 5
Borel                                 essential, 104
  function, 269                    commute, 115
  measure, 261                     compact, 8
     regular, 261                     locally, 10
  set, 260                            sequentially, 8
  σ-algebra, 260                   complete, 6, 13
  transform, 95, 100               completion, 22
boundary condition                 configuration space, 56
  Dirichlet, 188                   conjugation, 83
  Neumann, 188                     continuous, 7
  periodic, 188                    convergence, 6
bounded                            convolution, 165


                                                                        301
302                                                                   Index


core, 63                             graph, 63
cover, 8                             graph norm, 64
C ∗ algebra, 48                      Green’s function, 171
cyclic vector, 93                    ground state, 235

dense, 6                             Hamiltonian, 57
dilation group, 223                  harmonic oscillator, 178
Dirac measure, 261, 274              Hausdorff space, 5
Dirichlet boundary condition, 188    Heine–Borel theorem, 10
discrete topology, 4                 Heisenberg picture, 130
distance, 3, 10                      Hellinger-Toeplitz theorem, 67
distribution function, 261           Herglotz
domain, 22, 56, 58                     function, 95
dominated convergence theorem, 273     representation theorem, 107
                                     Hermite polynomials, 179
eigenspace, 112                      hermitian
eigenvalue, 74                         form, 71
   multiplicity, 112                   operator, 58
eigenvector, 74                      Hilbert space, 17, 37
element                                separable, 41
   adjoint, 48                       H¨lder’s inequality, 26
                                      o
   normal, 48                        HVZ theorem, 242
   positive, 48                      hydrogen atom, 222
   self-adjoint, 48
   unitary, 48                       ideal, 48
equivalent norms, 20                 identity, 24
essential                            induced topology, 5
   closure, 104                      inner product, 17
   range, 74                         inner product space, 17
   spectrum, 145                     integrable, 273
   supremum, 26                      integral, 270
expectation, 55                      interior, 6
extension, 59                        interior point, 4
                                     intertwining property, 248
finite intersection property, 8       involution, 48
first resolvent formula, 75           ionization, 242
form, 71
  bound, 149                         Jacobi operator, 67
  bounded, 21, 72
  closable, 71                       Kato–Rellich theorem, 135
  closed, 71                         kernel, 22
  core, 72                           KLMN theorem, 150
  domain, 68, 97
  hermitian, 71                      l.c., see limit circle
  nonnegative, 71                    l.p., see limit point
  semi-bounded, 71                   Lagrange identity, 182
Fourier                              Laguerre polynomial, 231
  series, 41                            generalized, 231
  transform, 126, 161                Lebesgue
Friedrichs extension, 70                decomposition, 281
Fubini theorem, 276                     measure, 262
function                                point, 285
  absolutely continuous, 84          Legendre equation, 226
                                     lemma
Gaussian wave packet, 175               Riemann-Lebesgue, 165
gradient, 162                        Lidskij trace theorem, 143
Gram–Schmidt orthogonalization, 42   limit circle, 187
Index                                                                   303


limit point, 4, 187                 ONS, see orthonormal set
Lindel¨f theorem, 8
       o                            open ball, 4
linear                              open set, 4
   functional, 24, 44               operator
   operator, 22                       adjoint, 47, 59
Liouville normal form, 186            bounded, 23
localization formula, 243             bounded from below, 70
                                      closable, 63
maximum norm, 12                      closed, 63
mean-square deviation, 56             closure, 63
measurable                            compact, 128
 function, 269                        domain, 22, 58
 set, 260                             finite rank, 127
 space, 260                           hermitian, 58
measure, 260                          Hilbert–Schmidt, 139
 absolutely continuous, 280           linear, 22, 58
 complete, 268                        nonnegative, 68
 finite, 260                           normal, 60, 67, 91
 growth point, 99                     positive, 68
 Lebesgue, 262                        relatively bounded, 133
 minimal support, 286                 relatively compact, 128
 mutually singular, 280               self-adjoint, 59
 product, 275                         semi-bounded, 70
 projection-valued, 88                strong convergence, 50
 space, 260                           symmetric, 58
 spectral, 95                         unitary, 39, 57
 support, 262                         weak convergence, 50
Mellin transform, 251               orthogonal, 17, 38
metric space, 3                       complement, 43
Minkowski’s inequality, 27            polynomials, 228
mollifier, 30                          projection, 43
momentum operator, 174                sum, 45
monotone convergence theorem, 271   orthonormal
multi-index, 161                      basis, 40
 order, 161                           set, 38
multiplicity                        orthonormal basis, 40
 spectral, 94                       oscillating, 219
                                    outer measure, 266
neighborhood, 4
Neumann                             parallel, 17, 38
  boundary condition, 188           parallelogram law, 19
  function                          parity operator, 98
    spherical, 230                  Parseval’s identity, 163
  series, 76                        partial isometry, 99
Nevanlinna function, 95             partition of unity, 11
Noether theorem, 174                perpendicular, 17, 38
norm, 12                            phase space, 56
  operator, 23                      Pl¨ cker identity, 186
                                      u
norm resolvent convergence, 153     polar decomposition, 99
normal, 11, 91                      polarization identity, 19, 39, 58
normalized, 17, 38                  position operator, 173
normed space, 12                    positivity
nowhere dense, 32                     improving, 235
                                      preserving, 235
observable, 55                      premeasure, 260
ONB, see orthonormal basis          probability density, 55
one-parameter unitary group, 57     product measure, 275
304                                                              Index


product topology, 8              spectral
projection, 48                      basis, 93
pure point spectrum, 106               ordered, 105
Pythagorean theorem, 17, 38         mapping theorem, 105
                                    measure
quadrangle inequality, 11              maximal, 105
quadratic form, 58, see form        theorem, 97
                                       compact operators, 136
Radon–Nikodym                       vector, 93
   derivative, 282                     maximal, 105
   theorem, 282                  spectrum, 73
RAGE theorem, 129                   absolutely continuous, 106
range, 22                           discrete, 145
   essential, 74                    essential, 145
rank, 127                           pure point, 106
reducing subspace, 80               singularly continuous, 106
regulated function, 112          spherical coordinates, 224
relatively compact, 128          spherical harmonics, 227
resolution of the identity, 89   spherically symmetric, 166
resolvent, 74                    ∗-ideal, 48
   convergence, 153              ∗-subalgebra, 48
   formula                       stationary phase, 252
     first, 75                    Stieltjes inversion formula, 95, 114
     second, 135                 Stone theorem, 124
   Neumann series, 76            Stone’s formula, 114
   set, 73                       Stone–Weierstraß theorem, 52
Riesz lemma, 44                  strong convergence, 50
                                 strong resolvent convergence, 153
scalar product, 17               Sturm comparison theorem, 218
scattering operator, 248         Sturm–Liouville equation, 181
scattering state, 248               regular, 182
Schatten p-class, 141            subcover, 8
Schauder basis, 14               subordinacy, 207
Schr¨dinger equation, 57
     o                           subordinate solution, 208
Schur criterion, 28              subspace
second countable, 5                 reducing, 80
second resolvent formula, 135    superposition, 56
self-adjoint                     supersymmetric quantum mechanics, 180
   essentially, 63               support, 7
semi-metric, 3
separable, 6, 14                 Temple’s inequality, 120
series                           tensor product, 46
   absolutely convergent, 16     theorem
sesquilinear form, 17              B.L.T., 23
   bounded, 21                     Bair, 32
   parallelogram law, 21           Banach–Steinhaus, 33
   polarization identity, 21       closed graph, 66
short range, 253                   dominated convergence, 273
σ-algebra, 259                     Fubini, 276
σ-finite, 260                       Heine–Borel, 10
simple function, 112, 270          Hellinger-Toeplitz, 67
simple spectrum, 94                Herglotz, 107
singular values, 137               HVZ, 242
singularly continuous              Kato–Rellich, 135
   spectrum, 106                   KLMN, 150
Sobolev space, 164                 Lebesgue decomposition, 281
span, 14                           Lindel¨f, 8
                                         o
Index                                                                 305


   monotone convergence, 271        Weyl–Titchmarsh m-function, 199
   Noether, 174                     Wiener theorem, 126
   Pythagorean, 17, 38              Wronskian, 182
   Radon–Nikodym, 282
   RAGE, 129                        Young’s inequality, 165
   Riesz, 44
   Schur, 28
   spectral, 97
   spectral mapping, 105
   Stone, 124
   Stone–Weierstraß, 52
   Sturm, 218
   Urysohn, 10
   virial, 223
   Weierstraß, 14
   Weyl, 146
   Wiener, 126, 167
topological space, 4
topology
   base, 5
   product, 8
total, 14
trace, 143
   class, 142
triangel inequality, 3, 12
   inverse, 3, 12
trivial topology, 4
Trotter product formula, 131

uncertainty principle, 174
uniform boundedness principle, 33
unit vector, 17, 38
unitary group, 57
  generator, 57
  strongly continuous, 57
  weakly continuous, 124
Urysohn lemma, 10

variance, 56
virial theorem, 223
Vitali set, 262

wave
  function, 55
  operators, 247
weak
  Cauchy sequence, 49
  convergence, 24, 49
  derivative, 85, 164
Weierstraß approximation, 14
Weyl
  M -matrix, 211
  circle, 194
  relations, 174
  sequence, 76
     singular, 145
  theorem, 146

More Related Content

PDF
Clarkson r., mc keon d.g.c. quantum field theory (u.waterloo
PDF
A Route to Chaos for the Physical Double Pendulum by
PDF
E03503025029
PPTX
Rigit rotar
PPT
Photos
PPT
Graham Storms Photographer
PPT
ACRLNEC 2009 - Building Community: How Combined Training Improves Customer Se...
Clarkson r., mc keon d.g.c. quantum field theory (u.waterloo
A Route to Chaos for the Physical Double Pendulum by
E03503025029
Rigit rotar
Photos
Graham Storms Photographer
ACRLNEC 2009 - Building Community: How Combined Training Improves Customer Se...

Similar to Mathematical methods in quantum mechanics (20)

PDF
Book algebra
PDF
Essentialphysics1
PDF
Quantum mechanics
PDF
Partial differential equations and complex analysis
PDF
Firk essential physics [yale 2000] 4 ah
PDF
Semigroups For Delay Equations Btkai Andrs Piazzera Susanna
PDF
Semigroups For Delay Equations Btkai Andrs Piazzera Susanna
PDF
An introduction to linear algebra
PDF
Limit Operators And Their Applications In Operator Theory Vladimir Rabinovitch
PDF
Theory Of Functions Of A Real Variable Shlomo Sternberg
PDF
Quantm mechanics book
PDF
A Course On Large Deviations With An Introduction To Gibbs Measures Firas Ras...
PDF
A Course On Large Deviations With An Introduction To Gibbs Measures Firas Ras...
PDF
A Course On Large Deviations With An Introduction To Gibbs Measures Firas Ras...
PDF
Discretetime Dynamics Of Structured Populations And Homogeneous Orderpreservi...
PDF
Problems in mathematics
PDF
Calculus Research Lab 3: Differential Equations!
PDF
Smoothed Particle Hydrodynamics
PDF
Scale Relativity And Fractal Spacetime A New Approach To Unifying Relativity ...
PDF
Ajit Kumar Fundamentals Of Quantum Mechanics Cambridge University Press (2018)
Book algebra
Essentialphysics1
Quantum mechanics
Partial differential equations and complex analysis
Firk essential physics [yale 2000] 4 ah
Semigroups For Delay Equations Btkai Andrs Piazzera Susanna
Semigroups For Delay Equations Btkai Andrs Piazzera Susanna
An introduction to linear algebra
Limit Operators And Their Applications In Operator Theory Vladimir Rabinovitch
Theory Of Functions Of A Real Variable Shlomo Sternberg
Quantm mechanics book
A Course On Large Deviations With An Introduction To Gibbs Measures Firas Ras...
A Course On Large Deviations With An Introduction To Gibbs Measures Firas Ras...
A Course On Large Deviations With An Introduction To Gibbs Measures Firas Ras...
Discretetime Dynamics Of Structured Populations And Homogeneous Orderpreservi...
Problems in mathematics
Calculus Research Lab 3: Differential Equations!
Smoothed Particle Hydrodynamics
Scale Relativity And Fractal Spacetime A New Approach To Unifying Relativity ...
Ajit Kumar Fundamentals Of Quantum Mechanics Cambridge University Press (2018)
Ad

More from Sergio Zaina (7)

PDF
427 chess combinations (collection)
PDF
Alburt, lev & dzindzichashvili & perelshteyn chess openings for black, expl...
PDF
Toward a theory of chaos
PDF
The control of chaos
PDF
Chaos theory
PDF
A mathematical theory of communication
PDF
Mecanica quantica
427 chess combinations (collection)
Alburt, lev & dzindzichashvili & perelshteyn chess openings for black, expl...
Toward a theory of chaos
The control of chaos
Chaos theory
A mathematical theory of communication
Mecanica quantica
Ad

Mathematical methods in quantum mechanics

  • 1. Mathematical Methods in Quantum Mechanics With Applications to Schr¨dinger Operators o Gerald Teschl Note: The AMS has granted the permission to post this online edition! This version is for personal online use only! If you like this book and want to support the idea of online versions, please consider buying this book: http://www.ams.org/bookstore-getitem?item=gsm-99 Graduate Studies in Mathematics Volume 99 American Mathematical Society Providence, Rhode Island
  • 2. Editorial Board David Cox (Chair) Steven G. Krants Rafe Mazzeo Martin Scharlemann 2000 Mathematics subject classification. 81-01, 81Qxx, 46-01, 34Bxx, 47B25 Abstract. This book provides a self-contained introduction to mathematical methods in quan- tum mechanics (spectral theory) with applications to Schr¨dinger operators. The first part cov- o ers mathematical foundations of quantum mechanics from self-adjointness, the spectral theorem, quantum dynamics (including Stone’s and the RAGE theorem) to perturbation theory for self- adjoint operators. The second part starts with a detailed study of the free Schr¨dinger operator respectively o position, momentum and angular momentum operators. Then we develop Weyl–Titchmarsh the- ory for Sturm–Liouville operators and apply it to spherically symmetric problems, in particular to the hydrogen atom. Next we investigate self-adjointness of atomic Schr¨dinger operators and o their essential spectrum, in particular the HVZ theorem. Finally we have a look at scattering theory and prove asymptotic completeness in the short range case. For additional information and updates on this book, visit: http://www.ams.org/bookpages/gsm-99/ Typeset by L TEXand Makeindex. Version: February 17, 2009. A Library of Congress Cataloging-in-Publication Data Teschl, Gerald, 1970– Mathematical methods in quantum mechanics : with applications to Schr¨dinger operators o / Gerald Teschl. p. cm. — (Graduate Studies in Mathematics ; v. 99) Includes bibliographical references and index. ISBN 978-0-8218-4660-5 (alk. paper) 1. Schr¨dinger operators. 2. Quantum theory—Mathematics. I. Title. o QC174.17.S3T47 2009 2008045437 515’.724–dc22 Copying and reprinting. Individual readers of this publication, and nonprofit libraries acting for them, are permitted to make fair use of the material, such as to copy a chapter for use in teaching or research. Permission is granted to quote brief passages from this publication in reviews, provided the customary acknowledgement of the source is given. Republication, systematic copying, or multiple reproduction of any material in this pub- lication (including abstracts) is permitted only under license from the American Mathematical Society. Requests for such permissions should be addressed to the Assistant to the Publisher, American Mathematical Society, P.O. Box 6248, Providence, Rhode Island 02940-6248. Requests can also be made by e-mail to reprint-permission@ams.org. c 2009 by the American Mathematical Society. All rights reserved. The American Mathematical Society retains all rights except those granted too the United States Government.
  • 3. To Susanne, Simon, and Jakob
  • 5. Contents Preface xi Part 0. Preliminaries Chapter 0. A first look at Banach and Hilbert spaces 3 §0.1. Warm up: Metric and topological spaces 3 §0.2. The Banach space of continuous functions 12 §0.3. The geometry of Hilbert spaces 16 §0.4. Completeness 22 §0.5. Bounded operators 22 §0.6. Lebesgue Lp spaces 25 §0.7. Appendix: The uniform boundedness principle 32 Part 1. Mathematical Foundations of Quantum Mechanics Chapter 1. Hilbert spaces 37 §1.1. Hilbert spaces 37 §1.2. Orthonormal bases 39 §1.3. The projection theorem and the Riesz lemma 43 §1.4. Orthogonal sums and tensor products 45 §1.5. The C∗ algebra of bounded linear operators 47 §1.6. Weak and strong convergence 49 §1.7. Appendix: The Stone–Weierstraß theorem 51 Chapter 2. Self-adjointness and spectrum 55 vii
  • 6. viii Contents §2.1. Some quantum mechanics 55 §2.2. Self-adjoint operators 58 §2.3. Quadratic forms and the Friedrichs extension 67 §2.4. Resolvents and spectra 73 §2.5. Orthogonal sums of operators 79 §2.6. Self-adjoint extensions 81 §2.7. Appendix: Absolutely continuous functions 84 Chapter 3. The spectral theorem 87 §3.1. The spectral theorem 87 §3.2. More on Borel measures 99 §3.3. Spectral types 104 §3.4. Appendix: The Herglotz theorem 107 Chapter 4. Applications of the spectral theorem 111 §4.1. Integral formulas 111 §4.2. Commuting operators 115 §4.3. The min-max theorem 117 §4.4. Estimating eigenspaces 119 §4.5. Tensor products of operators 120 Chapter 5. Quantum dynamics 123 §5.1. The time evolution and Stone’s theorem 123 §5.2. The RAGE theorem 126 §5.3. The Trotter product formula 131 Chapter 6. Perturbation theory for self-adjoint operators 133 §6.1. Relatively bounded operators and the Kato–Rellich theorem 133 §6.2. More on compact operators 136 §6.3. Hilbert–Schmidt and trace class operators 139 §6.4. Relatively compact operators and Weyl’s theorem 145 §6.5. Relatively form bounded operators and the KLMN theorem 149 §6.6. Strong and norm resolvent convergence 153 Part 2. Schr¨dinger Operators o Chapter 7. The free Schr¨dinger operator o 161 §7.1. The Fourier transform 161 §7.2. The free Schr¨dinger operator o 167
  • 7. Contents ix §7.3. The time evolution in the free case 169 §7.4. The resolvent and Green’s function 171 Chapter 8. Algebraic methods 173 §8.1. Position and momentum 173 §8.2. Angular momentum 175 §8.3. The harmonic oscillator 178 §8.4. Abstract commutation 179 Chapter 9. One-dimensional Schr¨dinger operators o 181 §9.1. Sturm–Liouville operators 181 §9.2. Weyl’s limit circle, limit point alternative 187 §9.3. Spectral transformations I 195 §9.4. Inverse spectral theory 202 §9.5. Absolutely continuous spectrum 206 §9.6. Spectral transformations II 209 §9.7. The spectra of one-dimensional Schr¨dinger operators o 214 Chapter 10. One-particle Schr¨dinger operators o 221 §10.1. Self-adjointness and spectrum 221 §10.2. The hydrogen atom 222 §10.3. Angular momentum 225 §10.4. The eigenvalues of the hydrogen atom 229 §10.5. Nondegeneracy of the ground state 235 Chapter 11. Atomic Schr¨dinger operators o 239 §11.1. Self-adjointness 239 §11.2. The HVZ theorem 242 Chapter 12. Scattering theory 247 §12.1. Abstract theory 247 §12.2. Incoming and outgoing states 250 §12.3. Schr¨dinger operators with short range potentials o 253 Part 3. Appendix Appendix A. Almost everything about Lebesgue integration 259 §A.1. Borel measures in a nut shell 259 §A.2. Extending a premeasure to a measure 263 §A.3. Measurable functions 268
  • 8. x Contents §A.4. The Lebesgue integral 270 §A.5. Product measures 275 §A.6. Vague convergence of measures 278 §A.7. Decomposition of measures 280 §A.8. Derivatives of measures 282 Bibliographical notes 289 Bibliography 293 Glossary of notation 297 Index 301
  • 9. Preface Overview The present text was written for my course Schr¨dinger Operators held o at the University of Vienna in winter 1999, summer 2002, summer 2005, and winter 2007. It gives a brief but rather self-contained introduction to the mathematical methods of quantum mechanics with a view towards applications to Schr¨dinger operators. The applications presented are highly o selective and many important and interesting items are not touched upon. Part 1 is a stripped down introduction to spectral theory of unbounded operators where I try to introduce only those topics which are needed for the applications later on. This has the advantage that you will (hopefully) not get drowned in results which are never used again before you get to the applications. In particular, I am not trying to present an encyclopedic reference. Nevertheless I still feel that the first part should provide a solid background covering many important results which are usually taken for granted in more advanced books and research papers. My approach is built around the spectral theorem as the central object. Hence I try to get to it as quickly as possible. Moreover, I do not take the detour over bounded operators but I go straight for the unbounded case. In addition, existence of spectral measures is established via the Herglotz theorem rather than the Riesz representation theorem since this approach paves the way for an investigation of spectral types via boundary values of the resolvent as the spectral parameter approaches the real line. xi
  • 10. xii Preface Part 2 starts with the free Schr¨dinger equation and computes the o free resolvent and time evolution. In addition, I discuss position, momen- tum, and angular momentum operators via algebraic methods. This is usually found in any physics textbook on quantum mechanics, with the only difference that I include some technical details which are typically not found there. Then there is an introduction to one-dimensional mod- els (Sturm–Liouville operators) including generalized eigenfunction expan- sions (Weyl–Titchmarsh theory) and subordinacy theory from Gilbert and Pearson. These results are applied to compute the spectrum of the hy- drogen atom, where again I try to provide some mathematical details not found in physics textbooks. Further topics are nondegeneracy of the ground state, spectra of atoms (the HVZ theorem), and scattering theory (the Enß method). Prerequisites I assume some previous experience with Hilbert spaces and bounded linear operators which should be covered in any basic course on functional analysis. However, while this assumption is reasonable for mathematics students, it might not always be for physics students. For this reason there is a preliminary chapter reviewing all necessary results (including proofs). In addition, there is an appendix (again with proofs) providing all necessary results from measure theory. Literature The present book is highly influenced by the four volumes of Reed and Simon [40]–[43] (see also [14]) and by the book by Weidmann [60] (an extended version of which has recently appeared in two volumes [62], [63], however, only in German). Other books with a similar scope are for example [14], [15], [21], [23], [39], [48], and [55]. For those who want to know more about the physical aspects, I can recommend the classical book by Thirring [58] and the visual guides by Thaller [56], [57]. Further information can be found in the bibliographical notes at the end. Reader’s guide There is some intentional overlap between Chapter 0, Chapter 1, and Chapter 2. Hence, provided you have the necessary background, you can start reading in Chapter 1 or even Chapter 2. Chapters 2 and 3 are key
  • 11. Preface xiii chapters and you should study them in detail (except for Section 2.6 which can be skipped on first reading). Chapter 4 should give you an idea of how the spectral theorem is used. You should have a look at (e.g.) the first section and you can come back to the remaining ones as needed. Chapter 5 contains two key results from quantum dynamics: Stone’s theorem and the RAGE theorem. In particular the RAGE theorem shows the connections between long time behavior and spectral types. Finally, Chapter 6 is again of central importance and should be studied in detail. The chapters in the second part are mostly independent of each other except for Chapter 7, which is a prerequisite for all others except for Chap- ter 9. If you are interested in one-dimensional models (Sturm–Liouville equa- tions), Chapter 9 is all you need. If you are interested in atoms, read Chapter 7, Chapter 10, and Chap- ter 11. In particular, you can skip the separation of variables (Sections 10.3 and 10.4, which require Chapter 9) method for computing the eigenvalues of the hydrogen atom, if you are happy with the fact that there are countably many which accumulate at the bottom of the continuous spectrum. If you are interested in scattering theory, read Chapter 7, the first two sections of Chapter 10, and Chapter 12. Chapter 5 is one of the key prereq- uisites in this case. Updates The AMS is hosting a web page for this book at http://www.ams.org/bookpages/gsm-99/ where updates, corrections, and other material may be found, including a link to material on my own web site: http://www.mat.univie.ac.at/~gerald/ftp/book-schroe/ Acknowledgments I would like to thank Volker Enß for making his lecture notes [18] avail- able to me. Many colleagues and students have made useful suggestions and pointed out mistakes in earlier drafts of this book, in particular: Kerstin Ammann, J¨rg Arnberger, Chris Davis, Fritz Gesztesy, Maria Hoffmann- o Ostenhof, Zhenyou Huang, Helge Kr¨ger, Katrin Grunert, Wang Lanning, u Daniel Lenz, Christine Pfeuffer, Roland M¨ws, Arnold L. Neidhardt, Harald o
  • 12. xiv Preface Rindler, Johannes Temme, Karl Unterkofler, Joachim Weidmann, and Rudi Weikard. If you also find an error or if you have comments or suggestions (no matter how small), please let me know. I have been supported by the Austrian Science Fund (FWF) during much of this writing, most recently under grant Y330. Gerald Teschl Vienna, Austria January 2009 Gerald Teschl Fakult¨t f¨r Mathematik a u Nordbergstraße 15 Universit¨t Wien a 1090 Wien, Austria E-mail: Gerald.Teschl@univie.ac.at URL: http://www.mat.univie.ac.at/~gerald/
  • 15. Chapter 0 A first look at Banach and Hilbert spaces I assume that the reader has some basic familiarity with measure theory and func- tional analysis. For convenience, some facts needed from Banach and Lp spaces are reviewed in this chapter. A crash course in measure theory can be found in the Appendix A. If you feel comfortable with terms like Lebesgue Lp spaces, Banach space, or bounded linear operator, you can skip this entire chapter. However, you might want to at least browse through it to refresh your memory. 0.1. Warm up: Metric and topological spaces Before we begin, I want to recall some basic facts from metric and topological spaces. I presume that you are familiar with these topics from your calculus course. As a general reference I can warmly recommend Kelly’s classical book [26]. A metric space is a space X together with a distance function d : X × X → R such that (i) d(x, y) ≥ 0, (ii) d(x, y) = 0 if and only if x = y, (iii) d(x, y) = d(y, x), (iv) d(x, z) ≤ d(x, y) + d(y, z) (triangle inequality). If (ii) does not hold, d is called a semi-metric. Moreover, it is straight- forward to see the inverse triangle inequality (Problem 0.1) |d(x, y) − d(z, y)| ≤ d(x, z). (0.1) 3
  • 16. 4 0. A first look at Banach and Hilbert spaces Example. Euclidean space Rn together with d(x, y) = ( n (xk − yk )2 )1/2 k=1 is a metric space and so is Cn together with d(x, y) = ( n |xk −yk |2 )1/2 . k=1 The set Br (x) = {y ∈ X|d(x, y) < r} (0.2) is called an open ball around x with radius r > 0. A point x of some set U is called an interior point of U if U contains some ball around x. If x is an interior point of U , then U is also called a neighborhood of x. A point x is called a limit point of U if (Br (x){x}) ∩ U = ∅ for every ball around x. Note that a limit point x need not lie in U , but U must contain points arbitrarily close to x. Example. Consider R with the usual metric and let U = (−1, 1). Then every point x ∈ U is an interior point of U . The points ±1 are limit points of U . A set consisting only of interior points is called open. The family of open sets O satisfies the properties (i) ∅, X ∈ O, (ii) O1 , O2 ∈ O implies O1 ∩ O2 ∈ O, (iii) {Oα } ⊆ O implies α Oα ∈ O. That is, O is closed under finite intersections and arbitrary unions. In general, a space X together with a family of sets O, the open sets, satisfying (i)–(iii) is called a topological space. The notions of interior point, limit point, and neighborhood carry over to topological spaces if we replace open ball by open set. There are usually different choices for the topology. Two usually not very interesting examples are the trivial topology O = {∅, X} and the discrete topology O = P(X) (the powerset of X). Given two topologies O1 and O2 on X, O1 is called weaker (or coarser) than O2 if and only if O1 ⊆ O2 . Example. Note that different metrics can give rise to the same topology. For example, we can equip Rn (or Cn ) with the Euclidean distance d(x, y) as before or we could also use n ˜ d(x, y) = |xk − yk |. (0.3) k=1 Then n n n 1 √ |xk | ≤ |xk |2 ≤ |xk | (0.4) n k=1 k=1 k=1
  • 17. 0.1. Warm up: Metric and topological spaces 5 ˜ ˜ shows Br/√n (x) ⊆ Br (x) ⊆ Br (x), where B, B are balls computed using d, ˜ d, respectively. Example. We can always replace a metric d by the bounded metric ˜ d(x, y) d(x, y) = (0.5) 1 + d(x, y) without changing the topology. Every subspace Y of a topological space X becomes a topological space ˜ of its own if we call O ⊆ Y open if there is some open set O ⊆ X such that ˜ O = O ∩ Y (induced topology). Example. The set (0, 1] ⊆ R is not open in the topology of X = R, but it is open in the induced topology when considered as a subset of Y = [−1, 1]. A family of open sets B ⊆ O is called a base for the topology if for each x and each neighborhood U (x), there is some set O ∈ B with x ∈ O ⊆ U (x). Since an open set O is a neighborhood of every one of its points, it can be ˜ written as O = O⊇O∈B O and we have ˜ Lemma 0.1. If B ⊆ O is a base for the topology, then every open set can be written as a union of elements from B. If there exists a countable base, then X is called second countable. Example. By construction the open balls B1/n (x) are a base for the topol- ogy in a metric space. In the case of Rn (or Cn ) it even suffices to take balls with rational center and hence Rn (and Cn ) is second countable. A topological space is called a Hausdorff space if for two different points there are always two disjoint neighborhoods. Example. Any metric space is a Hausdorff space: Given two different points x and y, the balls Bd/2 (x) and Bd/2 (y), where d = d(x, y) > 0, are disjoint neighborhoods (a semi-metric space will not be Hausdorff). The complement of an open set is called a closed set. It follows from de Morgan’s rules that the family of closed sets C satisfies (i) ∅, X ∈ C, (ii) C1 , C2 ∈ C implies C1 ∪ C2 ∈ C, (iii) {Cα } ⊆ C implies α Cα ∈ C. That is, closed sets are closed under finite unions and arbitrary intersections. The smallest closed set containing a given set U is called the closure U= C, (0.6) C∈C,U ⊆C
  • 18. 6 0. A first look at Banach and Hilbert spaces and the largest open set contained in a given set U is called the interior U◦ = O. (0.7) O∈O,O⊆U We can define interior and limit points as before by replacing the word ball by open set. Then it is straightforward to check Lemma 0.2. Let X be a topological space. Then the interior of U is the set of all interior points of U and the closure of U is the union of U with all limit points of U . A sequence (xn )∞ ⊆ X is said to converge to some point x ∈ X if n=1 d(x, xn ) → 0. We write limn→∞ xn = x as usual in this case. Clearly the limit is unique if it exists (this is not true for a semi-metric). Every convergent sequence is a Cauchy sequence; that is, for every ε > 0 there is some N ∈ N such that d(xn , xm ) ≤ ε, n, m ≥ N. (0.8) If the converse is also true, that is, if every Cauchy sequence has a limit, then X is called complete. Example. Both Rn and Cn are complete metric spaces. A point x is clearly a limit point of U if and only if there is some sequence xn ∈ U {x} converging to x. Hence Lemma 0.3. A closed subset of a complete metric space is again a complete metric space. Note that convergence can also be equivalently formulated in terms of topological terms: A sequence xn converges to x if and only if for every neighborhood U of x there is some N ∈ N such that xn ∈ U for n ≥ N . In a Hausdorff space the limit is unique. A set U is called dense if its closure is all of X, that is, if U = X. A metric space is called separable if it contains a countable dense set. Note that X is separable if and only if it is second countable as a topological space. Lemma 0.4. Let X be a separable metric space. Every subset of X is again separable. Proof. Let A = {xn }n∈N be a dense set in X. The only problem is that A ∩ Y might contain no elements at all. However, some elements of A must be at least arbitrarily close: Let J ⊆ N2 be the set of all pairs (n, m) for which B1/m (xn ) ∩ Y = ∅ and choose some yn,m ∈ B1/m (xn ) ∩ Y for all (n, m) ∈ J. Then B = {yn,m }(n,m)∈J ⊆ Y is countable. To see that B is
  • 19. 0.1. Warm up: Metric and topological spaces 7 dense, choose y ∈ Y . Then there is some sequence xnk with d(xnk , y) < 1/k. Hence (nk , k) ∈ J and d(ynk ,k , y) ≤ d(ynk ,k , xnk ) + d(xnk , y) ≤ 2/k → 0. A function between metric spaces X and Y is called continuous at a point x ∈ X if for every ε > 0 we can find a δ > 0 such that dY (f (x), f (y)) ≤ ε if dX (x, y) < δ. (0.9) If f is continuous at every point, it is called continuous. Lemma 0.5. Let X, Y be metric spaces and f : X → Y . The following are equivalent: (i) f is continuous at x (i.e, (0.9) holds). (ii) f (xn ) → f (x) whenever xn → x. (iii) For every neighborhood V of f (x), f −1 (V ) is a neighborhood of x. Proof. (i) ⇒ (ii) is obvious. (ii) ⇒ (iii): If (iii) does not hold, there is a neighborhood V of f (x) such that Bδ (x) ⊆ f −1 (V ) for every δ. Hence we can choose a sequence xn ∈ B1/n (x) such that f (xn ) ∈ f −1 (V ). Thus xn → x but f (xn ) → f (x). (iii) ⇒ (i): Choose V = Bε (f (x)) and observe that by (iii), Bδ (x) ⊆ f −1 (V ) for some δ. The last item implies that f is continuous if and only if the inverse image of every open (closed) set is again open (closed). Note: In a topological space, (iii) is used as the definition for continuity. However, in general (ii) and (iii) will no longer be equivalent unless one uses generalized sequences, so-called nets, where the index set N is replaced by arbitrary directed sets. The support of a function f : X → Cn is the closure of all points x for which f (x) does not vanish; that is, supp(f ) = {x ∈ X|f (x) = 0}. (0.10) If X and Y are metric spaces, then X × Y together with d((x1 , y1 ), (x2 , y2 )) = dX (x1 , x2 ) + dY (y1 , y2 ) (0.11) is a metric space. A sequence (xn , yn ) converges to (x, y) if and only if xn → x and yn → y. In particular, the projections onto the first (x, y) → x, respectively, onto the second (x, y) → y, coordinate are continuous. In particular, by the inverse triangle inequality (0.1), |d(xn , yn ) − d(x, y)| ≤ d(xn , x) + d(yn , y), (0.12) we see that d : X × X → R is continuous.
  • 20. 8 0. A first look at Banach and Hilbert spaces Example. If we consider R × R, we do not get the Euclidean distance of R2 unless we modify (0.11) as follows: ˜ d((x1 , y1 ), (x2 , y2 )) = dX (x1 , x2 )2 + dY (y1 , y2 )2 . (0.13) As noted in our previous example, the topology (and thus also conver- gence/continuity) is independent of this choice. If X and Y are just topological spaces, the product topology is defined by calling O ⊆ X × Y open if for every point (x, y) ∈ O there are open neighborhoods U of x and V of y such that U × V ⊆ O. In the case of metric spaces this clearly agrees with the topology defined via the product metric (0.11). A cover of a set Y ⊆ X is a family of sets {Uα } such that Y ⊆ α Uα . A cover is called open if all Uα are open. Any subset of {Uα } which still covers Y is called a subcover. Lemma 0.6 (Lindel¨f). If X is second countable, then every open cover o has a countable subcover. Proof. Let {Uα } be an open cover for Y and let B be a countable base. Since every Uα can be written as a union of elements from B, the set of all B ∈ B which satisfy B ⊆ Uα for some α form a countable open cover for Y . Moreover, for every Bn in this set we can find an αn such that Bn ⊆ Uαn . By construction {Uαn } is a countable subcover. A subset K ⊂ X is called compact if every open cover has a finite subcover. Lemma 0.7. A topological space is compact if and only if it has the finite intersection property: The intersection of a family of closed sets is empty if and only if the intersection of some finite subfamily is empty. Proof. By taking complements, to every family of open sets there is a cor- responding family of closed sets and vice versa. Moreover, the open sets are a cover if and only if the corresponding closed sets have empty intersec- tion. A subset K ⊂ X is called sequentially compact if every sequence has a convergent subsequence. Lemma 0.8. Let X be a topological space. (i) The continuous image of a compact set is compact. (ii) Every closed subset of a compact set is compact. (iii) If X is Hausdorff, any compact set is closed.
  • 21. 0.1. Warm up: Metric and topological spaces 9 (iv) The product of finitely many compact sets is compact. (v) A compact set is also sequentially compact. Proof. (i) Observe that if {Oα } is an open cover for f (Y ), then {f −1 (Oα )} is one for Y . (ii) Let {Oα } be an open cover for the closed subset Y . Then {Oα } ∪ {XY } is an open cover for X. (iii) Let Y ⊆ X be compact. We show that XY is open. Fix x ∈ XY (if Y = X, there is nothing to do). By the definition of Hausdorff, for every y ∈ Y there are disjoint neighborhoods V (y) of y and Uy (x) of x. By compactness of Y , there are y1 , . . . , yn such that the V (yj ) cover Y . But then U (x) = n Uyj (x) is a neighborhood of x which does not intersect j=1 Y. (iv) Let {Oα } be an open cover for X × Y . For every (x, y) ∈ X × Y there is some α(x, y) such that (x, y) ∈ Oα(x,y) . By definition of the product topology there is some open rectangle U (x, y) × V (x, y) ⊆ Oα(x,y) . Hence for fixed x, {V (x, y)}y∈Y is an open cover of Y . Hence there are finitely many points yk (x) such that the V (x, yk (x)) cover Y . Set U (x) = k U (x, yk (x)). Since finite intersections of open sets are open, {U (x)}x∈X is an open cover and there are finitely many points xj such that the U (xj ) cover X. By construction, the U (xj ) × V (xj , yk (xj )) ⊆ Oα(xj ,yk (xj )) cover X × Y . (v) Let xn be a sequence which has no convergent subsequence. Then K = {xn } has no limit points and is hence compact by (ii). For every n there is a ball Bεn (xn ) which contains only finitely many elements of K. However, finitely many suffice to cover K, a contradiction. In a metric space compact and sequentially compact are equivalent. Lemma 0.9. Let X be a metric space. Then a subset is compact if and only if it is sequentially compact. Proof. By item (v) of the previous lemma it suffices to show that X is compact if it is sequentially compact. First of all note that every cover of open balls with fixed radius ε > 0 has a finite subcover since if this were false we could construct a sequence xn ∈ X n−1 Bε (xm ) such that d(xn , xm ) > ε for m < n. m=1 In particular, we are done if we can show that for every open cover {Oα } there is some ε > 0 such that for every x we have Bε (x) ⊆ Oα for some α = α(x). Indeed, choosing {xk }n such that Bε (xk ) is a cover, we k=1 have that Oα(xk ) is a cover as well. So it remains to show that there is such an ε. If there were none, for every ε > 0 there must be an x such that Bε (x) ⊆ Oα for every α. Choose
  • 22. 10 0. A first look at Banach and Hilbert spaces 1 ε = n and pick a corresponding xn . Since X is sequentially compact, it is no restriction to assume xn converges (after maybe passing to a subsequence). Let x = lim xn . Then x lies in some Oα and hence Bε (x) ⊆ Oα . But choosing 1 ε ε n so large that n < 2 and d(xn , x) < 2 , we have B1/n (xn ) ⊆ Bε (x) ⊆ Oα , contradicting our assumption. Please also recall the Heine–Borel theorem: Theorem 0.10 (Heine–Borel). In Rn (or Cn ) a set is compact if and only if it is bounded and closed. Proof. By Lemma 0.8 (ii) and (iii) it suffices to show that a closed interval in I ⊆ R is compact. Moreover, by Lemma 0.9 it suffices to show that every sequence in I = [a, b] has a convergent subsequence. Let xn be our sequence and divide I = [a, a+b ] ∪ [ a+b , b]. Then at least one of these two 2 2 intervals, call it I1 , contains infinitely many elements of our sequence. Let y1 = xn1 be the first one. Subdivide I1 and pick y2 = xn2 , with n2 > n1 as before. Proceeding like this, we obtain a Cauchy sequence yn (note that by construction In+1 ⊆ In and hence |yn − ym | ≤ b−a for m ≥ n). n A topological space is called locally compact if every point has a com- pact neighborhood. Example. Rn is locally compact. The distance between a point x ∈ X and a subset Y ⊆ X is dist(x, Y ) = inf d(x, y). (0.14) y∈Y Note that x is a limit point of Y if and only if dist(x, Y ) = 0. Lemma 0.11. Let X be a metric space. Then | dist(x, Y ) − dist(z, Y )| ≤ d(x, z). (0.15) In particular, x → dist(x, Y ) is continuous. Proof. Taking the infimum in the triangle inequality d(x, y) ≤ d(x, z) + d(z, y) shows dist(x, Y ) ≤ d(x, z)+dist(z, Y ). Hence dist(x, Y )−dist(z, Y ) ≤ d(x, z). Interchanging x and z shows dist(z, Y ) − dist(x, Y ) ≤ d(x, z). Lemma 0.12 (Urysohn). Suppose C1 and C2 are disjoint closed subsets of a metric space X. Then there is a continuous function f : X → [0, 1] such that f is zero on C1 and one on C2 . If X is locally compact and C1 is compact, one can choose f with compact support.
  • 23. 0.1. Warm up: Metric and topological spaces 11 dist(x,C2 ) Proof. To prove the first claim, set f (x) = dist(x,C1 )+dist(x,C2 ) . For the second claim, observe that there is an open set O such that O is compact and C1 ⊂ O ⊂ O ⊂ XC2 . In fact, for every x, there is a ball Bε (x) such that Bε (x) is compact and Bε (x) ⊂ XC2 . Since C1 is compact, finitely many of them cover C1 and we can choose the union of those balls to be O. Now replace C2 by XO. Note that Urysohn’s lemma implies that a metric space is normal; that is, for any two disjoint closed sets C1 and C2 , there are disjoint open sets O1 and O2 such that Cj ⊆ Oj , j = 1, 2. In fact, choose f as in Urysohn’s lemma and set O1 = f −1 ([0, 1/2)), respectively, O2 = f −1 ((1/2, 1]). Lemma 0.13. Let X be a locally compact metric space. Suppose K is a compact set and {Oj }n an open cover. Then there is a partition of j=1 unity for K subordinate to this cover; that is, there are continuous functions hj : X → [0, 1] such that hj has compact support contained in Oj and n hj (x) ≤ 1 (0.16) j=1 with equality for x ∈ K. Proof. For every x ∈ K there is some ε and some j such that Bε (x) ⊆ Oj . By compactness of K, finitely many of these balls cover K. Let Kj be the union of those balls which lie inside Oj . By Urysohn’s lemma there are functions gj : X → [0, 1] such that gj = 1 on Kj and gj = 0 on XOj . Now set j−1 hj = gj (1 − gk ). (0.17) k=1 Then hj : X → [0, 1] has compact support contained in Oj and n n hj (x) = 1 − (1 − gj (x)) (0.18) j=1 j=1 shows that the sum is one for x ∈ K, since x ∈ Kj for some j implies gj (x) = 1 and causes the product to vanish. Problem 0.1. Show that |d(x, y) − d(z, y)| ≤ d(x, z). Problem 0.2. Show the quadrangle inequality |d(x, y) − d(x , y )| ≤ d(x, x ) + d(y, y ). Problem 0.3. Let X be some space together with a sequence of distance functions dn , n ∈ N. Show that ∞ 1 dn (x, y) d(x, y) = 2n 1 + dn (x, y) n=1
  • 24. 12 0. A first look at Banach and Hilbert spaces is again a distance function. Problem 0.4. Show that the closure satisfies U = U . Problem 0.5. Let U ⊆ V be subsets of a metric space X. Show that if U is dense in V and V is dense in X, then U is dense in X. Problem 0.6. Show that any open set O ⊆ R can be written as a countable union of disjoint intervals. (Hint: Let {Iα } be the set of all maximal subin- tervals of O; that is, Iα ⊆ O and there is no other subinterval of O which contains Iα . Then this is a cover of disjoint intervals which has a countable subcover.) 0.2. The Banach space of continuous functions Now let us have a first look at Banach spaces by investigating the set of continuous functions C(I) on a compact interval I = [a, b] ⊂ R. Since we want to handle complex models, we will always consider complex-valued functions! One way of declaring a distance, well-known from calculus, is the max- imum norm: f (x) − g(x) ∞ = max |f (x) − g(x)|. (0.19) x∈I It is not hard to see that with this definition C(I) becomes a normed linear space: A normed linear space X is a vector space X over C (or R) with a real-valued function (the norm) . such that • f ≥ 0 for all f ∈ X and f = 0 if and only if f = 0, • α f = |α| f for all α ∈ C and f ∈ X, and • f + g ≤ f + g for all f, g ∈ X (triangle inequality). From the triangle inequality we also get the inverse triangle inequal- ity (Problem 0.7) | f − g |≤ f −g . (0.20) Once we have a norm, we have a distance d(f, g) = f −g and hence we know when a sequence of vectors fn converges to a vector f . We will write fn → f or limn→∞ fn = f , as usual, in this case. Moreover, a mapping F : X → Y between two normed spaces is called continuous if fn → f implies F (fn ) → F (f ). In fact, it is not hard to see that the norm, vector addition, and multiplication by scalars are continuous (Problem 0.8). In addition to the concept of convergence we have also the concept of a Cauchy sequence and hence the concept of completeness: A normed
  • 25. 0.2. The Banach space of continuous functions 13 space is called complete if every Cauchy sequence has a limit. A complete normed space is called a Banach space. Example. The space 1 (N) of all sequences a = (aj )∞ for which the norm j=1 ∞ a 1 = |aj | (0.21) j=1 is finite is a Banach space. To show this, we need to verify three things: (i) 1 (N) is a vector space that is closed under addition and scalar multiplication, (ii) . 1 satisfies the three requirements for a norm, and (iii) 1 (N) is complete. First of all observe k k k |aj + bj | ≤ |aj | + |bj | ≤ a 1 + b 1 (0.22) j=1 j=1 j=1 for any finite k. Letting k → ∞, we conclude that 1 (N) is closed under addition and that the triangle inequality holds. That 1 (N) is closed under scalar multiplication and the two other properties of a norm are straight- forward. It remains to show that 1 (N) is complete. Let an = (an )∞ be j j=1 a Cauchy sequence; that is, for given ε > 0 we can find an Nε such that am − an 1 ≤ ε for m, n ≥ Nε . This implies in particular |am − an | ≤ ε for j j any fixed j. Thus an is a Cauchy sequence for fixed j and by completeness j of C has a limit: limn→∞ an = aj . Now consider j k |am − an | ≤ ε j j (0.23) j=1 and take m → ∞: k |aj − an | ≤ ε. j (0.24) j=1 Since this holds for any finite k, we even have a−an 1 ≤ ε. Hence (a−an ) ∈ 1 (N) and since a ∈ 1 (N), we finally conclude a = a + (a − a ) ∈ 1 (N). n n n Example. The space ∞ (N) of all bounded sequences a = (aj )∞ together j=1 with the norm a ∞ = sup |aj | (0.25) j∈N is a Banach space (Problem 0.10). Now what about convergence in the space C(I)? A sequence of functions fn (x) converges to f if and only if lim f − fn = lim sup |fn (x) − f (x)| = 0. (0.26) n→∞ n→∞ x∈I
  • 26. 14 0. A first look at Banach and Hilbert spaces That is, in the language of real analysis, fn converges uniformly to f . Now let us look at the case where fn is only a Cauchy sequence. Then fn (x) is clearly a Cauchy sequence of real numbers for any fixed x ∈ I. In particular, by completeness of C, there is a limit f (x) for each x. Thus we get a limiting function f (x). Moreover, letting m → ∞ in |fm (x) − fn (x)| ≤ ε ∀m, n > Nε , x ∈ I, (0.27) we see |f (x) − fn (x)| ≤ ε ∀n > Nε , x ∈ I; (0.28) that is, fn (x) converges uniformly to f (x). However, up to this point we do not know whether it is in our vector space C(I) or not, that is, whether it is continuous or not. Fortunately, there is a well-known result from real analysis which tells us that the uniform limit of continuous functions is again continuous. Hence f (x) ∈ C(I) and thus every Cauchy sequence in C(I) converges. Or, in other words Theorem 0.14. C(I) with the maximum norm is a Banach space. Next we want to know if there is a countable basis for C(I). We will call a set of vectors {un } ⊂ X linearly independent if every finite subset is and we will call a countable set of linearly independent vectors {un }N ⊂ X n=1 a Schauder basis if every element f ∈ X can be uniquely written as a countable linear combination of the basis elements: N f= cn un , cn = cn (f ) ∈ C, (0.29) n=1 where the sum has to be understood as a limit if N = ∞. In this case the span span{un } (the set of all finite linear combinations) of {un } is dense in X. A set whose span is dense is called total and if we have a countable total set, we also have a countable dense set (consider only linear combinations with rational coefficients — show this). A normed linear space containing a countable dense set is called separable. Example. The Banach space 1 (N) is separable. In fact, the set of vectors δ n , with δn = 1 and δm = 0, n = m, is total: Let a = (aj )∞ ∈ 1 (N) be n n j=1 n = n given and set a j=1 aj δ j . Then ∞ n a−a 1 = |aj | → 0 (0.30) j=n+1 since an = aj for 1 ≤ j ≤ n and an = 0 for j > n. j j Luckily this is also the case for C(I): Theorem 0.15 (Weierstraß). Let I be a compact interval. Then the set of polynomials is dense in C(I).
  • 27. 0.2. The Banach space of continuous functions 15 Proof. Let f (x) ∈ C(I) be given. By considering f (x) − f (a) + (f (b) − f (a))(x − b) it is no loss to assume that f vanishes at the boundary points. Moreover, without restriction we only consider I = [ −1 , 1 ] (why?). 2 2 Now the claim follows from the lemma below using 1 un (x) = (1 − x2 )n , In where 1 1 n In = (1 − x2 )n dx = (1 − x)n−1 (1 + x)n+1 dx −1 n+1 −1 n! n! = ··· = 22n+1 = 1 1 (n + 1) · · · (2n + 1) 2 ( 2 + 1) · · · ( 1 + n) 2 √ Γ(1 + n) π 1 = π 3 = (1 + O( )). Γ( 2 + n) n n √ In the last step we have used Γ( 1 ) = π [1, (6.1.8)] and the asymptotics 2 follow from Stirling’s formula [1, (6.1.37)]. Lemma 0.16 (Smoothing). Let un (x) be a sequence of nonnegative contin- uous functions on [−1, 1] such that un (x)dx = 1 and un (x)dx → 0, δ > 0. (0.31) |x|≤1 δ≤|x|≤1 (In other words, un has mass one and concentrates near x = 0 as n → ∞.) 1 Then for every f ∈ C[− 2 , 1 ] which vanishes at the endpoints, f (− 2 ) = 2 1 f ( 1 ) = 0, we have that 2 1/2 fn (x) = un (x − y)f (y)dy (0.32) −1/2 converges uniformly to f (x). Proof. Since f is uniformly continuous, for given ε we can find a δ (indepen- dent of x) such that |f (x)−f (y)| ≤ ε whenever |x−y| ≤ δ. Moreover, we can choose n such that δ≤|y|≤1 un (y)dy ≤ ε. Now abbreviate M = max{1, |f |} and note 1/2 1/2 |f (x) − un (x − y)f (x)dy| = |f (x)| |1 − un (x − y)dy| ≤ M ε. −1/2 −1/2 In fact, either the distance of x to one of the boundary points ± 1 is smaller 2 than δ and hence |f (x)| ≤ ε or otherwise the difference between one and the integral is smaller than ε.
  • 28. 16 0. A first look at Banach and Hilbert spaces Using this, we have 1/2 |fn (x) − f (x)| ≤ un (x − y)|f (y) − f (x)|dy + M ε −1/2 ≤ un (x − y)|f (y) − f (x)|dy |y|≤1/2,|x−y|≤δ + un (x − y)|f (y) − f (x)|dy + M ε |y|≤1/2,|x−y|≥δ =ε + 2M ε + M ε = (1 + 3M )ε, (0.33) which proves the claim. Note that fn will be as smooth as un , hence the title smoothing lemma. The same idea is used to approximate noncontinuous functions by smooth ones (of course the convergence will no longer be uniform in this case). Corollary 0.17. C(I) is separable. However, ∞ (N) is not separable (Problem 0.11)! Problem 0.7. Show that | f − g | ≤ f − g . Problem 0.8. Let X be a Banach space. Show that the norm, vector ad- dition, and multiplication by scalars are continuous. That is, if fn → f , gn → g, and αn → α, then fn → f , fn + gn → f + g, and αn gn → αg. ∞ Problem 0.9. Let X be a Banach space. Show that j=1 fj < ∞ implies that ∞ n fj = lim fj n→∞ j=1 j=1 exists. The series is called absolutely convergent in this case. Problem 0.10. Show that ∞ (N) is a Banach space. Problem 0.11. Show that ∞ (N) is not separable. (Hint: Consider se- quences which take only the value one and zero. How many are there? What is the distance between two such sequences?) 0.3. The geometry of Hilbert spaces So it looks like C(I) has all the properties we want. However, there is still one thing missing: How should we define orthogonality in C(I)? In Euclidean space, two vectors are called orthogonal if their scalar product vanishes, so we would need a scalar product:
  • 29. 0.3. The geometry of Hilbert spaces 17 Suppose H is a vector space. A map ., .. : H × H → C is called a sesquilinear form if it is conjugate linear in the first argument and linear in the second; that is, α1 f1 + α2 f2 , g ∗ ∗ = α1 f1 , g + α2 f2 , g , α1 , α2 ∈ C, (0.34) f, α1 g1 + α2 g2 = α1 f, g1 + α2 f, g2 , where ‘∗’ denotes complex conjugation. A sesquilinear form satisfying the requirements (i) f, f > 0 for f = 0 (positive definite), (ii) f, g = g, f ∗ (symmetry) is called an inner product or scalar product. Associated with every scalar product is a norm f = f, f . (0.35) The pair (H, ., .. ) is called an inner product space. If H is complete, it is called a Hilbert space. Example. Clearly Cn with the usual scalar product n a, b = a∗ bj j (0.36) j=1 is a (finite dimensional) Hilbert space. Example. A somewhat more interesting example is the Hilbert space 2 (N), that is, the set of all sequences ∞ (aj )∞ j=1 |aj |2 < ∞ (0.37) j=1 with scalar product ∞ a, b = a∗ bj . j (0.38) j=1 (Show that this is in fact a separable Hilbert space — Problem 0.13.) Of course I still owe you a proof for the claim that f, f is indeed a norm. Only the triangle inequality is nontrivial, which will follow from the Cauchy–Schwarz inequality below. A vector f ∈ H is called normalized or a unit vector if f = 1. Two vectors f, g ∈ H are called orthogonal or perpendicular (f ⊥ g) if f, g = 0 and parallel if one is a multiple of the other. If f and g are orthogonal, we have the Pythagorean theorem: 2 2 f +g = f + g 2, f ⊥ g, (0.39) which is one line of computation.
  • 30. 18 0. A first look at Banach and Hilbert spaces Suppose u is a unit vector. Then the projection of f in the direction of u is given by f = u, f u (0.40) and f⊥ defined via f⊥ = f − u, f u (0.41) is perpendicular to u since u, f⊥ = u, f − u, f u = u, f − u, f u, u = 0. f  w  f f   f f⊥   f   I f     f u I   Taking any other vector parallel to u, it is easy to see 2 2 2 f − αu = f⊥ + (f − αu) = f⊥ + | u, f − α|2 (0.42) and hence f = u, f u is the unique vector parallel to u which is closest to f. As a first consequence we obtain the Cauchy–Schwarz–Bunjakowski inequality: Theorem 0.18 (Cauchy–Schwarz–Bunjakowski). Let H0 be an inner prod- uct space. Then for every f, g ∈ H0 we have | f, g | ≤ f g (0.43) with equality if and only if f and g are parallel. Proof. It suffices to prove the case g = 1. But then the claim follows from f 2 = | g, f |2 + f⊥ 2 . Note that the Cauchy–Schwarz inequality entails that the scalar product is continuous in both variables; that is, if fn → f and gn → g, we have fn , gn → f, g . As another consequence we infer that the map . is indeed a norm. In fact, 2 2 2 f +g = f + f, g + g, f + g ≤ ( f + g )2 . (0.44) But let us return to C(I). Can we find a scalar product which has the maximum norm as associated norm? Unfortunately the answer is no! The reason is that the maximum norm does not satisfy the parallelogram law (Problem 0.17).
  • 31. 0.3. The geometry of Hilbert spaces 19 Theorem 0.19 (Jordan–von Neumann). A norm is associated with a scalar product if and only if the parallelogram law 2 2 2 2 f +g + f −g =2 f +2 g (0.45) holds. In this case the scalar product can be recovered from its norm by virtue of the polarization identity 1 2 2 2 2 f, g = f +g − f −g + i f − ig − i f + ig . (0.46) 4 Proof. If an inner product space is given, verification of the parallelogram law and the polarization identity is straightforward (Problem 0.14). To show the converse, we define 1 2 2 2 2 s(f, g) = f +g − f −g + i f − ig − i f + ig . 4 Then s(f, f ) = f 2 and s(f, g) = s(g, f )∗ are straightforward to check. Moreover, another straightforward computation using the parallelogram law shows g+h s(f, g) + s(f, h) = 2s(f, ). 2 Now choosing h = 0 (and using s(f, 0) = 0) shows s(f, g) = 2s(f, g ) and 2 thus s(f, g) + s(f, h) = s(f, g + h). Furthermore, by induction we infer m m 2n s(f, g) = s(f, 2n g); that is, α s(f, g) = s(f, αg) for every positive rational α. By continuity (check this!) this holds for all α 0 and s(f, −g) = −s(f, g), respectively, s(f, ig) = i s(f, g), finishes the proof. Note that the parallelogram law and the polarization identity even hold for sesquilinear forms (Problem 0.14). But how do we define a scalar product on C(I)? One possibility is b f, g = f ∗ (x)g(x)dx. (0.47) a The corresponding inner product space is denoted by L2 (I). Note that cont we have f ≤ |b − a| f ∞ (0.48) and hence the maximum norm is stronger than the L2 norm. cont Suppose we have two norms . 1 and . 2 on a space X. Then . 2 is said to be stronger than . 1 if there is a constant m 0 such that f 1 ≤m f 2. (0.49) It is straightforward to check the following.
  • 32. 20 0. A first look at Banach and Hilbert spaces Lemma 0.20. If . 2 is stronger than . 1 , then any . 2 Cauchy sequence is also a . 1 Cauchy sequence. Hence if a function F : X → Y is continuous in (X, . 1 ), it is also continuous in (X, . 2 ) and if a set is dense in (X, . 2 ), it is also dense in (X, . 1 ). In particular, L2 is separable. But is it also complete? Unfortunately cont the answer is no: Example. Take I = [0, 2] and define  1 0,  0 ≤ x ≤ 1 − n, fn (x) = 1 + n(x − 1), 1 1 − n ≤ x ≤ 1, (0.50)  1, 1 ≤ x ≤ 2.  Then fn (x) is a Cauchy sequence in L2 , but there is no limit in L2 ! cont cont Clearly the limit should be the step function which is 0 for 0 ≤ x 1 and 1 for 1 ≤ x ≤ 2, but this step function is discontinuous (Problem 0.18)! This shows that in infinite dimensional spaces different norms will give rise to different convergent sequences! In fact, the key to solving problems in infinite dimensional spaces is often finding the right norm! This is something which cannot happen in the finite dimensional case. Theorem 0.21. If X is a finite dimensional space, then all norms are equiv- alent. That is, for any two given norms . 1 and . 2 , there are constants m1 and m2 such that 1 f 1 ≤ f 2 ≤ m1 f 1 . (0.51) m2 Proof. Clearly we can choose a basis uj , 1 ≤ j ≤ n, and assume that . 2 2 2 is the usual Euclidean norm, j αj uj 2 = j |αj | . Let f = j αj uj . Then by the triangle and Cauchy–Schwartz inequalities f ≤ |αj | uj ≤ uj 2 f 1 1 1 2 j j and we can choose m2 = j uj 1. In particular, if fn is convergent with respect to . 2 , it is also convergent with respect to . 1 . Thus . 1 is continuous with respect to . 2 and attains its minimum m 0 on the unit sphere (which is compact by the Heine-Borel theorem). Now choose m1 = 1/m. Problem 0.12. Show that the norm in a Hilbert space satisfies f + g = f + g if and only if f = αg, α ≥ 0, or g = 0.
  • 33. 0.3. The geometry of Hilbert spaces 21 Problem 0.13. Show that 2 (N) is a separable Hilbert space. Problem 0.14. Suppose Q is a vector space. Let s(f, g) be a sesquilinear form on Q and q(f ) = s(f, f ) the associated quadratic form. Prove the parallelogram law q(f + g) + q(f − g) = 2q(f ) + 2q(g) (0.52) and the polarization identity 1 s(f, g) = (q(f + g) − q(f − g) + i q(f − ig) − i q(f + ig)) . (0.53) 4 Conversely, show that any quadratic form q(f ) : Q → R satisfying q(αf ) = |α|2 q(f ) and the parallelogram law gives rise to a sesquilinear form via the polarization identity. Problem 0.15. A sesquilinear form is called bounded if s = sup |s(f, g)| f = g =1 is finite. Similarly, the associated quadratic form q is bounded if q = sup |q(f )| f =1 is finite. Show q ≤ s ≤2 q . (Hint: Use the parallelogram law and the polarization identity from the pre- vious problem.) Problem 0.16. Suppose Q is a vector space. Let s(f, g) be a sesquilinear form on Q and q(f ) = s(f, f ) the associated quadratic form. Show that the Cauchy–Schwarz inequality |s(f, g)| ≤ q(f )1/2 q(g)1/2 (0.54) holds if q(f ) ≥ 0. (Hint: Consider 0 ≤ q(f + αg) = q(f ) + 2 Re(α s(f, g)) + |α|2 q(g) and choose α = t s(f, g)∗ /|s(f, g)| with t ∈ R.) Problem 0.17. Show that the maximum norm (on C[0, 1]) does not satisfy the parallelogram law. Problem 0.18. Prove the claims made about fn , defined in (0.50), in the last example.
  • 34. 22 0. A first look at Banach and Hilbert spaces 0.4. Completeness Since L2cont is not complete, how can we obtain a Hilbert space from it? Well, the answer is simple: take the completion. If X is an (incomplete) normed space, consider the set of all Cauchy ˜ sequences X. Call two Cauchy sequences equivalent if their difference con- ¯ verges to zero and denote by X the set of all equivalence classes. It is easy to see that X¯ (and X) inherit the vector space structure from X. Moreover, ˜ Lemma 0.22. If xn is a Cauchy sequence, then xn converges. Consequently the norm of a Cauchy sequence (xn )∞ can be defined by n=1 (xn )∞ n=1 = limn→∞ xn and is independent of the equivalence class (show ¯ ˜ this!). Thus X is a normed space (X is not! Why?). ¯ Theorem 0.23. X is a Banach space containing X as a dense subspace if we identify x ∈ X with the equivalence class of all sequences converging to x. ¯ Proof. (Outline) It remains to show that X is complete. Let ξn = [(xn,j )∞ ] j=1 ¯ be a Cauchy sequence in X. Then it is not hard to see that ξ = [(xj,j )∞ ] j=1 is its limit. ¯ Let me remark that the completion X is unique. More precisely any other complete space which contains X as a dense subset is isomorphic to ¯ X. This can for example be seen by showing that the identity map on X ¯ has a unique extension to X (compare Theorem 0.26 below). In particular it is no restriction to assume that a normed linear space or an inner product space is complete. However, in the important case of L2cont it is somewhat inconvenient to work with equivalence classes of Cauchy sequences and hence we will give a different characterization using the Lebesgue integral later. 0.5. Bounded operators A linear map A between two normed spaces X and Y will be called a (lin- ear) operator A : D(A) ⊆ X → Y. (0.55) The linear subspace D(A) on which A is defined is called the domain of A and is usually required to be dense. The kernel Ker(A) = {f ∈ D(A)|Af = 0} (0.56) and range Ran(A) = {Af |f ∈ D(A)} = AD(A) (0.57)
  • 35. 0.5. Bounded operators 23 are defined as usual. The operator A is called bounded if the operator norm A = sup Af Y (0.58) f X =1 is finite. The set of all bounded linear operators from X to Y is denoted by L(X, Y ). If X = Y , we write L(X, X) = L(X). Theorem 0.24. The space L(X, Y ) together with the operator norm (0.58) is a normed space. It is a Banach space if Y is. Proof. That (0.58) is indeed a norm is straightforward. If Y is complete and An is a Cauchy sequence of operators, then An f converges to an element g for every f . Define a new operator A via Af = g. By continuity of the vector operations, A is linear and by continuity of the norm Af = limn→∞ An f ≤ (limn→∞ An ) f , it is bounded. Furthermore, given ε 0 there is some N such that An − Am ≤ ε for n, m ≥ N and thus An f −Am f ≤ ε f . Taking the limit m → ∞, we see An f −Af ≤ ε f ; that is, An → A. By construction, a bounded operator is Lipschitz continuous, Af Y ≤ A f X, (0.59) and hence continuous. The converse is also true Theorem 0.25. An operator A is bounded if and only if it is continuous. Proof. Suppose A is continuous but not bounded. Then there is a sequence 1 of unit vectors un such that Aun ≥ n. Then fn = n un converges to 0 but Afn ≥ 1 does not converge to 0. Moreover, if A is bounded and densely defined, it is no restriction to assume that it is defined on all of X. Theorem 0.26 (B.L.T. theorem). Let A ∈ L(X, Y ) and let Y be a Banach space. If D(A) is dense, there is a unique (continuous) extension of A to X which has the same norm. Proof. Since a bounded operator maps Cauchy sequences to Cauchy se- quences, this extension can only be given by Af = lim Afn , fn ∈ D(A), f ∈ X. n→∞ To show that this definition is independent of the sequence fn → f , let gn → f be a second sequence and observe Afn − Agn = A(fn − gn ) ≤ A fn − gn → 0.
  • 36. 24 0. A first look at Banach and Hilbert spaces From continuity of vector addition and scalar multiplication it follows that our extension is linear. Finally, from continuity of the norm we conclude that the norm does not increase. An operator in L(X, C) is called a bounded linear functional and the space X ∗ = L(X, C) is called the dual space of X. A sequence fn is said to converge weakly, fn f , if (fn ) → (f ) for every ∈ X ∗ . The Banach space of bounded linear operators L(X) even has a multi- plication given by composition. Clearly this multiplication satisfies (A + B)C = AC + BC, A(B + C) = AB + BC, A, B, C ∈ L(X) (0.60) and (AB)C = A(BC), α (AB) = (αA)B = A (αB), α ∈ C. (0.61) Moreover, it is easy to see that we have AB ≤ A B . (0.62) However, note that our multiplication is not commutative (unless X is one- dimensional). We even have an identity, the identity operator I satisfying I = 1. A Banach space together with a multiplication satisfying the above re- quirements is called a Banach algebra. In particular, note that (0.62) ensures that multiplication is continuous (Problem 0.22). Problem 0.19. Show that the integral operator 1 (Kf )(x) = K(x, y)f (y)dy, 0 where K(x, y) ∈ C([0, 1] × [0, 1]), defined on D(K) = C[0, 1] is a bounded operator both in X = C[0, 1] (max norm) and X = L2 (0, 1). cont d Problem 0.20. Show that the differential operator A = dx defined on D(A) = C 1 [0, 1] ⊂ C[0, 1] is an unbounded operator. Problem 0.21. Show that AB ≤ A B for every A, B ∈ L(X). Problem 0.22. Show that the multiplication in a Banach algebra X is con- tinuous: xn → x and yn → y imply xn yn → xy. Problem 0.23. Let ∞ f (z) = fj z j , |z| R, j=0
  • 37. 0.6. Lebesgue Lp spaces 25 be a convergent power series with convergence radius R 0. Suppose A is a bounded operator with A R. Show that ∞ f (A) = fj Aj j=0 exists and defines a bounded linear operator (cf. Problem 0.9). 0.6. Lebesgue Lp spaces For this section some basic facts about the Lebesgue integral are required. The necessary background can be found in Appendix A. To begin with, Sections A.1, A.3, and A.4 will be sufficient. We fix some σ-finite measure space (X, Σ, µ) and denote by Lp (X, dµ), 1 ≤ p, the set of all complex-valued measurable functions for which 1/p f p = |f |p dµ (0.63) X is finite. First of all note that Lp (X, dµ) is a linear space, since |f + g|p ≤ 2p max(|f |, |g|)p ≤ 2p max(|f |p , |g|p ) ≤ 2p (|f |p + |g|p ). Of course our hope is that Lp (X, dµ) is a Banach space. However, there is a small technical problem (recall that a property is said to hold almost everywhere if the set where it fails to hold is contained in a set of measure zero): Lemma 0.27. Let f be measurable. Then |f |p dµ = 0 (0.64) X if and only if f (x) = 0 almost everywhere with respect to µ. Proof. Observe that we have A = {x|f (x) = 0} = n An , where An = 1 {x| |f (x)| n }. If |f |p dµ = 0, we must have µ(An ) = 0 for every n and hence µ(A) = limn→∞ µ(An ) = 0. The converse is obvious. Note that the proof also shows that if f is not 0 almost everywhere, there is an ε 0 such that µ({x| |f (x)| ≥ ε}) 0. Example. Let λ be the Lebesgue measure on R. Then the characteristic function of the rationals χQ is zero a.e. (with respect to λ). Let Θ be the Dirac measure centered at 0. Then f (x) = 0 a.e. (with respect to Θ) if and only if f (0) = 0. Thus f p = 0 only implies f (x) = 0 for almost every x, but not for all! Hence . p is not a norm on Lp (X, dµ). The way out of this misery is to identify functions which are equal almost everywhere: Let N (X, dµ) = {f |f (x) = 0 µ-almost everywhere}. (0.65)
  • 38. 26 0. A first look at Banach and Hilbert spaces Then N (X, dµ) is a linear subspace of Lp (X, dµ) and we can consider the quotient space Lp (X, dµ) = Lp (X, dµ)/N (X, dµ). (0.66) If dµ is the Lebesgue measure on X ⊆ Rn , we simply write Lp (X). Observe that f p is well-defined on Lp (X, dµ). Even though the elements of Lp (X, dµ) are, strictly speaking, equiva- lence classes of functions, we will still call them functions for notational convenience. However, note that for f ∈ Lp (X, dµ) the value f (x) is not well-defined (unless there is a continuous representative and different con- tinuous functions are in different equivalence classes, e.g., in the case of Lebesgue measure). With this modification we are back in business since Lp (X, dµ) turns out to be a Banach space. We will show this in the following sections. But before that let us also define L∞ (X, dµ). It should be the set of bounded measurable functions B(X) together with the sup norm. The only problem is that if we want to identify functions equal almost everywhere, the supremum is no longer independent of the equivalence class. The solution is the essential supremum f ∞ = inf{C | µ({x| |f (x)| C}) = 0}. (0.67) That is, C is an essential bound if |f (x)| ≤ C almost everywhere and the essential supremum is the infimum over all essential bounds. Example. If λ is the Lebesgue measure, then the essential sup of χQ with respect to λ is 0. If Θ is the Dirac measure centered at 0, then the essential sup of χQ with respect to Θ is 1 (since χQ (0) = 1, and x = 0 is the only point which counts for Θ). As before we set L∞ (X, dµ) = B(X)/N (X, dµ) (0.68) and observe that f ∞ is independent of the equivalence class. If you wonder where the ∞ comes from, have a look at Problem 0.24. As a preparation for proving that Lp is a Banach space, we will need H¨lder’s inequality, which plays a central role in the theory of Lp spaces. o In particular, it will imply Minkowski’s inequality, which is just the triangle inequality for Lp . Theorem 0.28 (H¨lder’s inequality). Let p and q be dual indices; that is, o 1 1 + =1 (0.69) p q
  • 39. 0.6. Lebesgue Lp spaces 27 with 1 ≤ p ≤ ∞. If f ∈ Lp (X, dµ) and g ∈ Lq (X, dµ), then f g ∈ L1 (X, dµ) and f g 1 ≤ f p g q. (0.70) Proof. The case p = 1, q = ∞ (respectively p = ∞, q = 1) follows directly from the properties of the integral and hence it remains to consider 1 p, q ∞. First of all it is no restriction to assume f p = g q = 1. Then, using the elementary inequality (Problem 0.25) 1 1 a1/p b1/q ≤ a + b, a, b ≥ 0, (0.71) p q with a = |f |p and b = |g|q and integrating over X gives 1 1 |f g|dµ ≤ |f |p dµ + |g|q dµ = 1 X p X q X and finishes the proof. As a consequence we also get Theorem 0.29 (Minkowski’s inequality). Let f, g ∈ Lp (X, dµ). Then f +g p ≤ f p + g p. (0.72) Proof. Since the cases p = 1, ∞ are straightforward, we only consider 1 p ∞. Using |f + g|p ≤ |f | |f + g|p−1 + |g| |f + g|p−1 , we obtain from H¨lder’s inequality (note (p − 1)q = p) o p f +g p ≤ f p (f + g)p−1 q + g p (f + g)p−1 q p−1 =( f p + g p ) (f + g) p . This shows that Lp (X, dµ) is a normed linear space. Finally it remains to show that Lp (X, dµ) is complete. Theorem 0.30. The space Lp (X, dµ) is a Banach space. Proof. Suppose fn is a Cauchy sequence. It suffices to show that some subsequence converges (show this). Hence we can drop some terms such that 1 fn+1 − fn p ≤ n . 2 Now consider gn = fn − fn−1 (set f0 = 0). Then ∞ G(x) = |gk (x)| k=1
  • 40. 28 0. A first look at Banach and Hilbert spaces is in Lp . This follows from n n 1 |gk | ≤ gk (x) p ≤ f1 p + p 2 k=1 k=1 using the monotone convergence theorem. In particular, G(x) ∞ almost everywhere and the sum ∞ gn (x) = lim fn (x) n→∞ n=1 is absolutely convergent for those x. Now let f (x) be this limit. Since |f (x) − fn (x)|p converges to zero almost everywhere and |f (x) − fn (x)|p ≤ 2p G(x)p ∈ L1 , dominated convergence shows f − fn p → 0. In particular, in the proof of the last theorem we have seen: Corollary 0.31. If fn − f p → 0, then there is a subsequence which con- verges pointwise almost everywhere. Note that the statement is not true in general without passing to a subsequence (Problem 0.28). Using H¨lder’s inequality, we can also identify a class of bounded oper- o ators in Lp . Lemma 0.32 (Schur criterion). Consider Lp (X, dµ) and Lq (X, dν) with 1 1 p + q = 1. Suppose that K(x, y) is measurable and there are measurable functions K1 (x, y), K2 (x, y) such that |K(x, y)| ≤ K1 (x, y)K2 (x, y) and K1 (x, .) q ≤ C1 , K2 (., y) p ≤ C2 (0.73) for µ-almost every x, respectively, for ν-almost every y. Then the operator K : Lp (X, dµ) → Lp (X, dµ) defined by (Kf )(x) = K(x, y)f (y)dν(y) (0.74) Y for µ-almost every x is bounded with K ≤ C1 C2 . Proof. Choose f ∈ Lp (X, dµ). By Fubini’s theorem Y |K(x, y)f (y)|dν(y) is measurable and by H¨lder’s inequality we have o |K(x, y)f (y)|dν(y) ≤ K1 (x, y)K2 (x, y)|f (y)|dν(y) Y Y 1/q 1/p ≤ K1 (x, y)q dν(y) |K2 (x, y)f (y)|p dν(y) Y Y 1/p ≤ C1 |K2 (x, y)f (y)|p dν(y) Y
  • 41. 0.6. Lebesgue Lp spaces 29 (if K2 (x, .)f (.) ∈ Lp (X, dν), the inequality is trivially true). Now take this inequality to the p’th power and integrate with respect to x using Fubini p p |K(x, y)f (y)|dν(y) dµ(x) ≤ C1 |K2 (x, y)f (y)|p dν(y)dµ(x) X Y X Y p p p = C1 |K2 (x, y)f (y)|p dµ(x)dµ(y) ≤ C1 C2 f p p. Y X Hence Y |K(x, y)f (y)|dν(y) ∈ Lp (X, dµ) and in particular it is finite for µ-almost every x. Thus K(x, .)f (.) is ν integrable for µ-almost every x and Y K(x, y)f (y)dν(y) is measurable. It even turns out that Lp is separable. Lemma 0.33. Suppose X is a second countable topological space (i.e., it has a countable basis) and µ is a regular Borel measure. Then Lp (X, dµ), 1 ≤ p ∞, is separable. Proof. The set of all characteristic functions χA (x) with A ∈ Σ and µ(A) ∞ is total by construction of the integral. Now our strategy is as follows: Using outer regularity, we can restrict A to open sets and using the existence of a countable base, we can restrict A to open sets from this base. Fix A. By outer regularity, there is a decreasing sequence of open sets On such that µ(On ) → µ(A). Since µ(A) ∞, it is no restriction to assume µ(On ) ∞, and thus µ(On A) = µ(On ) − µ(A) → 0. Now dominated convergence implies χA − χOn p → 0. Thus the set of all characteristic functions χO (x) with O open and µ(O) ∞ is total. Finally let B be a countable basis for the topology. Then, every open set O can be written as O = ∞ Oj with Oj ∈ B. Moreover, by considering the set of all finite j=1 ˜ ˜ unions of elements from B, it is no restriction to assume n Oj ∈ B. Hence j=1 ˜ there is an increasing sequence On˜ ˜ O with On ∈ B. By monotone con- vergence, χO − χOn p → 0 and hence the set of all characteristic functions ˜ χO with O ˜ ˜ ∈ B is total. To finish this chapter, let us show that continuous functions are dense in Lp . Theorem 0.34. Let X be a locally compact metric space and let µ be a σ-finite regular Borel measure. Then the set Cc (X) of continuous functions with compact support is dense in Lp (X, dµ), 1 ≤ p ∞. Proof. As in the previous proof the set of all characteristic functions χK (x) with K compact is total (using inner regularity). Hence it suffices to show that χK (x) can be approximated by continuous functions. By outer regu- larity there is an open set O ⊃ K such that µ(OK) ≤ ε. By Urysohn’s
  • 42. 30 0. A first look at Banach and Hilbert spaces lemma (Lemma 0.12) there is a continuous function fε which is 1 on K and 0 outside O. Since |χK − fε |p dµ = |fε |p dµ ≤ µ(OK) ≤ ε, X OK we have fε − χK → 0 and we are done. If X is some subset of Rn , we can do even better. A nonnegative function ∞ u ∈ Cc (Rn ) is called a mollifier if u(x)dx = 1. (0.75) Rn 1 The standard mollifier is u(x) = exp( |x|2 −1 ) for |x| 1 and u(x) = 0 otherwise. If we scale a mollifier according to uk (x) = k n u(k x) such that its mass is preserved ( uk 1 = 1) and it concentrates more and more around the origin, T u k E we have the following result (Problem 0.29): Lemma 0.35. Let u be a mollifier in Rn and set uk (x) = k n u(k x). Then for any (uniformly) continuous function f : Rn → C we have that fk (x) = uk (x − y)f (y)dy (0.76) Rn is in C ∞ (Rn ) and converges to f (uniformly). Now we are ready to prove Theorem 0.36. If X ⊆ Rn and µ is a regular Borel measure, then the set ∞ Cc (X) of all smooth functions with compact support is dense in Lp (X, dµ), 1 ≤ p ∞. Proof. By our previous result it suffices to show that any continuous func- tion f (x) with compact support can be approximated by smooth ones. By setting f (x) = 0 for x ∈ X, it is no restriction to assume X = Rn . Now choose a mollifier u and observe that fk has compact support (since f
  • 43. 0.6. Lebesgue Lp spaces 31 has). Moreover, since f has compact support, it is uniformly continuous and fk → f uniformly. But this implies fk → f in Lp . We say that f ∈ Lp (X) if f ∈ Lp (K) for any compact subset K ⊂ X. loc Lemma 0.37. Suppose f ∈ L1 (Rn ). Then loc ∞ ϕ(x)f (x)dx = 0, ∀ϕ ∈ Cc (Rn ), (0.77) Rn if and only if f (x) = 0 (a.e.). Proof. First of all we claim that for any bounded function g with compact ∞ support K, there is a sequence of functions ϕn ∈ Cc (Rn ) with support in K which converges pointwise to g such that ϕn ∞ ≤ g ∞ . To see this, take a sequence of continuous functions ϕn with support in K which converges to g in L1 . To make sure that ϕn ∞ ≤ g ∞ , just set it equal to g ∞ whenever ϕn g ∞ and equal to − g ∞ whenever ϕn g ∞ (show that the resulting sequence still converges). Finally use (0.76) to make ϕn smooth (note that this operation does not change the range) and extract a pointwise convergent subsequence. Now let K be some compact set and choose g = sign(f )χK . Then |f |dx = f sign(f )dx = lim f χn dx = 0, K K n→∞ which shows f = 0 for a.e. x ∈ K. Since K is arbitrary, we are done. Problem 0.24. Suppose µ(X) ∞. Show that lim f p = f ∞ p→∞ for any bounded measurable function. Problem 0.25. Prove (0.71). (Hint: Take logarithms on both sides.) Problem 0.26. Show the following generalization of H¨lder’s inequality: o fg r ≤ f p g q, (0.78) 1 1 where p + q = 1. r Problem 0.27 (Lyapunov inequality). Let 0 θ 1. Show that if f ∈ Lp1 ∩ Lp2 , then f ∈ Lp and θ 1−θ f p ≤ f p1 f p2 , (0.79) 1 θ 1−θ where p = p1 + p2 .
  • 44. 32 0. A first look at Banach and Hilbert spaces Problem 0.28. Find a sequence fn which converges to 0 in Lp ([0, 1], dx) but for which fn (x) → 0 for a.e. x ∈ [0, 1] does not hold. (Hint: Every n ∈ N can be uniquely written as n = 2m +k with 0 ≤ m and 0 ≤ k 2m . Now consider the characteristic functions of the intervals Im,k = [k2−m , (k + 1)2−m ].) Problem 0.29. Prove Lemma 0.35. (Hint: To show that fk is smooth, use Problems A.7 and A.8.) Problem 0.30. Construct a function f ∈ Lp (0, 1) which has a singularity at every rational number in [0, 1]. (Hint: Start with the function f0 (x) = |x|−α which has a single pole at 0. Then fj (x) = f0 (x − xj ) has a pole at xj .) 0.7. Appendix: The uniform boundedness principle Recall that the interior of a set is the largest open subset (that is, the union of all open subsets). A set is called nowhere dense if its closure has empty interior. The key to several important theorems about Banach spaces is the observation that a Banach space cannot be the countable union of nowhere dense sets. Theorem 0.38 (Baire category theorem). Let X be a complete metric space. Then X cannot be the countable union of nowhere dense sets. Proof. Suppose X = ∞ Xn . We can assume that the sets Xn are closed n=1 and none of them contains a ball; that is, XXn is open and nonempty for every n. We will construct a Cauchy sequence xn which stays away from all Xn . Since XX1 is open and nonempty, there is a closed ball Br1 (x1 ) ⊆ XX1 . Reducing r1 a little, we can even assume Br1 (x1 ) ⊆ XX1 . More- over, since X2 cannot contain Br1 (x1 ), there is some x2 ∈ Br1 (x1 ) that is not in X2 . Since Br1 (x1 ) ∩ (XX2 ) is open, there is a closed ball Br2 (x2 ) ⊆ Br1 (x1 ) ∩ (XX2 ). Proceeding by induction, we obtain a sequence of balls such that Brn (xn ) ⊆ Brn−1 (xn−1 ) ∩ (XXn ). Now observe that in every step we can choose rn as small as we please; hence without loss of generality rn → 0. Since by construction xn ∈ BrN (xN ) for n ≥ N , we conclude that xn is Cauchy and converges to some point x ∈ X. But x ∈ Brn (xn ) ⊆ XXn for every n, contradicting our assumption that the Xn cover X. (Sets which can be written as the countable union of nowhere dense sets are said to be of first category. All other sets are second category. Hence we have the name category theorem.)
  • 45. 0.7. Appendix: The uniform boundedness principle 33 In other words, if Xn ⊆ X is a sequence of closed subsets which cover X, at least one Xn contains a ball of radius ε 0. Now we come to the first important consequence, the uniform bound- edness principle. Theorem 0.39 (Banach–Steinhaus). Let X be a Banach space and Y some normed linear space. Let {Aα } ⊆ L(X, Y ) be a family of bounded operators. Suppose Aα x ≤ C(x) is bounded for fixed x ∈ X. Then Aα ≤ C is uniformly bounded. Proof. Let Xn = {x| Aα x ≤ n for all α} = {x| Aα x ≤ n}. α Then n Xn = X by assumption. Moreover, by continuity of Aα and the norm, each Xn is an intersection of closed sets and hence closed. By Baire’s theorem at least one contains a ball of positive radius: Bε (x0 ) ⊂ Xn . Now observe Aα y ≤ Aα (y + x0 ) + Aα x0 ≤ n + Aα x0 x for y ε. Setting y = ε x , we obtain n + C(x0 ) Aα x ≤ x ε for any x.
  • 49. Chapter 1 Hilbert spaces The phase space in classical mechanics is the Euclidean space R2n (for the n position and n momentum coordinates). In quantum mechanics the phase space is always a Hilbert space H. Hence the geometry of Hilbert spaces stands at the outset of our investigations. 1.1. Hilbert spaces Suppose H is a vector space. A map ., .. : H×H → C is called a sesquilinear form if it is conjugate linear in the first argument and linear in the second. A positive definite sesquilinear form is called an inner product or scalar product. Associated with every scalar product is a norm ψ = ψ, ψ . (1.1) The triangle inequality follows from the Cauchy–Schwarz–Bunjakowski inequality: | ψ, ϕ | ≤ ψ ϕ (1.2) with equality if and only if ψ and ϕ are parallel. If H is complete with respect to the above norm, it is called a Hilbert space. It is no restriction to assume that H is complete since one can easily replace it by its completion. Example. The space L2 (M, dµ) is a Hilbert space with scalar product given by f, g = f (x)∗ g(x)dµ(x). (1.3) M 37
  • 50. 38 1. Hilbert spaces Similarly, the set of all square summable sequences 2 (N) is a Hilbert space with scalar product ∗ f, g = fj gj . (1.4) j∈N (Note that the second example is a special case of the first one; take M = R and µ a sum of Dirac measures.) A vector ψ ∈ H is called normalized or a unit vector if ψ = 1. Two vectors ψ, ϕ ∈ H are called orthogonal or perpendicular (ψ ⊥ ϕ) if ψ, ϕ = 0 and parallel if one is a multiple of the other. If ψ and ϕ are orthogonal, we have the Pythagorean theorem: 2 2 ψ+ϕ = ψ + ϕ 2, ψ ⊥ ϕ, (1.5) which is one line of computation. Suppose ϕ is a unit vector. Then the projection of ψ in the direction of ϕ is given by ψ = ϕ, ψ ϕ (1.6) and ψ⊥ defined via ψ⊥ = ψ − ϕ, ψ ϕ (1.7) is perpendicular to ϕ. These results can also be generalized to more than one vector. A set of vectors {ϕj } is called an orthonormal set (ONS) if ϕj , ϕk = 0 for j = k and ϕj , ϕj = 1. Lemma 1.1. Suppose {ϕj }n is an orthonormal set. Then every ψ ∈ H j=0 can be written as n ψ = ψ + ψ⊥ , ψ = ϕj , ψ ϕj , (1.8) j=0 where ψ and ψ⊥ are orthogonal. Moreover, ϕj , ψ⊥ = 0 for all 1 ≤ j ≤ n. In particular, n 2 ψ = | ϕj , ψ |2 + ψ⊥ 2 . (1.9) j=0 ˆ Moreover, every ψ in the span of {ϕj }n satisfies j=0 ˆ ψ − ψ ≥ ψ⊥ (1.10) ˆ with equality holding if and only if ψ = ψ . In other words, ψ is uniquely characterized as the vector in the span of {ϕj }n closest to ψ. j=0
  • 51. 1.2. Orthonormal bases 39 Proof. A straightforward calculation shows ϕj , ψ − ψ = 0 and hence ψ and ψ⊥ = ψ − ψ are orthogonal. The formula for the norm follows by applying (1.5) iteratively. Now, fix a vector n ˆ ψ= cj ϕj j=0 in the span of {ϕj }n . Then one computes j=0 ˆ ψ−ψ 2 ˆ = ψ + ψ⊥ − ψ 2 = ψ⊥ 2 ˆ + ψ −ψ 2 n 2 = ψ⊥ + |cj − ϕj , ψ |2 j=0 from which the last claim follows. From (1.9) we obtain Bessel’s inequality n | ϕj , ψ |2 ≤ ψ 2 (1.11) j=0 with equality holding if and only if ψ lies in the span of {ϕj }n . j=0 Recall that a scalar product can be recovered from its norm by virtue of the polarization identity 1 2 2 2 2 ϕ, ψ = ϕ+ψ − ϕ−ψ + i ϕ − iψ − i ϕ + iψ . (1.12) 4 A bijective linear operator U ∈ L(H1 , H2 ) is called unitary if U preserves scalar products: U ϕ, U ψ 2 = ϕ, ψ 1 , ϕ, ψ ∈ H1 . (1.13) By the polarization identity this is the case if and only if U preserves norms: U ψ 2 = ψ 1 for all ψ ∈ H1 . The two Hilbert space H1 and H2 are called unitarily equivalent in this case. Problem 1.1. The operator 2 2 S: (N) → (N), (a1 , a2 , a3 , . . . ) → (0, a1 , a2 , . . . ) satisfies Sa = a . Is it unitary? 1.2. Orthonormal bases Of course, since we cannot assume H to be a finite dimensional vector space, we need to generalize Lemma 1.1 to arbitrary orthonormal sets {ϕj }j∈J .
  • 52. 40 1. Hilbert spaces We start by assuming that J is countable. Then Bessel’s inequality (1.11) shows that | ϕj , ψ |2 (1.14) j∈J converges absolutely. Moreover, for any finite subset K ⊂ J we have 2 ϕj , ψ ϕj = | ϕj , ψ |2 (1.15) j∈K j∈K by the Pythagorean theorem and thus j∈J ϕj , ψ ϕj is Cauchy if and only 2 if j∈J | ϕj , ψ | is. Now let J be arbitrary. Again, Bessel’s inequality shows that for any given ε 0 there are at most finitely many j for which | ϕj , ψ | ≥ ε. Hence there are at most countably many j for which | ϕj , ψ | 0. Thus it follows that | ϕj , ψ |2 (1.16) j∈J is well-defined and so is ϕj , ψ ϕj . (1.17) j∈J In particular, by continuity of the scalar product we see that Lemma 1.1 holds for arbitrary orthonormal sets without modifications. Theorem 1.2. Suppose {ϕj }j∈J is an orthonormal set. Then every ψ ∈ H can be written as ψ = ψ + ψ⊥ , ψ = ϕj , ψ ϕj , (1.18) j∈J where ψ and ψ⊥ are orthogonal. Moreover, ϕj , ψ⊥ = 0 for all j ∈ J. In particular, ψ 2= | ϕj , ψ |2 + ψ⊥ 2 . (1.19) j∈J ˆ Moreover, every ψ in the span of {ϕj }j∈J satisfies ˆ ψ − ψ ≥ ψ⊥ (1.20) ˆ with equality holding if and only if ψ = ψ . In other words, ψ is uniquely characterized as the vector in the span of {ϕj }j∈J closest to ψ. Note that from Bessel’s inequality (which of course still holds) it follows that the map ψ → ψ is continuous. An orthonormal set which is not a proper subset of any other orthonor- mal set is called an orthonormal basis (ONB) due to the following result: Theorem 1.3. For an orthonormal set {ϕj }j∈J the following conditions are equivalent:
  • 53. 1.2. Orthonormal bases 41 (i) {ϕj }j∈J is a maximal orthonormal set. (ii) For every vector ψ ∈ H we have ψ= ϕj , ψ ϕj . (1.21) j∈J (iii) For every vector ψ ∈ H we have 2 ψ = | ϕj , ψ |2 . (1.22) j∈J (iv) ϕj , ψ = 0 for all j ∈ J implies ψ = 0. Proof. We will use the notation from Theorem 1.2. ˜ (i) ⇒ (ii): If ψ⊥ = 0, then we can normalize ψ⊥ to obtain a unit vector ψ⊥ ˜ which is orthogonal to all vectors ϕj . But then {ϕj }j∈J ∪ {ψ⊥ } would be a larger orthonormal set, contradicting the maximality of {ϕj }j∈J . (ii) ⇒ (iii): This follows since (ii) implies ψ⊥ = 0. (iii) ⇒ (iv): If ψ, ϕj = 0 for all j ∈ J, we conclude ψ 2 = 0 and hence ψ = 0. (iv) ⇒ (i): If {ϕj }j∈J were not maximal, there would be a unit vector ϕ such that {ϕj }j∈J ∪ {ϕ} is a larger orthonormal set. But ϕj , ϕ = 0 for all j ∈ J implies ϕ = 0 by (iv), a contradiction. Since ψ → ψ is continuous, it suffices to check conditions (ii) and (iii) on a dense set. Example. The set of functions 1 ϕn (x) = √ ein x , n ∈ Z, (1.23) 2π forms an orthonormal basis for H = L2 (0, 2π). The corresponding orthogo- nal expansion is just the ordinary Fourier series (Problem 1.20). A Hilbert space is separable if and only if there is a countable orthonor- mal basis. In fact, if H is separable, then there exists a countable total set {ψj }N . Here N ∈ N if H is finite dimensional and N = ∞ otherwise. After j=0 throwing away some vectors, we can assume that ψn+1 cannot be expressed as a linear combinations of the vectors ψ0 , . . . , ψn . Now we can construct an orthonormal basis as follows: We begin by normalizing ψ0 , ψ0 ϕ0 = . (1.24) ψ0 Next we take ψ1 and remove the component parallel to ϕ0 and normalize again: ψ1 − ϕ0 , ψ1 ϕ0 ϕ1 = . (1.25) ψ1 − ϕ0 , ψ1 ϕ0
  • 54. 42 1. Hilbert spaces Proceeding like this, we define recursively n−1 ψn − j=0 ϕj , ψn ϕj ϕn = n−1 . (1.26) ψn − j=0 ϕj , ψn ϕj This procedure is known as Gram–Schmidt orthogonalization. Hence we obtain an orthonormal set {ϕj }N such that span{ϕj }n = span{ψj }n j=0 j=0 j=0 for any finite n and thus also for N (if N = ∞). Since {ψj }N is total, so j=0 is {ϕj }N . Now suppose there is some ψ = ψ + ψ⊥ ∈ H for which ψ⊥ = 0. j=0 ˆ ˆ Since {ϕj }N is total, we can find a ψ in its span, such that ψ − ψ ψ⊥ j=1 N contradicting (1.20). Hence we infer that {ϕj }j=1 is an orthonormal basis. Theorem 1.4. Every separable Hilbert space has a countable orthonormal basis. Example. In L2 (−1, 1) we can orthogonalize the polynomial fn (x) = xn . The resulting polynomials are up to a normalization equal to the Legendre polynomials 3 x2 − 1 P0 (x) = 1, P1 (x) = x, P2 (x) = , ... (1.27) 2 (which are normalized such that Pn (1) = 1). If fact, if there is one countable basis, then it follows that any other basis is countable as well. Theorem 1.5. If H is separable, then every orthonormal basis is countable. Proof. We know that there is at least one countable orthonormal basis {ϕj }j∈J . Now let {φk }k∈K be a second basis and consider the set Kj = {k ∈ K| φk , ϕj = 0}. Since these are the expansion coefficients of ϕj with ˜ respect to {φk }k∈K , this set is countable. Hence the set K = j∈J Kj is ˜ ˜ countable as well. But k ∈ KK implies φk = 0 and hence K = K. We will assume all Hilbert spaces to be separable. In particular, it can be shown that L2 (M, dµ) is separable. Moreover, it turns out that, up to unitary equivalence, there is only one (separable) infinite dimensional Hilbert space: Let H be an infinite dimensional Hilbert space and let {ϕj }j∈N be any orthogonal basis. Then the map U : H → 2 (N), ψ → ( ϕj , ψ )j∈N is unitary (by Theorem 1.3 (iii)). In particular, Theorem 1.6. Any separable infinite dimensional Hilbert space is unitarily equivalent to 2 (N).
  • 55. 1.3. The projection theorem and the Riesz lemma 43 Let me remark that if H is not separable, there still exists an orthonor- mal basis. However, the proof requires Zorn’s lemma: The collection of all orthonormal sets in H can be partially ordered by inclusion. Moreover, any linearly ordered chain has an upper bound (the union of all sets in the chain). Hence Zorn’s lemma implies the existence of a maximal element, that is, an orthonormal basis. Problem 1.2. Let {ϕj } be some orthonormal basis. Show that a bounded linear operator A is uniquely determined by its matrix elements Ajk = ϕj , Aϕk with respect to this basis. Problem 1.3. Show that L(H) is not separable if H is infinite dimensional. 1.3. The projection theorem and the Riesz lemma Let M ⊆ H be a subset. Then M ⊥ = {ψ| ϕ, ψ = 0, ∀ϕ ∈ M } is called the orthogonal complement of M . By continuity of the scalar prod- uct it follows that M ⊥ is a closed linear subspace and by linearity that (span(M ))⊥ = M ⊥ . For example we have H⊥ = {0} since any vector in H⊥ must be in particular orthogonal to all vectors in some orthonormal basis. Theorem 1.7 (Projection theorem). Let M be a closed linear subspace of a Hilbert space H. Then every ψ ∈ H can be uniquely written as ψ = ψ + ψ⊥ with ψ ∈ M and ψ⊥ ∈ M ⊥ . One writes M ⊕ M⊥ = H (1.28) in this situation. Proof. Since M is closed, it is a Hilbert space and has an orthonormal basis {ϕj }j∈J . Hence the result follows from Theorem 1.2. In other words, to every ψ ∈ H we can assign a unique vector ψ which is the vector in M closest to ψ. The rest, ψ − ψ , lies in M ⊥ . The operator PM ψ = ψ is called the orthogonal projection corresponding to M . Note that we have 2 PM = PM and PM ψ, ϕ = ψ, PM ϕ (1.29) since PM ψ, ϕ = ψ , ϕ = ψ, PM ϕ . Clearly we have PM ⊥ ψ = ψ − PM ψ = ψ⊥ . Furthermore, (1.29) uniquely characterizes orthogonal projec- tions (Problem 1.6). Moreover, we see that the vectors in a closed subspace M are precisely those which are orthogonal to all vectors in M ⊥ ; that is, M ⊥⊥ = M . If M is an arbitrary subset, we have at least M ⊥⊥ = span(M ). (1.30)
  • 56. 44 1. Hilbert spaces Note that by H⊥ = {0} we see that M ⊥ = {0} if and only if M is total. Finally we turn to linear functionals, that is, to operators : H → C. By the Cauchy–Schwarz inequality we know that ϕ : ψ → ϕ, ψ is a bounded linear functional (with norm ϕ ). It turns out that in a Hilbert space every bounded linear functional can be written in this way. Theorem 1.8 (Riesz lemma). Suppose is a bounded linear functional on a Hilbert space H. Then there is a unique vector ϕ ∈ H such that (ψ) = ϕ, ψ for all ψ ∈ H. In other words, a Hilbert space is equivalent to its own dual space H∗ = H. Proof. If ≡ 0, we can choose ϕ = 0. Otherwise Ker( ) = {ψ| (ψ) = 0} is a proper subspace and we can find a unit vector ϕ ∈ Ker( )⊥ . For every ˜ ψ ∈ H we have (ψ)ϕ − (ϕ)ψ ∈ Ker( ) and hence ˜ ˜ 0 = ϕ, (ψ)ϕ − (ϕ)ψ = (ψ) − (ϕ) ϕ, ψ . ˜ ˜ ˜ ˜ ˜ In other words, we can choose ϕ = (ϕ)∗ ϕ. To see uniqueness, let ϕ1 , ϕ2 be ˜ ˜ two such vectors. Then ϕ1 − ϕ2 , ψ = ϕ1 , ψ − ϕ2 , ψ = (ψ) − (ψ) = 0 for any ψ ∈ H, which shows ϕ1 − ϕ2 ∈ H⊥ = {0}. The following easy consequence is left as an exercise. Corollary 1.9. Suppose s is a bounded sesquilinear form; that is, |s(ψ, ϕ)| ≤ C ψ ϕ . (1.31) Then there is a unique bounded operator A such that s(ψ, ϕ) = Aψ, ϕ . (1.32) Moreover, A ≤ C. Note that by the polarization identity (Problem 0.14), A is already uniquely determined by its quadratic form qA (ψ) = ψ, Aψ . Problem 1.4. Suppose U : H → H is unitary and M ⊆ H. Show that U M ⊥ = (U M )⊥ . Problem 1.5. Show that an orthogonal projection PM = 0 has norm one. Problem 1.6. Suppose P ∈ L satisfies P2 = P and P ψ, ϕ = ψ, P ϕ and set M = Ran(P ). Show • P ψ = ψ for ψ ∈ M and M is closed, • ϕ ∈ M ⊥ implies P ϕ ∈ M ⊥ and thus P ϕ = 0, and conclude P = PM .
  • 57. 1.4. Orthogonal sums and tensor products 45 Problem 1.7. Let P1 , P2 be two orthogonal projections. Show that P1 ≤ P2 (that is, ψ, P1 ψ ≤ ψ, P2 ψ ) if and only if Ran(P1 ) ⊆ Ran(P2 ). Show in this case that the two projections commute (that is, P1 P2 = P2 P1 ) and that P2 − P1 is also a projection. (Hints: Pj ψ = ψ if and only if Pj ψ = ψ and Ran(P1 ) ⊆ Ran(P2 ) if and only if P2 P1 = P1 .) 1 Problem 1.8. Show P : L2 (R) → L2 (R), f (x) → 2 (f (x) + f (−x)) is a projection. Compute its range and kernel. Problem 1.9. Prove Corollary 1.9. Problem 1.10. Consider the sesquilinear form 1 x x B(f, g) = f (t)∗ dt g(t)dt dx 0 0 0 in L2 (0, 1). Show that it is bounded and find the corresponding operator A. (Hint: Partial integration.) 1.4. Orthogonal sums and tensor products Given two Hilbert spaces H1 and H2 , we define their orthogonal sum H1 ⊕ H2 to be the set of all pairs (ψ1 , ψ2 ) ∈ H1 × H2 together with the scalar product (ϕ1 , ϕ2 ), (ψ1 , ψ2 ) = ϕ1 , ψ1 1 + ϕ2 , ψ2 2 . (1.33) It is left as an exercise to verify that H1 ⊕ H2 is again a Hilbert space. Moreover, H1 can be identified with {(ψ1 , 0)|ψ1 ∈ H1 } and we can regard H1 as a subspace of H1 ⊕ H2 , and similarly for H2 . It is also customary to write ψ1 + ψ2 instead of (ψ1 , ψ2 ). More generally, let Hj , j ∈ N, be a countable collection of Hilbert spaces and define ∞ ∞ ∞ 2 Hj = { ψj | ψj ∈ Hj , ψj j ∞}, (1.34) j=1 j=1 j=1 which becomes a Hilbert space with the scalar product ∞ ∞ ∞ ϕj , ψj = ϕj , ψj j . (1.35) j=1 j=1 j=1 ∞ 2 (N). Example. j=1 C = ˜ Similarly, if H and H are two Hilbert spaces, we define their tensor ˜ product as follows: The elements should be products ψ⊗ ψ of elements ψ ∈ H
  • 58. 46 1. Hilbert spaces ˜ ˜ and ψ ∈ H. Hence we start with the set of all finite linear combinations of ˜ elements of H × H: n ˜ F(H, H) = { ˜ ˜ ˜ αj (ψj , ψj )|(ψj , ψj ) ∈ H × H, αj ∈ C}. (1.36) j=1 ˜ ˜ ˜ ˜ ˜ ˜ ˜ Since we want (ψ1 +ψ2 )⊗ ψ = ψ1 ⊗ ψ+ψ2 ⊗ ψ, ψ⊗(ψ1 + ψ2 ) = ψ⊗ ψ1 +ψ⊗ ψ2 , and (αψ) ⊗ ψ˜ = ψ ⊗ (αψ), we consider F(H, H)/N (H, H), where ˜ ˜ ˜ n n n ˜ N (H, H) = span{ ˜ αj βk (ψj , ψk ) − ( αj ψj , ˜ βk ψk )} (1.37) j,k=1 j=1 k=1 ˜ ˜ and write ψ ⊗ ψ for the equivalence class of (ψ, ψ). Next we define ˜ ˜ ψ ⊗ ψ, φ ⊗ φ = ψ, φ ψ, φ˜ ˜ (1.38) ˜ which extends to a sesquilinear form on F(H, H)/N (H, H).˜ To show that we ˜ obtain a scalar product, we need to ensure positivity. Let ψ = i αi ψi ⊗ψi = 0 and pick orthonormal bases ϕj , ϕk for span{ψi }, span{ψ ˜ ˜i }, respectively. Then ψ= αjk ϕj ⊗ ϕk , αjk = ˜ ˜ ˜ αi ϕj , ψi ϕk , ψi (1.39) j,k i and we compute ψ, ψ = |αjk |2 0. (1.40) j,k ˜ ˜ The completion of F(H, H)/N (H, H) with respect to the induced norm is called the tensor product H ⊗ H ˜ of H and H. ˜ ˜ Lemma 1.10. If ϕj , ϕk are orthonormal bases for H, H, respectively, then ˜ ˜ ϕj ⊗ ϕk is an orthonormal basis for H ⊗ H. ˜ Proof. That ϕj ⊗ ϕk is an orthonormal set is immediate from (1.38). More- ˜ ˜ over, since span{ϕj }, span{ϕk } are dense in H, H, respectively, it is easy to ˜ ˜ see that ϕj ⊗ ϕk is dense in F(H, H)/N (H, H). ˜ ˜ But the latter is dense in ˜ H ⊗ H. Example. We have H ⊗ Cn = Hn . ˜ µ Example. Let (M, dµ) and (M , d˜) be two measure spaces. Then we have L ˜ µ ˜ 2 (M, dµ) ⊗ L2 (M , d˜ ) = L2 (M × M , dµ × d˜ ). µ Clearly we have L ˜ µ ˜ 2 (M, dµ) ⊗ L2 (M , d˜ ) ⊆ L2 (M × M , dµ × d˜ ). Now µ take an orthonormal basis ϕj ⊗ ϕk for L ˜ ˜ µ 2 (M, dµ) ⊗ L2 (M , d˜ ) as in our previous lemma. Then (ϕj (x)ϕk (y))∗ f (x, y)dµ(x)d˜(y) = 0 ˜ µ (1.41) M ˜ M
  • 59. 1.5. The C ∗ algebra of bounded linear operators 47 implies ϕj (x)∗ fk (x)dµ(x) = 0, fk (x) = ϕk (y)∗ f (x, y)d˜(y) ˜ µ (1.42) M ˜ M and hence fk (x) = 0 µ-a.e. x. But this implies f (x, y) = 0 for µ-a.e. x and ˜ ˜ ˜ µ-a.e. y and thus f = 0. Hence ϕj ⊗ ϕk is a basis for L2 (M × M , dµ × d˜) µ and equality follows. It is straightforward to extend the tensor product to any finite number of Hilbert spaces. We even note ∞ ∞ ( Hj ) ⊗ H = (Hj ⊗ H), (1.43) j=1 j=1 where equality has to be understood in the sense that both spaces are uni- tarily equivalent by virtue of the identification ∞ ∞ ( ψj ) ⊗ ψ = ψj ⊗ ψ. (1.44) j=1 j=1 ˜ ˜ Problem 1.11. Show that ψ ⊗ ψ = 0 if and only if ψ = 0 or ψ = 0. ˜ ˜ Problem 1.12. We have ψ ⊗ ψ = φ ⊗ φ = 0 if and only if there is some α ∈ C{0} such that ψ = αφ and ψ˜ = α−1 φ. ˜ Problem 1.13. Show (1.43) 1.5. The C ∗ algebra of bounded linear operators We start by introducing a conjugation for operators on a Hilbert space H. Let A ∈ L(H). Then the adjoint operator is defined via ϕ, A∗ ψ = Aϕ, ψ (1.45) (compare Corollary 1.9). Example. If H = Cn and A = (ajk )1≤j,k≤n , then A∗ = (a∗ )1≤j,k≤n . kj Lemma 1.11. Let A, B ∈ L(H). Then (i) (A + B)∗ = A∗ + B ∗ , (αA)∗ = α∗ A∗ , (ii) A∗∗ = A, (iii) (AB)∗ = B ∗ A∗ , (iv) A = A∗ and A 2 = A∗ A = AA∗ . Proof. (i) and (ii) are obvious. (iii) follows from ϕ, (AB)ψ = A∗ ϕ, Bψ = B ∗ A∗ ϕ, ψ . (iv) follows from A∗ = sup | ψ, A∗ ϕ | = sup | Aψ, ϕ | = A ϕ = ψ =1 ϕ = ψ =1
  • 60. 48 1. Hilbert spaces and A∗ A = sup | ϕ, A∗ Aψ | = sup | Aϕ, Aψ | ϕ = ψ =1 ϕ = ψ =1 2 = sup Aϕ = A 2, ϕ =1 where we have used ϕ = sup ψ =1 | ψ, ϕ |. As a consequence of A∗ = A observe that taking the adjoint is continuous. In general, a Banach algebra A together with an involution (a + b)∗ = a∗ + b∗ , (αa)∗ = α∗ a∗ , a∗∗ = a, (ab)∗ = b∗ a∗ (1.46) satisfying a 2 = a∗ a (1.47) is called a C ∗ algebra. The element a∗ is called the adjoint of a. Note that a∗ = a follows from (1.47) and aa∗ ≤ a a∗ . Any subalgebra which is also closed under involution is called a ∗- subalgebra. An ideal is a subspace I ⊆ A such that a ∈ I, b ∈ A imply ab ∈ I and ba ∈ I. If it is closed under the adjoint map, it is called a ∗-ideal. Note that if there is an identity e, we have e∗ = e and hence (a−1 )∗ = (a∗ )−1 (show this). Example. The continuous functions C(I) together with complex conjuga- tion form a commutative C ∗ algebra. An element a ∈ A is called normal if aa∗ = a∗ a, self-adjoint if a = a∗ , unitary if aa∗ = a∗ a = I, an (orthogonal) projection if a = a∗ = a2 , and positive if a = bb∗ for some b ∈ A. Clearly both self-adjoint and unitary elements are normal. Problem 1.14. Let A ∈ L(H). Show that A is normal if and only if Aψ = A∗ ψ , ∀ψ ∈ H. (Hint: Problem 0.14.) Problem 1.15. Show that U : H → H is unitary if and only if U −1 = U ∗ . Problem 1.16. Compute the adjoint of 2 2 S: (N) → (N), (a1 , a2 , a3 , . . . ) → (0, a1 , a2 , . . . ).
  • 61. 1.6. Weak and strong convergence 49 1.6. Weak and strong convergence Sometimes a weaker notion of convergence is useful: We say that ψn con- verges weakly to ψ and write w-lim ψn = ψ or ψn ψ (1.48) n→∞ if ϕ, ψn → ϕ, ψ for every ϕ ∈ H (show that a weak limit is unique). Example. Let ϕn be an (infinite) orthonormal set. Then ψ, ϕn → 0 for every ψ since these are just the expansion coefficients of ψ. (ϕn does not converge to 0, since ϕn = 1.) Clearly ψn → ψ implies ψn ψ and hence this notion of convergence is indeed weaker. Moreover, the weak limit is unique, since ϕ, ψn → ϕ, ψ ˜ ˜ and ϕ, ψn → ϕ, ψ imply ϕ, (ψ − ψ) = 0. A sequence ψn is called a weak Cauchy sequence if ϕ, ψn is Cauchy for every ϕ ∈ H. Lemma 1.12. Let H be a Hilbert space. (i) ψn ψ implies ψ ≤ lim inf ψn . (ii) Every weak Cauchy sequence ψn is bounded: ψn ≤ C. (iii) Every weak Cauchy sequence converges weakly. (iv) For a weakly convergent sequence ψn ψ we have ψn → ψ if and only if lim sup ψn ≤ ψ . Proof. (i) Observe 2 ψ = ψ, ψ = lim inf ψ, ψn ≤ ψ lim inf ψn . (ii) For every ϕ we have that | ϕ, ψn | ≤ C(ϕ) is bounded. Hence by the uniform boundedness principle we have ψn = ψn , . ≤ C. (iii) Let ϕm be an orthonormal basis and define cm = limn→∞ ϕm , ψn . Then ψ = m cm ϕm is the desired limit. (iv) By (i) we have lim ψn = ψ and hence 2 2 2 ψ − ψn = ψ − 2 Re( ψ, ψn ) + ψn → 0. The converse is straightforward. Clearly an orthonormal basis does not have a norm convergent subse- quence. Hence the unit ball in an infinite dimensional Hilbert space is never compact. However, we can at least extract weakly convergent subsequences: Lemma 1.13. Let H be a Hilbert space. Every bounded sequence ψn has a weakly convergent subsequence.
  • 62. 50 1. Hilbert spaces Proof. Let ϕk be an orthonormal basis. Then by the usual diagonal se- quence argument we can find a subsequence ψnm such that ϕk , ψnm con- verges for all k. Since ψn is bounded, ϕ, ψnm converges for every ϕ ∈ H and hence ψnm is a weak Cauchy sequence. Finally, let me remark that similar concepts can be introduced for oper- ators. This is of particular importance for the case of unbounded operators, where convergence in the operator norm makes no sense at all. A sequence of operators An is said to converge strongly to A, s-lim An = A :⇔ An ψ → Aψ ∀x ∈ D(A) ⊆ D(An ). (1.49) n→∞ It is said to converge weakly to A, w-lim An = A :⇔ An ψ Aψ ∀ψ ∈ D(A) ⊆ D(An ). (1.50) n→∞ Clearly norm convergence implies strong convergence and strong conver- gence implies weak convergence. Example. Consider the operator Sn ∈ L( 2 (N)) which shifts a sequence n places to the left, that is, Sn (x1 , x2 , . . . ) = (xn+1 , xn+2 , . . . ), (1.51) ∗ and the operator Sn ∈ L( 2 (N)) which shifts a sequence n places to the right and fills up the first n places with zeros, that is, ∗ Sn (x1 , x2 , . . . ) = (0, . . . , 0, x1 , x2 , . . . ). (1.52) n places Then Sn converges to zero strongly but not in norm (since Sn = 1) and ∗ ∗ Sn converges weakly to zero (since ϕ, Sn ψ = Sn ϕ, ψ ) but not strongly ∗ (since Sn ψ = ψ ) . Note that this example also shows that taking adjoints is not continuous s with respect to strong convergence! If An → A, we only have ϕ, A∗ ψ = An ϕ, ψ → Aϕ, ψ = ϕ, A∗ ψ n (1.53) and hence A∗ n A∗ in general. However, if An and A are normal, we have (An − A)∗ ψ = (An − A)ψ (1.54) s and hence A∗ → n A∗ in this case. Thus at least for normal operators taking adjoints is continuous with respect to strong convergence. Lemma 1.14. Suppose An is a sequence of bounded operators. (i) s-limn→∞ An = A implies A ≤ lim inf n→∞ An . (ii) Every strong Cauchy sequence An is bounded: An ≤ C.
  • 63. 1.7. Appendix: The Stone–Weierstraß theorem 51 (iii) If An ψ → Aψ for ψ in some dense set and An ≤ C, then s-limn→∞ An = A. The same result holds if strong convergence is replaced by weak convergence. Proof. (i) follows from Aψ = lim An ψ ≤ lim inf An n→∞ n→∞ for every ψ with ψ = 1. (ii) follows as in Lemma 1.12 (i). (iii) Just use An ψ − Aψ ≤ An ψ − An ϕ + An ϕ − Aϕ + Aϕ − Aψ ≤ 2C ψ − ϕ + An ϕ − Aϕ ε and choose ϕ in the dense subspace such that ψ − ϕ ≤ 4C and n large ε such that An ϕ − Aϕ ≤ 2 . The case of weak convergence is left as an exercise. (Hint: (2.14).) Problem 1.17. Suppose ψn → ψ and ϕn ϕ. Then ψn , ϕn → ψ, ϕ . Problem 1.18. Let {ϕj }∞ be some orthonormal basis. Show that ψn j=1 ψ if and only if ψn is bounded and ϕj , ψn → ϕj , ψ for every j. Show that this is wrong without the boundedness assumption. Problem 1.19. A subspace M ⊆ H is closed if and only if every weak Cauchy sequence in M has a limit in M . (Hint: M = M ⊥⊥ .) 1.7. Appendix: The Stone–Weierstraß theorem In case of a self-adjoint operator, the spectral theorem will show that the closed ∗-subalgebra generated by this operator is isomorphic to the C ∗ al- gebra of continuous functions C(K) over some compact set. Hence it is important to be able to identify dense sets: Theorem 1.15 (Stone–Weierstraß, real version). Suppose K is a compact set and let C(K, R) be the Banach algebra of continuous functions (with the sup norm). If F ⊂ C(K, R) contains the identity 1 and separates points (i.e., for every x1 = x2 there is some function f ∈ F such that f (x1 ) = f (x2 )), then the algebra generated by F is dense. Proof. Denote by A the algebra generated by F . Note that if f ∈ A, we have |f | ∈ A: By the Weierstraß approximation theorem (Theorem 0.15) 1 there is a polynomial pn (t) such that |t| − pn (t) n for t ∈ f (K) and hence pn (f ) → |f |.
  • 64. 52 1. Hilbert spaces In particular, if f, g are in A, we also have (f + g) + |f − g| (f + g) − |f − g| max{f, g} = , min{f, g} = 2 2 in A. Now fix f ∈ C(K, R). We need to find some fε ∈ A with f − fε ∞ ε. First of all, since A separates points, observe that for given y, z ∈ K there is a function fy,z ∈ A such that fy,z (y) = f (y) and fy,z (z) = f (z) (show this). Next, for every y ∈ K there is a neighborhood U (y) such that fy,z (x) f (x) − ε, x ∈ U (y), and since K is compact, finitely many, say U (y1 ), . . . , U (yj ), cover K. Then fz = max{fy1 ,z , . . . , fyj ,z } ∈ A and satisfies fz f − ε by construction. Since fz (z) = f (z) for every z ∈ K, there is a neighborhood V (z) such that fz (x) f (x) + ε, x ∈ V (z), and a corresponding finite cover V (z1 ), . . . , V (zk ). Now fε = min{fz1 , . . . , fzk } ∈ A satisfies fε f + ε. Since f − ε fzl fε , we have found a required function. Theorem 1.16 (Stone–Weierstraß). Suppose K is a compact set and let C(K) be the C ∗ algebra of continuous functions (with the sup norm). If F ⊂ C(K) contains the identity 1 and separates points, then the ∗- subalgebra generated by F is dense. ˜ Proof. Just observe that F = {Re(f ), Im(f )|f ∈ F } satisfies the assump- tion of the real version. Hence any real-valued continuous functions can be ˜ approximated by elements from F , in particular this holds for the real and imaginary parts for any given complex-valued function. Note that the additional requirement of being closed under complex conjugation is crucial: The functions holomorphic on the unit ball and con- tinuous on the boundary separate points, but they are not dense (since the uniform limit of holomorphic functions is again holomorphic). Corollary 1.17. Suppose K is a compact set and let C(K) be the C ∗ algebra of continuous functions (with the sup norm). If F ⊂ C(K) separates points, then the closure of the ∗-subalgebra gen- erated by F is either C(K) or {f ∈ C(K)|f (t0 ) = 0} for some t0 ∈ K.
  • 65. 1.7. Appendix: The Stone–Weierstraß theorem 53 Proof. There are two possibilities: either all f ∈ F vanish at one point t0 ∈ K (there can be at most one such point since F separates points) or there is no such point. If there is no such point, we can proceed as in the proof of the Stone–Weierstraß theorem to show that the identity can be approximated by elements in A (note that to show |f | ∈ A if f ∈ A, we do not need the identity, since pn can be chosen to contain no constant term). If there is such a t0 , the identity is clearly missing from A. However, adding the identity to A, we get A + C = C(K) and it is easy to see that A = {f ∈ C(K)|f (t0 ) = 0}. Problem 1.20. Show that the functions ϕn (x) = √1 einx , n ∈ Z, form an 2π orthonormal basis for H = L2 (0, 2π). Problem 1.21. Let k ∈ N and I ⊆ R. Show that the ∗-subalgebra generated by fz0 (t) = (t−z0 )k for one z0 ∈ C is dense in the C ∗ algebra C∞ (I) of 1 continuous functions vanishing at infinity • for I = R if z0 ∈ CR and k = 1, 2, • for I = [a, ∞) if z0 ∈ (−∞, a) and any k, • for I = (−∞, a] ∪ [b, ∞) if z0 ∈ (a, b) and k odd. (Hint: Add ∞ to R to make it compact.)
  • 67. Chapter 2 Self-adjointness and spectrum 2.1. Some quantum mechanics In quantum mechanics, a single particle living in R3 is described by a complex-valued function (the wave function) ψ(x, t), (x, t) ∈ R3 × R, (2.1) where x corresponds to a point in space and t corresponds to time. The quantity ρt (x) = |ψ(x, t)|2 is interpreted as the probability density of the particle at the time t. In particular, ψ must be normalized according to |ψ(x, t)|2 d3 x = 1, t ∈ R. (2.2) R3 The location x of the particle is a quantity which can be observed (i.e., measured) and is hence called observable. Due to our probabilistic inter- pretation, it is also a random variable whose expectation is given by Eψ (x) = x|ψ(x, t)|2 d3 x. (2.3) R3 In a real life setting, it will not be possible to measure x directly and one will only be able to measure certain functions of x. For example, it is possible to check whether the particle is inside a certain area Ω of space (e.g., inside a detector). The corresponding observable is the characteristic function χΩ (x) of this set. In particular, the number Eψ (χΩ ) = χΩ (x)|ψ(x, t)|2 d3 x = |ψ(x, t)|2 d3 x (2.4) R3 Ω 55
  • 68. 56 2. Self-adjointness and spectrum corresponds to the probability of finding the particle inside Ω ⊆ R3 . An important point to observe is that, in contradistinction to classical mechan- ics, the particle is no longer localized at a certain point. In particular, the mean-square deviation (or variance) ∆ψ (x)2 = Eψ (x2 ) − Eψ (x)2 is always nonzero. In general, the configuration space (or phase space) of a quantum system is a (complex) Hilbert space H and the possible states of this system are represented by the elements ψ having norm one, ψ = 1. An observable a corresponds to a linear operator A in this Hilbert space and its expectation, if the system is in the state ψ, is given by the real number Eψ (A) = ψ, Aψ = Aψ, ψ , (2.5) where ., .. denotes the scalar product of H. Similarly, the mean-square deviation is given by ∆ψ (A)2 = Eψ (A2 ) − Eψ (A)2 = (A − Eψ (A))ψ 2 . (2.6) Note that ∆ψ (A) vanishes if and only if ψ is an eigenstate corresponding to the eigenvalue Eψ (A); that is, Aψ = Eψ (A)ψ. From a physical point of view, (2.5) should make sense for any ψ ∈ H. However, this is not in the cards as our simple example of one particle already shows. In fact, the reader is invited to find a square integrable function ψ(x) for which xψ(x) is no longer square integrable. The deeper reason behind this nuisance is that Eψ (x) can attain arbitrarily large values if the particle is not confined to a finite domain, which renders the corresponding opera- tor unbounded. But unbounded operators cannot be defined on the entire Hilbert space in a natural way by the closed graph theorem (Theorem 2.8 below). Hence, A will only be defined on a subset D(A) ⊆ H called the domain of A. Since we want A to be defined for at least most states, we require D(A) to be dense. However, it should be noted that there is no general prescription for how to find the operator corresponding to a given observable. Now let us turn to the time evolution of such a quantum mechanical system. Given an initial state ψ(0) of the system, there should be a unique ψ(t) representing the state of the system at time t ∈ R. We will write ψ(t) = U (t)ψ(0). (2.7) Moreover, it follows from physical experiments that superposition of states holds; that is, U (t)(α1 ψ1 (0) + α2 ψ2 (0)) = α1 ψ1 (t) + α2 ψ2 (t) (|α1 |2 + |α2 |2 = 1). In other words, U (t) should be a linear operator. Moreover, since ψ(t)
  • 69. 2.1. Some quantum mechanics 57 is a state (i.e., ψ(t) = 1), we have U (t)ψ = ψ . (2.8) Such operators are called unitary. Next, since we have assumed uniqueness of solutions to the initial value problem, we must have U (0) = I, U (t + s) = U (t)U (s). (2.9) A family of unitary operators U (t) having this property is called a one- parameter unitary group. In addition, it is natural to assume that this group is strongly continuous; that is, lim U (t)ψ = U (t0 )ψ, ψ ∈ H. (2.10) t→t0 Each such group has an infinitesimal generator defined by i 1 Hψ = lim (U (t)ψ − ψ), D(H) = {ψ ∈ H| lim (U (t)ψ − ψ) exists}. t→0 t t→0 t (2.11) This operator is called the Hamiltonian and corresponds to the energy of the system. If ψ(0) ∈ D(H), then ψ(t) is a solution of the Schr¨dinger o equation (in suitable units) d i ψ(t) = Hψ(t). (2.12) dt This equation will be the main subject of our course. In summary, we have the following axioms of quantum mechanics. Axiom 1. The configuration space of a quantum system is a complex separable Hilbert space H and the possible states of this system are repre- sented by the elements of H which have norm one. Axiom 2. Each observable a corresponds to a linear operator A defined maximally on a dense subset D(A). Moreover, the operator correspond- ing to a polynomial Pn (a) = n αj aj , αj ∈ R, is Pn (A) = n αj Aj , j=0 j=0 D(Pn (A)) = D(An ) = {ψ ∈ D(A)|Aψ ∈ D(An−1 )} (A0 = I). Axiom 3. The expectation value for a measurement of a, when the system is in the state ψ ∈ D(A), is given by (2.5), which must be real for all ψ ∈ D(A). Axiom 4. The time evolution is given by a strongly continuous one- parameter unitary group U (t). The generator of this group corresponds to the energy of the system. In the following sections we will try to draw some mathematical conse- quences from these assumptions: First we will see that Axioms 2 and 3 imply that observables corre- spond to self-adjoint operators. Hence these operators play a central role
  • 70. 58 2. Self-adjointness and spectrum in quantum mechanics and we will derive some of their basic properties. Another crucial role is played by the set of all possible expectation values for the measurement of a, which is connected with the spectrum σ(A) of the corresponding operator A. The problem of defining functions of an observable will lead us to the spectral theorem (in the next chapter), which generalizes the diagonalization of symmetric matrices. Axiom 4 will be the topic of Chapter 5. 2.2. Self-adjoint operators Let H be a (complex separable) Hilbert space. A linear operator is a linear mapping A : D(A) → H, (2.13) where D(A) is a linear subspace of H, called the domain of A. It is called bounded if the operator norm A = sup Aψ = sup | ψ, Aϕ | (2.14) ψ =1 ϕ = ψ =1 is finite. The second equality follows since equality in | ψ, Aϕ | ≤ ψ Aϕ is attained when Aϕ = zψ for some z ∈ C. If A is bounded, it is no restriction to assume D(A) = H and we will always do so. The Banach space of all bounded linear operators is denoted by L(H). Products of (unbounded) operators are defined naturally; that is, ABψ = A(Bψ) for ψ ∈ D(AB) = {ψ ∈ D(B)|Bψ ∈ D(A)}. The expression ψ, Aψ encountered in the previous section is called the quadratic form, qA (ψ) = ψ, Aψ , ψ ∈ D(A), (2.15) associated to A. An operator can be reconstructed from its quadratic form via the polarization identity 1 ϕ, Aψ = (qA (ϕ + ψ) − qA (ϕ − ψ) + iqA (ϕ − iψ) − iqA (ϕ + iψ)) . (2.16) 4 A densely defined linear operator A is called symmetric (or hermitian) if ϕ, Aψ = Aϕ, ψ , ψ, ϕ ∈ D(A). (2.17) The justification for this definition is provided by the following Lemma 2.1. A densely defined operator A is symmetric if and only if the corresponding quadratic form is real-valued.
  • 71. 2.2. Self-adjoint operators 59 Proof. Clearly (2.17) implies that Im(qA (ψ)) = 0. Conversely, taking the imaginary part of the identity qA (ψ + iϕ) = qA (ψ) + qA (ϕ) + i( ψ, Aϕ − ϕ, Aψ ) shows Re Aϕ, ψ = Re ϕ, Aψ . Replacing ϕ by iϕ in this last equation shows Im Aϕ, ψ = Im ϕ, Aψ and finishes the proof. In other words, a densely defined operator A is symmetric if and only if ψ, Aψ = Aψ, ψ , ψ ∈ D(A). (2.18) This already narrows the class of admissible operators to the class of symmetric operators by Axiom 3. Next, let us tackle the issue of the correct domain. ˜ By Axiom 2, A should be defined maximally; that is, if A is another symmetric operator such that A ⊆ A, ˜ then A = A. Here we write A ⊆ A ˜ ˜ ˜ ˜ ˜ if D(A) ⊆ D(A) and Aψ = Aψ for all ψ ∈ D(A). The operator A is called an extension of A in this case. In addition, we write A = A˜ if both A ⊆ A ˜ ˜ and A ⊆ A hold. The adjoint operator A∗ of a densely defined linear operator A is defined by ˜ ˜ D(A∗ ) = {ψ ∈ H|∃ψ ∈ H : ψ, Aϕ = ψ, ϕ , ∀ϕ ∈ D(A)}, ˜ ∗ ψ = ψ. (2.19) A The requirement that D(A) be dense implies that A∗ is well-defined. How- ever, note that D(A∗ ) might not be dense in general. In fact, it might contain no vectors other than 0. Clearly we have (αA)∗ = α∗ A∗ for α ∈ C and (A + B)∗ ⊇ A∗ + B ∗ provided D(A + B) = D(A) ∩ D(B) is dense. However, equality will not hold in general unless one operator is bounded (Problem 2.2). For later use, note that (Problem 2.4) Ker(A∗ ) = Ran(A)⊥ . (2.20) For symmetric operators we clearly have A ⊆ A∗ . If, in addition, A = A∗ holds, then A is called self-adjoint. Our goal is to show that observables correspond to self-adjoint operators. This is for example true in the case of the position operator x, which is a special case of a multiplication operator. Example. (Multiplication operator) Consider the multiplication operator (Af )(x) = A(x)f (x), D(A) = {f ∈ L2 (Rn , dµ) | Af ∈ L2 (Rn , dµ)} (2.21) given by multiplication with the measurable function A : Rn → C. First of all note that D(A) is dense. In fact, consider Ωn = {x ∈ Rn | |A(x)| ≤
  • 72. 60 2. Self-adjointness and spectrum n} Rn . Then, for every f ∈ L2 (Rn , dµ) the function fn = χΩn f ∈ D(A) converges to f as n → ∞ by dominated convergence. Next, let us compute the adjoint of A. Performing a formal computation, we have for h, f ∈ D(A) that h, Af = h(x)∗ A(x)f (x)dµ(x) = (A(x)∗ h(x))∗ f (x)dµ(x) = Ah, f , ˜ (2.22) ˜ where A is multiplication by A(x)∗ , (Af )(x) = A(x)∗ f (x), ˜ ˜ ˜ D(A) = {f ∈ L2 (Rn , dµ) | Af ∈ L2 (Rn , dµ)}. (2.23) Note D(A)˜ = D(A). At first sight this seems to show that the adjoint of ˜ A is A. But for our calculation we had to assume h ∈ D(A) and there might be some functions in D(A∗ ) which do not satisfy this requirement! In ˜ particular, our calculation only shows A ⊆ A∗ . To show that equality holds, we need to work a little harder: If h ∈ D(A∗ ), there is some g ∈ L2 (Rn , dµ) such that h(x)∗ A(x)f (x)dµ(x) = g(x)∗ f (x)dµ(x), f ∈ D(A), (2.24) and thus (h(x)A(x)∗ − g(x))∗ f (x)dµ(x) = 0, f ∈ D(A). (2.25) In particular, χΩn (x)(h(x)A(x)∗ − g(x))∗ f (x)dµ(x) = 0, f ∈ L2 (Rn , dµ), (2.26) which shows that χΩn (h(x)A(x)∗ − g(x))∗ ∈ L2 (Rn , dµ) vanishes. Since n is arbitrary, we even have h(x)A(x)∗ = g(x) ∈ L2 (Rn , dµ) and thus A∗ is multiplication by A(x)∗ and D(A∗ ) = D(A). In particular, A is self-adjoint if A is real-valued. In the general case we have at least Af = A∗ f for all f ∈ D(A) = D(A∗ ). Such operators are called normal. Now note that A⊆B ⇒ B ∗ ⊆ A∗ ; (2.27) that is, increasing the domain of A implies decreasing the domain of A∗ . Thus there is no point in trying to extend the domain of a self-adjoint operator further. In fact, if A is self-adjoint and B is a symmetric extension, we infer A ⊆ B ⊆ B ∗ ⊆ A∗ = A implying A = B. Corollary 2.2. Self-adjoint operators are maximal; that is, they do not have any symmetric extensions.
  • 73. 2.2. Self-adjoint operators 61 Furthermore, if A∗ is densely defined (which is the case if A is symmet- ric), we can consider A∗∗ . From the definition (2.19) it is clear that A ⊆ A∗∗ and thus A∗∗ is an extension of A. This extension is closely related to ex- tending a linear subspace M via M ⊥⊥ = M (as we will see a bit later) and thus is called the closure A = A∗∗ of A. If A is symmetric, we have A ⊆ A∗ and hence A = A∗∗ ⊆ A∗ ; that is, A lies between A and A∗ . Moreover, ψ, A∗ ϕ = Aψ, ϕ for all ψ ∈ D(A), ϕ ∈ D(A∗ ) implies that A is symmetric since A∗ ϕ = Aϕ for ϕ ∈ D(A). Example. (Differential operator) Take H = L2 (0, 2π). (i) Consider the operator d A0 f = −i f, D(A0 ) = {f ∈ C 1 [0, 2π] | f (0) = f (2π) = 0}. (2.28) dx That A0 is symmetric can be shown by a simple integration by parts (do this). Note that the boundary conditions f (0) = f (2π) = 0 are chosen such that the boundary terms occurring from integration by parts vanish. However, this will also follow once we have computed A∗ . If g ∈ D(A∗ ), we 0 0 must have 2π 2π g(x)∗ (−if (x))dx = g (x)∗ f (x)dx ˜ (2.29) 0 0 for some g ∈ L2 (0, 2π). Integration by parts (cf. (2.116)) shows ˜ 2π x ∗ f (x) g(x) − i g (t)dt ˜ dx = 0. (2.30) 0 0 In fact, this formula holds for g ∈ C[0, 2π]. Since the set of continuous ˜ functions is dense, the general case g ∈ L2 (0, 2π) follows by approximating ˜ g with continuous functions and taking limits on both sides using dominated ˜ convergence. x Hence g(x) − i 0 g (t)dt ∈ {f |f ∈ D(A0 )}⊥ . But {f |f ∈ D(A0 )} = ˜ 2π x {h ∈ C[0, 2π]| 0 h(t)dt = 0} (show this) implying g(x) = g(0) + i 0 g (t)dt ˜ since {f |f ∈ D(A0 )} = {h ∈ H| 1, h = 0} = {1}⊥ and {1}⊥⊥ = span{1}. Thus g ∈ AC[0, 2π], where x AC[a, b] = {f ∈ C[a, b]|f (x) = f (a) + g(t)dt, g ∈ L1 (a, b)} (2.31) a denotes the set of all absolutely continuous functions (see Section 2.7). In summary, g ∈ D(A∗ ) implies g ∈ AC[0, 2π] and A∗ g = g = −ig . Conversely, 0 0 ˜ for every g ∈ H 1 (0, 2π) = {f ∈ AC[0, 2π]|f ∈ L2 (0, 2π)}, (2.29) holds with g = −ig and we conclude ˜ d A∗ f = −i 0 f, D(A∗ ) = H 1 (0, 2π). 0 (2.32) dx
  • 74. 62 2. Self-adjointness and spectrum In particular, A0 is symmetric but not self-adjoint. Since A0 = A∗∗ ⊆ A∗ , 0 0 we can use integration by parts to compute 0 = g, A0 f − A∗ g, f = i(f (0)g(0)∗ − f (2π)g(2π)∗ ) 0 (2.33) and since the boundary values of g ∈ D(A∗ ) can be prescribed arbitrarily, 0 we must have f (0) = f (2π) = 0. Thus d A0 f = −i f, D(A0 ) = {f ∈ D(A∗ ) | f (0) = f (2π) = 0}. 0 (2.34) dx (ii) Now let us take d Af = −i f, D(A) = {f ∈ C 1 [0, 2π] | f (0) = f (2π)}, (2.35) dx which is clearly an extension of A0 . Thus A∗ ⊆ A∗ and we compute 0 0 = g, Af − A∗ g, f = if (0)(g(0)∗ − g(2π)∗ ). (2.36) Since this must hold for all f ∈ D(A), we conclude g(0) = g(2π) and d A∗ f = −i f, D(A∗ ) = {f ∈ H 1 (0, 2π) | f (0) = f (2π)}. (2.37) dx Similarly, as before, A = A∗ and thus A is self-adjoint. One might suspect that there is no big difference between the two sym- metric operators A0 and A from the previous example, since they coincide on a dense set of vectors. However, the converse is true: For example, the first operator A0 has no eigenvectors at all (i.e., solutions of the equation A0 ψ = zψ, z ∈ C) whereas the second one has an orthonormal basis of eigenvectors! Example. Compute the eigenvectors of A0 and A from the previous exam- ple. (i) By definition, an eigenvector is a (nonzero) solution of A0 u = zu, z ∈ C, that is, a solution of the ordinary differential equation − i u (x) = zu(x) (2.38) satisfying the boundary conditions u(0) = u(2π) = 0 (since we must have u ∈ D(A0 )). The general solution of the differential equation is u(x) = u(0)eizx and the boundary conditions imply u(x) = 0. Hence there are no eigenvectors. (ii) Now we look for solutions of Au = zu, that is, the same differential equation as before, but now subject to the boundary condition u(0) = u(2π). Again the general solution is u(x) = u(0)eizx and the boundary condition requires u(0) = u(0)e2πiz . Thus there are two possibilities. Either u(0) = 0
  • 75. 2.2. Self-adjoint operators 63 (which is of no use for us) or z ∈ Z. In particular, we see that all eigenvectors are given by 1 un (x) = √ einx , n ∈ Z, (2.39) 2π which are well known to form an orthonormal basis. We will see a bit later that this is a consequence of self-adjointness of A. Hence it will be important to know whether a given operator is self- adjoint or not. Our example shows that symmetry is easy to check (in case of differential operators it usually boils down to integration by parts), but computing the adjoint of an operator is a nontrivial job even in simple situ- ations. However, we will learn soon that self-adjointness is a much stronger property than symmetry, justifying the additional effort needed to prove it. On the other hand, if a given symmetric operator A turns out not to be self-adjoint, this raises the question of self-adjoint extensions. Two cases need to be distinguished. If A is self-adjoint, then there is only one self- adjoint extension (if B is another one, we have A ⊆ B and hence A = B by Corollary 2.2). In this case A is called essentially self-adjoint and D(A) is called a core for A. Otherwise there might be more than one self- adjoint extension or none at all. This situation is more delicate and will be investigated in Section 2.6. Since we have seen that computing A∗ is not always easy, a criterion for self-adjointness not involving A∗ will be useful. Lemma 2.3. Let A be symmetric such that Ran(A + z) = Ran(A + z ∗ ) = H for one z ∈ C. Then A is self-adjoint. ˜ Proof. Let ψ ∈ D(A∗ ) and A∗ ψ = ψ. Since Ran(A + z ∗ ) = H, there is a ϑ ∈ D(A) such that (A + z ˜ ∗ )ϑ = ψ + z ∗ ψ. Now we compute ψ, (A + z)ϕ = ψ + z ∗ ψ, ϕ = (A + z ∗ )ϑ, ϕ = ϑ, (A + z)ϕ , ˜ ϕ ∈ D(A), and hence ψ = ϑ ∈ D(A) since Ran(A + z) = H. To proceed further, we will need more information on the closure of an operator. We will use a different approach which avoids the use of the adjoint operator. We will establish equivalence with our original definition in Lemma 2.4. The simplest way of extending an operator A is to take the closure of its graph Γ(A) = {(ψ, Aψ)|ψ ∈ D(A)} ⊂ H2 . That is, if (ψn , Aψn ) → (ψ, ψ), ˜ we might try to define Aψ = ψ. ˜ For Aψ to be well-defined, we need that ˜ ˜ (ψn , Aψn ) → (0, ψ) implies ψ = 0. In this case A is called closable and the unique operator A which satisfies Γ(A) = Γ(A) is called the closure of A. Clearly, A is called closed if A = A, which is the case if and only if the
  • 76. 64 2. Self-adjointness and spectrum graph of A is closed. Equivalently, A is closed if and only if Γ(A) equipped with the graph norm (ψ, Aψ) 2 Γ(A) = ψ 2 + Aψ 2 is a Hilbert space (i.e., closed). By construction, A is the smallest closed extension of A. Example. Suppose A is bounded. Then the closure was already computed in Theorem 0.26. In particular, D(A) = D(A) and a bounded operator is closed if and only if its domain is closed. Example. Consider again the differential operator A0 from (2.28) and let us compute the closure without the use of the adjoint operator. Let f ∈ D(A0 ) and let fn ∈ D(A0 ) be a sequence such that fn → f , x A0 fn → −ig. Then fn → g and hence f (x) = 0 g(t)dt. Thus f ∈ AC[0, 2π] 2π and f (0) = 0. Moreover f (2π) = limn→0 0 fn (t)dt = 0. Conversely, any such f can be approximated by functions in D(A0 ) (show this). Example. Consider again the multiplication operator by A(x) in L2 (Rn , dµ) but now defined on functions with compact support, that is, D(A0 ) = {f ∈ D(A) | supp(f ) is compact}. (2.40) Then its closure is given by A0 = A. In particular, A0 is essentially self- adjoint and D(A0 ) is a core for A. To prove A0 = A, let some f ∈ D(A) be given and consider fn = χ{x| |x|≤n} f . Then fn ∈ D(A0 ) and fn (x) → f (x) as well as A(x)fn (x) → A(x)f (x) in L2 (Rn , dµ) by dominated convergence. Thus D(A0 ) ⊆ D(A) and since A is closed, we even get equality. Example. Consider the multiplication A(x) = x in L2 (R) defined on D(A0 ) = {f ∈ D(A) | f (x)dx = 0}. (2.41) R Then A0 is closed. Hence D(A0 ) is not a core for A. To show that A0 is closed, suppose there is a sequence fn (x) → f (x) such that xfn (x) → g(x). Since A is closed, we necessarily have f ∈ D(A) and g(x) = xf (x). But then 1 0 = lim fn (x)dx = lim (fn (x) + sign(x)xfn (x))dx n→∞ R n→∞ R 1 + |x| 1 = (f (x) + sign(x)g(x))dx = f (x)dx (2.42) R 1 + |x| R which shows f ∈ D(A0 ). Next, let us collect a few important results.
  • 77. 2.2. Self-adjoint operators 65 Lemma 2.4. Suppose A is a densely defined operator. (i) A∗ is closed. (ii) A is closable if and only if D(A∗ ) is dense and A = A∗∗ , respec- tively, (A)∗ = A∗ , in this case. (iii) If A is injective and Ran(A) is dense, then (A∗ )−1 = (A−1 )∗ . If −1 A is closable and A is injective, then A = A−1 . Proof. Let us consider the following two unitary operators from H2 to itself U (ϕ, ψ) = (ψ, −ϕ), V (ϕ, ψ) = (ψ, ϕ). (i) From Γ(A∗ ) = {(ϕ, ϕ) ∈ H2 | ϕ, Aψ = ϕ, ψ , ∀ψ ∈ D(A)} ˜ ˜ ˜ ˜ ˜ = {(ϕ, ϕ) ∈ H2 | (ϕ, ϕ), (ψ, −ψ) H2 = 0, ∀(ψ, ψ) ∈ Γ(A)} ˜ = (U Γ(A))⊥ (2.43) we conclude that A∗ is closed. (ii) Similarly, using U Γ⊥ = (U Γ)⊥ (Problem 1.4), by Γ(A) = Γ(A)⊥⊥ = (U Γ(A∗ ))⊥ = {(ψ, ψ)| ψ, A∗ ϕ − ψ, ϕ = 0, ∀ϕ ∈ D(A∗ )} ˜ ˜ ˜ ˜ we see that (0, ψ) ∈ Γ(A) if and only if ψ ∈ D(A∗ )⊥ . Hence A is closable if and only if D(A ∗ ) is dense. In this case, equation (2.43) also shows A∗ = A∗ . Moreover, replacing A by A∗ in (2.43) and comparing with the last formula shows A∗∗ = A. (iii) Next note that (provided A is injective) Γ(A−1 ) = V Γ(A). Hence if Ran(A) is dense, then Ker(A∗ ) = Ran(A)⊥ = {0} and Γ((A∗ )−1 ) = V Γ(A∗ ) = V U Γ(A)⊥ = U V Γ(A)⊥ = U (V Γ(A))⊥ shows that (A∗ )−1 = (A−1 )∗ . Similarly, if A is closable and A is injective, −1 then A = A−1 by −1 Γ(A ) = V Γ(A) = V Γ(A) = Γ(A−1 ). Corollary 2.5. If A is self-adjoint and injective, then A−1 is also self- adjoint. Proof. Equation (2.20) in the case A = A∗ implies Ran(A)⊥ = Ker(A) = {0} and hence (iii) is applicable.
  • 78. 66 2. Self-adjointness and spectrum If A is densely defined and bounded, we clearly have D(A∗ ) = H and by Corollary 1.9, A∗ ∈ L(H). In particular, since A = A∗∗ , we obtain Theorem 2.6. We have A ∈ L(H) if and only if A∗ ∈ L(H). Now we can also generalize Lemma 2.3 to the case of essential self-adjoint operators. Lemma 2.7. A symmetric operator A is essentially self-adjoint if and only if one of the following conditions holds for one z ∈ CR: • Ran(A + z) = Ran(A + z ∗ ) = H, • Ker(A∗ + z) = Ker(A∗ + z ∗ ) = {0}. If A is nonnegative, that is, ψ, Aψ ≥ 0 for all ψ ∈ D(A), we can also admit z ∈ (−∞, 0). Proof. First of all note that by (2.20) the two conditions are equivalent. By taking the closure of A, it is no restriction to assume that A is closed. Let z = x + iy. From 2 2 (A + z)ψ = (A + x)ψ + iyψ 2 = (A + x)ψ + y2 ψ 2 ≥ y2 ψ 2, (2.44) we infer that Ker(A+z) = {0} and hence (A+z)−1 exists. Moreover, setting ψ = (A + z)−1 ϕ (y = 0) shows (A + z)−1 ≤ |y|−1 . Hence (A + z)−1 is bounded and closed. Since it is densely defined by assumption, its domain Ran(A + z) must be equal to H. Replacing z by z ∗ , we see Ran(A + z ∗ ) = H and applying Lemma 2.3 shows that A is self-adjoint. Conversely, if A = A∗ , the above calculation shows Ker(A∗ + z) = {0}, which finishes the case z ∈ CR. The argument for the nonnegative case with z 0 is similar using ε ψ 2 ≤ ψ, (A + ε)ψ ≤ ψ (A + ε)ψ which shows (A + ε)−1 ≤ ε−1 , ε 0. In addition, we can also prove the closed graph theorem which shows that an unbounded closed operator cannot be defined on the entire Hilbert space. Theorem 2.8 (Closed graph). Let H1 and H2 be two Hilbert spaces and A : H1 → H2 an operator defined on all of H1 . Then A is bounded if and only if Γ(A) is closed. Proof. If A is bounded, then it is easy to see that Γ(A) is closed. So let us assume that Γ(A) is closed. Then A∗ is well-defined and for all unit vectors
  • 79. 2.3. Quadratic forms and the Friedrichs extension 67 ϕ ∈ D(A∗ ) we have that the linear functional ϕ (ψ) = A∗ ϕ, ψ is pointwise bounded, that is, ϕ (ψ) = | ϕ, Aψ | ≤ Aψ . Hence by the uniform boundedness principle there is a constant C such that ∗ ∗ ∗∗ ϕ = A ϕ ≤ C. That is, A is bounded and so is A = A . Note that since symmetric operators are closable, they are automatically closed if they are defined on the entire Hilbert space. Theorem 2.9 (Hellinger-Toeplitz). A symmetric operator defined on the entire Hilbert space is bounded. Problem 2.1 (Jacobi operator). Let a and b be some real-valued sequences in ∞ (Z). Consider the operator 2 Jfn = an fn+1 + an−1 fn−1 + bn fn , f∈ (Z). Show that J is a bounded self-adjoint operator. Problem 2.2. Show that (αA)∗ = α∗ A∗ and (A + B)∗ ⊇ A∗ + B ∗ (where D(A∗ + B ∗ ) = D(A∗ ) ∩ D(B ∗ )) with equality if one operator is bounded. Give an example where equality does not hold. Problem 2.3. Suppose AB is densely defined. Show that (AB)∗ ⊇ B ∗ A∗ . Moreover, if B is bounded, then (BA)∗ = A∗ B ∗ . Problem 2.4. Show (2.20). Problem 2.5. An operator is called normal if Aψ = A∗ ψ for all ψ ∈ D(A) = D(A∗ ). Show that if A is normal, so is A + z for any z ∈ C. Problem 2.6. Show that normal operators are closed. (Hint: A∗ is closed.) Problem 2.7. Show that a bounded operator A is normal if and only if AA∗ = A∗ A. Problem 2.8. Show that the kernel of a closed operator is closed. Problem 2.9. Show that if A is closed and B bounded, then AB is closed. 2.3. Quadratic forms and the Friedrichs extension Finally we want to draw some further consequences of Axiom 2 and show that observables correspond to self-adjoint operators. Since self-adjoint op- erators are already maximal, the difficult part remaining is to show that an observable has at least one self-adjoint extension. There is a good way of doing this for nonnegative operators and hence we will consider this case first.
  • 80. 68 2. Self-adjointness and spectrum An operator is called nonnegative (resp. positive) if ψ, Aψ ≥ 0 (resp. 0 for ψ = 0) for all ψ ∈ D(A). If A is positive, the map (ϕ, ψ) → ϕ, Aψ is a scalar product. However, there might be sequences which are Cauchy with respect to this scalar product but not with respect to our original one. To avoid this, we introduce the scalar product ϕ, ψ A = ϕ, (A + 1)ψ , A ≥ 0, (2.45) defined on D(A), which satisfies ψ ≤ ψ A . Let HA be the completion of D(A) with respect to the above scalar product. We claim that HA can be regarded as a subspace of H; that is, D(A) ⊆ HA ⊆ H. If (ψn ) is a Cauchy sequence in D(A), then it is also Cauchy in H (since ψ ≤ ψ A by assumption) and hence we can identify the limit in HA with the limit of (ψn ) regarded as a sequence in H. For this identification to be unique, we need to show that if (ψn ) ⊂ D(A) is a Cauchy sequence in HA such that ψn → 0, then ψn A → 0. This follows from 2 ψn A = ψn , ψn − ψm A + ψn , ψm A ≤ ψn A ψn − ψm A + ψn (A + 1)ψm (2.46) since the right-hand side can be made arbitrarily small choosing m, n large. Clearly the quadratic form qA can be extended to every ψ ∈ HA by setting qA (ψ) = ψ, ψ A − ψ 2 , ψ ∈ Q(A) = HA . (2.47) The set Q(A) is also called the form domain of A. Example. (Multiplication operator) Let A be multiplication by A(x) ≥ 0 in L2 (Rn , dµ). Then Q(A) = D(A1/2 ) = {f ∈ L2 (Rn , dµ) | A1/2 f ∈ L2 (Rn , dµ)} (2.48) and qA (x) = A(x)|f (x)|2 dµ(x) (2.49) Rn (show this). Now we come to our extension result. Note that A + 1 is injective and ˜ the best we can hope for is that for a nonnegative extension A, the operator ˜ ˜ A + 1 is a bijection from D(A) onto H. Lemma 2.10. Suppose A is a nonnegative operator. Then there is a non- ˜ ˜ negative extension A such that Ran(A + 1) = H. ˜ Proof. Let us define an operator A by ˜ ˜ D(A) = {ψ ∈ HA |∃ψ ∈ H : ϕ, ψ A ˜ = ϕ, ψ , ∀ϕ ∈ HA }, ˜ ˜ Aψ = ψ − ψ.
  • 81. 2.3. Quadratic forms and the Friedrichs extension 69 ˜ Since HA is dense, ψ is well-defined. Moreover, it is straightforward to see ˜ that A is a nonnegative extension of A. ˜ ˜ It is also not hard to see that Ran(A + 1) = H. Indeed, for any ψ ∈ H, ϕ → ψ, ˜ ϕ is a bounded linear functional on HA . Hence there is an element ˜ ψ ∈ HA such that ψ, ϕ = ψ, ϕ A for all ϕ ∈ HA . By the definition of A, ˜ (A ˜ ˜ + 1)ψ = ψ and hence A + 1 is onto. ˜ Example. Let us take H = L2 (0, π) and consider the operator d2 Af = − f, D(A) = {f ∈ C 2 [0, π] | f (0) = f (π) = 0}, (2.50) dx2 which corresponds to the one-dimensional model of a particle confined to a box. (i) First of all, using integration by parts twice, it is straightforward to check that A is symmetric: π π π g(x)∗ (−f )(x)dx = g (x)∗ f (x)dx = (−g )(x)∗ f (x)dx. (2.51) 0 0 0 Note that the boundary conditions f (0) = f (π) = 0 are chosen such that the boundary terms occurring from integration by parts vanish. Moreover, the same calculation also shows that A is positive: π π f (x)∗ (−f )(x)dx = |f (x)|2 dx 0, f = 0. (2.52) 0 0 (ii) Next let us show HA = {f ∈ H 1 (0, π) | f (0) = f (π) = 0}. In fact, since π g, f A = g (x)∗ f (x) + g(x)∗ f (x) dx, (2.53) 0 we see that fn is Cauchy in HA if and only if both fn and fn are Cauchy x in L2 (0, π). Thus fn → f and fn → g in L2 (0, π) and fn (x) = 0 fn (t)dt x implies f (x) = 0 g(t)dt. Thus f ∈ AC[0, π]. Moreover, f (0) = 0 is obvious π π and from 0 = fn (π) = 0 fn (t)dt we have f (π) = limn→∞ 0 fn (t)dt = 0. So we have HA ⊆ {f ∈ H 1 (0, π) | f (0) = f (π) = 0}. To see the converse, 1 π approximate f by smooth functions gn . Using gn − π 0 gn (t)dt instead π of gn , it is no restriction to assume 0 gn (t)dt = 0. Now define fn (x) = x 0 gn (t)dt and note fn ∈ D(A) → f . ˜ ˜ (iii) Finally, let us compute the extension A. We have f ∈ D(A) if for all g ∈ HA there is an f ˜ such that g, f A = g, f . That is, ˜ π π g (x)∗ f (x)dx = g(x)∗ (f (x) − f (x))dx. ˜ (2.54) 0 0
  • 82. 70 2. Self-adjointness and spectrum Integration by parts on the right-hand side shows π π x g (x)∗ f (x)dx = − g (x)∗ ˜ (f (t) − f (t))dt dx (2.55) 0 0 0 or equivalently π x g (x)∗ f (x) + ˜ (f (t) − f (t))dt dx = 0. (2.56) 0 0 π Now observe {g ∈ H|g ∈ HA } = {h ∈ H| 0 h(t)dt = 0} = {1}⊥ and thus x ˜ f (x) + 0 (f (t) − f (t))dt ∈ {1}⊥⊥ = span{1}. So we see f ∈ H 2 (0, π) = ˜ {f ∈ AC[0, π]|f ∈ H 1 (0, π)} and Af = −f . The converse is easy and hence ˜ d2 ˜ Af = − 2 f, D(A) = {f ∈ H 2 [0, π] | f (0) = f (π) = 0}. (2.57) dx Now let us apply this result to operators A corresponding to observables. Since A will, in general, not satisfy the assumptions of our lemma, we will ˜ ˜ consider A2 instead, which has a symmetric extension A2 with Ran(A2 +1) = H. By our requirement for observables, A 2 is maximally defined and hence is equal to this extension. In other words, Ran(A2 + 1) = H. Moreover, for any ϕ ∈ H there is a ψ ∈ D(A2 ) such that (A − i)(A + i)ψ = (A + i)(A − i)ψ = ϕ (2.58) and since (A ± i)ψ ∈ D(A), we infer Ran(A ± i) = H. As an immediate consequence we obtain Corollary 2.11. Observables correspond to self-adjoint operators. But there is another important consequence of the results which is worth- while mentioning. A symmetric operator is called semi-bounded, respec- tively, bounded from below, if qA (ψ) = ψ, Aψ ≥ γ ψ 2 , γ ∈ R. (2.59) We will write A ≥ γ for short. Theorem 2.12 (Friedrichs extension). Let A be a symmetric operator which ˜ is bounded from below by γ. Then there is a self-adjoint extension A which is also bounded from below by γ and which satisfies D(A) ˜ ⊆ HA−γ . ˜ ˜ Moreover, A is the only self-adjoint extension with D(A) ⊆ HA−γ . Proof. If we replace A by A − γ, then existence follows from Lemma 2.10. ˆ ˆ To see uniqueness, let A be another self-adjoint extension with D(A) ⊆ HA . Choose ϕ ∈ D(A) and ψ ∈ D(A). ˆ Then ϕ, (A + 1)ψ = (A + 1)ϕ, ψ = ψ, (A + 1)ϕ ∗ = ψ, ϕ ∗ = ϕ, ψ A ˆ A
  • 83. 2.3. Quadratic forms and the Friedrichs extension 71 ˆ and by continuity we even get ϕ, (A + 1)ψ = ϕ, ψ A for every ϕ ∈ HA . ˜ ˜ ˜ ˆ Hence by the definition of A we have ψ ∈ D(A) and Aψ = Aψ; that is, Aˆ ⊆ A. But self-adjoint operators are maximal by Corollary 2.2 and thus ˜ ˆ ˜ A = A. Clearly Q(A) = HA and qA can be defined for semi-bounded operators as before by using ψ A = ψ, (A − γ)ψ + ψ 2 . In many physical applications, the converse of this result is also of im- portance: given a quadratic form q, when is there a corresponding operator A such that q = qA ? So let q : Q → C be a densely defined quadratic form corresponding to a sesquilinear form s : Q × Q → C; that is, q(ψ) = s(ψ, ψ). As with a scalar product, s can be recovered from q via the polarization identity (cf. Problem 0.14). Furthermore, as in Lemma 2.1 one can show that s is symmetric, s(ϕ, ψ) = s(ψ, ϕ)∗ , if and only if q is real-valued. In this case q will be called hermitian. A hermitian form q is called nonnegative if q(ψ) ≥ 0 and semi- bounded if q(ψ) ≥ γ ψ 2 for some γ ∈ R. As before we can associate a norm ψ q = q(ψ) + (1 − γ) ψ 2 with any semi-bounded q and look at the completion Hq of Q with respect to this norm. However, since we are not assuming that q is steaming from a semi-bounded operator, we do not know whether Hq can be regarded as a subspace of H! Hence we will call q clos- able if for every Cauchy sequence ψn ∈ Q with respect to . q , ψn → 0 implies ψn q → 0. In this case we have Hq ⊆ H and we call the extension of q to Hq the closure of q. In particular, we will call q closed if Q = Hq . Example. Let H = L2 (0, 1). Then q(f ) = |f (c)|2 , f ∈ C[0, 1], c ∈ [0, 1], is a well-defined nonnegative form. However, let fn (x) = max(0, 1−n|x−c|). Then fn is a Cauchy sequence with respect to . q such that fn → 0 but fn q → 1. Hence q is not closable and hence also not associated with a nonnegative operator. Formally, one can interpret q as the quadratic form of the multiplication operator with the delta distribution at x = c. Exercise: Show Hq = H ⊕ C. From our previous considerations we already know that the quadratic form qA of a semi-bounded operator A is closable and its closure is associated with a self-adjoint operator. It turns out that the converse is also true (compare also Corollary 1.9 for the case of bounded operators): Theorem 2.13. To every closed semi-bounded quadratic form q there cor- responds a unique self-adjoint operator A such that Q = Q(A) and q = qA .
  • 84. 72 2. Self-adjointness and spectrum If s is the sesquilinear form corresponding to q, then A is given by ˜ ˜ D(A) = {ψ ∈ Hq |∃ψ ∈ H : s(ϕ, ψ) = ϕ, ψ , ∀ϕ ∈ Hq }, ˜ (2.60) Aψ = ψ − (1 − γ)ψ. ˜ Proof. Since Hq is dense, ψ and hence A is well-defined. Moreover, replacing q by q(.) − γ . and A by A − γ, it is no restriction to assume γ = 0. As in the proof of Lemma 2.10 it follows that A is a nonnegative operator, Aψ 2 ≥ ψ 2 , with Ran(A + 1) = H. In particular, (A + 1)−1 exists and is bounded. Furthermore, for every ϕj ∈ H we can find ψj ∈ D(A) such that ϕj = (A + 1)ψj . Finally, (A + 1)−1 ϕ1 , ϕ2 = ψ1 , (A + 1)ψ2 = s(ψ1 , ψ2 ) = s(ψ2 , ψ1 )∗ ∗ = ψ2 , (A + 1)ψ1 = (A + 1)ψ1 , ψ2 −1 = ϕ1 , (A + 1) ϕ2 shows that (A + 1)−1 is self-adjoint and so is A + 1 by Corollary 2.5. ˜ Any subspace Q ⊆ Q(A) which is dense with respect to . A is called a form core of A and uniquely determines A. Example. We have already seen that the operator d2 Af = − f, D(A) = {f ∈ H 2 [0, π] | f (0) = f (π) = 0} (2.61) dx2 is associated with the closed form π qA (f ) = |f (x)|2 dx, Q(A) = {f ∈ H 1 [0, π] | f (0) = f (π) = 0}. (2.62) 0 However, this quadratic form even makes sense on the larger form domain Q = H 1 [0, π]. What is the corresponding self-adjoint operator? (See Prob- lem 2.13.) A hermitian form q is called bounded if |q(ψ)| ≤ C ψ 2 and we call q = sup |q(ψ)| (2.63) ψ =1 the norm of q. In this case the norm . q is equivalent to . . Hence Hq = H and the corresponding operator is bounded by the Hellinger–Toeplitz theorem (Theorem 2.9). In fact, the operator norm is equal to the norm of q (see also Problem 0.15): Lemma 2.14. A semi-bounded form q is bounded if and only if the associ- ated operator A is. Moreover, in this case q = A . (2.64)
  • 85. 2.4. Resolvents and spectra 73 Proof. Using the polarization identity and the parallelogram law (Prob- lem 0.14), we infer 2 Re ϕ, Aψ ≤ ( ψ 2 + ϕ 2 ) supψ | ψ, Aψ | and choosing ϕ = Aψ −1 Aψ shows A ≤ q|. The converse is easy. As a consequence we see that for symmetric operators we have A = sup | ψ, Aψ | (2.65) ψ =1 generalizing (2.14) in this case. Problem 2.10. Let A be invertible. Show A 0 if and only if A−1 0. d 2 Problem 2.11. Let A = − dx2 , D(A) = {f ∈ H 2 (0, π) | f (0) = f (π) = 0} 1 and let ψ(x) = 2√π x(π −x). Find the error in the following argument: Since A is symmetric, we have 1 = Aψ, Aψ = ψ, A2 ψ = 0. Problem 2.12. Suppose A is a closed operator. Show that A∗ A (with D(A∗ A) = {ψ ∈ D(A)|Aψ ∈ D(A∗ )}) is self-adjoint. Show Q(A∗ A) = D(A). (Hint: A∗ A ≥ 0.) Problem 2.13. Suppose A0 can be written as A0 = S ∗ S. Show that the Friedrichs extension is given by A = S ∗ S. 2 d Use this to compute the Friedrichs extension of A = − dx2 , D(A) = {f ∈ C 2 (0, π)|f (0) = f (π) = 0}. Compute also the self-adjoint operator SS ∗ and its form domain. Problem 2.14. Use the previous problem to compute the Friedrichs exten- d2 ∞ sion A of A0 = − dx2 , D(A0 ) = Cc (R). Show that Q(A) = H 1 (R) and D(A) = H 2 (R). (Hint: Section 2.7.) Problem 2.15. Let A be self-adjoint. Suppose D ⊆ D(A) is a core. Then D is also a form core. Problem 2.16. Show that (2.65) is wrong if A is not symmetric. 2.4. Resolvents and spectra Let A be a (densely defined) closed operator. The resolvent set of A is defined by ρ(A) = {z ∈ C|(A − z)−1 ∈ L(H)}. (2.66) More precisely, z ∈ ρ(A) if and only if (A − z) : D(A) → H is bijective and its inverse is bounded. By the closed graph theorem (Theorem 2.8), it suffices to check that A − z is bijective. The complement of the resolvent set is called the spectrum σ(A) = Cρ(A) (2.67)
  • 86. 74 2. Self-adjointness and spectrum of A. In particular, z ∈ σ(A) if A − z has a nontrivial kernel. A vector ψ ∈ Ker(A − z) is called an eigenvector and z is called an eigenvalue in this case. The function RA : ρ(A) → L(H) (2.68) z → (A − z)−1 is called the resolvent of A. Note the convenient formula RA (z)∗ = ((A − z)−1 )∗ = ((A − z)∗ )−1 = (A∗ − z ∗ )−1 = RA∗ (z ∗ ). (2.69) In particular, ρ(A∗ ) = ρ(A)∗ . (2.70) Example. (Multiplication operator) Consider again the multiplication op- erator (Af )(x) = A(x)f (x), D(A) = {f ∈ L2 (Rn , dµ) | Af ∈ L2 (Rn , dµ)}, (2.71) given by multiplication with the measurable function A : R n → C. Clearly (A − z)−1 is given by the multiplication operator 1 (A − z)−1 f (x) = f (x), A(x) − z 1 D((A − z)−1 ) = {f ∈ L2 (Rn , dµ) | f ∈ L2 (Rn , dµ)} (2.72) A−z whenever this operator is bounded. But (A − z)−1 = 1 A−z ∞ ≤ 1 ε is equivalent to µ({x| |A(x) − z| ε}) = 0 and hence ρ(A) = {z ∈ C|∃ε 0 : µ({x| |A(x) − z| ε}) = 0}. (2.73) The spectrum σ(A) = {z ∈ C|∀ε 0 : µ({x| |A(x) − z| ε}) 0} (2.74) is also known as the essential range of A(x). Moreover, z is an eigenvalue of A if µ(A−1 ({z})) 0 and χA−1 ({z}) is a corresponding eigenfunction in this case. Example. (Differential operator) Consider again the differential operator d Af = −i f, D(A) = {f ∈ AC[0, 2π] | f ∈ L2 , f (0) = f (2π)} (2.75) dx in L2 (0, 2π). We already know that the eigenvalues of A are the integers and that the corresponding normalized eigenfunctions 1 un (x) = √ einx (2.76) 2π form an orthonormal basis.
  • 87. 2.4. Resolvents and spectra 75 To compute the resolvent, we must find the solution of the correspond- ing inhomogeneous equation −if (x) − z f (x) = g(x). By the variation of constants formula the solution is given by (this can also be easily verified directly) x f (x) = f (0)eizx + i eiz(x−t) g(t)dt. (2.77) 0 Since f must lie in the domain of A, we must have f (0) = f (2π) which gives 2π i f (0) = e−izt g(t)dt, z ∈ CZ. (2.78) e−2πiz − 1 0 (Since z ∈ Z are the eigenvalues, the inverse cannot exist in this case.) Hence 2π (A − z)−1 g(x) = G(z, x, t)g(t)dt, (2.79) 0 where −i 1−e−2πiz , t x, G(z, x, t) = eiz(x−t) i z ∈ CZ. (2.80) 1−e2πiz , t x, In particular σ(A) = Z. If z, z ∈ ρ(A), we have the first resolvent formula RA (z) − RA (z ) = (z − z )RA (z)RA (z ) = (z − z )RA (z )RA (z). (2.81) In fact, (A − z)−1 − (z − z )(A − z)−1 (A − z )−1 = (A − z)−1 (1 − (z − A + A − z )(A − z )−1 ) = (A − z )−1 , (2.82) which proves the first equality. The second follows after interchanging z and z . Now fix z = z0 and use (2.81) recursively to obtain n RA (z) = (z − z0 )j RA (z0 )j+1 + (z − z0 )n+1 RA (z0 )n+1 RA (z). (2.83) j=0 The sequence of bounded operators n Rn = (z − z0 )j RA (z0 )j+1 (2.84) j=0 converges to a bounded operator if |z − z0 | RA (z0 ) −1 and clearly we expect z ∈ ρ(A) and Rn → RA (z) in this case. Let R∞ = limn→∞ Rn and set ϕn = Rn ψ, ϕ = R∞ ψ for some ψ ∈ H. Then a quick calculation shows ARn ψ = (A − z0 )Rn ψ + z0 ϕn = ψ + (z − z0 )ϕn−1 + z0 ϕn . (2.85) Hence (ϕn , Aϕn ) → (ϕ, ψ + zϕ) shows ϕ ∈ D(A) (since A is closed) and (A − z)R∞ ψ = ψ. Similarly, for ψ ∈ D(A), Rn Aψ = ψ + (z − z0 )ϕn−1 + z0 ϕn (2.86)
  • 88. 76 2. Self-adjointness and spectrum and hence R∞ (A − z)ψ = ψ after taking the limit. Thus R∞ = RA (z) as anticipated. If A is bounded, a similar argument verifies the Neumann series for the resolvent n−1 Aj 1 RA (z) = − j+1 + n An RA (z) z z j=0 ∞ Aj =− , |z| A . (2.87) z j+1 j=0 In summary we have proved the following: Theorem 2.15. The resolvent set ρ(A) is open and RA : ρ(A) → L(H) is holomorphic; that is, it has an absolutely convergent power series expansion around every point z0 ∈ ρ(A). In addition, RA (z) ≥ dist(z, σ(A))−1 (2.88) and if A is bounded, we have {z ∈ C| |z| A } ⊆ ρ(A). As a consequence we obtain the useful Lemma 2.16. We have z ∈ σ(A) if there is a sequence ψn ∈ D(A) such that ψn = 1 and (A − z)ψn → 0. If z is a boundary point of ρ(A), then the converse is also true. Such a sequence is called a Weyl sequence. Proof. Let ψn be a Weyl sequence. Then z ∈ ρ(A) is impossible by 1 = ψn = RA (z)(A − z)ψn ≤ RA (z) (A − z)ψn → 0. Conversely, by (2.88) there is a sequence zn → z and corresponding vectors ϕn ∈ H such that RA (z)ϕn ϕn −1 → ∞. Let ψn = RA (zn )ϕn and rescale ϕn such that ψn = 1. Then ϕn → 0 and hence (A − z)ψn = ϕn + (zn − z)ψn ≤ ϕn + |z − zn | → 0 shows that ψn is a Weyl sequence. Let us also note the following spectral mapping result. Lemma 2.17. Suppose A is injective. Then σ(A−1 ){0} = (σ(A){0})−1 . (2.89) In addition, we have Aψ = zψ if and only if A−1 ψ = z −1 ψ. Proof. Suppose z ∈ ρ(A){0}. Then we claim RA−1 (z −1 ) = −zARA (z) = −z − z 2 RA (z).
  • 89. 2.4. Resolvents and spectra 77 In fact, the right-hand side is a bounded operator from H → Ran(A) = D(A−1 ) and (A−1 − z −1 )(−zARA (z))ϕ = (−z + A)RA (z)ϕ = ϕ, ϕ ∈ H. Conversely, if ψ ∈ D(A−1 ) = Ran(A), we have ψ = Aϕ and hence (−zARA (z))(A−1 − z −1 )ψ = ARA (z)((A − z)ϕ) = Aϕ = ψ. Thus z −1 ∈ ρ(A−1 ). The rest follows after interchanging the roles of A and A−1 . Next, let us characterize the spectra of self-adjoint operators. Theorem 2.18. Let A be symmetric. Then A is self-adjoint if and only if σ(A) ⊆ R and (A − E) ≥ 0, E ∈ R, if and only if σ(A) ⊆ [E, ∞). Moreover, RA (z) ≤ | Im(z)|−1 and, if (A − E) ≥ 0, RA (λ) ≤ |λ − E|−1 , λ E. Proof. If σ(A) ⊆ R, then Ran(A + z) = H, z ∈ CR, and hence A is self-adjoint by Lemma 2.7. Conversely, if A is self-adjoint (resp. A ≥ E), then RA (z) exists for z ∈ CR (resp. z ∈ C[E, ∞)) and satisfies the given estimates as has been shown in the proof of Lemma 2.7. In particular, we obtain (show this!) Theorem 2.19. Let A be self-adjoint. Then inf σ(A) = inf ψ, Aψ (2.90) ψ∈D(A), ψ =1 and sup σ(A) = sup ψ, Aψ . (2.91) ψ∈D(A), ψ =1 For the eigenvalues and corresponding eigenfunctions we have Lemma 2.20. Let A be symmetric. Then all eigenvalues are real and eigen- vectors corresponding to different eigenvalues are orthogonal. Proof. If Aψj = λj ψj , j = 1, 2, we have λ1 ψ1 2 = ψ1 , λ1 ψ1 = ψ1 , Aψ1 = ψ1 , Aψ1 = λ1 ψ1 , ψ1 = λ∗ ψ1 1 2 and (λ1 − λ2 ) ψ1 , ψ2 = Aψ1 , ψ2 − Aψ1 , ψ2 = 0, finishing the proof. The result does not imply that two linearly independent eigenfunctions to the same eigenvalue are orthogonal. However, it is no restriction to assume that they are since we can use Gram–Schmidt to find an orthonormal basis for Ker(A − λ). If H is finite dimensional, we can always find an orthonormal basis of eigenvectors. In the infinite dimensional case this is
  • 90. 78 2. Self-adjointness and spectrum no longer true in general. However, if there is an orthonormal basis of eigenvectors, then A is essentially self-adjoint. Theorem 2.21. Suppose A is a symmetric operator which has an orthonor- mal basis of eigenfunctions {ϕj }. Then A is essentially self-adjoint. In particular, it is essentially self-adjoint on span{ϕj }. n Proof. Consider the set of all finite linear combinations ψ = j=0 cj ϕj n cj which is dense in H. Then φ = j=0 λj ±i ϕj ∈ D(A) and (A ± i)φ = ψ shows that Ran(A ± i) is dense. Similarly, we can characterize the spectra of unitary operators. Recall that a bijection U is called unitary if U ψ, U ψ = ψ, U ∗ U ψ = ψ, ψ . Thus U is unitary if and only if U ∗ = U −1 . (2.92) Theorem 2.22. Let U be unitary. Then σ(U ) ⊆ {z ∈ C| |z| = 1}. All eigenvalues have modulus one and eigenvectors corresponding to different eigenvalues are orthogonal. Proof. Since U ≤ 1, we have σ(U ) ⊆ {z ∈ C| |z| ≤ 1}. Moreover, U −1 is also unitary and hence σ(U ) ⊆ {z ∈ C| |z| ≥ 1} by Lemma 2.17. If U ψj = zj ψj , j = 1, 2, we have (z1 − z2 ) ψ1 , ψ2 = U ∗ ψ1 , ψ2 − ψ1 , U ψ2 = 0 since U ψ = zψ implies U ∗ ψ = U −1 ψ = z −1 ψ = z ∗ ψ. Problem 2.17. Suppose A is closed and B bounded: • Show that I + B has a bounded inverse if B 1. • Suppose A has a bounded inverse. Then so does A + B if B ≤ A−1 −1 . Problem 2.18. What is the spectrum of an orthogonal projection? Problem 2.19. Compute the resolvent of Af = f , D(A) = {f ∈ H 1 [0, 1] | f (0) = 0} and show that unbounded operators can have empty spectrum. d 2 Problem 2.20. Compute the eigenvalues and eigenvectors of A = − dx2 , D(A) = {f ∈ H 2 (0, π)|f (0) = f (π) = 0}. Compute the resolvent of A. Problem 2.21. Find a Weyl sequence for the self-adjoint operator A = d2 − dx2 , D(A) = H 2 (R) for z ∈ (0, ∞). What is σ(A)? (Hint: Cut off the solutions of −u (x) = z u(x) outside a finite ball.)
  • 91. 2.5. Orthogonal sums of operators 79 Problem 2.22. Suppose A = A0 . If ψn ∈ D(A) is a Weyl sequence for ˜ z ∈ σ(A), then there is also one with ψn ∈ D(A0 ). Problem 2.23. Suppose A is bounded. Show that the spectra of AA∗ and A∗ A coincide away from 0 by showing 1 1 ∗ RAA∗ (z) = (ARA∗ A (z)A∗ − 1) , RA∗ A (z) = (A RAA∗ (z)A − 1) . z z (2.93) 2.5. Orthogonal sums of operators Let Hj , j = 1, 2, be two given Hilbert spaces and let Aj : D(Aj ) → Hj be two given operators. Setting H = H1 ⊕ H2 , we can define an operator A = A1 ⊕ A2 , D(A) = D(A1 ) ⊕ D(A2 ) (2.94) by setting A(ψ1 + ψ2 ) = A1 ψ1 + A2 ψ2 for ψj ∈ D(Aj ). Clearly A is closed, (essentially) self-adjoint, etc., if and only if both A1 and A2 are. The same considerations apply to countable orthogonal sums. Let H = j Hj and set A= Aj , D(A) = {ψ ∈ D(Aj )|Aψ ∈ H}. (2.95) j j Then we have Theorem 2.23. Suppose Aj are self-adjoint operators on Hj . Then A = j Aj is self-adjoint and RA (z) = RAj (z), z ∈ ρ(A) = Cσ(A) (2.96) j where σ(A) = σ(Aj ) (2.97) j (the closure can be omitted if there are only finitely many terms). Proof. Fix z ∈ j σ(Aj ) and let ε = Im(z). Then, by Theorem 2.18, RAj (z) ≤ ε−1 and so R(z) = j RAj (z) is a bounded operator with R(z) ≤ ε −1 (cf. Problem 2.26). It is straightforward to check that R(z) is in fact the resolvent of A and thus σ(A) ⊆ R. In particular, A is self- adjoint by Theorem 2.18. To see that σ(A) ⊆ j σ(Aj ), note that the above argument can be repeated with ε = dist(z, j σ(Aj )) 0, which will follow from the spectral theorem (Problem 3.5) to be proven in the next chapter. Conversely, if z ∈ σ(Aj ), there is a corresponding Weyl sequence ψn ∈ D(Aj ) ⊆ D(A) and hence z ∈ σ(A).
  • 92. 80 2. Self-adjointness and spectrum Conversely, given an operator A, it might be useful to write A as an orthogonal sum and investigate each part separately. Let H1 ⊆ H be a closed subspace and let P1 be the corresponding pro- jector. We say that H1 reduces the operator A if P1 A ⊆ AP1 . Note that this is equivalent to P1 D(A) ⊆ D(A) and P1 Aψ = AP1 ψ for ψ ∈ D(A). Moreover, if we set H2 = H⊥ , we have H = H1 ⊕ H2 and P2 = 1 − P1 reduces 1 A as well. Lemma 2.24. Suppose H = j Hj where each Hj reduces A. Then A = j Aj , where Aj ψ = Aψ, D(Aj ) = Pj D(A) ⊆ D(A). (2.98) If A is closable, then Hj also reduces A and A= Aj . (2.99) j Proof. As already noted, Pj D(A) ⊆ D(A) and thus every ψ ∈ D(A) can be written as ψ = j Pj ψ; that is, D(A) = j D(Aj ). Moreover, if ψ ∈ D(Aj ), we have Aψ = APj ψ = Pj Aψ ∈ Hj and thus Aj : D(Aj ) → Hj which proves the first claim. Now let us turn to the second claim. Suppose ψ ∈ D(A). Then there is a sequence ψn ∈ D(A) such that ψn → ψ and Aψn → ϕ = Aψ. Thus Pj ψn → Pj ψ and APj ψn = Pj Aψn → Pj ϕ which shows Pj ψ ∈ D(A) and Pj Aψ = APj ψ; that is, Hj reduces A. Moreover, this argument also shows Pj D(A) ⊆ D(Aj ) and the converse follows analogously. If A is self-adjoint, then H1 reduces A if P1 D(A) ⊆ D(A) and AP1 ψ ∈ H1 for every ψ ∈ D(A). In fact, if ψ ∈ D(A), we can write ψ = ψ1 ⊕ ψ2 , with P2 = 1 − P1 and ψj = Pj ψ ∈ D(A). Since AP1 ψ = Aψ1 and P1 Aψ = P1 Aψ1 + P1 Aψ2 = Aψ1 + P1 Aψ2 , we need to show P1 Aψ2 = 0. But this follows since ϕ, P1 Aψ2 = AP1 ϕ, ψ2 = 0 (2.100) for every ϕ ∈ D(A). Problem 2.24. Show ( j Aj ) ∗ = j A∗ . j Problem 2.25. Show that A defined in (2.95) is closed if and only if all Aj are. Problem 2.26. Show that for A defined in (2.95), we have A = supj Aj .
  • 93. 2.6. Self-adjoint extensions 81 2.6. Self-adjoint extensions It is safe to skip this entire section on first reading. In many physical applications a symmetric operator is given. If this operator turns out to be essentially self-adjoint, there is a unique self-adjoint extension and everything is fine. However, if it is not, it is important to find out if there are self-adjoint extensions at all (for physical problems there better be) and to classify them. In Section 2.2 we saw that A is essentially self-adjoint if Ker(A∗ − z) = Ker(A∗ − z ∗ ) = {0} for one z ∈ CR. Hence self-adjointness is related to the dimension of these spaces and one calls the numbers d± (A) = dim K± , K± = Ran(A ± i)⊥ = Ker(A∗ i), (2.101) defect indices of A (we have chosen z = i for simplicity; any other z ∈ CR would be as good). If d− (A) = d+ (A) = 0, there is one self-adjoint extension of A, namely A. But what happens in the general case? Is there more than one extension, or maybe none at all? These questions can be answered by virtue of the Cayley transform V = (A − i)(A + i)−1 : Ran(A + i) → Ran(A − i). (2.102) Theorem 2.25. The Cayley transform is a bijection from the set of all symmetric operators A to the set of all isometric operators V (i.e., V ϕ = ϕ for all ϕ ∈ D(V )) for which Ran(1 − V ) is dense. Proof. Since A is symmetric, we have (A ± i)ψ 2 = Aψ 2 + ψ 2 for all ψ ∈ D(A) by a straightforward computation. Thus for every ϕ = (A + i)ψ ∈ D(V ) = Ran(A + i) we have V ϕ = (A − i)ψ = (A + i)ψ = ϕ . Next observe 2A(A + i)−1 , 1 ± V = ((A − i) ± (A + i))(A + i)−1 = 2i(A + i)−1 , which shows that Ran(1 − V ) = D(A) is dense and A = i(1 + V )(1 − V )−1 . Conversely, let V be given and use the last equation to define A. Since V is isometric, we have (1 ± V )ϕ, (1 V )ϕ = ±2i Im V ϕ, ϕ for all ϕ ∈ D(V ) by a straightforward computation. Thus for every ψ = (1 − V )ϕ ∈ D(A) = Ran(1 − V ) we have Aψ, ψ = −i (1 + V )ϕ, (1 − V )ϕ = i (1 − V )ϕ, (1 + V )ϕ = ψ, Aψ ;
  • 94. 82 2. Self-adjointness and spectrum that is, A is symmetric. Finally observe 2i(1 − V )−1 , A ± i = ((1 + V ) ± (1 − V ))(1 − V )−1 = 2iV (1 − V )−1 , which shows that A is the Cayley transform of V and finishes the proof. Thus A is self-adjoint if and only if its Cayley transform V is unitary. Moreover, finding a self-adjoint extension of A is equivalent to finding a unitary extensions of V and this in turn is equivalent to (taking the closure and) finding a unitary operator from D(V )⊥ to Ran(V )⊥ . This is possible if and only if both spaces have the same dimension, that is, if and only if d+ (A) = d− (A). Theorem 2.26. A symmetric operator has self-adjoint extensions if and only if its defect indices are equal. In this case let A1 be a self-adjoint extension and V1 its Cayley trans- form. Then D(A1 ) = D(A) + (1 − V1 )K+ = {ψ + ϕ+ − V1 ϕ+ |ψ ∈ D(A), ϕ+ ∈ K+ } (2.103) and A1 (ψ + ϕ+ − V1 ϕ+ ) = Aψ + iϕ+ + iV1 ϕ+ . (2.104) Moreover, i (A1 ± i)−1 = (A ± i)−1 ⊕ ϕ± , . (ϕ± − ϕj ), j j (2.105) 2 j where {ϕ+ } is an orthonormal basis for K+ and ϕ− = V1 ϕ+ . j j j Proof. From the proof of the previous theorem we know that D(A1 ) = Ran(1 − V1 ) = Ran(1 + V ) + (1 − V1 )K+ = D(A) + (1 − V1 )K+ . Moreover, A1 (ψ +ϕ+ −V1 ϕ+ ) = Aψ +i(1+V1 )(1−V1 )−1 (1−V1 )ϕ+ = Aψ +i(1+V1 )ϕ+ . Similarly, Ran(A1 ± i) = Ran(A ± i) ⊕ K± and (A1 + i)−1 = − 2 (1 − V1 ), i respectively, (A1 + i)−1 = − 2 (1 − V1−1 ). i Note that instead of z = i we could use V (z) = (A + z ∗ )(A + z)−1 for any z ∈ CR. We remark that in this case one can show that the defect indices are independent of z ∈ C+ = {z ∈ C| Im(z) 0}. d Example. Recall the operator A = −i dx , D(A) = {f ∈ H 1 (0, 2π)|f (0) = f (2π) = 0} with adjoint A ∗ = −i d , D(A∗ ) = H 1 (0, 2π). dx Clearly K± = span{e x } (2.106) is one-dimensional and hence all unitary maps are of the form Vθ e2π−x = eiθ ex , θ ∈ [0, 2π). (2.107)
  • 95. 2.6. Self-adjoint extensions 83 The functions in the domain of the corresponding operator Aθ are given by fθ (x) = f (x) + α(e2π−x − eiθ ex ), f ∈ D(A), α ∈ C. (2.108) In particular, fθ satisfies ˜ ˜ 1 − eiθ e2π fθ (2π) = eiθ fθ (0), e iθ = , (2.109) e2π − eiθ and thus we have ˜ D(Aθ ) = {f ∈ H 1 (0, 2π)|f (2π) = eiθ f (0)}. (2.110) Concerning closures, we can combine the fact that a bounded operator is closed if and only if its domain is closed with item (iii) from Lemma 2.4 to obtain Lemma 2.27. The following items are equivalent. • A is closed. • D(V ) = Ran(A + i) is closed. • Ran(V ) = Ran(A − i) is closed. • V is closed. Next, we give a useful criterion for the existence of self-adjoint exten- sions. A conjugate linear map C : H → H is called a conjugation if it satisfies C 2 = I and Cψ, Cϕ = ψ, ϕ . The prototypical example is, of course, complex conjugation Cψ = ψ ∗ . An operator A is called C-real if CD(A) ⊆ D(A), and ACψ = CAψ, ψ ∈ D(A). (2.111) Note that in this case CD(A) = D(A), since D(A) = C 2 D(A) ⊆ CD(A). Theorem 2.28. Suppose the symmetric operator A is C-real. Then its defect indices are equal. Proof. Let {ϕj } be an orthonormal set in Ran(A + i)⊥ . Then {Cϕj } is an orthonormal set in Ran(A − i)⊥ . Hence {ϕj } is an orthonormal basis for Ran(A + i)⊥ if and only if {Cϕj } is an orthonormal basis for Ran(A − i)⊥ . Hence the two spaces have the same dimension. Finally, we note the following useful formula for the difference of resol- vents of self-adjoint extensions. Lemma 2.29. If Aj , j = 1, 2, are self-adjoint extensions of A and if {ϕj (z)} is an orthonormal basis for Ker(A∗ − z), then (A1 − z)−1 − (A2 − z)−1 = (αjk (z) − αjk (z)) ϕj (z ∗ ), . ϕk (z), (2.112) 1 2 j,k
  • 96. 84 2. Self-adjointness and spectrum where αjk (z) = ϕk (z), (Al − z)−1 ϕj (z ∗ ) . l (2.113) Proof. First observe that ((A1 − z)−1 − (A2 − z)−1 )ϕ is zero for every ϕ ∈ Ran(A − z). Hence it suffices to consider vectors of the form ϕ = ∗ ∗ ⊥ ∗ ∗ j ϕj (z ), ϕ ϕj (z ) ∈ Ran(A − z) = Ker(A − z ). Hence we have (A1 − z)−1 − (A2 − z)−1 = ϕj (z ∗ ), . ψj (z), j where ψj (z) = ((A1 − z)−1 − (A2 − z)−1 )ϕj (z ∗ ). Now computing the adjoint once using ((Al − z)−1 )∗ = (Al − z ∗ )−1 and once using ( j ϕj , . ψj )∗ = j ψj , . ϕj , we obtain ϕj (z), . ψj (z ∗ ) = ψj (z), . ϕk (z ∗ ). j j Evaluating at ϕk (z) implies ψk (z) = ψj (z ∗ ), ϕk (z ∗ ) ϕj (z) = 1 2 (αkj (z) − αkj (z))ϕj (z) j j and finishes the proof. Problem 2.27. Compute the defect indices of d ∞ A0 = i, D(A0 ) = Cc ((0, ∞)). dx Can you give a self-adjoint extension of A0 ? Problem 2.28. Let A1 be a self-adjoint extension of A and suppose ϕ ∈ Ker(A∗ − z0 ). Show that ϕ(z) = ϕ + (z − z0 )(A1 − z)−1 ϕ ∈ Ker(A∗ − z). 2.7. Appendix: Absolutely continuous functions Let (a, b) ⊆ R be some interval. We denote by x AC(a, b) = {f ∈ C(a, b)|f (x) = f (c) + g(t)dt, c ∈ (a, b), g ∈ L1 (a, b)} loc c (2.114) the set of all absolutely continuous functions. That is, f is absolutely continuous if and only if it can be written as the integral of some locally integrable function. Note that AC(a, b) is a vector space. x By Corollary A.36, f (x) = f (c) + c g(t)dt is differentiable a.e. (with re- spect to Lebesgue measure) and f (x) = g(x). In particular, g is determined uniquely a.e.
  • 97. 2.7. Appendix: Absolutely continuous functions 85 If [a, b] is a compact interval, we set AC[a, b] = {f ∈ AC(a, b)|g ∈ L1 (a, b)} ⊆ C[a, b]. (2.115) If f, g ∈ AC[a, b], we have the formula of partial integration (Problem 2.29) b b f (x)g (x)dx = f (b)g(b) − f (a)g(a) − f (x)g(x)dx (2.116) a a which also implies that the product rule holds for absolutely continuous functions. We set H (a, b) = {f ∈ L2 (a, b)|f (j) ∈ AC(a, b), f (j+1) ∈ L2 (a, b), 0 ≤ j ≤ m − 1}. m (2.117) Then we have Lemma 2.30. Suppose f ∈ H m (a, b), m ≥ 1. Then f is bounded and limx↓a f (j) (x), respectively, limx↑b f (j) (x), exists for 0 ≤ j ≤ m − 1. More- over, the limit is zero if the endpoint is infinite. Proof. If the endpoint is finite, then f (j+1) is integrable near this endpoint and hence the claim follows. If the endpoint is infinite, note that x |f (j) (x)|2 = |f (j) (c)|2 + 2 Re(f (j) (t)∗ f (j+1) (t))dt c shows that the limit exists (dominated convergence). Since f (j) is square integrable, the limit must be zero. Let me remark that it suffices to check that the function plus the highest derivative are in L2 ; the lower derivatives are then automatically in L2 . That is, H m (a, b) = {f ∈ L2 (a, b)|f (j) ∈ AC(a, b), 0 ≤ j ≤ m − 1, f (m) ∈ L2 (a, b)}. (2.118) For a finite endpoint this is straightforward. For an infinite endpoint this can also be shown directly, but it is much easier to use the Fourier transform (compare Section 7.1). Problem 2.29. Show (2.116). (Hint: Fubini.) Problem 2.30. A function u ∈ L1 (0, 1) is called weakly differentiable if for some v ∈ L1 (0, 1) we have 1 1 v(x)ϕ(x)dx = − u(x)ϕ (x)dx 0 0 ∞ for all test functions ϕ ∈ Cc (0, 1). Show that u is weakly differentiable if and only if u is absolutely continuous and u = v in this case. (Hint: You will
  • 98. 86 2. Self-adjointness and spectrum 1 ∞ need that 0 u(t)ϕ (t)dt = 0 for all ϕ ∈ Cc (0, 1) if and only if f is constant. ∞ 1 To see this choose some ϕ0 ∈ Cc (0, 1) with I(ϕ0 ) = 0 ϕ0 (t)dt = 1. Then invoke Lemma 0.37 and use that every ϕ ∈ Cc ∞ (0, 1) can be written as t t ϕ(t) = Φ (t) + I(ϕ)ϕ0 (t) with Φ(t) = 0 ϕ(s)ds − I(ϕ) 0 ϕ0 (s)ds.) Problem 2.31. Show that H 1 (a, b) together with the norm b b 2 f 2,1 = |f (t)|2 dt + |f (t)|2 dt a a is a Hilbert space. ∞ Problem 2.32. What is the closure of C0 (a, b) in H 1 (a, b)? (Hint: Start with the case where (a, b) is finite.) Problem 2.33. Show that if f ∈ AC(a, b) and f ∈ Lp (a, b), then f is H¨lder continuous: o 1 1− p |f (x) − f (y)| ≤ f p |x − y| .
  • 99. Chapter 3 The spectral theorem The time evolution of a quantum mechanical system is governed by the Schr¨dinger equation o d i ψ(t) = Hψ(t). (3.1) dt If H = Cn and H is hence a matrix, this system of ordinary differential equations is solved by the matrix exponential ψ(t) = exp(−itH)ψ(0). (3.2) This matrix exponential can be defined by a convergent power series ∞ (−it)n n exp(−itH) = H . (3.3) n! n=0 For this approach the boundedness of H is crucial, which might not be the case for a quantum system. However, the best way to compute the matrix exponential and to understand the underlying dynamics is to diagonalize H. But how do we diagonalize a self-adjoint operator? The answer is known as the spectral theorem. 3.1. The spectral theorem In this section we want to address the problem of defining functions of a self-adjoint operator A in a natural way, that is, such that (f +g)(A) = f (A)+g(A), (f g)(A) = f (A)g(A), (f ∗ )(A) = f (A)∗ . (3.4) As long as f and g are polynomials, no problems arise. If we want to extend this definition to a larger class of functions, we will need to perform some limiting procedure. Hence we could consider convergent power series or equip the space of polynomials on the spectrum with the sup norm. In both 87
  • 100. 88 3. The spectral theorem cases this only works if the operator A is bounded. To overcome this limita- tion, we will use characteristic functions χΩ (A) instead of powers Aj . Since χΩ (λ)2 = χΩ (λ), the corresponding operators should be orthogonal projec- tions. Moreover, we should also have χR (A) = I and χΩ (A) = n χΩj (A) j=1 for any finite union Ω = n Ωj of disjoint sets. The only remaining prob- j=1 lem is of course the definition of χΩ (A). However, we will defer this problem and begin by developing a functional calculus for a family of characteristic functions χΩ (A). Denote the Borel sigma algebra of R by B. A projection-valued mea- sure is a map P : B → L(H), Ω → P (Ω), (3.5) from the Borel sets to the set of orthogonal projections, that is, P (Ω)∗ = P (Ω) and P (Ω)2 = P (Ω), such that the following two conditions hold: (i) P (R) = I. (ii) If Ω = n Ωn with Ωn ∩ Ωm = ∅ for n = m, then n P (Ωn )ψ = P (Ω)ψ for every ψ ∈ H (strong σ-additivity). Note that we require strong convergence, n P (Ωn )ψ = P (Ω)ψ, rather than norm convergence, n P (Ωn ) = P (Ω). In fact, norm convergence does not even hold in the simplest case where H = L2 (I) and P (Ω) = χΩ (multiplication operator), since for a multiplication operator the norm is just the sup norm of the function. Furthermore, it even suffices to require weak convergence, since w-lim Pn = P for some orthogonal projections implies s-lim Pn = P by ψ, Pn ψ = ψ, Pn ψ = Pn ψ, Pn ψ = Pn ψ 2 together 2 with Lemma 1.12 (iv). Example. Let H = Cn and let A ∈ GL(n) be some symmetric matrix. Let λ1 , . . . , λm be its (distinct) eigenvalues and let Pj be the projections onto the corresponding eigenspaces. Then PA (Ω) = Pj (3.6) {j|λj ∈Ω} is a projection-valued measure. Example. Let H = L2 (R) and let f be a real-valued measurable function. Then P (Ω) = χf −1 (Ω) (3.7) is a projection-valued measure (Problem 3.3). It is straightforward to verify that any projection-valued measure satis- fies P (∅) = 0, P (RΩ) = I − P (Ω), (3.8)
  • 101. 3.1. The spectral theorem 89 and P (Ω1 ∪ Ω2 ) + P (Ω1 ∩ Ω2 ) = P (Ω1 ) + P (Ω2 ). (3.9) Moreover, we also have P (Ω1 )P (Ω2 ) = P (Ω1 ∩ Ω2 ). (3.10) Indeed, first suppose Ω1 ∩ Ω2 = ∅. Then, taking the square of (3.9), we infer P (Ω1 )P (Ω2 ) + P (Ω2 )P (Ω1 ) = 0. (3.11) Multiplying this equation from the right by P (Ω2 ) shows that P (Ω1 )P (Ω2 ) = −P (Ω2 )P (Ω1 )P (Ω2 ) is self-adjoint and thus P (Ω1 )P (Ω2 ) = P (Ω2 )P (Ω1 ) = 0. For the general case Ω1 ∩ Ω2 = ∅ we now have P (Ω1 )P (Ω2 ) = (P (Ω1 − Ω2 ) + P (Ω1 ∩ Ω2 ))(P (Ω2 − Ω1 ) + P (Ω1 ∩ Ω2 )) = P (Ω1 ∩ Ω2 ) (3.12) as stated. Moreover, a projection-valued measure is monotone, that is, Ω1 ⊆ Ω2 ⇒ P (Ω1 ) ≤ P (Ω2 ), (3.13) in the sense that ψ, P (Ω1 )ψ ≤ ψ, P (Ω2 )ψ or equivalently Ran(P (Ω1 )) ⊆ Ran(P (Ω2 )) (cf. Problem 1.7). As a useful consequence note that P (Ω2 ) = 0 implies P (Ω1 ) = 0 for every subset Ω1 ⊆ Ω2 . To every projection-valued measure there corresponds a resolution of the identity P (λ) = P ((−∞, λ]) (3.14) which has the properties (Problem 3.4): (i) P (λ) is an orthogonal projection. (ii) P (λ1 ) ≤ P (λ2 ) for λ1 ≤ λ2 . (iii) s-limλn ↓λ P (λn ) = P (λ) (strong right continuity). (iv) s-limλ→−∞ P (λ) = 0 and s-limλ→+∞ P (λ) = I. As before, strong right continuity is equivalent to weak right continuity. Picking ψ ∈ H, we obtain a finite Borel measure µψ (Ω) = ψ, P (Ω)ψ = P (Ω)ψ 2 with µψ (R) = ψ 2 ∞. The corresponding distribution func- tion is given by µψ (λ) = ψ, P (λ)ψ and since for every distribution function there is a unique Borel measure (Theorem A.2), for every resolution of the identity there is a unique projection-valued measure. Using the polarization identity (2.16), we also have the complex Borel measures 1 µϕ,ψ (Ω) = ϕ, P (Ω)ψ = (µϕ+ψ (Ω) − µϕ−ψ (Ω) + iµϕ−iψ (Ω) − iµϕ+iψ (Ω)). 4 (3.15) Note also that, by Cauchy–Schwarz, |µϕ,ψ (Ω)| ≤ ϕ ψ .
  • 102. 90 3. The spectral theorem Now let us turn to integration with respect to our projection-valued measure. For any simple function f = n αj χΩj (where Ωj = f −1 (αj )) j=1 we set n P (f ) ≡ f (λ)dP (λ) = αj P (Ωj ). (3.16) R j=1 In particular, P (χΩ ) = P (Ω). Then ϕ, P (f )ψ = j αj µϕ,ψ (Ωj ) shows ϕ, P (f )ψ = f (λ)dµϕ,ψ (λ) (3.17) R and, by linearity of the integral, the operator P is a linear map from the set of simple functions into the set of bounded linear operators on H. Moreover, P (f )ψ 2 = j |αj |2 µψ (Ωj ) (the sets Ωj are disjoint) shows 2 P (f )ψ = |f (λ)|2 dµψ (λ). (3.18) R Equipping the set of simple functions with the sup norm, we infer P (f )ψ ≤ f ∞ ψ , (3.19) which implies that P has norm one. Since the simple functions are dense in the Banach space of bounded Borel functions B(R), there is a unique extension of P to a bounded linear operator P : B(R) → L(H) (whose norm is one) from the bounded Borel functions on R (with sup norm) to the set of bounded linear operators on H. In particular, (3.17) and (3.18) remain true. There is some additional structure behind this extension. Recall that the set L(H) of all bounded linear mappings on H forms a C ∗ algebra. A C ∗ algebra homomorphism φ is a linear map between two C ∗ algebras which respects both the multiplication and the adjoint; that is, φ(ab) = φ(a)φ(b) and φ(a∗ ) = φ(a)∗ . Theorem 3.1. Let P (Ω) be a projection-valued measure on H. Then the operator P : B(R) → L(H) (3.20) f → R f (λ)dP (λ) is a C ∗ algebra homomorphism with norm one such that P (g)ϕ, P (f )ψ = g ∗ (λ)f (λ)dµϕ,ψ (λ). (3.21) R In addition, if fn (x) → f (x) pointwise and if the sequence supλ∈R |fn (λ)| is s bounded, then P (fn ) → P (f ) strongly. Proof. The properties P (1) = I, P (f ∗ ) = P (f )∗ , and P (f g) = P (f )P (g) are straightforward for simple functions f . For general f they follow from continuity. Hence P is a C ∗ algebra homomorphism.
  • 103. 3.1. The spectral theorem 91 Equation (3.21) is a consequence of P (g)ϕ, P (f )ψ = ϕ, P (g ∗ f )ψ . The last claim follows from the dominated convergence theorem and (3.18). As a consequence of (3.21), observe µP (g)ϕ,P (f )ψ (Ω) = P (g)ϕ, P (Ω)P (f )ψ = g ∗ (λ)f (λ)dµϕ,ψ (λ), (3.22) Ω which implies dµP (g)ϕ,P (f )ψ = g ∗ f dµϕ,ψ . (3.23) Example. Let H = Cn and A = A∗ ∈ GL(n), respectively, PA , as in the previous example. Then m PA (f ) = f (λj )Pj . (3.24) j=1 In particular, PA (f ) = A for f (λ) = λ. Next we want to define this operator for unbounded Borel functions. Since we expect the resulting operator to be unbounded, we need a suitable domain first. Motivated by (3.18), we set Df = {ψ ∈ H| |f (λ)|2 dµψ (λ) ∞}. (3.25) R This is clearly a linear subspace of H since µαψ (Ω) = |α|2 µψ (Ω) and since µϕ+ψ (Ω) = P (Ω)(ϕ+ψ) 2 ≤ 2( P (Ω)ϕ 2 + P (Ω)ψ 2 ) = 2(µϕ (Ω)+µψ (Ω)) (by the triangle inequality). For every ψ ∈ Df , the sequence of bounded Borel functions fn = χΩn f, Ωn = {λ| |f (λ)| ≤ n}, (3.26) is a Cauchy sequence converging to f in the sense of L2 (R, dµψ ). Hence, by virtue of (3.18), the vectors ψn = P (fn )ψ form a Cauchy sequence in H and we can define P (f )ψ = lim P (fn )ψ, ψ ∈ Df . (3.27) n→∞ By construction, P (f ) is a linear operator such that (3.18) holds. Since f ∈ L1 (R, dµψ ) (µψ is finite), (3.17) also remains true at least for ϕ = ψ. In addition, Df is dense. Indeed, let Ωn be defined as in (3.26) and abbreviate ψn = P (Ωn )ψ. Now observe that dµψn = χΩn dµψ and hence ψn ∈ Df . Moreover, ψn → ψ by (3.18) since χΩn → 1 in L2 (R, dµψ ). The operator P (f ) has some additional properties. One calls an un- bounded operator A normal if D(A) = D(A∗ ) and Aψ = A∗ ψ for all ψ ∈ D(A). Note that normal operators are closed since the graph norms on D(A) = D(A∗ ) are identical.
  • 104. 92 3. The spectral theorem Theorem 3.2. For any Borel function f , the operator P (f ) ≡ f (λ)dP (λ), D(P (f )) = Df , (3.28) R is normal and satisfies P (f )∗ = P (f ∗ ). (3.29) Proof. Let f be given and define fn , Ωn as above. Since (3.29) holds for fn by our previous theorem, we get ϕ, P (f )ψ = P (f ∗ )ϕ, ψ for any ϕ, ψ ∈ Df = Df ∗ by continuity. Thus it remains to show that ˜ D(P (f )∗ ) ⊆ Df . If ψ ∈ D(P (f )∗ ), we have ψ, P (f )ϕ = ψ, ϕ for all ϕ ∈ Df by definition. By construction of P (f ) we have P (fn ) = P (f )P (Ωn ) and thus ∗ ˜ P (fn )ψ, ϕ = ψ, P (fn )ϕ = ψ, P (f )P (Ωn )ϕ = P (Ωn )ψ, ϕ ∗ ˜ for any ϕ ∈ H shows P (fn )ψ = P (Ωn )ψ. This proves existence of the limit ∗ ˜ ˜ lim |fn |2 dµψ = lim P (fn )ψ 2 = lim P (Ωn )ψ 2 = ψ 2, n→∞ R n→∞ n→∞ which by monotone convergence implies f ∈ L2 (R, dµψ ); that is, ψ ∈ Df . That P (f ) is normal follows from (3.18), which implies P (f )ψ 2 = P (f ∗ )ψ 2 = R |f (λ)|2 dµψ . These considerations seem to indicate some kind of correspondence be- ˜ tween the operators P (f ) in H and f in L2 (R, dµψ ). Recall that U : H → H is called unitary if it is a bijection which preserves norms U ψ = ψ (and ˜ ˜ hence scalar products). The operators A in H and A in H are said to be unitarily equivalent if ˜ U A = AU, ˜ U D(A) = D(A). (3.30) ˜ ˜ Clearly, A is self-adjoint if and only if A is and σ(A) = σ(A). Now let us return to our original problem and consider the subspace Hψ = {P (g)ψ|g ∈ L2 (R, dµψ )} ⊆ H. (3.31) Note that Hψ is closed since L2 is and ψn = P (gn )ψ converges in H if and only if gn converges in L2 . It even turns out that we can restrict P (f ) to Hψ (see Section 2.5). Lemma 3.3. The subspace Hψ reduces P (f ); that is, Pψ P (f ) ⊆ P (f )Pψ . Here Pψ is the projection onto Hψ .
  • 105. 3.1. The spectral theorem 93 Proof. First suppose f is bounded. Any ϕ ∈ H can be decomposed as ϕ = P (g)ψ + ϕ⊥ . Moreover, P (h)ψ, P (f )ϕ⊥ = P (f ∗ h)ψ, ϕ⊥ = 0 for every bounded function h implies P (f )ϕ⊥ ∈ H⊥ . Hence Pψ P (f )ϕ = ψ Pψ P (f )P (g)ψ = P (f )Pψ ϕ which by definition says that Hψ reduces P (f ). If f is unbounded, we consider fn = f χΩn as before. Then, for every ϕ ∈ Df , P (fn )Pψ ϕ = Pψ P (fn )ϕ. Letting n → ∞, we have P (Ωn )Pψ ϕ → Pψ ϕ and P (fn )Pψ ϕ = P (f )P (Ωn )Pψ ϕ → Pψ P (f )ϕ. Finally, closedness of P (f ) implies Pψ ϕ ∈ Df and P (f )Pψ ϕ = Pψ P (f )ϕ. In particular we can decompose P (f ) = P (f ) Hψ ⊕ P (f ) H⊥ . Note that ψ Pψ Df = Df ∩ Hψ = {P (g)ψ|g, f g ∈ L2 (R, dµψ )} (3.32) and P (f )P (g)ψ = P (f g)ψ ∈ Hψ in this case. By (3.18), the relation Uψ (P (f )ψ) = f (3.33) defines a unique unitary operator Uψ : Hψ → L2 (R, dµψ ) such that Uψ P (f ) Hψ = f Uψ , (3.34) where f is identified with its corresponding multiplication operator. More- over, if f is unbounded, we have Uψ (Df ∩Hψ ) = D(f ) = {g ∈ L2 (R, dµψ )|f g ∈ L2 (R, dµψ )} (since ϕ = P (f )ψ implies dµϕ = |f |2 dµψ ) and the above equa- tion still holds. The vector ψ is called cyclic if Hψ = H and in this case our picture is complete. Otherwise we need to extend this approach. A set {ψj }j∈J (J some index set) is called a set of spectral vectors if ψj = 1 and Hψi ⊥ Hψj for all i = j. A set of spectral vectors is called a spectral basis if j Hψj = H. Luckily a spectral basis always exists: Lemma 3.4. For every projection-valued measure P , there is an (at most countable) spectral basis {ψn } such that H= Hψn (3.35) n and a corresponding unitary operator U= Uψn : H → L2 (R, dµψn ) (3.36) n n such that for any Borel function f , U P (f ) = f U, U Df = D(f ). (3.37)
  • 106. 94 3. The spectral theorem Proof. It suffices to show that a spectral basis exists. This can be easily done using a Gram–Schmidt type construction. First of all observe that if {ψj }j∈J is a spectral set and ψ ⊥ Hψj for all j, we have Hψ ⊥ Hψj for all j. Indeed, ψ ⊥ Hψj implies P (g)ψ ⊥ Hψj for every bounded function g since P (g)ψ, P (f )ψj = ψ, P (g ∗ f )ψj = 0. But P (g)ψ with g bounded is dense in Hψ implying Hψ ⊥ Hψj . ˜ ˜ Now start with some total set {ψj }. Normalize ψ1 and choose this to be ψ1 . Move to the first ψ ˜j which is not in Hψ , project to the orthogonal 1 complement of Hψ1 and normalize it. Choose the result to be ψ2 . Proceeding ˜ like this, we get a set of spectral vectors {ψj } such that span{ψj } ⊆ j Hψj . ˜ Hence H = span{ψj } ⊆ Hψ . j j It is important to observe that the cardinality of a spectral basis is not well-defined (in contradistinction to the cardinality of an ordinary basis of the Hilbert space). However, it can be at most equal to the cardinality of an ordinary basis. In particular, since H is separable, it is at most count- able. The minimal cardinality of a spectral basis is called the spectral multiplicity of P . If the spectral multiplicity is one, the spectrum is called simple. Example. Let H = C2 and A = 0 0 and consider the associated projection- 01 valued measure PA (Ω) as before. Then ψ1 = (1, 0) and ψ2 = (0, 1) are a spectral basis. However, ψ = (1, 1) is cyclic and hence the spectrum of A is simple. If A = 1 0 , there is no cyclic vector (why?) and hence the spectral 01 multiplicity is two. Using this canonical form of projection-valued measures, it is straight- forward to prove Lemma 3.5. Let f, g be Borel functions and α, β ∈ C. Then we have αP (f ) + βP (g) ⊆ P (αf + βg), D(αP (f ) + βP (g)) = D|f |+|g| (3.38) and P (f )P (g) ⊆ P (f g), D(P (f )P (g)) = Dg ∩ Df g . (3.39) Now observe that to every projection-valued measure P we can assign a self-adjoint operator A = R λdP (λ). The question is whether we can invert this map. To do this, we consider the resolvent RA (z) = R (λ − z)−1 dP (λ). From (3.17) the corresponding quadratic form is given by 1 Fψ (z) = ψ, RA (z)ψ = dµψ (λ), (3.40) R λ−z
  • 107. 3.1. The spectral theorem 95 which is know as the Borel transform of the measure µψ . By 1 Im(Fψ (z)) = Im(z) dµψ (λ), (3.41) R |λ − z|2 we infer that Fψ (z) is a holomorphic map from the upper half plane into itself. Such functions are called Herglotz or Nevanlinna functions (see Section 3.4). Moreover, the measure µψ can be reconstructed from Fψ (z) by the Stieltjes inversion formula λ+δ 1 µψ (λ) = lim lim Im(Fψ (t + iε))dt. (3.42) δ↓0 ε↓0 π −∞ (The limit with respect to δ is only here to ensure right continuity of µψ (λ).) M Conversely, if Fψ (z) is a Herglotz function satisfying |Fψ (z)| ≤ Im(z) , then it is the Borel transform of a unique measure µψ (given by the Stieltjes inversion formula) satisfying µψ (R) ≤ M . So let A be a given self-adjoint operator and consider the expectation of the resolvent of A, Fψ (z) = ψ, RA (z)ψ . (3.43) This function is holomorphic for z ∈ ρ(A) and satisfies ψ 2 Fψ (z ∗ ) = Fψ (z)∗ and |Fψ (z)| ≤ (3.44) Im(z) (see (2.69) and Theorem 2.18). Moreover, the first resolvent formula (2.81) shows that it maps the upper half plane to itself: Im(Fψ (z)) = Im(z) RA (z)ψ 2 ; (3.45) that is, it is a Herglotz function. So by our above remarks, there is a corresponding measure µψ (λ) given by the Stieltjes inversion formula. It is called the spectral measure corresponding to ψ. More generally, by polarization, for each ϕ, ψ ∈ H we can find a corre- sponding complex measure µϕ,ψ such that 1 ϕ, RA (z)ψ = dµϕ,ψ (λ). (3.46) R λ−z The measure µϕ,ψ is conjugate linear in ϕ and linear in ψ. Moreover, a comparison with our previous considerations begs us to define a family of operators via the sesquilinear forms sΩ (ϕ, ψ) = χΩ (λ)dµϕ,ψ (λ). (3.47) R Since the associated quadratic form is nonnegative, qΩ (ψ) = sΩ (ψ, ψ) = µψ (Ω) ≥ 0, the Cauchy–Schwarz inequality for sesquilinear forms (Prob- lem 0.16) implies |sΩ (ϕ, ψ)| ≤ qΩ (ϕ)1/2 qΩ (ψ)1/2 = µϕ (Ω)1/2 µψ (Ω)1/2 ≤
  • 108. 96 3. The spectral theorem µϕ (R)1/2 µψ (R)1/2 ≤ ϕ ψ . Hence Corollary 1.9 implies that there is in- deed a family of nonnegative (0 ≤ ψ, PA (Ω)ψ ≤ 1) and hence self-adjoint operators PA (Ω) such that ϕ, PA (Ω)ψ = χΩ (λ)dµϕ,ψ (λ). (3.48) R Lemma 3.6. The family of operators PA (Ω) forms a projection-valued mea- sure. Proof. We first show PA (Ω1 )PA (Ω2 ) = PA (Ω1 ∩ Ω2 ) in two steps. First observe (using the first resolvent formula (2.81)) 1 dµ ∗ (λ) = RA (z ∗ )ϕ, RA (˜)ψ = ϕ, RA (z)RA (˜)ψ z z R λ − z RA (z )ϕ,ψ ˜ 1 = ( ϕ, RA (z)ψ − ϕ, RA (˜)ψ ) z z−z ˜ 1 1 1 1 dµϕ,ψ (λ) = − dµϕ,ψ (λ) = z−z R λ−z λ−z ˜ ˜ R λ−z λ−z ˜ implying dµRA (z ∗ )ϕ,ψ (λ) = (λ − z)−1 dµϕ,ψ (λ) by Problem 3.21. Secondly we compute 1 dµ (λ) = ϕ, RA (z)PA (Ω)ψ = RA (z ∗ )ϕ, PA (Ω)ψ R λ − z ϕ,PA (Ω)ψ 1 = χΩ (λ)dµRA (z ∗ )ϕ,ψ (λ) = χΩ (λ)dµϕ,ψ (λ) R R λ−z implying dµϕ,PA (Ω)ψ (λ) = χΩ (λ)dµϕ,ψ (λ). Equivalently we have ϕ, PA (Ω1 )PA (Ω2 )ψ = ϕ, PA (Ω1 ∩ Ω2 )ψ since χΩ1 χΩ2 = χΩ1 ∩Ω2 . In particular, choosing Ω1 = Ω2 , we see that PA (Ω1 ) is a projector. To see PA (R) = I, let ψ ∈ Ker(PA (R)). Then 0 = ψ, PA (R)ψ = µψ (R) implies ψ, RA (z)ψ = 0 which implies ψ = 0. ∞ Now let Ω = n=1 Ωn with Ωn ∩ Ωm = ∅ for n = m. Then n n ψ, PA (Ωj )ψ = µψ (Ωj ) → ψ, PA (Ω)ψ = µψ (Ω) j=1 j=1 by σ-additivity of µψ . Hence PA is weakly σ-additive which implies strong σ-additivity, as pointed out earlier. Now we can prove the spectral theorem for self-adjoint operators.
  • 109. 3.1. The spectral theorem 97 Theorem 3.7 (Spectral theorem). To every self-adjoint operator A there corresponds a unique projection-valued measure PA such that A= λdPA (λ). (3.49) R Proof. Existence has already been established. Moreover, Lemma 3.5 shows that PA ((λ−z)−1 ) = RA (z), z ∈ CR. Since the measures µϕ,ψ are uniquely determined by the resolvent and the projection-valued measure is uniquely determined by the measures µϕ,ψ , we are done. The quadratic form of A is given by qA (ψ) = λdµψ (λ) (3.50) R and can be defined for every ψ in the form domain Q(A) = D(|A|1/2 ) = {ψ ∈ H| |λ|dµψ (λ) ∞} (3.51) R (which is larger than the domain D(A) = {ψ ∈ H| R λ2 dµψ (λ) ∞}). This extends our previous definition for nonnegative operators. ˜ Note that if A and A are unitarily equivalent as in (3.30), then U RA (z) = RA (z)U and hence ˜ dµψ = d˜U ψ . µ (3.52) In particular, we have U PA (f ) = PA (f )U , U D(PA (f )) = D(PA (f )). ˜ ˜ Finally, let us give a characterization of the spectrum of A in terms of the associated projectors. Theorem 3.8. The spectrum of A is given by σ(A) = {λ ∈ R|PA ((λ − ε, λ + ε)) = 0 for all ε 0}. (3.53) 1 1 Proof. Let Ωn = (λ0 − n , λ0 + n ). Suppose PA (Ωn ) = 0. Then we can find a ψn ∈ PA (Ωn )H with ψn = 1. Since 2 2 (A − λ0 )ψn = (A − λ0 )PA (Ωn )ψn 1 = (λ − λ0 )2 χΩn (λ)dµψn (λ) ≤ , R n2 we conclude λ0 ∈ σ(A) by Lemma 2.16. Conversely, if PA ((λ0 − ε, λ0 + ε)) = 0, set fε (λ) = χR(λ0 −ε,λ0 +ε) (λ)(λ − λ0 )−1 . Then (A − λ0 )PA (fε ) = PA ((λ − λ0 )fε (λ)) = PA (R(λ0 − ε, λ0 + ε)) = I. Similarly PA (fε )(A − λ0 ) = I|D(A) and hence λ0 ∈ ρ(A).
  • 110. 98 3. The spectral theorem In particular, PA ((λ1 , λ2 )) = 0 if and only if (λ1 , λ2 ) ⊆ ρ(A). Corollary 3.9. We have PA (σ(A)) = I and PA (R ∩ ρ(A)) = 0. (3.54) Proof. For every λ ∈ R ∩ ρ(A) there is some open interval Iλ with PA (Iλ ) = 0. These intervals form an open cover for R ∩ ρ(A) and there is a countable subcover Jn . Setting Ωn = Jn mn Jm , we have disjoint Borel sets which cover R ∩ ρ(A) and satisfy PA (Ωn ) = 0. Finally, strong σ-additivity shows PA (R ∩ ρ(A))ψ = n PA (Ωn )ψ = 0. Consequently, PA (f ) = PA (σ(A))PA (f ) = PA (χσ(A) f ). (3.55) In other words, PA (f ) is not affected by the values of f on Rσ(A)! It is clearly more intuitive to write PA (f ) = f (A) and we will do so from now on. This notation is justified by the elementary observation n n PA ( αj λj ) = αj Aj . (3.56) j=0 j=0 Moreover, this also shows that if A is bounded and f (A) can be defined via a convergent power series, then this agrees with our present definition by Theorem 3.1. Problem 3.1. Show that a self-adjoint operator P is a projection if and only if σ(P ) ⊆ {0, 1}. Problem 3.2. Consider the parity operator Π : L2 (Rn ) → L2 (Rn ), ψ(x) → ψ(−x). Show that Π is self-adjoint. Compute its spectrum σ(Π) and the corresponding projection-valued measure PΠ . Problem 3.3. Show that (3.7) is a projection-valued measure. What is the corresponding operator? Problem 3.4. Show that P (λ) defined in (3.14) satisfies properties (i)–(iv) stated there. Problem 3.5. Show that for a self-adjoint operator A we have RA (z) = dist(z, σ(A)). Problem 3.6. Suppose A is self-adjoint and B − z0 ≤ r. Show that σ(A + B) ⊆ σ(A) + Br (z0 ), where Br (z0 ) is the ball of radius r around z0 . (Hint: Problem 2.17.)
  • 111. 3.2. More on Borel measures 99 Problem 3.7. Show that for a self-adjoint operator A we have ARA (z) ≤ |z| Im(z) . Find some A for which equality is attained. Conclude that for every ψ ∈ H we have lim ARA (z)ψ = 0, (3.57) z→∞ where the limit is taken in any sector ε| Re(z)| ≤ | Im(z)|, ε 0. Problem 3.8. Suppose A is self-adjoint. Show that, if ψ ∈ D(An ), then n Aj ψ An ψ RA (z)ψ = − + O( n ), as z → ∞. (3.58) z j+1 |z| Im(z) j=0 (Hint: Proceed as in (2.87) and use the previous problem.) Problem 3.9. Let λ0 be an eigenvalue and ψ a corresponding normalized eigenvector. Compute µψ . Problem 3.10. Show that λ0 is an eigenvalue if and only if P ({λ0 }) = 0. Show that Ran(P ({λ0 })) is the corresponding eigenspace in this case. Problem 3.11 (Polar decomposition). Let A be a closed operator and √ set |A| = A∗ A (recall that, by Problem 2.12, A∗ A is self-adjoint and Q(A∗ A) = D(A)). Show that |A|ψ = Aψ . Conclude that Ker(A) = Ker(|A|) = Ran(|A|)⊥ and that ϕ = |A|ψ → Aψ if ϕ ∈ Ran(|A|), U= ϕ→0 if ϕ ∈ Ker(|A|) extends to a well-defined partial isometry; that is, U : Ker(U )⊥ → Ran(U ) is unitary, where Ker(U ) = Ker(A) and Ran(U ) = Ker(A∗ )⊥ . In particular, we have the polar decomposition A = U |A|. √ Problem 3.12. Compute |A| = A∗ A for the rank one operator A = √ ϕ, . ψ. Compute AA∗ also. 3.2. More on Borel measures Section 3.1 showed that in order to understand self-adjoint operators, one needs to understand multiplication operators on L2 (R, dµ), where dµ is a finite Borel measure. This is the purpose of the present section. The set of all growth points, that is, σ(µ) = {λ ∈ R|µ((λ − ε, λ + ε)) 0 for all ε 0}, (3.59)
  • 112. 100 3. The spectral theorem is called the spectrum of µ. The same proof as for Corollary 3.9 shows that the spectrum σ = σ(µ) is a support for µ; that is, µ(Rσ) = 0. In the previous section we have already seen that the Borel transform of µ, 1 F (z) = dµ(λ), (3.60) R λ−z plays an important role. Theorem 3.10. The Borel transform of a finite Borel measure is a Herglotz function. It is holomorphic in Cσ(µ) and satisfies µ(R) F (z ∗ ) = F (z)∗ , |F (z)| ≤ , z ∈ C+ . (3.61) Im(z) Proof. First of all note 1 dµ(λ) Im(F (z)) = Im dµ(λ) = Im(z) , R λ−z R |λ − z|2 which shows that F maps C+ to C+ . Moreover, F (z ∗) = F (z)∗ is obvious and dµ(λ) 1 |F (z)| ≤ ≤ dµ(λ) R |λ − z| Im(z) R establishes the bound. Moreover, since µ(Rσ) = 0, we have 1 F (z) = dµ(λ), σ λ−z which together with the bound 1 1 ≤ |λ − z| dist(z, σ) allows the application of the dominated convergence theorem to conclude that F is continuous on Cσ. To show that F is holomorphic in Cσ, by Morera’s theorem, it suffices to check Γ F (z)dz = 0 for every triangle Γ ⊂ Cσ. Since (λ − z)−1 is bounded for (λ, z) ∈ σ × Γ, this follows from −1 −1 Γ (λ − z) dz = 0 by using Fubini, Γ F (z)dz = Γ R (λ − z) dµ(λ) dz = −1 dz dµ(λ) = 0. R Γ (λ − z) Note that F cannot be holomorphically extended to a larger domain. In fact, if F is holomorphic in a neighborhood of some λ ∈ R, then F (λ) = F (λ∗ ) = F (λ)∗ implies Im(F (λ)) = 0 and the Stieltjes inversion formula (Theorem 3.21) shows that λ ∈ Rσ(µ). Associated with this measure is the operator Af (λ) = λf (λ), D(A) = {f ∈ L2 (R, dµ)|λf (λ) ∈ L2 (R, dµ)}. (3.62) By Theorem 3.8 the spectrum of A is precisely the spectrum of µ; that is, σ(A) = σ(µ). (3.63)
  • 113. 3.2. More on Borel measures 101 Note that 1 ∈ L2 (R, dµ) is a cyclic vector for A and that dµg,f (λ) = g(λ)∗ f (λ)dµ(λ). (3.64) Now what can we say about the function f (A) (which is precisely the multiplication operator by f ) of A? We are only interested in the case where f is real-valued. Introduce the measure (f µ)(Ω) = µ(f −1 (Ω)). (3.65) Then g(λ)d(f µ)(λ) = g(f (λ))dµ(λ). (3.66) R R In fact, it suffices to check this formula for simple functions g, which follows since χΩ ◦ f = χf −1 (Ω) . In particular, we have Pf (A) (Ω) = χf −1 (Ω) . (3.67) It is tempting to conjecture that f (A) is unitarily equivalent to multi- plication by λ in L2 (R, d(f µ)) via the map L2 (R, d(f µ)) → L2 (R, dµ), g → g ◦ f. (3.68) However, this map is only unitary if its range is L2 (R, dµ). Lemma 3.11. Suppose f is injective. Then U : L2 (R, dµ) → L2 (R, d(f µ)), g → g ◦ f −1 (3.69) is a unitary map such that U f (λ) = λ. Example. Let f (λ) = λ2 . Then (g ◦ f )(λ) = g(λ2 ) and the range of the above map is given by the symmetric functions. Note that we can still get a unitary map L2 (R, d(f µ)) ⊕ L2 (R, d(f µ)) → L2 (R, dµ), (g1 , g2 ) → g1 (λ2 ) + g2 (λ2 )(χ(0,∞) (λ) − χ(0,∞) (−λ)). Lemma 3.12. Let f be real-valued. The spectrum of f (A) is given by σ(f (A)) = σ(f µ). (3.70) In particular, σ(f (A)) ⊆ f (σ(A)), (3.71) where equality holds if f is continuous and the closure can be dropped if, in addition, σ(A) is bounded (i.e., compact). Proof. The first formula follows by comparing σ(f µ) = {λ ∈ R | µ(f −1 (λ − ε, λ + ε)) 0 for all ε 0} with (2.74).
  • 114. 102 3. The spectral theorem If f is continuous, f −1 ((f (λ) − ε, f (λ) + ε)) contains an open interval around λ and hence f (λ) ∈ σ(f (A)) if λ ∈ σ(A). If, in addition, σ(A) is compact, then f (σ(A)) is compact and hence closed. Whether two operators with simple spectrum are unitarily equivalent can be read off from the corresponding measures: Lemma 3.13. Let A1 , A2 be self-adjoint operators with simple spectrum and corresponding spectral measures µ1 and µ2 of cyclic vectors. Then A1 and A2 are unitarily equivalent if and only if µ1 and µ2 are mutually absolutely continuous. Proof. Without restriction we can assume that Aj is multiplication by λ in L2 (R, dµj ). Let U : L2 (R, dµ1 ) → L2 (R, dµ2 ) be a unitary map such that U A1 = A2 U . Then we also have U f (A1 ) = f (A2 )U for any bounded Borel function and hence U f (λ) = U f (λ) · 1 = f (λ)U (1)(λ) and thus U is multiplication by u(λ) = U (1)(λ). Moreover, since U is unitary, we have µ1 (Ω) = |χΩ |2 dµ1 = |u χΩ |2 dµ2 = |u|2 dµ2 ; R R Ω that is, dµ1 = |u|2 dµ 2 . Reversing the roles of A1 and A2 , we obtain dµ2 = |v|2 dµ1 , where v = U −1 1. The converse is left as an exercise (Problem 3.17). Next we recall the unique decomposition of µ with respect to Lebesgue measure, dµ = dµac + dµs , (3.72) where µac is absolutely continuous with respect to Lebesgue measure (i.e., we have µac (B) = 0 for all B with Lebesgue measure zero) and µs is singular with respect to Lebesgue measure (i.e., µs is supported, µs (RB) = 0, on a set B with Lebesgue measure zero). The singular part µs can be further decomposed into a (singularly) continuous and a pure point part, dµs = dµsc + dµpp , (3.73) where µsc is continuous on R and µpp is a step function. Since the measures dµac , dµsc , and dµpp are mutually singular, they have mutually disjoint supports Mac , Msc , and Mpp . Note that these sets are not unique. We will choose them such that Mpp is the set of all jumps of µ(λ) and such that Msc has Lebesgue measure zero. To the sets Mac , Msc , and Mpp correspond projectors P ac = χMac (A), P sc = χMsc (A), and P pp = χMpp (A) satisfying P ac + P sc + P pp = I. In
  • 115. 3.2. More on Borel measures 103 other words, we have a corresponding direct sum decomposition of both our Hilbert space L2 (R, dµ) = L2 (R, dµac ) ⊕ L2 (R, dµsc ) ⊕ L2 (R, dµpp ) (3.74) and our operator A = (AP ac ) ⊕ (AP sc ) ⊕ (AP pp ). (3.75) The corresponding spectra, σac (A) = σ(µac ), σsc (A) = σ(µsc ), and σpp (A) = σ(µpp ) are called the absolutely continuous, singularly continuous, and pure point spectrum of A, respectively. It is important to observe that σpp (A) is in general not equal to the set of eigenvalues σp (A) = {λ ∈ R|λ is an eigenvalue of A} (3.76) since we only have σpp (A) = σp (A). 1 Example. Let H = 2 (N) and let A be given by Aδn = n δn , where δn is the sequence which is 1 at the n’th place and zero otherwise (that is, A is 1 1 a diagonal matrix with diagonal elements n ). Then σp (A) = { n |n ∈ N} but σ(A) = σpp (A) = σp (A) ∪ {0}. To see this, just observe that δn is the 1 eigenvector corresponding to the eigenvalue n and for z ∈ σ(A) we have n RA (z)δn = 1−nz δn . At z = 0 this formula still gives the inverse of A, but it is unbounded and hence 0 ∈ σ(A) but 0 ∈ σp (A). Since a continuous measure cannot live on a single point and hence also not on a countable set, we have σac (A) = σsc (A) = ∅. Example. An example with purely absolutely continuous spectrum is given by taking µ to be the Lebesgue measure. An example with purely singularly continuous spectrum is given by taking µ to be the Cantor measure. Finally, we show how the spectrum can be read off from the boundary values of Im(F ) towards the real line. We define the following sets: Mac = {λ|0 lim sup Im(F (λ + iε)) ∞}, ε↓0 Ms = {λ| lim sup Im(F (λ + iε)) = ∞}, (3.77) ε↓0 M = Mac ∪ Ms = {λ|0 lim sup Im(F (λ + iε))}. ε↓0 Then, by Theorem 3.23 we conclude that these sets are minimal supports for µac , µs , and µ, respectively. In fact, by Theorem 3.23 we could even restrict ourselves to values of λ, where the lim sup is a lim (finite or infinite). Lemma 3.14. The spectrum of µ is given by σ(µ) = M , M = {λ|0 lim inf Im(F (λ + iε))}. (3.78) ε↓0
  • 116. 104 3. The spectral theorem Proof. First observe that F is real holomorphic near λ ∈ σ(µ) and hence Im(F (λ)) = 0 in this case. Thus M ⊆ σ(µ) and since σ(µ) is closed, we even have M ⊆ σ(µ). To see the converse, note that by Theorem 3.23, the set M is a support for M . Thus, if λ ∈ σ(µ), then 0 µ((λ − ε, λ + ε)) = µ((λ − ε, λ + ε) ∩ M ) for all ε 0 and we can find a sequence λn ∈ (λ−1/n, λ+1/n)∩M converging to λ from inside M . This shows the remaining part σ(µ) ⊆ M . To recover σ(µac ) from Mac , we need the essential closure of a Borel set N ⊆ R, ess N = {λ ∈ R||(λ − ε, λ + ε) ∩ N | 0 for all ε 0}. (3.79) ess Note that N is closed, whereas, in contradistinction to the ordinary clo- ess sure, we might have N ⊂ N (e.g., any isolated point of N will disappear). Lemma 3.15. The absolutely continuous spectrum of µ is given by ess σ(µac ) = M ac . (3.80) Proof. We use that 0 µac ((λ − ε, λ + ε)) = µac ((λ − ε, λ + ε) ∩ Mac ) is equivalent to |(λ − ε, λ + ε) ∩ Mac | 0. One direction follows from the definition of absolute continuity and the other from minimality of Mac . Problem 3.13. Construct a multiplication operator on L2 (R) which has dense point spectrum. Problem 3.14. Let λ be Lebesgue measure on R. Show that if f ∈ AC(R) with f 0, then 1 d(f λ) = dλ. f (λ) Problem 3.15. Let dµ(λ) = χ[0,1] (λ)dλ and f (λ) = χ(−∞,t] (λ), t ∈ R. Compute f µ. Problem 3.16. Let A be the multiplication operator by the Cantor function in L2 (0, 1). Compute the spectrum of A. Determine the spectral types. Problem 3.17. Show the missing direction in the proof of Lemma 3.13. ess Problem 3.18. Show N ⊆ N. 3.3. Spectral types Our next aim is to transfer the results of the previous section to arbitrary self-adjoint operators A using Lemma 3.4. To this end, we will need a spectral measure which contains the information from all measures in a spectral basis. This will be the case if there is a vector ψ such that for every
  • 117. 3.3. Spectral types 105 ϕ ∈ H its spectral measure µϕ is absolutely continuous with respect to µψ . Such a vector will be called a maximal spectral vector of A and µψ will be called a maximal spectral measure of A. Lemma 3.16. For every self-adjoint operator A there is a maximal spectral vector. Proof. Let {ψj }j∈J be a spectral basis and choose nonzero numbers εj with 2 j∈J |εj | = 1. Then I claim that ψ= εj ψj j∈J is a maximal spectral vector. Let ϕ be given. Then we can write it as ϕ = 2 2 j fj (A)ψj and hence dµϕ = j |fj | dµψj . But µψ (Ω) = j |εj | µψj (Ω) = 0 implies µψj (Ω) = 0 for every j ∈ J and thus µϕ (Ω) = 0. A set {ψj } of spectral vectors is called ordered if ψk is a maximal k−1 spectral vector for A restricted to ( j=1 Hψj )⊥ . As in the unordered case one can show Theorem 3.17. For every self-adjoint operator there is an ordered spectral basis. Observe that if {ψj } is an ordered spectral basis, then µψj+1 is absolutely continuous with respect to µψj . If µ is a maximal spectral measure, we have σ(A) = σ(µ) and the fol- lowing generalization of Lemma 3.12 holds. Theorem 3.18 (Spectral mapping). Let µ be a maximal spectral measure and let f be real-valued. Then the spectrum of f (A) is given by σ(f (A)) = {λ ∈ R|µ(f −1 (λ − ε, λ + ε)) 0 for all ε 0}. (3.81) In particular, σ(f (A)) ⊆ f (σ(A)), (3.82) where equality holds if f is continuous and the closure can be dropped if, in addition, σ(A) is bounded. Next, we want to introduce the splitting (3.74) for arbitrary self-adjoint operators A. It is tempting to pick a spectral basis and treat each summand in the direct sum separately. However, since it is not clear that this approach is independent of the spectral basis chosen, we use the more sophisticated
  • 118. 106 3. The spectral theorem definition Hac = {ψ ∈ H|µψ is absolutely continuous}, Hsc = {ψ ∈ H|µψ is singularly continuous}, Hpp = {ψ ∈ H|µψ is pure point}. (3.83) Lemma 3.19. We have H = Hac ⊕ Hsc ⊕ Hpp . (3.84) There are Borel sets Mxx such that the projector onto Hxx is given by P xx = χMxx (A), xx ∈ {ac, sc, pp}. In particular, the subspaces Hxx reduce A. For the sets Mxx one can choose the corresponding supports of some maximal spectral measure µ. Proof. We will use the unitary operator U of Lemma 3.4. Pick ϕ ∈ H and write ϕ = n ϕn with ϕn ∈ Hψn . Let fn = U ϕn . Then, by construction of the unitary operator U , ϕn = fn (A)ψn and hence dµϕn = |fn |2 dµψn . Moreover, since the subspaces Hψn are orthogonal, we have dµϕ = |fn |2 dµψn n and hence dµϕ,xx = |fn |2 dµψn ,xx , xx ∈ {ac, sc, pp}. n This shows U Hxx = L2 (R, dµψn ,xx ), xx ∈ {ac, sc, pp} n and reduces our problem to the considerations of the previous section. Furthermore, note that if µ is a maximal spectral measure, then every support for µxx is also a support for µϕ,xx for any ϕ ∈ H. The absolutely continuous, singularly continuous, and pure point spectrum of A are defined as σac (A) = σ(A|Hac ), σsc (A) = σ(A|Hsc ), and σpp (A) = σ(A|Hpp ), (3.85) respectively. If µ is a maximal spectral measure, we have σac (A) = σ(µac ), σsc (A) = σ(µsc ), and σpp (A) = σ(µpp ). ˜ ˜˜ If A and A are unitarily equivalent via U , then so are A|Hxx and A|Hxx ˜ by (3.52). In particular, σxx (A) = σxx (A). Problem 3.19. Compute σ(A), σac (A), σsc (A), and σpp (A) for the multi- 1 plication operator A = 1+x2 in L2 (R). What is its spectral multiplicity?
  • 119. 3.4. Appendix: The Herglotz theorem 107 3.4. Appendix: The Herglotz theorem Let C± = {z ∈ C| ± Im(z) 0} be the upper, respectively, lower, half plane. A holomorphic function F : C+ → C+ mapping the upper half plane to itself is called a Herglotz function. We can define F on C− using F (z ∗ ) = F (z)∗ . In Theorem 3.10 we have seen that the Borel transform of a finite mea- sure is a Herglotz function satisfying a growth estimate. It turns out that the converse is also true. Theorem 3.20 (Herglotz representation). Suppose F is a Herglotz function satisfying M |F (z)| ≤ , z ∈ C+ . (3.86) Im(z) Then there is a Borel measure µ, satisfying µ(R) ≤ M , such that F is the Borel transform of µ. Proof. We abbreviate F (z) = v(z) + i w(z) and z = x + i y. Next we choose a contour Γ = {x + iε + λ|λ ∈ [−R, R]} ∪ {x + iε + Reiϕ |ϕ ∈ [0, π]} and note that z lies inside Γ and z ∗ + 2iε lies outside Γ if 0 ε y R. Hence we have by Cauchy’s formula 1 1 1 F (z) = − ∗ − 2iε F (ζ)dζ. 2πi Γ ζ −z ζ −z Inserting the explicit form of Γ, we see R 1 y−ε F (z) = F (x + iε + λ)dλ π −R λ2 + (y − ε)2 π i y−ε + F (x + iε + Reiϕ )Reiϕ dϕ. π 0 R2 e2iϕ + (y − ε)2 The integral over the semi-circle vanishes as R → ∞ and hence we obtain 1 y−ε F (z) = F (λ + iε)dλ π R (λ − x)2 + (y − ε)2 and taking imaginary parts, w(z) = φε (λ)wε (λ)dλ, R where φε (λ) = (y − ε)/((λ − x)2 + (y − ε)2 ) and wε (λ) = w(λ + iε)/π. Letting y → ∞, we infer from our bound wε (λ)dλ ≤ M. R
  • 120. 108 3. The spectral theorem In particular, since |φε (λ) − φ0 (λ)| ≤ const ε, we have w(z) = lim φ0 (λ)dµε (λ), ε↓0 R λ where µε (λ) = −∞ wε (x)dx. Since µε (R) ≤ M , Lemma A.26 implies that there is subsequence which converges vaguely to some measure µ. Moreover, by Lemma A.27 we even have w(z) = φ0 (λ)dµ(λ). R Now F (z) and R (λ − z)−1 dµ(λ) have the same imaginary part and thus they only differ by a real constant. By our bound this constant must be zero. Observe dµ(λ) Im(F (z)) = Im(z) (3.87) R |λ − z|2 and lim λ Im(F (iλ)) = µ(R). (3.88) λ→∞ Theorem 3.21. Let F be the Borel transform of some finite Borel measure µ. Then the measure µ is unique and can be reconstructed via the Stieltjes inversion formula λ2 1 1 (µ((λ1 , λ2 )) + µ([λ1 , λ2 ])) = lim Im(F (λ + iε))dλ. (3.89) 2 ε↓0 π λ1 Proof. By Fubini we have λ2 λ2 1 1 ε Im(F (λ + iε))dλ = dµ(x)dλ π λ1 π λ1 R (x − λ)2 + ε2 λ2 1 ε = dλ dµ(x), R π λ1 (x − λ)2 + ε2 where λ2 1 ε 1 λ2 − x λ1 − x 2 + ε2 dλ = arctan − arctan π λ1 (x − λ) π ε ε 1 → χ (x) + χ(λ1 ,λ2 ) (x) 2 [λ1 ,λ2 ] pointwise. Hence the result follows from the dominated convergence theorem −x −x since 0 ≤ π arctan( λ2ε ) − arctan( λ1ε ) ≤ 1. 1 Furthermore, the Radon–Nikodym derivative of µ can be obtained from the boundary values of F .
  • 121. 3.4. Appendix: The Herglotz theorem 109 Theorem 3.22. Let µ be a finite Borel measure and F its Borel transform. Then 1 1 (Dµ)(λ) ≤ lim inf F (λ + iε) ≤ lim sup F (λ + iε) ≤ (Dµ)(λ). (3.90) ε↓0 π ε↓0 π Proof. We need to estimate ε Im(F (λ + iε)) = Kε (t)dµ(t), Kε (t) = . R t2 + ε2 We first split the integral into two parts: Im(F (λ+iε)) = Kε (t−λ)dµ(t)+ Kε (t−λ)µ(t), Iδ = (λ−δ, λ+δ). Iδ RIδ Clearly the second part can be estimated by Kε (t − λ)µ(t) ≤ Kε (δ)µ(R). RIδ To estimate the first part, we integrate Kε (s) ds dµ(t) over the triangle {(s, t)|λ − s t λ + s, 0 s δ} = {(s, t)|λ − δ t λ + δ, t − λ s δ} and obtain δ µ(Is )Kε (s)ds = (K(δ) − Kε (t − λ))dµ(t). 0 Iδ Now suppose there are constants c and C such that c ≤ µ(Is ) ≤ C, 0 ≤ s ≤ δ. 2s Then δ δ 2c arctan( ) ≤ Kε (t − λ)dµ(t) ≤ 2C arctan( ) ε Iδ ε since δ δ δKε (δ) + −sKε (s)ds = arctan( ). 0 ε Thus the claim follows combining both estimates. As a consequence of Theorem A.37 and Theorem A.38 we obtain (cf. also Lemma A.39) Theorem 3.23. Let µ be a finite Borel measure and F its Borel transform. Then the limit 1 Im(F (λ)) = lim Im(F (λ + iε)) (3.91) ε↓0 π exists a.e. with respect to both µ and Lebesgue measure (finite or infinite) and 1 (Dµ)(λ) = Im(F (λ)) (3.92) π whenever (Dµ)(λ) exists.
  • 122. 110 3. The spectral theorem Moreover, the set {λ| Im(F (λ)) = ∞} is a support for the singularly continuous part and {λ|0 Im(F (λ)) ∞} is a minimal support for the absolutely continuous part. In particular, Corollary 3.24. The measure µ is purely absolutely continuous on I if lim supε↓0 Im(F (λ + iε)) ∞ for all λ ∈ I. The limit of the real part can be computed as well. Corollary 3.25. The limit lim F (λ + iε) (3.93) ε↓0 exists a.e. with respect to both µ and Lebesgue measure. It is finite a.e. with respect to Lebesgue measure. Proof. If F (z) is a Herglotz function, then so is F (z). Moreover, F (z) has values in the first quadrant; that is, both Re( F (z)) and Im( F (z)) are positive for z ∈ C+ . Hence both F (z) and i F (z) are Herglotz functions and by Theorem 3.23 both limε↓0 Re( F (λ + iε)) and limε↓0 Im( F (λ + iε)) exist and are finite a.e. with respect to Lebesgue measure. By taking squares, the same is true for F (z) and hence limε↓0 F (λ + iε) exists and is finite a.e. with respect to Lebesgue measure. Since limε↓0 Im(F (λ + iε)) = ∞ implies limε↓0 F (λ + iε) = ∞, the result follows. Problem 3.20. Find all rational Herglotz functions F : C → C satisfying F (z ∗ ) = F (z)∗ and lim|z|→∞ |zF (z)| = M ∞. What can you say about the zeros of F ? Problem 3.21. A complex measure dµ is a measure which can be written as a complex linear combinations of positive measures dµj : dµ = dµ1 − dµ2 + i(dµ3 − dµ4 ). Let dµ F (z) = R λ−z be the Borel transform of a complex measure. Show that µ is uniquely de- termined by F via the Stieltjes inversion formula λ2 1 1 (µ((λ1 , λ2 )) + µ([λ1 , λ2 ])) = lim (F (λ + iε) − F (λ − iε))dλ. 2 ε↓0 2πi λ1 Problem 3.22. Compute the Borel transform of the complex measure given dλ by dµ(λ) = (λ−i)2 .
  • 123. Chapter 4 Applications of the spectral theorem This chapter can be mostly skipped on first reading. You might want to have a look at the first section and then come back to the remaining ones later. Now let us show how the spectral theorem can be used. We will give a few typical applications: First we will derive an operator-valued version of the Stieltjes inversion formula. To do this, we need to show how to integrate a family of functions of A with respect to a parameter. Moreover, we will show that these integrals can be evaluated by computing the corresponding integrals of the complex- valued functions. Secondly we will consider commuting operators and show how certain facts, which are known to hold for the resolvent of an operator A, can be established for a larger class of functions. Then we will show how the eigenvalues below the essential spectrum and dimension of Ran PA (Ω) can be estimated using the quadratic form. Finally, we will investigate tensor products of operators. 4.1. Integral formulas We begin with the first task by having a closer look at the projections PA (Ω). They project onto subspaces corresponding to expectation values in the set Ω. In particular, the number ψ, χΩ (A)ψ (4.1) 111
  • 124. 112 4. Applications of the spectral theorem is the probability for a measurement of a to lie in Ω. In addition, we have ψ, Aψ = λ dµψ (λ) ∈ hull(Ω), ψ ∈ PA (Ω)H, ψ = 1, (4.2) Ω where hull(Ω) is the convex hull of Ω. The space Ran χ{λ0 } (A) is called the eigenspace corresponding to λ0 since we have ϕ, Aψ = λ χ{λ0 } (λ)dµϕ,ψ (λ) = λ0 dµϕ,ψ (λ) = λ0 ϕ, ψ (4.3) R R and hence Aψ = λ0 ψ for all ψ ∈ Ran χ{λ0 } (A). The dimension of the eigenspace is called the multiplicity of the eigenvalue. Moreover, since −iε lim = χ{λ0 } (λ), (4.4) ε↓0 λ − λ0 − iε we infer from Theorem 3.1 that lim −iεRA (λ0 + iε)ψ = χ{λ0 } (A)ψ. (4.5) ε↓0 Similarly, we can obtain an operator-valued version of the Stieltjes inversion formula. But first we need to recall a few facts from integration in Banach spaces. We will consider the case of mappings f : I → X where I = [t0 , t1 ] ⊂ R is a compact interval and X is a Banach space. As before, a function f : I → X is called simple if the image of f is finite, f (I) = {xi }n , and if each inverse i=1 image f −1 (xi ), 1 ≤ i ≤ n, is a Borel set. The set of simple functions S(I, X) forms a linear space and can be equipped with the sup norm f ∞ = sup f (t) . (4.6) t∈I The corresponding Banach space obtained after completion is called the set of regulated functions R(I, X). Observe that C(I, X) ⊂ R(I, X). In fact, consider the simple function n−1 t1 −t0 fn = i=0 f (si )χ[si ,si+1 ) , where si = t0 + i n . Since f ∈ C(I, X) is uniformly continuous, we infer that fn converges uniformly to f . For f ∈ S(I, X) we can define a linear map : S(I, X) → X by n f (t)dt = xi |f −1 (xi )|, (4.7) I i=1 where |Ω| denotes the Lebesgue measure of Ω. This map satisfies f (t)dt ≤ f ∞ (t1 − t0 ) (4.8) I
  • 125. 4.1. Integral formulas 113 and hence it can be extended uniquely to a linear map : R(I, X) → X with the same norm (t1 − t0 ) by Theorem 0.26. We even have f (t)dt ≤ f (t) dt, (4.9) I I which clearly holds for f ∈ S(I, X) and thus for all f ∈ R(I, X) by conti- nuity. In addition, if ∈ X ∗ is a continuous linear functional, then ( f (t)dt) = (f (t))dt, f ∈ R(I, X). (4.10) I I In particular, if A(t) ∈ R(I, L(H)), then A(t)dt ψ = (A(t)ψ)dt. (4.11) I I If I = R, we say that f : I → X is integrable if f ∈ R([−r, r], X) for all r 0 and if f (t) is integrable. In this case we can set f (t)dt = lim f (t)dt (4.12) R r→∞ [−r,r] and (4.9) and (4.10) still hold. t3 We will use the standard notation t2 f (s)ds = I χ(t2 ,t3 ) (s)f (s)ds and t2 t3 t3 f (s)ds = − t2 f (s)ds. We write f ∈ C 1 (I, X) if d f (t + ε) − f (t) f (t) = lim (4.13) dt ε→0 ε t exists for all t ∈ I. In particular, if f ∈ C(I, X), then F (t) = t0 f (s)ds ∈ C 1 (I, X) and dF/dt = f as can be seen from t+ε F (t + ε) − F (t) − f (t)ε = (f (s) − f (t))ds ≤ |ε| sup f (s) − f (t) . t s∈[t,t+ε] (4.14) The important facts for us are the following two results. Lemma 4.1. Suppose f : I × R → C is a bounded Borel function and set F (λ) = I f (t, λ)dt. Let A be self-adjoint. Then f (t, A) ∈ R(I, L(H)) and F (A) = f (t, A)dt, respectively, F (A)ψ = f (t, A)ψ dt. (4.15) I I Proof. That f (t, A) ∈ R(I, L(H)) follows from the spectral theorem, since it is no restriction to assume that A is multiplication by λ in some L2 space.
  • 126. 114 4. Applications of the spectral theorem We compute ϕ, ( f (t, A)dt)ψ = ϕ, f (t, A)ψ dt I I = f (t, λ)dµϕ,ψ (λ)dt I R = f (t, λ)dt dµϕ,ψ (λ) R I = F (λ)dµϕ,ψ (λ) = ϕ, F (A)ψ R by Fubini’s theorem and hence the first claim follows. Lemma 4.2. Suppose f : R → L(H) is integrable and A ∈ L(H). Then A f (t)dt = Af (t)dt, respectively, f (t)dtA = f (t)Adt. R R R R (4.16) Proof. It suffices to prove the case where f is simple and of compact sup- port. But for such functions the claim is straightforward. Now we can prove an operator-valued version of the Stieltjes inversion formula. Theorem 4.3 (Stone’s formula). Let A be self-adjoint. Then λ2 1 s 1 RA (λ + iε) − RA (λ − iε) dλ → PA ([λ1 , λ2 ]) + PA ((λ1 , λ2 )) 2πi λ1 2 (4.17) strongly. Proof. By λ2 1 1 1 1 λ2 ε − dλ = dλ 2πi λ1 x − λ − iε x − λ + iε π λ1 (x − λ)2 + ε2 1 λ2 − x λ1 − x = arctan − arctan π ε ε 1 → χ (x) + χ(λ1 ,λ2 ) (x) 2 [λ1 ,λ2 ] the result follows combining the last part of Theorem 3.1 with Lemma 4.1.
  • 127. 4.2. Commuting operators 115 Note that by using the first resolvent formula, Stone’s formula can also be written in the form λ2 1 1 ψ, PA ([λ1 , λ2 ]) + PA ((λ1 , λ2 )) ψ = lim Im ψ, RA (λ + iε)ψ dλ 2 ε↓0 π λ1 λ2 ε = lim RA (λ + iε)ψ 2 dλ. ε↓0 π λ1 (4.18) Problem 4.1. Let Γ be a differentiable Jordan curve in ρ(A). Show χΩ (A) = RA (z)dz, Γ where Ω is the intersection of the interior of Γ with R. 4.2. Commuting operators Now we come to commuting operators. As a preparation we can now prove Lemma 4.4. Let K ⊆ R be closed and let C∞ (K) be the set of all continuous functions on K which vanish at ∞ (if K is unbounded) with the sup norm. The ∗-subalgebra generated by the function 1 λ→ (4.19) λ−z for one z ∈ CK is dense in C∞ (K). Proof. If K is compact, the claim follows directly from the complex Stone– Weierstraß theorem since (λ1 −z)−1 = (λ2 −z)−1 implies λ1 = λ2 . Otherwise, ˜ replace K by K = K ∪{∞}, which is compact, and set (∞−z)−1 = 0. Then we can again apply the complex Stone–Weierstraß theorem to conclude that ˜ our ∗-subalgebra is equal to {f ∈ C(K)|f (∞) = 0} which is equivalent to C∞ (K). We say that two bounded operators A, B commute if [A, B] = AB − BA = 0. (4.20) If A or B is unbounded, we soon run into trouble with this definition since the above expression might not even make sense for any nonzero vector (e.g., take B = ϕ, . ψ with ψ ∈ D(A)). To avoid this nuisance, we will replace A by a bounded function of A. A good candidate is the resolvent. Hence if A is self-adjoint and B is bounded, we will say that A and B commute if [RA (z), B] = [RA (z ∗ ), B] = 0 (4.21) for one z ∈ ρ(A).
  • 128. 116 4. Applications of the spectral theorem Lemma 4.5. Suppose A is self-adjoint and commutes with the bounded operator B. Then [f (A), B] = 0 (4.22) for any bounded Borel function f . If f is unbounded, the claim holds for any ψ ∈ D(f (A)) in the sense that Bf (A) ⊆ f (A)B. Proof. Equation (4.21) tells us that (4.22) holds for any f in the ∗-sub- algebra generated by RA (z). Since this subalgebra is dense in C∞ (σ(A)), the claim follows for all such f ∈ C∞ (σ(A)). Next fix ψ ∈ H and let f be bounded. Choose a sequence fn ∈ C∞ (σ(A)) converging to f in L2 (R, dµψ ). Then Bf (A)ψ = lim Bfn (A)ψ = lim fn (A)Bψ = f (A)Bψ. n→∞ n→∞ If f is unbounded, let ψ ∈ D(f (A)) and choose fn as in (3.26). Then f (A)Bψ = lim fn (A)Bψ = lim Bfn (A)ψ n→∞ n→∞ shows f ∈ L2 (R, dµBψ ) (i.e., Bψ ∈ D(f (A))) and f (A)Bψ = BF (A)ψ. In the special case where B is an orthogonal projection, we obtain Corollary 4.6. Let A be self-adjoint and H1 a closed subspace with corre- sponding projector P1 . Then H1 reduces A if and only if P1 and A commute. Furthermore, note Corollary 4.7. If A is self-adjoint and bounded, then (4.21) holds if and only if (4.20) holds. Proof. Since σ(A) is compact, we have λ ∈ C∞ (σ(A)) and hence (4.20) follows from (4.22) by our lemma. Conversely, since B commutes with any polynomial of A, the claim follows from the Neumann series. As another consequence we obtain Theorem 4.8. Suppose A is self-adjoint and has simple spectrum. A bounded operator B commutes with A if and only if B = f (A) for some bounded Borel function. Proof. Let ψ be a cyclic vector for A. By our unitary equivalence it is no restriction to assume H = L2 (R, dµψ ). Then Bg(λ) = Bg(λ) · 1 = g(λ)(B1)(λ) since B commutes with the multiplication operator g(λ). Hence B is multi- plication by f (λ) = (B1)(λ).
  • 129. 4.3. The min-max theorem 117 The assumption that the spectrum of A is simple is crucial as the exam- ple A = I shows. Note also that the functions exp(−itA) can also be used instead of resolvents. Lemma 4.9. Suppose A is self-adjoint and B is bounded. Then B commutes with A if and only if [e−iAt , B] = 0 (4.23) for all t ∈ R. ˆ Proof. It suffices to show [f (A), B] = 0 for f ∈ S(R), since these functions are dense in C∞ (R) by the complex Stone–Weierstraß theorem. Here f ˆ denotes the Fourier transform of f ; see Section 7.1. But for such f we have 1 1 [f (A), B] = √ [ f (t)e−iAt dt, B] = √ ˆ f (t)[e−iAt , B]dt = 0 2π R 2π R by Lemma 4.2. The extension to the case where B is self-adjoint and unbounded is straightforward. We say that A and B commute in this case if ∗ [RA (z1 ), RB (z2 )] = [RA (z1 ), RB (z2 )] = 0 (4.24) for one z1 ∈ ρ(A) and one z2 ∈ ρ(B) (the claim for ∗ z2 follows by taking adjoints). From our above analysis it follows that this is equivalent to [e−iAt , e−iBs ] = 0, t, s ∈ R, (4.25) respectively, [f (A), g(B)] = 0 (4.26) for arbitrary bounded Borel functions f and g. Problem 4.2. Let A and B be self-adjoint. Show that A and B commute if and only if the corresponding spectral projections PA (Ω) and PB (Ω) commute for every Borel set Ω. In particular, Ran(PB (Ω)) reduces A and vice versa. Problem 4.3. Let A and B be self-adjoint operators with pure point spec- trum. Show that A and B commute if and only if they have a common orthonormal basis of eigenfunctions. 4.3. The min-max theorem In many applications a self-adjoint operator has a number of eigenvalues below the bottom of the essential spectrum. The essential spectrum is ob- tained from the spectrum by removing all discrete eigenvalues with finite multiplicity (we will have a closer look at this in Section 6.2). In general there is no way of computing the lowest eigenvalues and their corresponding eigenfunctions explicitly. However, one often has some idea about how the eigenfunctions might approximately look.
  • 130. 118 4. Applications of the spectral theorem So suppose we have a normalized function ψ1 which is an approximation for the eigenfunction ϕ1 of the lowest eigenvalue E1 . Then by Theorem 2.19 we know that ψ1 , Aψ1 ≥ ϕ1 , Aϕ1 = E1 . (4.27) If we add some free parameters to ψ1 , one can optimize them and obtain quite good upper bounds for the first eigenvalue. But is there also something one can say about the next eigenvalues? Suppose we know the first eigenfunction ϕ1 . Then we can restrict A to the orthogonal complement of ϕ1 and proceed as before: E2 will be the infimum over all expectations restricted to this subspace. If we restrict to the orthogonal complement of an approximating eigenfunction ψ1 , there will still be a component in the direction of ϕ1 left and hence the infimum of the expectations will be lower than E2 . Thus the optimal choice ψ1 = ϕ1 will give the maximal value E2 . More precisely, let {ϕj }N be an orthonormal basis for the space spanned j=1 by the eigenfunctions corresponding to eigenvalues below the essential spec- trum. Here the essential spectrum σess (A) is given by precisely those values in the spectrum which are not isolated eigenvalues of finite multiplicity (see Section 6.2). Assume they satisfy (A − Ej )ϕj = 0, where Ej ≤ Ej+1 are the eigenvalues (counted according to their multiplicity). If the number of eigenvalues N is finite, we set Ej = inf σess (A) for j N and choose ϕj orthonormal such that (A − Ej )ϕj ≤ ε. Define U (ψ1 , . . . , ψn ) = {ψ ∈ D(A)| ψ = 1, ψ ∈ span{ψ1 , . . . , ψn }⊥ }. (4.28) (i) We have inf ψ, Aψ ≤ En + O(ε). (4.29) ψ∈U (ψ1 ,...,ψn−1 ) n In fact, set ψ = j=1 αj ϕj and choose αj such that ψ ∈ U (ψ1 , . . . , ψn−1 ). Then n ψ, Aψ = |αj |2 Ej + O(ε) ≤ En + O(ε) (4.30) j=1 and the claim follows. (ii) We have inf ψ, Aψ ≥ En − O(ε). (4.31) ψ∈U (ϕ1 ,...,ϕn−1 ) In fact, set ψ = ϕn . Since ε can be chosen arbitrarily small, we have proven the following.
  • 131. 4.4. Estimating eigenspaces 119 Theorem 4.10 (Min-max). Let A be self-adjoint and let E1 ≤ E2 ≤ E3 · · · be the eigenvalues of A below the essential spectrum, respectively, the in- fimum of the essential spectrum, once there are no more eigenvalues left. Then En = sup inf ψ, Aψ . (4.32) ψ1 ,...,ψn−1 ψ∈U (ψ1 ,...,ψn−1 ) Clearly the same result holds if D(A) is replaced by the quadratic form domain Q(A) in the definition of U . In addition, as long as En is an eigen- value, the sup and inf are in fact max and min, explaining the name. Corollary 4.11. Suppose A and B are self-adjoint operators with A ≥ B (i.e., A − B ≥ 0). Then En (A) ≥ En (B). Problem 4.4. Suppose A, An are bounded and An → A. Then Ek (An ) → Ek (A). (Hint: A − An ≤ ε is equivalent to A − ε ≤ A ≤ A + ε.) 4.4. Estimating eigenspaces Next, we show that the dimension of the range of PA (Ω) can be estimated if we have some functions which lie approximately in this space. Theorem 4.12. Suppose A is a self-adjoint operator and ψj , 1 ≤ j ≤ k, are linearly independent elements of a H. (i) Let λ ∈ R, ψj ∈ Q(A). If 2 ψ, Aψ λ ψ (4.33) k for any nonzero linear combination ψ = j=1 cj ψj , then dim Ran PA ((−∞, λ)) ≥ k. (4.34) Similarly, ψ, Aψ λ ψ 2 implies dim Ran PA ((λ, ∞)) ≥ k. (ii) Let λ1 λ2 , ψj ∈ D(A). If λ2 + λ1 λ2 − λ1 (A − )ψ ψ (4.35) 2 2 k for any nonzero linear combination ψ = j=1 cj ψj , then dim Ran PA ((λ1 , λ2 )) ≥ k. (4.36) Proof. (i) Let M = span{ψj } ⊆ H. We claim dim PA ((−∞, λ))M = dim M = k. For this it suffices to show Ker PA ((−∞, λ))|M = {0}. Sup- pose PA ((−∞, λ))ψ = 0, ψ = 0. Then we see that for any nonzero linear
  • 132. 120 4. Applications of the spectral theorem combination ψ ψ, Aψ = η dµψ (η) = η dµψ (η) R [λ,∞) ≥λ dµψ (η) = λ ψ 2 . [λ,∞) This contradicts our assumption (4.33). (ii) This is just the previous case (i) applied to (A − (λ2 + λ1 )/2)2 with λ = (λ2 − λ1 )2 /4. Another useful estimate is Theorem 4.13 (Temple’s inequality). Let λ1 λ2 and ψ ∈ D(A) with ψ = 1 such that λ = ψ, Aψ ∈ (λ1 , λ2 ). (4.37) If there is one isolated eigenvalue E between λ1 and λ2 , that is, σ(A) ∩ (λ1 , λ2 ) = E, then (A − λ)ψ 2 (A − λ)ψ 2 λ− ≤E ≤λ+ . (4.38) λ2 − λ λ − λ1 Proof. First of all we can assume λ = 0 if we replace A by A − λ. To prove the first inequality, observe that by assumption (E, λ2 ) ⊂ ρ(A) and hence the spectral theorem implies (A − λ2 )(A − E) ≥ 0. Thus ψ, (A − λ2 )(A − E) = Aψ 2 + λ2 E ≥ 0 and the first inequality follows after dividing by λ2 0. Similarly, (A − λ1 )(A − E) ≥ 0 implies the second inequality. Note that the last inequality only provides additional information if (A − λ)ψ 2 ≤ (λ2 − λ)(λ − λ1 ). A typical application is if E = E0 is the lowest eigenvalue. In this case any normalized trial function ψ will give the bound E0 ≤ ψ, Aψ . If, in addition, we also have some estimate λ2 ≤ E1 for the second eigenvalue E1 , then Temple’s inequality can give a bound from below. For λ1 we can choose any value λ1 E0 ; in fact, if we let λ1 → −∞, we just recover the bound we already know. 4.5. Tensor products of operators Recall the definition of the tensor product of Hilbert space from Section 1.4. Suppose Aj , 1 ≤ j ≤ n, are (essentially) self-adjoint operators on Hj . For every monomial λn1 · · · λnn we can define 1 n n (An1 ⊗ · · · ⊗ Ann )ψ1 ⊗ · · · ⊗ ψn = (An1 ψ1 ) ⊗ · · · ⊗ (Ann ψn ), ψj ∈ D(Aj j ), 1 n 1 n (4.39)
  • 133. 4.5. Tensor products of operators 121 and extend this definition by linearity to the span of all such functions (check that this definition is well-defined by showing that the corresponding operator on F(H1 , . . . , Hn ) vanishes on N (H1 , . . . , Hn )). Hence for every polynomial P (λ1 , . . . , λn ) of degree N we obtain an operator P (A1 , . . . , An )ψ1 ⊗ · · · ⊗ ψn , ψj ∈ D(AN ), j (4.40) defined on the set D = span{ψ1 ⊗ · · · ⊗ ψn | ψj ∈ D(AN )}. j (4.41) Moreover, if P is real-valued, then the operator P (A1 , . . . , An ) on D is sym- metric and we can consider its closure, which will again be denoted by P (A1 , . . . , An ). Theorem 4.14. Suppose Aj , 1 ≤ j ≤ n, are self-adjoint operators on Hj and let P (λ1 , . . . , λn ) be a real-valued polynomial and define P (A1 , . . . , An ) as above. Then P (A1 , . . . , An ) is self-adjoint and its spectrum is the closure of the range of P on the product of the spectra of the Aj ; that is, σ(P (A1 , . . . , An )) = P (σ(A1 ), . . . , σ(An )). (4.42) Proof. By the spectral theorem it is no restriction to assume that Aj is multiplication by λj on L2 (R, dµj ) and P (A1 , . . . , An ) is hence multiplication by P (λ1 , . . . , λn ) on L2 (Rn , dµ1 × · · · × dµn ). Since D contains the set of all functions ψ1 (λ1 ) · · · ψn (λn ) for which ψj ∈ L2 (R, dµj ), it follows that the c domain of the closure of P contains L2 (Rn , dµ1 × · · · × dµn ). Hence P is c the maximally defined multiplication operator by P (λ1 , . . . , λn ), which is self-adjoint. Now let λ = P (λ1 , . . . , λn ) with λj ∈ σ(Aj ). Then there exist Weyl sequences ψj,k ∈ D(AN ) with (Aj − λj )ψj,k → 0 as k → ∞. Consequently, j (P −λ)ψk → 0, where ψk = ψ1,k ⊗· · ·⊗ψ1,k and hence λ ∈ σ(P ). Conversely, if λ ∈ P (σ(A1 ), . . . , σ(An )), then |P (λ1 , . . . , λn ) − λ| ≥ ε for a.e. λj with respect to µj and hence (P − λ)−1 exists and is bounded; that is, λ ∈ ρ(P ). The two main cases of interest are A1 ⊗ A2 , in which case σ(A1 ⊗ A2 ) = σ(A1 )σ(A2 ) = {λ1 λ2 |λj ∈ σ(Aj )}, (4.43) and A1 ⊗ I + I ⊗ A2 , in which case σ(A1 ⊗ I + I ⊗ A2 ) = σ(A1 ) + σ(A2 ) = {λ1 + λ2 |λj ∈ σ(Aj )}. (4.44) Problem 4.5. Show that the closure can be omitted in (4.44) if at least one operator is bounded and in (4.43) if both operators are bounded.
  • 135. Chapter 5 Quantum dynamics As in the finite dimensional case, the solution of the Schr¨dinger equation o d i ψ(t) = Hψ(t) (5.1) dt is given by ψ(t) = exp(−itH)ψ(0). (5.2) A detailed investigation of this formula will be our first task. Moreover, in the finite dimensional case the dynamics is understood once the eigenvalues are known and the same is true in our case once we know the spectrum. Note that, like any Hamiltonian system from classical mechanics, our system is not hyperbolic (i.e., the spectrum is not away from the real axis) and hence simple results such as all solutions tend to the equilibrium position cannot be expected. 5.1. The time evolution and Stone’s theorem In this section we want to have a look at the initial value problem associated with the Schr¨dinger equation (2.12) in the Hilbert space H. If H is one- o dimensional (and hence A is a real number), the solution is given by ψ(t) = e−itA ψ(0). (5.3) Our hope is that this formula also applies in the general case and that we can reconstruct a one-parameter unitary group U (t) from its generator A (compare (2.11)) via U (t) = exp(−itA). We first investigate the family of operators exp(−itA). Theorem 5.1. Let A be self-adjoint and let U (t) = exp(−itA). (i) U (t) is a strongly continuous one-parameter unitary group. 123
  • 136. 124 5. Quantum dynamics (ii) The limit limt→0 1 (U (t)ψ − ψ) exists if and only if ψ ∈ D(A) in t which case limt→0 1 (U (t)ψ − ψ) = −iAψ. t (iii) U (t)D(A) = D(A) and AU (t) = U (t)A. Proof. The group property (i) follows directly from Theorem 3.1 and the corresponding statements for the function exp(−itλ). To prove strong con- tinuity, observe that lim e−itA ψ − e−it0 A ψ 2 = lim |e−itλ − e−it0 λ |2 dµψ (λ) t→t0 t→t0 R = lim |e−itλ − e−it0 λ |2 dµψ (λ) = 0 R t→t0 by the dominated convergence theorem. Similarly, if ψ ∈ D(A), we obtain 1 −itA 1 lim (e ψ − ψ) + iAψ 2 = lim | (e−itλ − 1) + iλ|2 dµψ (λ) = 0 t→0 t t→0 R t ˜ since |eitλ − 1| ≤ |tλ|. Now let A be the generator defined as in (2.11). Then ˜ A is a symmetric extension of A since we have ˜ i i ˜ ϕ, Aψ = lim ϕ, (U (t) − 1)ψ = lim (U (−t) − 1)ϕ, ψ = Aϕ, ψ t→0 t t→0 −t ˜ and hence A = A by Corollary 2.2. This settles (ii). To see (iii), replace ψ → U (s)ψ in (ii). For our original problem this implies that formula (5.3) is indeed the solution to the initial value problem of the Schr¨dinger equation. Moreover, o U (t)ψ, AU (t)ψ = U (t)ψ, U (t)Aψ = ψ, Aψ (5.4) shows that the expectations of A are time independent. This corresponds to conservation of energy. On the other hand, the generator of the time evolution of a quantum mechanical system should always be a self-adjoint operator since it corre- sponds to an observable (energy). Moreover, there should be a one-to-one correspondence between the unitary group and its generator. This is ensured by Stone’s theorem. Theorem 5.2 (Stone). Let U (t) be a weakly continuous one-parameter uni- tary group. Then its generator A is self-adjoint and U (t) = exp(−itA). Proof. First of all observe that weak continuity together with item (iv) of Lemma 1.12 shows that U (t) is in fact strongly continuous.
  • 137. 5.1. The time evolution and Stone’s theorem 125 Next we show that A is densely defined. Pick ψ ∈ H and set τ ψτ = U (t)ψdt 0 (the integral is defined as in Section 4.1) implying limτ →0 τ −1 ψτ = ψ. More- over, 1 1 t+τ 1 τ (U (t)ψτ − ψτ ) = U (s)ψds − U (s)ψds t t t t 0 1 τ +t 1 t = U (s)ψds − U (s)ψds t τ t 0 t 1 1 t = U (τ ) U (s)ψds − U (s)ψds → U (τ )ψ − ψ t 0 t 0 as t → 0 shows ψτ ∈ D(A). As in the proof of the previous theorem, we can show that A is symmetric and that U (t)D(A) = D(A). Next, let us prove that A is essentially self-adjoint. By Lemma 2.7 it suffices to prove Ker(A∗ − z ∗ ) = {0} for z ∈ CR. Suppose A∗ ϕ = z ∗ ϕ. Then for each ψ ∈ D(A) we have d ϕ, U (t)ψ = ϕ, −iAU (t)ψ = −i A∗ ϕ, U (t)ψ = −iz ϕ, U (t)ψ dt and hence ϕ, U (t)ψ = exp(−izt) ϕ, ψ . Since the left-hand side is bounded for all t ∈ R and the exponential on the right-hand side is not, we must have ϕ, ψ = 0 implying ϕ = 0 since D(A) is dense. So A is essentially self-adjoint and we can introduce V (t) = exp(−itA). We are done if we can show U (t) = V (t). Let ψ ∈ D(A) and abbreviate ψ(t) = (U (t) − V (t))ψ. Then ψ(t + s) − ψ(t) lim = iAψ(t) s→0 s d and hence dt ψ(t) 2 = 2 Re ψ(t), iAψ(t) = 0. Since ψ(0) = 0, we have ψ(t) = 0 and hence U (t) and V (t) coincide on D(A). Furthermore, since D(A) is dense, we have U (t) = V (t) by continuity. As an immediate consequence of the proof we also note the following useful criterion. Corollary 5.3. Suppose D ⊆ D(A) is dense and invariant under U (t). Then A is essentially self-adjoint on D. Proof. As in the above proof it follows that ϕ, ψ = 0 for any ψ ∈ D and ϕ ∈ Ker(A∗ − z ∗ ).
  • 138. 126 5. Quantum dynamics Note that by Lemma 4.9 two strongly continuous one-parameter groups commute, [e−itA , e−isB ] = 0, (5.5) if and only if the generators commute. Clearly, for a physicist, one of the goals must be to understand the time evolution of a quantum mechanical system. We have seen that the time evolution is generated by a self-adjoint operator, the Hamiltonian, and is given by a linear first order differential equation, the Schr¨dinger equation. o To understand the dynamics of such a first order differential equation, one must understand the spectrum of the generator. Some general tools for this endeavor will be provided in the following sections. Problem 5.1. Let H = L2 (0, 2π) and consider the one-parameter unitary group given by U (t)f (x) = f (x − t mod 2π). What is the generator of U ? 5.2. The RAGE theorem Now, let us discuss why the decomposition of the spectrum introduced in Section 3.3 is of physical relevance. Let ϕ = ψ = 1. The vector ϕ, ψ ϕ is the projection of ψ onto the (one-dimensional) subspace spanned by ϕ. Hence | ϕ, ψ |2 can be viewed as the part of ψ which is in the state ϕ. The first question one might raise is, how does | ϕ, U (t)ψ |2 , U (t) = e−itA , (5.6) behave as t → ∞? By the spectral theorem, µϕ,ψ (t) = ϕ, U (t)ψ = ˆ e−itλ dµϕ,ψ (λ) (5.7) R is the Fourier transform of the measure µϕ,ψ . Thus our question is an- swered by Wiener’s theorem. Theorem 5.4 (Wiener). Let µ be a finite complex Borel measure on R and let µ(t) = ˆ e−itλ dµ(λ) (5.8) R be its Fourier transform. Then the Ces`ro time average of µ(t) has the limit a ˆ T 1 lim |ˆ(t)|2 dt = µ |µ({λ})|2 , (5.9) T →∞ T 0 λ∈R where the sum on the right-hand side is finite.
  • 139. 5.2. The RAGE theorem 127 Proof. By Fubini we have T T 1 1 |ˆ(t)|2 dt = µ e−i(x−y)t dµ(x)dµ∗ (y)dt T 0 T 0 R R T 1 = e−i(x−y)t dt dµ(x)dµ∗ (y). R R T 0 The function in parentheses is bounded by one and converges pointwise to χ{0} (x − y) as T → ∞. Thus, by the dominated convergence theorem, the limit of the above expression is given by χ{0} (x − y)dµ(x)dµ∗ (y) = µ({y})dµ∗ (y) = |µ({y})|2 , R R R y∈R which finishes the proof. To apply this result to our situation, observe that the subspaces Hac , Hsc , and Hpp are invariant with respect to time evolution since P xx U (t) = χMxx (A) exp(−itA) = exp(−itA)χMxx (A) = U (t)P xx , xx ∈ {ac, sc, pp}. Moreover, if ψ ∈ Hxx , we have P xx ψ = ψ, which shows ϕ, f (A)ψ = ϕ, P xx f (A)ψ = P xx ϕ, f (A)ψ implying dµϕ,ψ = dµP xx ϕ,ψ . Thus if µψ is ac, sc, or pp, so is µϕ,ψ for every ϕ ∈ H. That is, if ψ ∈ Hc = Hac ⊕Hsc , then the Ces`ro mean of ϕ, U (t)ψ tends a to zero. In other words, the average of the probability of finding the system in any prescribed state tends to zero if we start in the continuous subspace Hc of A. If ψ ∈ Hac , then dµϕ,ψ is absolutely continuous with respect to Lebesgue measure and thus µϕ,ψ (t) is continuous and tends to zero as |t| → ∞. In ˆ fact, this follows from the Riemann-Lebesgue lemma (see Lemma 7.6 below). Now we want to draw some additional consequences from Wiener’s the- orem. This will eventually yield a dynamical characterization of the contin- uous and pure point spectrum due to Ruelle, Amrein, Gorgescu, and Enß. But first we need a few definitions. An operator K ∈ L(H) is called a finite rank operator if its range is finite dimensional. The dimension rank(K) = dim Ran(K) is called the rank of K. If {ψj }n is an orthonormal basis for Ran(K), we j=1 have n n Kψ = ψj , Kψ ψj = ϕj , ψ ψj , (5.10) j=1 j=1
  • 140. 128 5. Quantum dynamics where ϕj = K ∗ ψj . The elements ϕj are linearly independent since Ran(K) = Ker(K ∗ )⊥ . Hence every finite rank operator is of the form (5.10). In addi- tion, the adjoint of K is also finite rank and is given by n K ∗ψ = ψj , ψ ϕj . (5.11) j=1 The closure of the set of all finite rank operators in L(H) is called the set of compact operators C(H). It is straightforward to verify (Problem 5.2) Lemma 5.5. The set of all compact operators C(H) is a closed ∗-ideal in L(H). There is also a weaker version of compactness which is useful for us. The operator K is called relatively compact with respect to A if KRA (z) ∈ C(H) (5.12) for one z ∈ ρ(A). By the first resolvent formula this then follows for all z ∈ ρ(A). In particular we have D(A) ⊆ D(K). Now let us return to our original problem. Theorem 5.6. Let A be self-adjoint and suppose K is relatively compact. Then 1 T lim Ke−itA P c ψ 2 dt = 0 and lim Ke−itA P ac ψ = 0 T →∞ T 0 t→∞ (5.13) for every ψ ∈ D(A). If, in addition, K is bounded, then the result holds for any ψ ∈ H. Proof. Let ψ ∈ Hc , respectively, ψ ∈ Hac , and drop the projectors. Then, if K is a rank one operator (i.e., K = ϕ1 , . ϕ2 ), the claim follows from Wiener’s theorem, respectively, the Riemann-Lebesgue lemma. Hence it holds for any finite rank operator K. If K is compact, there is a sequence Kn of finite rank operators such that K − Kn ≤ 1/n and hence 1 Ke−itA ψ ≤ Kn e−itA ψ + ψ . n Thus the claim holds for any compact operator K. If ψ ∈ D(A), we can set ψ = (A − i)−1 ϕ, where ϕ ∈ Hc if and only if ψ ∈ Hc (since Hc reduces A). Since K(A + i)−1 is compact by assumption, the claim can be reduced to the previous situation. If K is also bounded, we can find a sequence ψn ∈ D(A) such that ψ − ψn ≤ 1/n and hence 1 Ke−itA ψ ≤ Ke−itA ψn + K , n concluding the proof.
  • 141. 5.2. The RAGE theorem 129 With the help of this result we can now prove an abstract version of the RAGE theorem. Theorem 5.7 (RAGE). Let A be self-adjoint. Suppose Kn ∈ L(H) is a se- quence of relatively compact operators which converges strongly to the iden- tity. Then T 1 Hc = {ψ ∈ H| lim lim Kn e−itA ψ dt = 0}, n→∞ T →∞ T 0 Hpp = {ψ ∈ H| lim sup (I − Kn )e−itA ψ = 0}. (5.14) n→∞ t≥0 Proof. Abbreviate ψ(t) = exp(−itA)ψ. We begin with the first equation. Let ψ ∈ Hc . Then T T 1/2 1 1 Kn ψ(t) dt ≤ Kn ψ(t) 2 dt →0 T 0 T 0 by Cauchy–Schwarz and the previous theorem. Conversely, if ψ ∈ Hc , we can write ψ = ψ c + ψ pp . By our previous estimate it suffices to show Kn ψ pp (t) ≥ ε 0 for n large. In fact, we even claim lim sup Kn ψ pp (t) − ψ pp (t) = 0. (5.15) n→∞ t≥0 By the spectral theorem, we can write ψ pp (t) = j αj (t)ψj , where the ψj are orthonormal eigenfunctions and αj (t) = exp(−itλj )αj . Truncate this expansion after N terms. Then this part converges uniformly to the desired limit by strong convergence of Kn . Moreover, by Lemma 1.14 we have Kn ≤ M , and hence the error can be made arbitrarily small by choosing N large. Now let us turn to the second equation. If ψ ∈ Hpp , the claim follows by (5.15). Conversely, if ψ ∈ Hpp , we can write ψ = ψ c + ψ pp and by our previous estimate it suffices to show that (I − Kn )ψ c (t) does not tend to 0 as n → ∞. If it did, we would have T 1 0 = lim (I − Kn )ψ c (t) 2 dt T →∞ T 0 T 1 ≥ ψ c (t) 2 − lim Kn ψ c (t) 2 dt = ψ c (t) 2 , T →∞ T 0 a contradiction. In summary, regularity properties of spectral measures are related to the long time behavior of the corresponding quantum mechanical system. However, a more detailed investigation of this topic is beyond the scope of this manuscript. For a survey containing several recent results, see [28].
  • 142. 130 5. Quantum dynamics It is often convenient to treat the observables as time dependent rather than the states. We set K(t) = eitA Ke−itA (5.16) and note ψ(t), Kψ(t) = ψ, K(t)ψ , ψ(t) = e−itA ψ. (5.17) This point of view is often referred to as the Heisenberg picture in the physics literature. If K is unbounded, we will assume D(A) ⊆ D(K) such that the above equations make sense at least for ψ ∈ D(A). The main interest is the behavior of K(t) for large t. The strong limits are called asymptotic observables if they exist. Theorem 5.8. Suppose A is self-adjoint and K is relatively compact. Then T 1 lim eitA Ke−itA ψdt = PA ({λ})KPA ({λ})ψ, ψ ∈ D(A). T →∞ T 0 λ∈σp (A) (5.18) If K is in addition bounded, the result holds for any ψ ∈ H. Proof. We will assume that K is bounded. To obtain the general result, use the same trick as before and replace K by KRA (z). Write ψ = ψ c + ψ pp . Then T 1 1 T lim K(t)ψ c dt ≤ lim K(t)ψ c dt = 0 T →∞ T 0 T →∞ T 0 by Theorem 5.6. As in the proof of the previous theorem we can write ψ pp = j αj ψj and hence T T 1 1 αj K(t)ψj dt = αj eit(A−λj ) dt Kψj . T 0 T 0 j j As in the proof of Wiener’s theorem, we see that the operator in parentheses tends to PA ({λj }) strongly as T → ∞. Since this operator is also bounded by 1 for all T , we can interchange the limit with the summation and the claim follows. We also note the following corollary. Corollary 5.9. Under the same assumptions as in the RAGE theorem we have 1 T itA lim lim e Kn e−itA ψdt = P pp ψ, (5.19) n→∞ T →∞ T 0 respectively, T 1 lim lim eitA (I − Kn )e−itA ψdt = P c ψ. (5.20) n→∞ T →∞ T 0 Problem 5.2. Prove Lemma 5.5.
  • 143. 5.3. The Trotter product formula 131 Problem 5.3. Prove Corollary 5.9. 5.3. The Trotter product formula In many situations the operator is of the form A + B, where eitA and eitB can be computed explicitly. Since A and B will not commute in general, we cannot obtain eit(A+B) from eitA eitB . However, we at least have Theorem 5.10 (Trotter product formula). Suppose A, B, and A + B are self-adjoint. Then t t n eit(A+B) = s-lim ei n A ei n B . (5.21) n→∞ Proof. First of all note that we have n eiτ A eiτ B − eit(A+B) n−1 n−1−j j = eiτ A eiτ B eiτ A eiτ B − eiτ (A+B) eiτ (A+B) , j=0 t where τ = n, and hence (eiτ A eiτ B )n − eit(A+B) ψ ≤ |t| max Fτ (s), |s|≤|t| where 1 iτ A iτ B Fτ (s) =(e e − eiτ (A+B) )eis(A+B) ψ . τ Now for ψ ∈ D(A + B) = D(A) ∩ D(B) we have 1 iτ A iτ B (e e − eiτ (A+B) )ψ → iAψ + iBψ − i(A + B)ψ = 0 τ as τ → 0. So limτ →0 Fτ (s) = 0 at least pointwise, but we need this uniformly with respect to s ∈ [−|t|, |t|]. Pointwise convergence implies 1 iτ A iτ B (e e − eiτ (A+B) )ψ ≤ C(ψ) τ and, since D(A + B) is a Hilbert space when equipped with the graph norm ψ 2Γ(A+B) = ψ 2 + (A + B)ψ 2 , we can invoke the uniform boundedness principle to obtain 1 iτ A iτ B (e e − eiτ (A+B) )ψ ≤ C ψ Γ(A+B) . τ Now 1 iτ A iτ B |Fτ (s) − Fτ (r)| ≤ (e e − eiτ (A+B) )(eis(A+B) − eir(A+B) )ψ τ ≤ C (eis(A+B) − eir(A+B) )ψ Γ(A+B)
  • 144. 132 5. Quantum dynamics shows that Fτ (.) is uniformly continuous and the claim follows by a standard ε 2 argument. If the operators are semi-bounded from below, the same proof shows Theorem 5.11 (Trotter product formula). Suppose A, B, and A + B are self-adjoint and semi-bounded from below. Then t t n e−t(A+B) = s-lim e− n A e− n B , t ≥ 0. (5.22) n→∞ Problem 5.4. Prove Theorem 5.11.
  • 145. Chapter 6 Perturbation theory for self-adjoint operators The Hamiltonian of a quantum mechanical system is usually the sum of the kinetic energy H0 (free Schr¨dinger operator) plus an operator V cor- o responding to the potential energy. Since H0 is easy to investigate, one usually tries to consider V as a perturbation of H0 . This will only work if V is small with respect to H0 . Hence we study such perturbations of self-adjoint operators next. 6.1. Relatively bounded operators and the Kato–Rellich theorem An operator B is called A bounded or relatively bounded with respect to A if D(A) ⊆ D(B) and if there are constants a, b ≥ 0 such that Bψ ≤ a Aψ + b ψ , ψ ∈ D(A). (6.1) The infimum of all constants a for which a corresponding b exists such that (6.1) holds is called the A-bound of B. The triangle inequality implies Lemma 6.1. Suppose Bj , j = 1, 2, are A bounded with respective A-bounds ai , i = 1, 2. Then α1 B1 + α2 B2 is also A bounded with A-bound less than |α1 |a1 + |α2 |a2 . In particular, the set of all A bounded operators forms a linear space. There are also the following equivalent characterizations: 133
  • 146. 134 6. Perturbation theory for self-adjoint operators Lemma 6.2. Suppose A is closed and B is closable. Then the following are equivalent: (i) B is A bounded. (ii) D(A) ⊆ D(B). (iii) BRA (z) is bounded for one (and hence for all) z ∈ ρ(A). Moreover, the A-bound of B is no larger than inf z∈ρ(A) BRA (z) . Proof. (i) ⇒ (ii) is true by definition. (ii) ⇒ (iii) since BRA (z) is a closed (Problem 2.9) operator defined on all of H and hence bounded by the closed graph theorem (Theorem 2.8). To see (iii) ⇒ (i), let ψ ∈ D(A). Then Bψ = BRA (z)(A − z)ψ ≤ a (A − z)ψ ≤ a Aψ + (a|z|) ψ , where a = BRA (z) . Finally, note that if BRA (z) is bounded for one z ∈ ρ(A), it is bounded for all z ∈ ρ(A) by the first resolvent formula. 2 d Example. Let A be the self-adjoint operator A = − dx2 , D(A) = {f ∈ H 2 [0, 1]|f (0) = f (1) = 0} in the Hilbert space L2 (0, 1). If we want to add a potential represented by a multiplication operator with a real-valued (mea- surable) function q, then q will be relatively bounded if q ∈ L2 (0, 1): Indeed, since all functions in D(A) are continuous on [0, 1] and hence bounded, we clearly have D(A) ⊂ D(q) in this case. We are mainly interested in the situation where A is self-adjoint and B is symmetric. Hence we will restrict our attention to this case. Lemma 6.3. Suppose A is self-adjoint and B relatively bounded. The A- bound of B is given by lim BRA (±iλ) . (6.2) λ→∞ If A is bounded from below, we can also replace ±iλ by −λ. Proof. Let ϕ = RA (±iλ)ψ, λ 0, and let a∞ be the A-bound of B. Then (use the spectral theorem to estimate the norms) b BRA (±iλ)ψ ≤ a ARA (±iλ)ψ + b RA (±iλ)ψ ≤ (a + ) ψ . λ Hence lim supλ BRA (±iλ) ≤ a∞ which, together with the inequality a∞ ≤ inf λ BRA (±iλ) from the previous lemma, proves the claim. The case where A is bounded from below is similar, using |γ| b BRA (−λ)ψ ≤ a max 1, + ψ , λ+γ λ+γ for −λ γ. Now we will show the basic perturbation result due to Kato and Rellich.
  • 147. 6.1. Relatively bounded operators and the Kato–Rellich theorem 135 Theorem 6.4 (Kato–Rellich). Suppose A is (essentially) self-adjoint and B is symmetric with A-bound less than one. Then A + B, D(A + B) = D(A), is (essentially) self-adjoint. If A is essentially self-adjoint, we have D(A) ⊆ D(B) and A + B = A + B. If A is bounded from below by γ, then A + B is bounded from below by b γ − max a|γ| + b, . (6.3) a−1 Proof. Since D(A) ⊆ D(B) and D(A) ⊆ D(A + B) by (6.1), we can assume that A is closed (i.e., self-adjoint). It suffices to show that Ran(A+B ±iλ) = H. By the above lemma we can find a λ 0 such that BRA (±iλ) 1. Hence −1 ∈ ρ(BRA (±iλ)) and thus I + BRA (±iλ) is onto. Thus (A + B ± iλ) = (I + BRA (±iλ))(A ± iλ) is onto and the proof of the first part is complete. If A is bounded from below, we can replace ±iλ by −λ and the above equation shows that RA+B exists for λ sufficiently large. By the proof of the previous lemma we can choose −λ min(γ, b/(a − 1)). Example. In our previous example we have seen that q ∈ L2 (0, 1) is rel- atively bounded by checking D(A) ⊂ D(q). However, working a bit harder (Problem 6.2), one can even show that the relative bound is 0 and hence A + q is self-adjoint by the Kato–Rellich theorem. Finally, let us show that there is also a connection between the resolvents. Lemma 6.5. Suppose A and B are closed and D(A) ⊆ D(B). Then we have the second resolvent formula RA+B (z) − RA (z) = −RA (z)BRA+B (z) = −RA+B (z)BRA (z) (6.4) for z ∈ ρ(A) ∩ ρ(A + B). Proof. We compute RA+B (z) + RA (z)BRA+B (z) = RA (z)(A + B − z)RA+B (z) = RA (z). The second identity is similar. Problem 6.1. Show that (6.1) implies Bψ 2 ≤ a2 Aψ ˜ 2 + ˜2 ψ b 2 with a = a(1 + ε2 ) and ˜ = b(1 + ε−2 ) for any ε 0. Conversely, show that ˜ b this inequality implies (6.1) with a = a and b = ˜ ˜ b.
  • 148. 136 6. Perturbation theory for self-adjoint operators 2 d Problem 6.2. Let A be the self-adjoint operator A = − dx2 , D(A) = {f ∈ H 2 [0, 1]|f (0) = f (1) = 0} in the Hilbert space L2 (0, 1) and q ∈ L2 (0, 1). Show that for every f ∈ D(A) we have ε 1 f 2 ≤ f 2+ ∞ f 2 2 2ε for any ε 0. Conclude that the relative bound of q with respect to A is 1 1 1 zero. (Hint: |f (x)|2 ≤ | 0 f (t)dt|2 ≤ 0 |f (t)|2 dt = − 0 f (t)∗ f (t)dt.) Problem 6.3. Let A be as in the previous example. Show that q is relatively bounded if and only if x(1 − x)q(x) ∈ L2 (0, 1). Problem 6.4. Compute the resolvent of A + α ψ, . ψ. (Hint: Show α (I + α ϕ, . ψ)−1 = I − ϕ, . ψ 1 + α ϕ, ψ and use the second resolvent formula.) 6.2. More on compact operators Recall from Section 5.2 that we have introduced the set of compact operators C(H) as the closure of the set of all finite rank operators in L(H). Before we can proceed, we need to establish some further results for such operators. We begin by investigating the spectrum of self-adjoint compact operators and show that the spectral theorem takes a particularly simple form in this case. Theorem 6.6 (Spectral theorem for compact operators). Suppose the op- erator K is self-adjoint and compact. Then the spectrum of K consists of an at most countable number of eigenvalues which can only cluster at 0. Moreover, the eigenspace to each nonzero eigenvalue is finite dimensional. In addition, we have K= λPK ({λ}). (6.5) λ∈σ(K) Proof. It suffices to show rank(PK ((λ − ε, λ + ε))) ∞ for 0 ε |λ|. Let Kn be a sequence of finite rank operators such that K − Kn ≤ 1/n. If Ran PK ((λ − ε, λ + ε)) is infinite dimensional, we can find a vector ψn in this range such that ψn = 1 and Kn ψn = 0. But this yields a contradiction since 1 ≥ | ψn , (K − Kn )ψn | = | ψn , Kψn | ≥ |λ| − ε 0 n by (4.2). As a consequence we obtain the canonical form of a general compact operator.
  • 149. 6.2. More on compact operators 137 Theorem 6.7 (Canonical form of compact operators). Let K be compact. ˆ There exist orthonormal sets {φj }, {φj } and positive numbers sj = sj (K) such that K= ˆ s j φj , . φj , K∗ = ˆ s j φj , . φj . (6.6) j j ˆ ˆ ˆ Note Kφj = sj φj and K ∗ φj = sj φj , and hence K ∗ Kφj = s2 φj and KK ∗ φj = j ˆ s 2 φj . j The numbers sj (K)2 0 are the nonzero eigenvalues of KK ∗ , respec- tively, K ∗ K (counted with multiplicity) and sj (K) = sj (K ∗ ) = sj are called singular values of K. There are either finitely many singular values (if K is finite rank) or they converge to zero. Proof. By Lemma 5.5, K ∗ K is compact and hence Theorem 6.6 applies. Let {φj } be an orthonormal basis of eigenvectors for PK ∗ K ((0, ∞))H and let s2 be the eigenvalue corresponding to φj . Then, for any ψ ∈ H we can write j ψ= ˜ φj , ψ φj + ψ j ˜ with ψ ∈ Ker(K ∗ K) = Ker(K). Then Kψ = ˆ s j φj , ψ φj , j where φj = s−1 Kφj , since K ψ 2 = ψ, K ∗ K ψ = 0. By φj , φk = ˆ j ˜ ˜ ˜ ˆ ˆ (sj sk )−1 Kφj , Kφk = (sj sk )−1 K ∗ Kφj , φk = sj s−1 φj , φk we see that k ˆ the {φj } are orthonormal and the formula for K ∗ follows by taking the adjoint of the formula for K (Problem 6.5). ˆ 2 If K is self-adjoint, then φj = σj φj , σj = 1 are the eigenvectors of K and σj sj are the corresponding eigenvalues. Moreover, note that we have K = max sj (K). (6.7) j Finally, let me remark that there are a number of other equivalent defi- nitions for compact operators. Lemma 6.8. For K ∈ L(H) the following statements are equivalent: (i) K is compact. (i’) K ∗ is compact. s (ii) An ∈ L(H) and An → A strongly implies An K → AK. (iii) ψn ψ weakly implies Kψn → Kψ in norm.
  • 150. 138 6. Perturbation theory for self-adjoint operators (iv) ψn bounded implies that Kψn has a (norm) convergent subse- quence. Proof. (i) ⇔ (i’). This is immediate from Theorem 6.7. (i) ⇒ (ii). Translating An → An − A, it is no restriction to assume A = 0. Since An ≤ M , it suffices to consider the case where K is finite rank. Then (by (6.6)) N N An K 2 = sup 2 ˆ s j | φ j , ψ | An φ j 2 ≤ ˆ s j An φ j 2 → 0. ψ =1 j=1 j=1 (ii) ⇒ (iii). Again, replace ψn → ψn − ψ and assume ψ = 0. Choose An = ψn , . ϕ, ϕ = 1. Then Kψn = An K ∗ → 0. (iii) ⇒ (iv). If ψn is bounded, it has a weakly convergent subsequence by Lemma 1.13. Now apply (iii) to this subsequence. (iv) ⇒ (i). Let ϕj be an orthonormal basis and set n Kn = ϕj , . Kϕj . j=1 Then γn = K − Kn = sup Kψ ψ∈span{ϕj }∞ , ψ =1 j=n is a decreasing sequence tending to a limit ε ≥ 0. Moreover, we can find a sequence of unit vectors ψn ∈ span{ϕj }∞ for which Kψn ≥ ε. By j=n assumption, Kψn has a convergent subsequence which, since ψn converges weakly to 0, converges to 0. Hence ε must be 0 and we are done. The last condition explains the name compact. Moreover, note that one cannot replace An K → AK by KAn → KA in (ii) unless one additionally requires An to be normal (then this follows by taking adjoints — recall that only for normal operators is taking adjoints continuous with respect to strong convergence). Without the requirement that An be normal, the claim is wrong as the following example shows. Example. Let H = 2 (N) and let An be the operator which shifts each sequence n places to the left and let K = δ1 , . δ1 , where δ1 = (1, 0, . . . ). Then s-lim An = 0 but KAn = 1. Problem 6.5. Deduce the formula for K ∗ from the one for K in (6.6). Problem 6.6. Show that it suffices to check conditions (iii) and (iv) from Lemma 6.8 on a dense subset.
  • 151. 6.3. Hilbert–Schmidt and trace class operators 139 6.3. Hilbert–Schmidt and trace class operators Among the compact operators two special classes are of particular impor- tance. The first ones are integral operators Kψ(x) = K(x, y)ψ(y)dµ(y), ψ ∈ L2 (M, dµ), (6.8) M where K(x, y) ∈ L2 (M ×M, dµ⊗dµ). Such an operator is called a Hilbert– Schmidt operator. Using Cauchy–Schwarz, 2 |Kψ(x)|2 dµ(x) = |K(x, y)ψ(y)|dµ(y) dµ(x) M M M ≤ |K(x, y)|2 dµ(y) |ψ(y)|2 dµ(y) dµ(x) M M M = |K(x, y)|2 dµ(y) dµ(x) |ψ(y)|2 dµ(y) , (6.9) M M M we see that K is bounded. Next, pick an orthonormal basis ϕj (x) for L2 (M, dµ). Then, by Lemma 1.10, ϕi (x)ϕj (y) is an orthonormal basis for L2 (M × M, dµ ⊗ dµ) and K(x, y) = ci,j ϕi (x)ϕj (y), ci,j = ϕi , Kϕ∗ , j (6.10) i,j where |ci,j |2 = |K(x, y)|2 dµ(y) dµ(x) ∞. (6.11) i,j M M In particular, Kψ(x) = ci,j ϕ∗ , ψ ϕi (x) j (6.12) i,j shows that K can be approximated by finite rank operators (take finitely many terms in the sum) and is hence compact. Using (6.6), we can also give a different characterization of Hilbert– Schmidt operators. Lemma 6.9. If H = L2 (M, dµ), then a compact operator K is Hilbert– Schmidt if and only if j sj (K)2 ∞ and sj (K)2 = |K(x, y)|2 dµ(x)dµ(y), (6.13) j M M in this case.
  • 152. 140 6. Perturbation theory for self-adjoint operators Proof. If K is compact, we can define approximating finite rank operators Kn by considering only finitely many terms in (6.6): n Kn = ˆ s j φj , . φj . j=1 n ∗ˆ Then Kn has the kernel Kn (x, y) = j=1 sj φj (y) φj (x) and n 2 |Kn (x, y)| dµ(x)dµ(y) = sj (K)2 . M M j=1 Now if one side converges, so does the other and, in particular, (6.13) holds in this case. Hence we will call a compact operator Hilbert–Schmidt if its singular values satisfy sj (K)2 ∞. (6.14) j By our lemma this coincides with our previous definition if H = L2 (M, dµ). Since every Hilbert space is isomorphic to some L2 (M, dµ), we see that the Hilbert–Schmidt operators together with the norm 1/2 K 2 = sj (K)2 (6.15) j form a Hilbert space (isomorphic to L2 (M ×M, dµ⊗dµ)). Note that K 2 = K ∗ 2 (since sj (K) = sj (K ∗ )). There is another useful characterization for identifying Hilbert–Schmidt operators: Lemma 6.10. A compact operator K is Hilbert–Schmidt if and only if 2 Kψn ∞ (6.16) n for some orthonormal basis and 2 2 Kψn = K 2 (6.17) n for any orthonormal basis in this case. Proof. This follows from Kψn 2 = ˆ | φj , Kψn |2 = | K ∗ φj , ψn |2 ˆ n n,j n,j = K ∗ φn ˆ 2 = sj (K)2 . n j
  • 153. 6.3. Hilbert–Schmidt and trace class operators 141 Corollary 6.11. The set of Hilbert–Schmidt operators forms a ∗-ideal in L(H) and KA 2 ≤ A K 2, respectively, AK 2 ≤ A K 2. (6.18) Proof. Let K be Hilbert–Schmidt and A bounded. Then AK is compact and 2 2 2 2 2 2 AK 2 = AKψn ≤ A Kψn = A K 2. n n For KA just consider adjoints. This approach can be generalized by defining 1/p K p = sj (K)p (6.19) j plus corresponding spaces Jp (H) = {K ∈ C(H)| K p ∞}, (6.20) which are known as Schatten p-classes. Note that by (6.7) K ≤ K p (6.21) and that by sj (K) = sj (K ∗ ) we have K p = K ∗ p. (6.22) Lemma 6.12. The spaces Jp (H) together with the norm . p are Banach spaces. Moreover,    1/p  K p = sup | ψj , Kϕj |p {ψj }, {ϕj } ONS , (6.23)   j where the sup is taken over all orthonormal sets. Proof. The hard part is to prove (6.23): Choose q such that p + 1 = 1 and 1 q use H¨lder’s inequality to obtain (sj |...|2 = (sp |...|2 )1/p |...|2/q ) o j 1/p 1/q sj | ϕn , φj |2 ≤ sp | ϕn , φj |2 j | ϕn , φj |2 j j j 1/p ≤ sp | ϕn , φj |2 j . j
  • 154. 142 6. Perturbation theory for self-adjoint operators ˆ Clearly the analogous equation holds for φj , ψn . Now using Cauchy–Schwarz, we have 1/2 1/2 p | ψn , Kϕn |p = sj ϕn , φj sj ˆ φj , ψn j 1/2 1/2 ≤ sp | ϕn , φj |2 j sp | ψn , φj |2 j ˆ . j j Summing over n, a second appeal to Cauchy–Schwarz and interchanging the order of summation finally gives 1/2 1/2 | ψn , Kϕn |p ≤ sp | ϕn , φj |2 j sp | ψn , φj |2 j ˆ n n,j n,j 1/2 1/2 ≤ sp j sp j = sp . j j j j ˆ Since equality is attained for ϕn = φn and ψn = φn , equation (6.23) holds. Now the rest is straightforward. From 1/p | ψj , (K1 + K2 )ϕj |p j 1/p 1/p ≤ | ψj , K1 ϕj |p + | ψj , K2 ϕj |p j j ≤ K1 p + K2 p we infer that Jp (H) is a vector space and the triangle inequality. The other requirements for a norm are obvious and it remains to check completeness. If Kn is a Cauchy sequence with respect to . p , it is also a Cauchy sequence with respect to . ( K ≤ K p ). Since C(H) is closed, there is a compact K with K − Kn → 0 and by Kn p ≤ C we have 1/p | ψj , Kϕj |p ≤C j for any finite ONS. Since the right-hand side is independent of the ONS (and in particular on the number of vectors), K is in Jp (H). The two most important cases are p = 1 and p = 2: J2 (H) is the space of Hilbert–Schmidt operators investigated in the previous section and J1 (H) is the space of trace class operators. Since Hilbert–Schmidt operators are easy to identify, it is important to relate J1 (H) with J2 (H): Lemma 6.13. An operator is trace class if and only if it can be written as the product of two Hilbert–Schmidt operators, K = K1 K2 , and in this case
  • 155. 6.3. Hilbert–Schmidt and trace class operators 143 we have K 1 ≤ K1 2 K2 2 . (6.24) Proof. By Cauchy–Schwarz we have 1/2 ∗ ∗ 2 2 | ϕn , Kψn | = | K1 ϕn , K2 ψn | ≤ K1 ϕn K2 ψn n n n n = K1 2 K2 2 and hence K = K1 K2 is trace class if both K1 and K2 are Hilbert–Schmidt operators. To see the converse, let K be given by (6.6) and choose K1 = ˆ sj (K) φj , . φj , respectively, K2 = j sj (K) φj , . φj . j Corollary 6.14. The set of trace class operators forms a ∗-ideal in L(H) and KA 1 ≤ A K 1, respectively, AK 1 ≤ A K 1. (6.25) Proof. Write K = K1 K2 with K1 , K2 Hilbert–Schmidt and use Corol- lary 6.11. Now we can also explain the name trace class: Lemma 6.15. If K is trace class, then for any orthonormal basis {ϕn } the trace tr(K) = ϕn , Kϕn (6.26) n is finite and independent of the orthonormal basis. Proof. Let {ψn } be another ONB. If we write K = K1 K2 with K1 , K2 Hilbert–Schmidt, we have ∗ ∗ ϕn , K1 K2 ϕn = K1 ϕn , K2 ϕn = K1 ϕn , ψm ψm , K2 ϕn n n n,m ∗ ∗ = K2 ψm , ϕn ϕn , K1 ψm = K2 ψm , K1 ψm m,n m = ψm , K2 K1 ψm . m Hence the trace is independent of the ONB and we even have tr(K1 K2 ) = tr(K2 K1 ). Clearly for self-adjoint trace class operators, the trace is the sum over all eigenvalues (counted with their multiplicity). To see this, one just has to choose the orthonormal basis to consist of eigenfunctions. This is even true for all trace class operators and is known as Lidskij trace theorem (see [44] or [20] for an easy to read introduction).
  • 156. 144 6. Perturbation theory for self-adjoint operators Finally we note the following elementary properties of the trace: Lemma 6.16. Suppose K, K1 , K2 are trace class and A is bounded. (i) The trace is linear. (ii) tr(K ∗ ) = tr(K)∗ . (iii) If K1 ≤ K2 , then tr(K1 ) ≤ tr(K2 ). (iv) tr(AK) = tr(KA). Proof. (i) and (ii) are straightforward. (iii) follows from K1 ≤ K2 if and only if ϕ, K1 ϕ ≤ ϕ, K2 ϕ for every ϕ ∈ H. (iv) By Problem 6.7 and (i) it is no restriction to assume that A is unitary. Let {ϕn } be some ONB and note that {ψn = Aϕn } is also an ONB. Then tr(AK) = ψn , AKψn = Aϕn , AKAϕn n n = ϕn , KAϕn = tr(KA) n and the claim follows. Problem 6.7. Show that every bounded operator can be written as a linear combination of two self-adjoint operators. Furthermore, show that every bounded self-adjoint operator can √ written as a linear combination of two be unitary operators. (Hint: x ± i 1 − x2 has absolute value one for x ∈ [−1, 1].) Problem 6.8. Let H = 2 (N) and let A be multiplication by a sequence a(n). Show that A ∈ Jp ( 2 (N)) if and only if a ∈ p (N). Furthermore, show that A p = a p in this case. Problem 6.9. Show that A ≥ 0 is trace class if (6.26) is finite for one (and √ √ hence all) ONB. (Hint: A is self-adjoint (why?) and A = A A.) Problem 6.10. Show that for an orthogonal projection P we have dim Ran(P ) = tr(P ), where we set tr(P ) = ∞ if (6.26) is infinite (for one and hence all ONB by the previous problem). Problem 6.11. Show that for K ∈ C we have |K| = sj φj , . φj , j √ where |K| = K ∗ K. Conclude that K p = (tr(|A|p ))1/p .
  • 157. 6.4. Relatively compact operators and Weyl’s theorem 145 Problem 6.12. Show that K : 2 (N) → 2 (N), f (n) → j∈N k(n+j)f (j) is Hilbert–Schmidt with K 2 ≤ c 1 if |k(n)| ≤ c(n), where c(n) is decreasing and summable. 6.4. Relatively compact operators and Weyl’s theorem In the previous section we have seen that the sum of a self-adjoint operator and a symmetric operator is again self-adjoint if the perturbing operator is small. In this section we want to study the influence of perturbations on the spectrum. Our hope is that at least some parts of the spectrum remain invariant. We introduce some notation first. The discrete spectrum σd (A) is the set of all eigenvalues which are discrete points of the spectrum and whose corresponding eigenspace is finite dimensional. The complement of the dis- crete spectrum is called the essential spectrum σess (A) = σ(A)σd (A). If A is self-adjoint, we might equivalently set σd (A) = {λ ∈ σp (A)| rank(PA ((λ − ε, λ + ε))) ∞ for some ε 0}, (6.27) respectively, σess (A) = {λ ∈ R| rank(PA ((λ − ε, λ + ε))) = ∞ for all ε 0}. (6.28) Example. For a self-adjoint compact operator K we have by Theorem 6.6 that σess (K) ⊆ {0}, (6.29) where equality holds if and only if H is infinite dimensional. Let A be self-adjoint. Note that if we add a multiple of the identity to A, we shift the entire spectrum. Hence, in general, we cannot expect a (rel- atively) bounded perturbation to leave any part of the spectrum invariant. Next, if λ0 is in the discrete spectrum, we can easily remove this eigenvalue with a finite rank perturbation of arbitrarily small norm. In fact, consider A + εPA ({λ0 }). (6.30) Hence our only hope is that the remainder, namely the essential spectrum, is stable under finite rank perturbations. To show this, we first need a good criterion for a point to be in the essential spectrum of A. Lemma 6.17 (Weyl criterion). A point λ is in the essential spectrum of a self-adjoint operator A if and only if there is a sequence ψn such that ψn = 1, ψn converges weakly to 0, and (A − λ)ψn → 0. Moreover, the sequence can be chosen orthonormal. Such a sequence is called a singular Weyl sequence.
  • 158. 146 6. Perturbation theory for self-adjoint operators Proof. Let ψn be a singular Weyl sequence for the point λ0 . By Lemma 2.16 we have λ0 ∈ σ(A) and hence it suffices to show λ0 ∈ σd (A). If λ0 ∈ σd (A), we can find an ε 0 such that Pε = PA ((λ0 − ε, λ0 + ε)) is finite rank. ˜ ˜ Consider ψn = Pε ψn . Clearly (A − λ0 )ψn = Pε (A − λ0 )ψn ≤ (A − ˜ λ0 )ψn → 0 and Lemma 6.8 (iii) implies ψn → 0. However, ˜ ψn − ψn 2 = dµψn (λ) R(λ−ε,λ+ε) 1 ≤ (λ − λ0 )2 dµψn (λ) ε2 R(λ−ε,λ+ε) 1 ≤ 2 (A − λ0 )ψn 2 ε ˜ and hence ψn → 1, a contradiction. 1 1 Conversely, if λ0 ∈ σess (A), consider Pn = PA ([λ − n , λ − n+1 ) ∪ (λ + 1 1 n+1 , λ + n ]). Then rank(Pnj ) 0 for an infinite subsequence nj . Now pick ψj ∈ Ran Pnj . Now let K be a self-adjoint compact operator and ψn a singular Weyl sequence for A. Then ψn converges weakly to zero and hence (A + K − λ)ψn ≤ (A − λ)ψn + Kψn → 0 (6.31) since (A − λ)ψn → 0 by assumption and Kψn → 0 by Lemma 6.8 (iii). Hence σess (A) ⊆ σess (A + K). Reversing the roles of A + K and A shows σess (A + K) = σess (A). In particular, note that A and A + K have the same singular Weyl sequences. Since we have shown that we can remove any point in the discrete spec- trum by a self-adjoint finite rank operator, we obtain the following equivalent characterization of the essential spectrum. Lemma 6.18. The essential spectrum of a self-adjoint operator A is pre- cisely the part which is invariant under compact perturbations. In particular, σess (A) = σ(A + K). (6.32) K∈C(H),K ∗ =K There is even a larger class of operators under which the essential spec- trum is invariant. Theorem 6.19 (Weyl). Suppose A and B are self-adjoint operators. If RA (z) − RB (z) ∈ C(H) (6.33) for one z ∈ ρ(A) ∩ ρ(B), then σess (A) = σess (B). (6.34)
  • 159. 6.4. Relatively compact operators and Weyl’s theorem 147 Proof. In fact, suppose λ ∈ σess (A) and let ψn be a corresponding singular Weyl sequence. Then 1 RA (z) (RA (z) − )ψn = (A − λ)ψn λ−z z−λ 1 and thus (RA (z)− λ−z )ψn → 0. Moreover, by our assumption we also have 1 (RB (z) − λ−z )ψn → 0 and thus (B − λ)ϕn → 0, where ϕn = RB (z)ψn . Since lim ϕn = lim RA (z)ψn = |λ − z|−1 = 0 n→∞ n→∞ 1 1 (since (RA (z) − λ−z )ψn = λ−z RA (z)(A − λ)ψn → 0), we obtain a singular Weyl sequence for B, showing λ ∈ σess (B). Now interchange the roles of A and B. As a first consequence note the following result: Theorem 6.20. Suppose A is symmetric with equal finite defect indices. Then all self-adjoint extensions have the same essential spectrum. Proof. By Lemma 2.29 the resolvent difference of two self-adjoint extensions is a finite rank operator if the defect indices are finite. In addition, the following result is of interest. Lemma 6.21. Suppose RA (z) − RB (z) ∈ C(H) (6.35) for one z ∈ ρ(A)∩ρ(B). Then this holds for all z ∈ ρ(A)∩ρ(B). In addition, if A and B are self-adjoint, then f (A) − f (B) ∈ C(H) (6.36) for all f ∈ C∞ (R). Proof. If the condition holds for one z, it holds for all since we have (using both resolvent formulas) RA (z ) − RB (z ) = (1 − (z − z )RB (z ))(RA (z) − RB (z))(1 − (z − z )RA (z )). Let A and B be self-adjoint. The set of all functions f for which the claim holds is a closed ∗-subalgebra of C∞ (R) (with sup norm). Hence the claim follows from Lemma 4.4. Remember that we have called K relatively compact with respect to A if KRA (z) is compact (for one and hence for all z) and note that the resolvent difference RA+K (z) − RA (z) is compact if K is relatively compact.
  • 160. 148 6. Perturbation theory for self-adjoint operators In particular, Theorem 6.19 applies if B = A + K, where K is relatively compact. For later use observe that the set of all operators which are relatively compact with respect to A forms a linear space (since compact operators do) and relatively compact operators have A-bound zero. Lemma 6.22. Let A be self-adjoint and suppose K is relatively compact with respect to A. Then the A-bound of K is zero. Proof. Write KRA (λi) = (KRA (i))((A + i)RA (λi)) and observe that the first operator is compact and the second is normal and converges strongly to 0 (cf. Problem 3.7). Hence the claim follows from Lemma 6.3 and the discussion after Lemma 6.8 (since RA is normal). In addition, note the following result which is a straightforward conse- quence of the second resolvent formula. Lemma 6.23. Suppose A is self-adjoint and B is symmetric with A-bound less then one. If K is relatively compact with respect to A, then it is also relatively compact with respect to A + B. Proof. Since B is A bounded with A-bound less than one, we can choose a z ∈ C such that BRA (z) 1 and hence BRA+B (z) = BRA (z)(I + BRA (z))−1 (6.37) shows that B is also A + B bounded and the result follows from KRA+B (z) = KRA (z)(I − BRA+B (z)) (6.38) since KRA (z) is compact and BRA+B (z) is bounded. Problem 6.13. Let A and B be self-adjoint operators. Suppose B is rel- atively bounded with respect to A and A + B is self-adjoint. Show that if |B|1/2 RA (z) is Hilbert–Schmidt for one z ∈ ρ(A), then this is true for all z ∈ ρ(A). Moreover, |B|1/2 RA+B (z) is also Hilbert–Schmidt and RA+B (z) − RA (z) is trace class. d2 Problem 6.14. Show that A = − dx2 + q(x), D(A) = H 2 (R) is self-adjoint if q ∈ L∞ (R). Show that if −u (x) + q(x)u(x) = zu(x) has a solution for which u and u are bounded near +∞ (or −∞) but u is not square integrable near +∞ (or −∞), then z ∈ σess (A). (Hint: Use u to construct a Weyl sequence by restricting it to a compact set. Now modify your construction to get a singular Weyl sequence by observing that functions with disjoint support are orthogonal.)
  • 161. 6.5. Relatively form bounded operators and the KLMN theorem 149 6.5. Relatively form bounded operators and the KLMN theorem In Section 6.1 we have considered the case where the operators A and B have a common domain on which the operator sum is well-defined. In this section we want to look at the case were this is no longer possible, but where it is still possible to add the corresponding quadratic forms. Under suitable conditions this form sum will give rise to an operator via Theorem 2.13. d 2 Example. Let A be the self-adjoint operator A = − dx2 , D(A) = {f ∈ H 2 [0, 1]|f (0) = f (1) = 0} in the Hilbert space L2 (0, 1). If we want to add a potential represented by a multiplication operator with a real-valued (measurable) function q, then we already have seen that q will be relatively bounded if q ∈ L2 (0, 1). Hence, if q ∈ L2 (0, 1), we are out of luck with the theory developed so far. On the other hand, if we look at the corresponding quadratic forms, we have Q(A) = {f ∈ H 1 [0, 1]|f (0) = f (1) = 0} and Q(q) = D(|q|1/2 ). Thus we see that Q(A) ⊂ Q(q) if q ∈ L1 (0, 1). In summary, the operators can be added if q ∈ L2 (0, 1) while the forms can be added under the less restrictive condition q ∈ L1 (0, 1). Finally, note that in some drastic cases, there might even be no way to define the operator sum: Let xj be an enumeration of the rational numbers in (0, 1) and set ∞ 1 q(x) = , j=1 2j |x − xj | where the sum is to be understood as a limit in L1 (0, 1). Then q gives rise to a self-adjoint multiplication operator in L2 (0, 1). However, note that D(A) ∩ D(q) = {0}! In fact, let f ∈ D(A) ∩ D(q). Then f is continuous and q(x)f (x) ∈ L2 (0, 1). Now suppose f (xj ) = 0 for some rational number xj ∈ (0, 1). Then by continuity |f (x)| ≥ δ for x ∈ (xj − ε, xj + ε) and q(x)|f (x)| ≥ δ2−j |x − xj |−1/2 for x ∈ (xj − ε, xj + ε) which shows that q(x)f (x) ∈ L2 (0, 1) and hence f must vanish at every rational point. By continuity, we conclude f = 0. Recall from Section 2.3 that every closed semi-bounded form q = qA corresponds to a self-adjoint operator A (Theorem 2.13). Given a self-adjoint operator A ≥ γ and a (hermitian) form q : Q → R with Q(A) ⊆ Q, we call q relatively form bound with respect to qA if there are constants a, b ≥ 0 such that |q(ψ)| ≤ a qA−γ (ψ) + b ψ 2 , ψ ∈ Q(A). (6.39) The infimum of all possible a is called the form bound of q with respect to qA .
  • 162. 150 6. Perturbation theory for self-adjoint operators Note that we do not require that q is associated with some self-adjoint operator (though it will be in most cases). d 2 Example. Let A = − dx2 , D(A) = {f ∈ H 2 [0, 1]|f (0) = f (1) = 0}. Then q(f ) = |f (c)|2 , f ∈ H 1 [0, 1], c ∈ (0, 1), is a well-defined nonnegative form. Formally, one can interpret q as the quadratic form of the multiplication operator with the delta distribution at x = c. But for f ∈ Q(A) = {f ∈ H 1 [0, 1]|f (0) = f (1) = 0} we have by Cauchy–Schwarz c 1 1 |f (c)|2 = 2 Re f (t)∗ f (t)dt ≤ 2 |f (t)∗ f (t)|dt ≤ ε f 2 + f 2 . 0 0 ε Consequently q is relatively bounded with bound 0 and hence qA + q gives rise to a well-defined operator as we will show in the next theorem. The following result is the analog of the Kato–Rellich theorem and is due to Kato, Lions, Lax, Milgram, and Nelson. Theorem 6.24 (KLMN). Suppose qA : Q(A) → R is a semi-bounded closed hermitian form and q a relatively bounded hermitian form with relative bound less than one. Then qA + q defined on Q(A) is closed and hence gives rise to a semi-bounded self-adjoint operator. Explicitly we have qA +q ≥ (1−a)γ −b. Proof. A straightforward estimate shows qA (ψ) + q(ψ) ≥ (1 − a)qa (ψ) − b ψ 2 ≥ ((1 − a)γ − b) ψ 2 ; that is, qA + q is semi-bounded. Moreover, by 1 2 qA (ψ) ≤ |qA (ψ) + q(ψ)| + b ψ 1−a we see that the norms . qA and . qA +q are equivalent. Hence qA + q is closed and the result follows from Theorem 2.13. In the investigation of the spectrum of the operator A + B a key role is played by the second resolvent formula. In our present case we have the following analog. Theorem 6.25. Suppose A − γ ≥ 0 is self-adjoint and let q be a hermitian form with Q(q) ⊆ Q(A). Then the hermitian form q(RA (−λ)1/2 ψ), ψ ∈ H, (6.40) b corresponds to a bounded operator Cq (λ) with Cq (λ) ≤ a for λ a − γ if and only if q is relatively form bound with constants a and b. In particular, the form bound is given by lim Cq (λ) . (6.41) λ→∞
  • 163. 6.5. Relatively form bounded operators and the KLMN theorem 151 Moreover, if a 1, then RqA +q (−λ) = RA (−λ)1/2 (1 − Cq (λ))−1 RA (−λ)1/2 . (6.42) Here RqA +q (z) is the resolvent of the self-adjoint operator corresponding to qA + q. 1/2 Proof. We will abbreviate C = Cq (λ) and RA = RA (−λ)1/2 . If q is form b bounded, we have for λ a − γ that 1/2 1/2 1/2 2 |q(RA ψ)| ≤ a qA−γ (RA ψ) + b RA ψ b 1/2 2 = a ψ, (A − γ + )RA ψ ≤ a ψ a 1/2 and hence q(RA ψ) corresponds to a bounded operator C. The converse is similar. If a 1, then (1 − C)−1 is a well-defined bounded operator and so is 1/2 1/2 R = RA (1 − C)−1 RA . To see that R is the inverse of A1 − λ, where A1 1/2 is the operator associated with qA + q, take ϕ = RA ϕ ∈ Q(A) and ψ ∈ H. ˜ Then sA1 +λ (ϕ, Rψ) = sA+λ (ϕ, Rψ) + s(ϕ, Rψ) 1/2 1/2 = ϕ, (1 + C)−1 RA ψ + ϕ, C(1 + C)−1 RA ψ = ϕ, ψ . ˜ ˜ Taking ϕ ∈ D(A1 ) ⊆ Q(A), we see (A1 + λ)ϕ, Rψ = ϕ, ψ and thus R = RA1 (−λ) (Problem 6.15). Furthermore, we can define Cq (λ) for all z ∈ ρ(A) using Cq (z) = ((A + λ)1/2 RA (−z)1/2 )∗ Cq (λ)(A + λ)1/2 RA (−z)1/2 . (6.43) We will call q relatively form compact if the operator Cq (z) is compact for one and hence all z ∈ ρ(A). As in the case of relatively compact operators we have Lemma 6.26. Suppose A − γ ≥ 0 is self-adjoint and let q be a hermitian form. If q is relatively form compact with respect to qA , then its relative form bound is 0 and the resolvents of qA + q and qA differ by a compact operator. In particular, by Weyl’s theorem, the operators associated with qA and qA + q have the same essential spectrum. b Proof. Fix λ0 a − γ and let λ ≥ λ0 . Consider the operator D(λ) = (A+λ0 ) 1/2 R (−λ)1/2 and note that D(λ) is a bounded self-adjoint operator A with D(λ) ≤ 1. Moreover, D(λ) converges strongly to 0 as λ → ∞ (cf. Problem 3.7). Hence D(λ)C(λ0 ) → 0 by Lemma 6.8 and the same is true for C(λ) = D(λ)C(λ0 )D(λ). So the relative bound is zero by (6.41).
  • 164. 152 6. Perturbation theory for self-adjoint operators Finally, the resolvent difference is compact by (6.42) since (1 + C)−1 = 1 − C(1 + C)−1 . Corollary 6.27. Suppose A−γ ≥ 0 is self-adjoint and let q1 , q2 be hermitian forms. If q1 is relatively bounded with bound less than one and q2 is relatively compact, then the resolvent difference of qA + q1 + q2 and qA + q1 is compact. In particular, the operators associated with qA + q1 and qA + q1 + q2 have the same essential spectrum. Proof. Just observe that Cq1 +q2 = Cq1 + Cq2 and (1 + Cq1 + Cq2 )−1 = (1 + Cq1 )−1 − (1 + Cq1 )−1 Cq2 (1 + Cq1 + Cq2 )−1 . Finally we turn to the special case where q = qB for some self-adjoint operator B. In this case we have CB (z) = (|B|1/2 RA (−z)1/2 )∗ sign(B)|B|1/2 RA (−z)1/2 (6.44) and hence CB (z) ≤ |B|1/2 RA (−z)1/2 2 (6.45) with equality if V ≥ 0. Thus the following result is not too surprising. Lemma 6.28. Suppose A − γ ≥ 0 and B is self-adjoint. Then the following are equivalent: (i) B is A form bounded. (ii) Q(A) ⊆ Q(B). (iii) |B|1/2 RA (z)1/2 is bounded for one (and hence for all) z ∈ ρ(A). Proof. (i) ⇒ (ii) is true by definition. (ii) ⇒ (iii) since |B|1/2 RA (z)1/2 is a closed (Problem 2.9) operator defined on all of H and hence bounded by the closed graph theorem (Theorem 2.8). To see (iii) ⇒ (i), observe |B|1/2 RA (z)1/2 = |B|1/2 RA (z0 )1/2 (A − z0 )1/2 RA (z)1/2 which shows that |B|1/2 RA (z)1/2 is bounded for all z ∈ ρ(A) if it is bounded for one z0 ∈ ρ(A). But then (6.45) shows that (i) holds. Clearly C(λ) will be compact if |B|1/2 RA (z)1/2 is compact. However, 1/2 since RA (z) might be hard to compute, we provide the following more handy criterion. Lemma 6.29. Suppose A − γ ≥ 0 and B is self-adjoint where B is rela- tively form bounded with bound less than one. Then the resolvent difference RA+B (z) − RA (z) is compact if |B|1/2 RA (z) is compact and trace class if |B|1/2 RA (z) is Hilbert–Schmidt.
  • 165. 6.6. Strong and norm resolvent convergence 153 Proof. Abbreviate RA = RA (−λ), B1 = |B|1/2 , B2 = sign(B)|B|1/2 . Then 1/2 ˜ 1/2 ˜ we have (1 − CB )−1 = 1 − (B1 RA )∗ (1 + CB )−1 B2 RA , where CB = 1/2 1/2 ∗ ˜ B2 RA (B1 RA ) . Hence RA+B − RA = (B1 RA )∗ (1 + CB )−1 B2 RA and the claim follows. Moreover, the second resolvent formula still holds when interpreted suit- ably: Lemma 6.30. Suppose A − γ ≥ 0 and B is self-adjoint. If Q(A) ⊆ Q(B) and qA + qB is a closed semi-bounded form. Then RA+B (z) = RA (z) − (|B|1/2 RA+B (z ∗ ))∗ sign(B)|B|1/2 RA (z) = RA (z) − (|B|1/2 RA (z ∗ ))∗ sign(B)|B|1/2 RA+B (z) (6.46) for z ∈ ρ(A) ∩ ρ(A + B). Here A + B is the self-adjoint operator associated with qA + qB . Proof. Let ϕ ∈ D(A + B) and ψ ∈ H. Denote the right-hand side in (6.46) by R(z) and abbreviate R = R(z), RA = RA (z), B1 = |B|1/2 , B2 = sign(B)|B|1/2 . Then, using sA+B−z (ϕ, ψ) = (A + B + z ∗ )ϕ, ψ , sA+B−z (ϕ, Rψ) = sA+B−z (ϕ, RA ψ) − B1 RA+B (A + B + z ∗ )ϕ, B2 RA ψ ∗ = sA+B−z (ϕ, RA ψ) − sB (ϕ, RA ψ) = sA−z (ϕ, RA ψ) = ϕ, ψ . Thus R = RA+B (z) (Problem 6.15). The second equality follows after ex- changing the roles of A and A + B. It can be shown using abstract interpolation techniques that if B is relatively bounded with respect to A, then it is also relatively form bounded. In particular, if B is relatively bounded, then BRA (z) is bounded and it is not hard to check that (6.46) coincides with (6.4). Consequently A + B defined as operator sum is the same as A + B defined as form sum. Problem 6.15. Suppose A is closed and R is bounded. Show that R = RA (z) if and only if (A − z)∗ ϕ, Rψ = ϕ, ψ for all ϕ ∈ D(A∗ ), ψ ∈ H. Problem 6.16. Let q be relatively form bounded with constants a and b. b Show that Cq (λ) satisfies C(λ) ≤ max(a, λ+γ ) for λ −γ. Furthermore, show that C(λ) decreases as λ → ∞. 6.6. Strong and norm resolvent convergence Suppose An and A are self-adjoint operators. We say that An converges to A in the norm, respectively, strong resolvent sense, if lim RAn (z) = RA (z), respectively, s-lim RAn (z) = RA (z), (6.47) n→∞ n→∞
  • 166. 154 6. Perturbation theory for self-adjoint operators for one z ∈ Γ = CΣ, Σ = σ(A) ∪ n σ(An ). In fact, in the case of strong resolvent convergence it will be convenient to include the case if An s is only defined on some subspace Hn ⊆ H, where we require Pn → 1 for the orthogonal projection onto Hn . In this case RAn (z) (respectively, any other function of An ) has to be understood as RAn (z)Pn , where Pn is the orthogonal projector onto Hn . (This generalization will produce nothing new in the norm case, since Pn → 1 implies Pn = 1 for sufficiently large n.) Using the Stone–Weierstraß theorem, we obtain as a first consequence Theorem 6.31. Suppose An converges to A in the norm resolvent sense. Then f (An ) converges to f (A) in norm for any bounded continuous function f : Σ → C with limλ→−∞ f (λ) = limλ→∞ f (λ). If An converges to A in the strong resolvent sense, then f (An ) converges to f (A) strongly for any bounded continuous function f : Σ → C. Proof. The set of functions for which the claim holds clearly forms a ∗- subalgebra (since resolvents are normal, taking adjoints is continuous even with respect to strong convergence) and since it contains f (λ) = 1 and 1 f (λ) = λ−z0 , this ∗-subalgebra is dense by the Stone–Weierstraß theorem ε (cf. Problem 1.21). The usual 3 argument shows that this ∗-subalgebra is also closed. It remains to show the strong resolvent case for arbitrary bounded con- tinuous functions. Let χn be a compactly supported continuous function s (0 ≤ χm ≤ 1) which is one on the interval [−m, m]. Then χm (An ) → χm (A), s f (An )χm (An ) → f (A)χm (A) by the first part and hence (f (An ) − f (A))ψ ≤ f (An ) (1 − χm (A))ψ + f (An ) (χm (A) − χm (An ))ψ + (f (An )χm (An ) − f (A)χm (A))ψ + f (A) (1 − χm (A))ψ s can be made arbitrarily small since f (.) ≤ f ∞ and χm (.) → I by Theo- rem 3.1. As a consequence, note that the point z ∈ Γ is of no importance, that is, Corollary 6.32. Suppose An converges to A in the norm or strong resolvent sense for one z0 ∈ Γ. Then this holds for all z ∈ Γ. Also, Corollary 6.33. Suppose An converges to A in the strong resolvent sense. Then s eitAn → eitA , t ∈ R, (6.48)
  • 167. 6.6. Strong and norm resolvent convergence 155 and if all operators are semi-bounded by the same bound s e−tAn → e−tA , t ≥ 0. (6.49) Next we need some good criteria to check for norm, respectively, strong, resolvent convergence. Lemma 6.34. Let An , A be self-adjoint operators with D(An ) = D(A). Then An converges to A in the norm resolvent sense if there are sequences an and bn converging to zero such that (An − A)ψ ≤ an ψ + bn Aψ , ψ ∈ D(A) = D(An ). (6.50) Proof. From the second resolvent formula RAn (z) − RA (z) = RAn (z)(A − An )RA (z), we infer (RAn (i) − RA (i))ψ ≤ RAn (i) an RA (i)ψ + bn ARA (i)ψ ≤ (an + bn ) ψ and hence RAn (i) − RA (i) ≤ an + bn → 0. In particular, norm convergence implies norm resolvent convergence: Corollary 6.35. Let An , A be bounded self-adjoint operators with An → A. Then An converges to A in the norm resolvent sense. Similarly, if no domain problems get in the way, strong convergence implies strong resolvent convergence: Lemma 6.36. Let An , A be self-adjoint operators. Then An converges to A in the strong resolvent sense if there is a core D0 of A such that for any ψ ∈ D0 we have Pn ψ ∈ D(An ) for n sufficiently large and An ψ → Aψ. Proof. We begin with the case Hn = H. Using the second resolvent formula, we have (RAn (i) − RA (i))ψ ≤ (A − An )RA (i)ψ → 0 for ψ ∈ (A − i)D0 which is dense, since D0 is a core. The rest follows from Lemma 1.14. ˜ s If Hn ⊂ H, we can consider An = An ⊕ 0 and conclude R ˜ (i) → RA (i) An from the first case. By RAn (i) = RAn (i) − i(1 − Pn ) the same is true for ˜ s RAn (i) since 1 − Pn → 0 by assumption. If you wonder why we did not define weak resolvent convergence, here is the answer: it is equivalent to strong resolvent convergence.
  • 168. 156 6. Perturbation theory for self-adjoint operators Lemma 6.37. Suppose w-limn→∞ RAn (z) = RA (z) for some z ∈ Γ. Then s-limn→∞ RAn (z) = RA (z) also. Proof. By RAn (z) RA (z) we also have RAn (z)∗ RA (z)∗ and thus by the first resolvent formula RAn (z)ψ 2 − RA (z)ψ 2 = ψ, RAn (z ∗ )RAn (z)ψ − RA (z ∗ )RA (z)ψ 1 = ψ, (RAn (z) − RAn (z ∗ ) + RA (z) − RA (z ∗ ))ψ → 0. z − z∗ Together with RAn (z)ψ RA (z)ψ we have RAn (z)ψ → RA (z)ψ by virtue of Lemma 1.12 (iv). Now what can we say about the spectrum? Theorem 6.38. Let An and A be self-adjoint operators. If An converges to A in the strong resolvent sense, we have σ(A) ⊆ limn→∞ σ(An ). If An converges to A in the norm resolvent sense, we have σ(A) = limn→∞ σ(An ). Proof. Suppose the first claim were incorrect. Then we can find a λ ∈ σ(A) and some ε 0 such that σ(An ) ∩ (λ − ε, λ + ε) = ∅. Choose a bounded ε ε continuous function f which is one on (λ − 2 , λ + 2 ) and which vanishes outside (λ − ε, λ + ε). Then f (An ) = 0 and hence f (A)ψ = lim f (An )ψ = 0 for every ψ. On the other hand, since λ ∈ σ(A), there is a nonzero ψ ∈ ε ε Ran PA ((λ − 2 , λ + 2 )) implying f (A)ψ = ψ, a contradiction. To see the second claim, recall that the norm of RA (z) is just one over the distance from the spectrum. In particular, λ ∈ σ(A) if and only if RA (λ + i) 1. So λ ∈ σ(A) implies RA (λ + i) 1, which implies RAn (λ + i) 1 for n sufficiently large, which implies λ ∈ σ(An ) for n sufficiently large. Example. Note that the spectrum can contract if we only have convergence 1 in the strong resolvent sense: Let An be multiplication by n x in L2 (R). Then An converges to 0 in the strong resolvent sense, but σ(An ) = R and σ(0) = {0}. Lemma 6.39. Suppose An converges in the strong resolvent sense to A. If PA ({λ}) = 0, then s-lim PAn ((−∞, λ)) = s-lim PAn ((−∞, λ]) = PA ((−∞, λ)) = PA ((−∞, λ]). n→∞ n→∞ (6.51) Proof. By Theorem 6.31 the spectral measures µn,ψ corresponding to An converge vaguely to those of A. Hence PAn (Ω)ψ 2 = µn,ψ (Ω) together with Lemma A.25 implies the claim.
  • 169. 6.6. Strong and norm resolvent convergence 157 Using P ((λ0 , λ1 )) = P ((−∞, λ1 )) − P ((−∞, λ0 ]), we also obtain the following. Corollary 6.40. Suppose An converges in the strong resolvent sense to A. If PA ({λ0 }) = PA ({λ1 }) = 0, then s-lim PAn ((λ0 , λ1 )) = s-lim PAn ([λ0 , λ1 ]) = PA ((λ0 , λ1 )) = PA ([λ0 , λ1 ]). n→∞ n→∞ (6.52) Example. The following example shows that the requirement PA ({λ}) = 0 is crucial, even if we have bounded operators and norm convergence. In fact, let H = C2 and 1 1 0 An = . (6.53) n 0 −1 Then An → 0 and 0 0 PAn ((−∞, 0)) = PAn ((−∞, 0]) = , (6.54) 0 1 but P0 ((−∞, 0)) = 0 and P0 ((−∞, 0]) = I. Problem 6.17. Show that for self-adjoint operators, strong resolvent con- vergence is equivalent to convergence with respect to the metric 1 d(A, B) = (RA (i) − RB (i))ϕn , (6.55) 2n n∈N where {ϕn }n∈N is some (fixed) ONB. Problem 6.18 (Weak convergence of spectral measures). Suppose An → A in the strong resolvent sense and let µn,ψ , µψ be the corresponding spectral measures. Show that f (λ)dµn,ψ (λ) → f (λ)dµψ (λ) (6.56) for every bounded continuous f . Give a counterexample when f is not con- tinuous.
  • 173. Chapter 7 The free Schr¨dinger o operator 7.1. The Fourier transform We first review some basic facts concerning the Fourier transform which will be needed in the following section. Let C ∞ (Rn ) be the set of all complex-valued functions which have partial derivatives of arbitrary order. For f ∈ C ∞ (Rn ) and α ∈ Nn we set 0 ∂ |α| f ∂α f = , xα = xα1 · · · xαn , |α| = α1 + · · · + αn . (7.1) ∂xα1 · · · ∂xαn 1 n 1 n An element α ∈ Nn is called a multi-index and |α| is called its order. 0 Recall the Schwartz space S(Rn ) = {f ∈ C ∞ (Rn )| sup |xα (∂β f )(x)| ∞, α, β ∈ Nn } 0 (7.2) x ∞ which is dense in L2 (Rn ) (since Cc (Rn ) ⊂ S(Rn ) is). Note that if f ∈ S(Rn ), then the same is true for xα f (x) and (∂α f )(x) for any multi-index α. For f ∈ S(Rn ) we define 1 ˆ F(f )(p) ≡ f (p) = e−ipx f (x)dn x. (7.3) (2π)n/2 Rn Then, Lemma 7.1. The Fourier transform maps the Schwartz space into itself, F : S(Rn ) → S(Rn ). Furthermore, for any multi-index α ∈ Nn and any 0 f ∈ S(Rn ) we have (∂α f )∧ (p) = (ip)α f (p), ˆ (xα f (x))∧ (p) = i|α| ∂α f (p). ˆ (7.4) 161
  • 174. 162 7. The free Schr¨dinger operator o Proof. First of all, by integration by parts, we see ∂ 1 ∂ ( f (x))∧ (p) = e−ipx f (x)dn x ∂xj (2π)n/2 Rn ∂xj 1 ∂ −ipx = − e f (x)dn x (2π)n/2 Rn ∂xj 1 = ipj e−ipx f (x)dn x = ipj f (p). ˆ (2π)n/2 Rn So the first formula follows by induction. Similarly, the second formula follows from induction using 1 (xj f (x))∧ (p) = xj e−ipx f (x)dn x (2π)n/2 Rn 1 ∂ −ipx ∂ ˆ = i e f (x)dn x = i f (p), (2π)n/2 Rn ∂pj ∂pj where interchanging the derivative and integral is permissible by Prob- ˆ lem A.8. In particular, f (p) is differentiable. ˆ To see that f ∈ S(Rn ) if f ∈ S(Rn ), we begin with the observation ˆ ˆ ˆ that f is bounded; in fact, f ∞ ≤ (2π)−n/2 f 1 . But then pα (∂β f )(p) = i−|α|−|β| (∂ α xβ f (x))∧ (p) is bounded since ∂ α xβ f (x) ∈ S(Rn ) if f ∈ S(Rn ). Hence we will sometimes write pf (x) for −i∂f (x), where ∂ = (∂1 , . . . , ∂n ) is the gradient. Two more simple properties are left as an exercise. Lemma 7.2. Let f ∈ S(Rn ). Then (f (x + a))∧ (p) = eiap f (p), ˆ a ∈ Rn , (7.5) 1 ˆ p (f (λx))∧ (p) = n f ( ), λ 0. (7.6) λ λ Next, we want to compute the inverse of the Fourier transform. For this the following lemma will be needed. 2 /2 Lemma 7.3. We have e−zx ∈ S(Rn ) for Re(z) 0 and 2 /2 1 −p2 /(2z) F(e−zx )(p) = e . (7.7) z n/2 √ Here z n/2 has to be understood as ( z)n , where the branch cut of the root is chosen along the negative real axis. Proof. Due to the product structure of the exponential, one can treat each coordinate separately, reducing the problem to the case n = 1.
  • 175. 7.1. The Fourier transform 163 ˆ Let φz (x) = exp(−zx2 /2). Then φz (x)+zxφz (x) = 0 and hence i(pφz (p)+ ˆ ˆ z φz (p)) = 0. Thus φz (p) = cφ1/z (p) and (Problem 7.1) ˆ 1 1 c = φz (0) = √ exp(−zx2 /2)dx = √ 2π R z at least for z 0. However, since the integral is holomorphic for Re(z) 0 by Problem A.10, this holds for all z with Re(z) 0 if we choose the branch cut of the root along the negative real axis. Now we can show Theorem 7.4. The Fourier transform F : S(Rn ) → S(Rn ) is a bijection. Its inverse is given by 1 F −1 (g)(x) ≡ g (x) = ˇ eipx g(p)dn p. (7.8) (2π)n/2 Rn We have F 2 (f )(x) = f (−x) and thus F 4 = I. Proof. Abbreviate φε (x) = exp(−εx2 /2). By dominated convergence we have 1 (f (p))∨ (x) = ˆ ˆ eipx f (p)dn p (2π)n/2 Rn 1 ˆ = lim φε (p)eipx f (p)dn p ε→0 (2π)n/2 Rn 1 = lim φε (p)eipx f (y)e−ipx dn ydn p, ε→0 (2π)n Rn Rn and, invoking Fubini and Lemma 7.2, we further see 1 = lim (φε (p)eipx )∧ (y)f (y)dn y ε→0 (2π)n/2 Rn 1 1 = lim φ1/ε (y − x)f (y)dn y ε→0 (2π)n/2 Rn εn/2 1 √ = lim φ1 (z)f (x + εz)dn z = f (x), ε→0 (2π)n/2 Rn y−x which finishes the proof, where we used the change of coordinates z = √ ε and again dominated convergence in the last two steps. From Fubini’s theorem we also obtain Parseval’s identity 1 ˆ |f (p)|2 dn p = f (x)∗ f (p)eipx dn p dn x ˆ Rn (2π)n/2 Rn Rn = |f (x)|2 dn x (7.9) Rn
  • 176. 164 7. The free Schr¨dinger operator o for f ∈ S(Rn ). Thus, by Theorem 0.26, we can extend F to L2 (Rn ) by setting 1 ˆ f (p) = lim e−ipx f (x)dn x, (7.10) R→∞ (2π)n/2 |x|≤R where the limit is to be understood in L2 (Rn ) (Problem 7.5). If f ∈ L1 (Rn )∩ ˆ L2 (Rn ), we can omit the limit (why?) and f is still given by (7.3). Theorem 7.5. The Fourier transform F extends to a unitary operator F : L2 (Rn ) → L2 (Rn ). Its spectrum is given by σ(F) = {z ∈ C|z 4 = 1} = {1, −1, i, −i}. (7.11) Proof. As already noted, F extends uniquely to a bounded operator on L2 (Rn ). Moreover, the same is true for F −1 . Since Parseval’s identity remains valid by continuity of the norm, this extension is a unitary operator. It remains to compute the spectrum. In fact, if ψn is a Weyl sequence, then (F 2 + z 2 )(F + z)(F − z)ψn = (F 4 − z 4 )ψn = (1 − z 4 )ψn → 0 implies z 4 = 1. Hence σ(F) ⊆ {z ∈ C|z 4 = 1}. We defer the proof for equality to Section 8.3, where we will explicitly compute an orthonormal basis of eigenfunctions. Lemma 7.1 also allows us to extend differentiation to a larger class. Let us introduce the Sobolev space ˆ H r (Rn ) = {f ∈ L2 (Rn )||p|r f (p) ∈ L2 (Rn )}. (7.12) Then, every function in H r (Rn ) has partial derivatives up to order r, which are defined via ∂α f = ((ip)α f (p))∨ , ˆ f ∈ H r (Rn ), |α| ≤ r. (7.13) By Lemma 7.1 this definition coincides with the usual one for every f ∈ S(Rn ) and we have g(x)(∂α f )(x)dn x = g ∗ , (∂α f ) = g (p)∗ , (ip)α f (p) ˆ ˆ Rn = (−1)|α| (ip)α g (p)∗ , f (p) = (−1)|α| ∂α g ∗ , f ˆ ˆ = (−1)|α| (∂α g)(x)f (x)dn x, (7.14) Rn for f, g ∈ H r (Rn ). Furthermore, recall that a function h ∈ L1 (Rn ) satisfy- loc ing ϕ(x)h(x)dn x = (−1)|α| (∂α ϕ)(x)f (x)dn x, ∞ ϕ ∈ Cc (Rn ), (7.15) Rn Rn is also called the weak derivative or the derivative in the sense of distri- butions of f (by Lemma 0.37 such a function is unique if it exists). Hence,
  • 177. 7.1. The Fourier transform 165 choosing g = ϕ in (7.14), we see that H r (Rn ) is the set of all functions hav- ing partial derivatives (in the sense of distributions) up to order r, which are in L2 (Rn ). Finally, we note that on L1 (Rn ) we have Lemma 7.6 (Riemann-Lebesgue). Let C∞ (Rn ) denote the Banach space of all continuous functions f : Rn → C which vanish at ∞ equipped with the sup norm. Then the Fourier transform is a bounded injective map from L1 (Rn ) into C∞ (Rn ) satisfying ˆ f ∞ ≤ (2π)−n/2 f 1. (7.16) ˆ Proof. Clearly we have f ∈ C∞ (Rn ) if f ∈ S(Rn ). Moreover, since S(Rn ) is dense in L1 (Rn ), the estimate 1 1 ˆ sup |f (p)| ≤ sup |e−ipx f (x)|dn x = |f (x)|dn x p (2π)n/2 p Rn (2π)n/2 Rn shows that the Fourier transform extends to a continuous map from L1 (Rn ) into C∞ (Rn ). ˆ To see that the Fourier transform is injective, suppose f = 0. Then Fubini implies 0= ˆ ϕ(x)f (x)dn x = ϕ(x)f (x)dn x ˆ Rn Rn for every ϕ ∈ S(Rn ). Hence Lemma 0.37 implies f = 0. Note that F : L1 (Rn ) → C∞ (Rn ) is not onto (cf. Problem 7.7). Another useful property is the convolution formula. Lemma 7.7. The convolution (f ∗ g)(x) = f (y)g(x − y)dn y = f (x − y)g(y)dn y (7.17) Rn Rn of two functions f, g ∈ L1 (Rn ) is again in L1 (Rn ) and we have Young’s inequality f ∗g 1 ≤ f 1 g 1. (7.18) Moreover, its Fourier transform is given by (f ∗ g)∧ (p) = (2π)n/2 f (p)ˆ(p). ˆ g (7.19) Proof. The fact that f ∗ g is in L1 together with Young’s inequality follows by applying Fubini’s theorem to h(x, y) = f (x − y)g(y). For the last claim
  • 178. 166 7. The free Schr¨dinger operator o we compute 1 (f ∗ g)∧ (p) = e−ipx f (y)g(x − y)dn y dn x (2π)n/2 Rn Rn 1 = e−ipy f (y) e−ip(x−y) g(x − y)dn x dn y Rn (2π)n/2 Rn = e−ipy f (y)ˆ(p)dn y = (2π)n/2 f (p)ˆ(p), g ˆ g Rn where we have again used Fubini’s theorem. In other words, L1 (Rn ) together with convolution as a product is a Banach algebra (without identity). For the case of convolution on L2 (Rn ) see Problem 7.9. √ Problem 7.1. Show that R exp(−x2 /2)dx = 2π. (Hint: Square the inte- gral and evaluate it using polar coordinates.) Problem 7.2. Compute the Fourier transform of the following functions f : R → C: 1 (i) f (x) = χ(−1,1) (x). (ii) f (p) = p2 +k2 , Re(k) 0. Problem 7.3. Suppose f (x) ∈ L1 (R) and g(x) = −ixf (x) ∈ L1 (R). Then ˆ ˆ f is differentiable and f = g . ˆ Problem 7.4. A function f : Rn → C is called spherically symmetric if it is invariant under rotations; that is, f (Ox) = f (x) for all O ∈ SO(Rn ) (equivalently, f depends only on the distance to the origin |x|). Show that the Fourier transform of a spherically symmetric function is again spherically symmetric. Problem 7.5. Show (7.10). (Hint: First suppose f has compact support. ∞ Then there is a sequence of functions fn ∈ Cc (Rn ) converging to f in L2 . The support of these functions can be chosen inside a fixed set and hence this sequence also converges to f in L1 . Thus (7.10) follows for f ∈ L2 with compact support. To remove this restriction, use that the projection onto a ball with radius R converges strongly to the identity as R → ∞.) Problem 7.6. Show that C∞ (Rn ) is indeed a Banach space. Show that S(Rn ) is dense. Problem 7.7. Show that F : L1 (Rn ) → C∞ (Rn ) is not onto as follows: (i) The range of F is dense. (ii) F is onto if and only if it has a bounded inverse. (iii) F has no bounded inverse.
  • 179. 7.2. The free Schr¨dinger operator o 167 (Hint for (iii): Suppose ϕ is smooth with compact support in (0, 1) and set fm (x) = m eikx ϕ(x − k). Then fm 1 = m ϕ 1 and fm ∞ ≤ const k=1 ˆ since ϕ ∈ S(R) and hence ϕ(p) ≤ const(1 + |p|) −2 ). Problem 7.8. Show that the convolution of two S(Rn ) functions is in S(Rn ). Problem 7.9. Show that the convolution of two L2 (Rn ) functions is in C∞ (Rn ) and we have f ∗ g ∞ ≤ f 2 g 2 . Problem 7.10 (Wiener). Suppose f ∈ L2 (Rn ). Then the set {f (x + a)|a ∈ ˆ Rn } is total in L2 (Rn ) if and only if f (p) = 0 a.e. (Hint: Use Lemma 7.2 and the fact that a subspace is total if and only if its orthogonal complement is zero.) ˆ Problem 7.11. Suppose f (x)ek|x| ∈ L1 (R) for some k 0. Then f (p) has an analytic extension to the strip | Im(p)| k. 7.2. The free Schr¨dinger operator o In Section 2.1 we have seen that the Hilbert space corresponding to one particle in R3 is L2 (R3 ). More generally, the Hilbert space for N particles in Rd is L2 (Rn ), n = N d. The corresponding nonrelativistic Hamilton operator, if the particles do not interact, is given by H0 = −∆, (7.20) where ∆ is the Laplace operator n ∂2 ∆= . (7.21) j=1 ∂x2 j Here we have chosen units such that all relevant physical constants disap- pear; that is, = 1 and the mass of the particles is equal to m = 1 . Be 2 1 aware that some authors prefer to use m = 1; that is, H0 = − 2 ∆. Our first task is to find a good domain such that H0 is a self-adjoint operator. By Lemma 7.1 we have that − ∆ψ(x) = (p2 ψ(p))∨ (x), ˆ ψ ∈ H 2 (Rn ), (7.22) and hence the operator H0 ψ = −∆ψ, D(H0 ) = H 2 (Rn ), (7.23) is unitarily equivalent to the maximally defined multiplication operator (F H0 F −1 )ϕ(p) = p2 ϕ(p), D(p2 ) = {ϕ ∈ L2 (Rn )|p2 ϕ(p) ∈ L2 (Rn )}. (7.24)
  • 180. 168 7. The free Schr¨dinger operator o Theorem 7.8. The free Schr¨dinger operator H0 is self-adjoint and its o spectrum is characterized by σ(H0 ) = σac (H0 ) = [0, ∞), σsc (H0 ) = σpp (H0 ) = ∅. (7.25) Proof. It suffices to show that dµψ is purely absolutely continuous for every ψ. First observe that ˆ |ψ(p)|2 n 1 ˆ ˆ ψ, RH0 (z)ψ = ψ, Rp2 (z)ψ = d p= d˜ψ (r), µ Rn p2 − z R r2 −z where d˜ψ (r) = χ[0,∞) (r)rn−1 µ ˆ |ψ(rω)|2 dn−1 ω dr. S n−1 Hence, after a change of coordinates, we have 1 ψ, RH0 (z)ψ = dµψ (λ), R λ−z where 1 √ dµψ (λ) = χ[0,∞) (λ)λn/2−1 ˆ |ψ( λω)|2 dn−1 ω dλ, 2 S n−1 proving the claim. Finally, we note that the compactly supported smooth functions are a core for H0 . ∞ Lemma 7.9. The set Cc (Rn ) = {f ∈ S(Rn )| supp(f ) is compact} is a core for H0 . Proof. It is not hard to see that S(Rn ) is a core (Problem 7.12) and hence it suffices to show that the closure of H0 |Cc (Rn ) contains H0 |S(Rn ) . To see this, ∞ let ϕ(x) ∈ Cc ∞ (Rn ) which is one for |x| ≤ 1 and vanishes for |x| ≥ 2. Set 1 ∞ ϕn (x) = ϕ( n x). Then ψn (x) = ϕn (x)ψ(x) is in Cc (Rn ) for every ψ ∈ S(Rn ) and ψn → ψ, respectively, ∆ψn → ∆ψ. Note also that the quadratic form of H0 is given by n qH0 (ψ) = |∂j ψ(x)|2 dn x, ψ ∈ Q(H0 ) = H 1 (Rn ). (7.26) j=1 Rn Problem 7.12. Show that S(Rn ) is a core for H0 . (Hint: Show that the closure of H0 |S(Rn ) contains H0 .) Problem 7.13. Show that {ψ ∈ S(R)|ψ(0) = 0} is dense but not a core for d2 H0 = − dx2 .
  • 181. 7.3. The time evolution in the free case 169 7.3. The time evolution in the free case Now let us look at the time evolution. We have 2 e−itH0 ψ(x) = F −1 e−itp ψ(p). ˆ (7.27) The right-hand side is a product and hence our operator should be express- ible as an integral operator via the convolution formula. However, since 2 e−itp is not in L2 , a more careful analysis is needed. Consider 2 fε (p2 ) = e−(it+ε)p , ε 0. (7.28) Then fε (H0 )ψ → e−itH0 ψ by Theorem 3.1. Moreover, by Lemma 7.3 and the convolution formula we have 1 |x−y|2 − fε (H0 )ψ(x) = e 4(it+ε) ψ(y)dn y (7.29) (4π(it + ε))n/2 Rn and hence 1 |x−y|2 e−itH0 ψ(x) = ei 4t ψ(y)dn y (7.30) (4πit)n/2 Rn for t = 0 and ψ ∈ L1 ∩ L2 . For general ψ ∈ L2 the integral has to be understood as a limit. Using this explicit form, it is not hard to draw some immediate conse- quences. For example, if ψ ∈ L2 (Rn ) ∩ L1 (Rn ), then ψ(t) ∈ C(Rn ) for t = 0 (use dominated convergence and continuity of the exponential) and satisfies 1 ψ(t) ∞ ≤ ψ(0) 1 . (7.31) |4πt|n/2 Thus we have spreading of wave functions in this case. Moreover, it is even possible to determine the asymptotic form of the wave function for large t as follows. Observe x2 −itH0 ei 4t y2 xy e ψ(x) = ei 4t ψ(y)ei 2t dn y (4πit)n/2 Rn n/2 ∧ 1 ix 2 iy 2 x = e 4t e 4t ψ(y) ( ). (7.32) 2it 2t 2 Moreover, since exp(i y )ψ(y) → ψ(y) in L2 as |t| → ∞ (dominated conver- 4t gence), we obtain Lemma 7.10. For any ψ ∈ L2 (Rn ) we have n/2 1 2 x ˆ x e−itH0 ψ(x) − ei 4t ψ( ) → 0 (7.33) 2it 2t in L2 as |t| → ∞.
  • 182. 170 7. The free Schr¨dinger operator o Note that this result is not too surprising from a physical point of view. In fact, if a classical particle starts at a point x(0) = x0 with velocity v = 2p (recall that we use units where the mass is m = 1 ), then we will find it at 2 x = x0 + 2pt at time t. Dividing by 2t, we get 2t = p + x0 ≈ p for large t. x 2t Hence the probability distribution for finding a particle at a point x at time x t should approach the probability distribution for the momentum at p = 2t ; that is, |ψ(x, t)| 2 dn x = |ψ( x )|2 dn x . This could also be stated as follows: 2t (2t)n The probability of finding the particle in a region Ω ⊆ Rn is asymptotically for |t| → ∞ equal to the probability of finding the momentum of the particle 1 in 2t Ω. Next we want to apply the RAGE theorem in order to show that for any initial condition, a particle will escape to infinity. Lemma 7.11. Let g(x) be the multiplication operator by g and let f (p) be ˆ the operator given by f (p)ψ(x) = F −1 (f (p)ψ(p))(x). Denote by L∞ (Rn ) the ∞ bounded Borel functions which vanish at infinity. Then f (p)g(x) and g(x)f (p) (7.34) are compact if f, g ∈ L∞ (Rn ) and (extend to) Hilbert–Schmidt operators if ∞ f, g ∈ L2 (Rn ). Proof. By symmetry it suffices to consider g(x)f (p). Let f, g ∈ L2 . Then 1 ˇ g(x)f (p)ψ(x) = g(x)f (x − y)ψ(y)dn y (2π)n/2 Rn ˇ shows that g(x)f (p) is Hilbert–Schmidt since g(x)f (x − y) ∈ L2 (Rn × Rn ). If f, g are bounded, then the functions fR (p) = χ{p|p2 ≤R} (p)f (p) and gR (x) = χ{x|x2 ≤R} (x)g(x) are in L2 . Thus gR (x)fR (p) is compact and by g(x)f (p) − gR (x)fR (p) ≤ g ∞ f − fR ∞ + g − gR ∞ fR ∞ it tends to g(x)f (p) in norm since f, g vanish at infinity. In particular, this lemma implies that χΩ (H0 + i)−1 (7.35) is compact if Ω ⊆ Rn is bounded and hence lim χΩ e−itH0 ψ 2 =0 (7.36) t→∞ for any ψ ∈ L2 (Rn ) and any bounded subset Ω of Rn . In other words, the particle will eventually escape to infinity since the probability of finding the particle in any bounded set tends to zero. (If ψ ∈ L1 (Rn ), this of course also follows from (7.31).)
  • 183. 7.4. The resolvent and Green’s function 171 7.4. The resolvent and Green’s function Now let us compute the resolvent of H0 . We will try to use an approach similar to that for the time evolution in the previous section. However, since it is highly nontrivial to compute the inverse Fourier transform of exp(−εp2 )(p2 − z)−1 directly, we will use a small ruse. Note that ∞ RH0 (z) = ezt e−tH0 dt, Re(z) 0, (7.37) 0 by Lemma 4.1. Moreover, 1 |x−y|2 e−tH0 ψ(x) = e− 4t ψ(y)dn y, t 0, (7.38) (4πt)n/2 Rn by the same analysis as in the previous section. Hence, by Fubini, we have RH0 (z)ψ(x) = G0 (z, |x − y|)ψ(y)dn y, (7.39) Rn where ∞ 1 r2 G0 (z, r) = e− 4t +zt dt, r 0, Re(z) 0. (7.40) 0 (4πt)n/2 The function G0 (z, r) is called Green’s function of H0 . The integral can be evaluated in terms of modified Bessel functions of the second kind as follows: First of all it suffices to consider z 0 since the remaining values will follow r by analytic continuation. Then, making the substitution t = 2√−z es , we obtain ∞ √ n −1 ∞ 1 2 − r +zt 1 −z 2 e 4t dt = e−νs e−x cosh(s) ds 0 (4πt)n/2 4π 2πr −∞ √ n −1 ∞ 1 −z 2 = cosh(−νs)e−x cosh(s) ds, 2π 2πr 0 (7.41) √ where we have abbreviated x = −zr and ν = n − 1. But the last integral 2 is given by the modified Bessel function Kν (x) (see [1, (9.6.24)]) and thus √ n −1 1 −z 2 √ G0 (z, r) = K n −1 ( −zr). (7.42) 2π 2πr 2 Note Kν (x) = K−ν (x) and Kν (x) 0 for ν, x ∈ R. The functions Kν (x) satisfy the differential equation (see [1, (9.6.1)]) d2 1 d ν2 + −1− 2 Kν (x) = 0 (7.43) dx2 x dx x
  • 184. 172 7. The free Schr¨dinger operator o and have the asymptotics (see [1, (9.6.8) and (9.6.9)]) Γ(ν) x −ν + O(x−ν+2 ), ν 0, Kν (x) = 2 2 x (7.44) − log( 2 ) + O(1), ν = 0, for |x| → 0 and (see [1, (9.7.2)]) π −x Kν (x) = e (1 + O(x−1 )) (7.45) 2x for |x| → ∞. For more information see for example [1] or [59]. In particular, G0 (z, r) has an analytic continuation for z ∈ C[0, ∞) = ρ(H0 ). Hence we can define the right-hand side of (7.39) for all z ∈ ρ(H0 ) such that ϕ(x)G0 (z, |x − y|)ψ(y)dn ydn x (7.46) Rn Rn is analytic for z ∈ ρ(H0 ) and ϕ, ψ ∈ S(Rn ) (by Morera’s theorem). Since it is equal to ϕ, RH0 (z)ψ for Re(z) 0, it is equal to this function for all z ∈ ρ(H0 ), since both functions are analytic in this domain. In particular, (7.39) holds for all z ∈ ρ(H0 ). If n is odd, we have the case of spherical Bessel functions which can be expressed in terms of elementary functions. For example, we have 1 √ G0 (z, r) = √ e− −z r , n = 1, (7.47) 2 −z and 1 −√−z r G0 (z, r) = e , n = 3. (7.48) 4πr Problem 7.14. Verify (7.39) directly in the case n = 1.
  • 185. Chapter 8 Algebraic methods 8.1. Position and momentum Apart from the Hamiltonian H0 , which corresponds to the kinetic energy, there are several other important observables associated with a single par- ticle in three dimensions. Using the commutation relation between these observables, many important consequences about these observables can be derived. First consider the one-parameter unitary group (Uj (t)ψ)(x) = e−itxj ψ(x), 1 ≤ j ≤ 3. (8.1) For ψ ∈ S(R3 ) we compute e−itxj ψ(x) − ψ(x) lim i = xj ψ(x) (8.2) t→0 t and hence the generator is the multiplication operator by the j’th coordinate function. By Corollary 5.3 it is essentially self-adjoint on ψ ∈ S(R3 ). It is customary to combine all three operators into one vector-valued operator x, which is known as the position operator. Moreover, it is not hard to see that the spectrum of xj is purely absolutely continuous and given by σ(xj ) = R. In fact, let ϕ(x) be an orthonormal basis for L2 (R). Then ϕi (x1 )ϕj (x2 )ϕk (x3 ) is an orthonormal basis for L2 (R3 ) and x1 can be written as an orthogonal sum of operators restricted to the subspaces spanned by ϕj (x2 )ϕk (x3 ). Each subspace is unitarily equivalent to L2 (R) and x1 is given by multiplication with the identity. Hence the claim follows (or use Theorem 4.14). Next, consider the one-parameter unitary group of translations (Uj (t)ψ)(x) = ψ(x − tej ), 1 ≤ j ≤ 3, (8.3) 173
  • 186. 174 8. Algebraic methods where ej is the unit vector in the j’th coordinate direction. For ψ ∈ S(R3 ) we compute ψ(x − tej ) − ψ(x) 1 ∂ lim i = ψ(x) (8.4) t→0 t i ∂xj and hence the generator is pj = 1 ∂xj . Again it is essentially self-adjoint i ∂ on ψ ∈ S(R3 ). Moreover, since it is unitarily equivalent to xj by virtue of the Fourier transform, we conclude that the spectrum of pj is again purely absolutely continuous and given by σ(pj ) = R. The operator p is known as the momentum operator. Note that since [H0 , pj ]ψ(x) = 0, ψ ∈ S(R3 ), (8.5) we have d ψ(t), pj ψ(t) = 0, ψ(t) = e−itH0 ψ(0) ∈ S(R3 ); (8.6) dt that is, the momentum is a conserved quantity for the free motion. More generally we have Theorem 8.1 (Noether). Suppose A is a self-adjoint operator which com- mutes with a self-adjoint operator H. Then D(A) is invariant under e−itH , that is, e−itH D(A) = D(A), and A is a conserved quantity, that is, ψ(t), Aψ(t) = ψ(0), Aψ(0) , ψ(t) = e−itH ψ(0) ∈ D(A). (8.7) Proof. By the second part of Lemma 4.5 (with f (λ) = λ and B = e−itH ) we see D(A) = D(e−itH A) ⊆ D(Ae−itH ) = {ψ|e−itH ψ ∈ D(A)} which implies e−itH D(A) ⊆ D(A), and [e−itH , A]ψ = 0 for ψ ∈ D(A). Similarly one has i[pj , xk ]ψ(x) = δjk ψ(x), ψ ∈ S(R3 ), (8.8) which is known as the Weyl relations. In terms of the corresponding unitary groups they read e−ispj e−itxk = eistδjk e−itxj e−ispk . (8.9) The Weyl relations also imply that the mean-square deviation of position and momentum cannot be made arbitrarily small simultaneously: Theorem 8.2 (Heisenberg Uncertainty Principle). Suppose A and B are two symmetric operators. Then for any ψ ∈ D(AB) ∩ D(BA) we have 1 ∆ψ (A)∆ψ (B) ≥ |Eψ ([A, B])| (8.10) 2 with equality if (B − Eψ (B))ψ = iλ(A − Eψ (A))ψ, λ ∈ R{0}, (8.11) or if ψ is an eigenstate of A or B.
  • 187. 8.2. Angular momentum 175 Proof. Let us fix ψ ∈ D(AB) ∩ D(BA) and abbreviate ˆ A = A − Eψ (A), ˆ B = B − Eψ (B). ˆ ˆ Then ∆ψ (A) = Aψ , ∆ψ (B) = Bψ and hence by Cauchy–Schwarz ˆ ˆ | Aψ, Bψ | ≤ ∆ψ (A)∆ψ (B). Now note that ˆˆ 1 ˆ ˆ 1 ˆ ˆ ˆˆ ˆˆ AB = {A, B} + [A, B], {A, B} = AB + B A 2 2 ˆ ˆ where {A, B} and i[A, B] are symmetric. So ˆ ˆ ˆˆ 1 ˆ ˆ 1 | Aψ, Bψ |2 = | ψ, ABψ |2 = | ψ, {A, B}ψ |2 + | ψ, [A, B]ψ |2 2 2 which proves (8.10). ˆ ˆ To have equality if ψ is not an eigenstate, we need Bψ = z Aψ for ˆ ˆ equality in Cauchy–Schwarz and ψ, {A, B}ψ = 0. Inserting the first into ˆ the second requirement gives 0 = (z − z ∗ ) Aψ 2 and shows Re(z) = 0. In the case of position and momentum we have ( ψ = 1) δjk ∆ψ (pj )∆ψ (xk ) ≥ (8.12) 2 and the minimum is attained for the Gaussian wave packets n/4 λ λ 2 −ip ψ(x) = e− 2 |x−x0 | 0x , (8.13) π λ which satisfy Eψ (x) = x0 and Eψ (p) = p0 , respectively, ∆ψ (pj )2 = 2 and 1 ∆ψ (xk )2 = 2λ . Problem 8.1. Check that (8.13) realizes the minimum. 8.2. Angular momentum Now consider the one-parameter unitary group of rotations (Uj (t)ψ)(x) = ψ(Mj (t)x), 1 ≤ j ≤ 3, (8.14) where Mj (t) is the matrix of rotation around ej by an angle of t. For ψ ∈ S(R3 ) we compute 3 ψ(Mi (t)x) − ψ(x) lim i = εijk xj pk ψ(x), (8.15) t→0 t j,k=1 where   1 if ijk is an even permutation of 123, εijk = −1 if ijk is an odd permutation of 123, (8.16) 0 otherwise. 
  • 188. 176 8. Algebraic methods Again one combines the three components into one vector-valued operator L = x ∧ p, which is known as the angular momentum operator. Since ei2πLj = I, we see that the spectrum is a subset of Z. In particular, the continuous spectrum is empty. We will show below that we have σ(Lj ) = Z. Note that since [H0 , Lj ]ψ(x) = 0, ψ ∈ S(R3 ), (8.17) we again have d ψ(t), Lj ψ(t) = 0, ψ(t) = e−itH0 ψ(0) ∈ S(R3 ); (8.18) dt that is, the angular momentum is a conserved quantity for the free motion as well. Moreover, we even have 3 [Li , Kj ]ψ(x) = i εijk Kk ψ(x), ψ ∈ S(R3 ), Kj ∈ {Lj , pj , xj }, (8.19) k=1 and these algebraic commutation relations are often used to derive informa- tion on the point spectra of these operators. In this respect the domain x2 D = span{xα e− 2 | α ∈ Nn } ⊂ S(Rn ) 0 (8.20) is often used. It has the nice property that the finite dimensional subspaces x2 Dk = span{xα e− 2 | |α| ≤ k} (8.21) are invariant under Lj (and hence they reduce Lj ). Lemma 8.3. The subspace D ⊂ L2 (Rn ) defined in (8.20) is dense. Proof. By Lemma 1.10 it suffices to consider the case n = 1. Suppose ϕ, ψ = 0 for every ψ ∈ D. Then k 1 −x 2 (itx)j √ ϕ(x)e 2 dx = 0 2π j! j=1 for any finite k and hence also in the limit k → ∞ by the dominated conver- x2 gence theorem. But the limit is the Fourier transform of ϕ(x)e− 2 , which shows that this function is zero. Hence ϕ(x) = 0. Since D is invariant under the unitary groups generated by Lj , the op- erators Lj are essentially self-adjoint on D by Corollary 5.3. Introducing L2 = L2 + L2 + L2 , it is straightforward to check 1 2 3 [L2 , Lj ]ψ(x) = 0, ψ ∈ S(R3 ). (8.22) Moreover, Dk is invariant under L2 and L3 and hence Dk reduces L2 and L3 . In particular, L2 and L3 are given by finite matrices on Dk . Now
  • 189. 8.2. Angular momentum 177 let Hm = Ker(L3 − m) and denote by Pk the projector onto Dk . Since L2 and L3 commute on Dk , the space Pk Hm is invariant under L2 , which shows that we can choose an orthonormal basis consisting of eigenfunctions of L2 for Pk Hm . Increasing k, we get an orthonormal set of simultaneous eigenfunctions whose span is equal to D. Hence there is an orthonormal basis of simultaneous eigenfunctions of L2 and L3 . Now let us try to draw some further consequences by using the commuta- tion relations (8.19). (All commutation relations below hold for ψ ∈ S(R3 ).) Denote by Hl,m the set of all functions in D satisfying L3 ψ = mψ, L2 ψ = l(l + 1)ψ. (8.23) By L2 ≥ 0 and σ(L3 ) ⊆ Z we can restrict our attention to the case l ≥ 0 and m ∈ Z. First introduce two new operators L± = L1 ± iL2 , [L3 , L± ] = ±L± . (8.24) Then, for every ψ ∈ Hl,m we have L3 (L± ψ) = (m ± 1)(L± ψ), L2 (L± ψ) = l(l + 1)(L± ψ); (8.25) that is, L± Hl,m → Hl,m±1 . Moreover, since L2 = L2 ± L3 + L L± , 3 (8.26) we obtain 2 L± ψ = ψ, L L± ψ = (l(l + 1) − m(m ± 1)) ψ (8.27) for every ψ ∈ Hl,m . If ψ = 0, we must have l(l + 1) − m(m ± 1) ≥ 0, which shows Hl,m = {0} for |m| l. Moreover, L± Hl,m → Hl,m±1 is injective unless |m| = l. Hence we must have Hl,m = {0} for l ∈ N0 . Up to this point we know σ(L2 ) ⊆ {l(l + 1)|l ∈ N0 }, σ(L3 ) ⊆ Z. In order to show that equality holds in both cases, we need to show that Hl,m = {0} for l ∈ N0 , m = −l, −l + 1, . . . , l − 1, l. First of all we observe 1 x2 ψ0,0 (x) = 3/4 e− 2 ∈ H0,0 . (8.28) π Next, we note that (8.19) implies [L3 , x± ] = ±x± , x± = x1 ± ix2 , [L± , x± ] = 0, [L± , x ] = ±2x3 , [L2 , x± ] = 2x± (1 ± L3 ) 2x3 L± . (8.29) Hence if ψ ∈ Hl,l , then (x1 ± ix2 )ψ ∈ Hl±1,l±1 . Thus 1 ψl,l (x) = √ (x1 ± ix2 )l ψ0,0 (x) ∈ Hl,l , (8.30) l!
  • 190. 178 8. Algebraic methods respectively, (l + m)! ψl,m (x) = Ll−m ψl,l (x) ∈ Hl,m . (8.31) (l − m)!(2l)! − The constants are chosen such that ψl,m = 1. In summary, Theorem 8.4. There exists an orthonormal basis of simultaneous eigenvec- tors for the operators L2 and Lj . Moreover, their spectra are given by σ(L2 ) = {l(l + 1)|l ∈ N0 }, σ(L3 ) = Z. (8.32) We will give an alternate derivation of this result in Section 10.3. 8.3. The harmonic oscillator Finally, let us consider another important model whose algebraic structure is similar to those of the angular momentum, the harmonic oscillator H = H0 + ω 2 x2 , ω 0. (8.33) We will choose as domain x2 D(H) = D = span{xα e− 2 | α ∈ N3 } ⊆ L2 (R3 ) 0 (8.34) from our previous section. We will first consider the one-dimensional case. Introducing 1 √ 1 d A± = √ ωx √ , D(A± ) = D, (8.35) 2 ω dx we have [A− , A+ ] = 1 (8.36) and H = ω(2N + 1), N = A+ A− , D(N ) = D, (8.37) for any function in D. In particular, note that D is invariant under A± . Moreover, since [N, A± ] = ±A± , (8.38) we see that N ψ = nψ implies N A± ψ = (n ± 1)A± ψ. Moreover, A+ ψ 2 = ψ, A− A+ ψ = (n + 1) ψ 2 , respectively, A− ψ 2 = n ψ 2 , in this case and hence we conclude that σp (N ) ⊆ N0 . If N ψ0 = 0, then we must have A− ψ = 0 and the normalized solution of this last equation is given by ω 1/4 ωx2 ψ0 (x) = e− 2 ∈ D. (8.39) π
  • 191. 8.4. Abstract commutation 179 Hence 1 ψn (x) = √ An ψ0 (x) + (8.40) n! is a normalized eigenfunction of N corresponding to the eigenvalue n. More- over, since 1 ω 1/4 √ ωx2 ψn (x) = √ Hn ( ωx)e− 2 (8.41) 2n n! π where Hn (x) is a polynomial of degree n given by n x2 d x2 2 dn −x2 Hn (x) = e 2 x− e− 2 = (−1)n ex e , (8.42) dx dxn we conclude span{ψn } = D. The polynomials Hn (x) are called Hermite polynomials. In summary, Theorem 8.5. The harmonic oscillator H is essentially self-adjoint on D and has an orthonormal basis of eigenfunctions ψn1 ,n2 ,n3 (x) = ψn1 (x1 )ψn2 (x2 )ψn3 (x3 ), (8.43) with ψnj (xj ) from (8.41). The spectrum is given by σ(H) = {(2n + 3)ω|n ∈ N0 }. (8.44) Finally, there is also a close connection with the Fourier transformation. Without restriction we choose ω = 1 and consider only one-dimension. Then it easy to verify that H commutes with the Fourier transformation, FH = HF, (8.45) on D. Moreover, by FA± = iA± F we even infer 1 (−i)n Fψn = √ FAn ψ0 = √ An Fψ0 = (−i)n ψn , + + (8.46) n! n! since Fψ0 = ψ0 by Lemma 7.3. In particular, σ(F) = {z ∈ C|z 4 = 1}. (8.47) 8.4. Abstract commutation The considerations of the previous section can be generalized as follows. First of all, the starting point was a factorization of H according to H = A∗ A (note that A± from the previous section are adjoint to each other when restricted to D). Then it turned out that commuting both operators just corresponds to a shift of H; that is, AA∗ = H + c. Hence one could exploit the close spectral relation of A∗ A and AA∗ to compute both the eigenvalues and eigenvectors.
  • 192. 180 8. Algebraic methods More generally, let A be a closed operator and recall that H0 = A∗ A is a self-adjoint operator (cf. Problem 2.12) with Ker(H0 ) = Ker(A). Similarly, H1 = AA∗ is a self-adjoint operator with Ker(H1 ) = Ker(A∗ ). Theorem 8.6. Let A be a closed operator. The operators H0 = A∗ A Ker(A)⊥ and H1 = AA∗ Ker(A∗ )⊥ are unitarily equivalent. If H0 ψ0 = Eψ0 , ψ0 ∈ D(H0 ), then ψ1 = Aψ0 ∈ D(H1 ) with H1 ψ1 = ψ1 √ and ψ1 = E ψ0 . Moreover, 1 1 RH1 (z) ⊇ (ARH0 (z)A∗ − 1) , RH0 (z) ⊇ (A∗ RH1 (z)A − 1) . (8.48) z z 1/2 Proof. Introducing |A| = H0 , we have the polar decomposition (Prob- lem 3.11) A = U |A|, where U : Ker(A)⊥ → Ker(A∗ )⊥ is unitary. Taking adjoints, we have (Problem 2.3) A∗ = |A|U ∗ and thus H1 = AA∗ = U |A||A|U ∗ = U H0 U ∗ shows the claimed unitary equivalence. The√claims about the eigenvalues are straightforward (for the norm note Aψ0 = EU ψ0 ). To see the connection between the resolvents, abbreviate P1 = PH1 ({0}). Then 1 1 RH1 (z) = RH1 (z)(1 − P1 ) + P1 = U RH0 U ∗ + P1 z z 1 ⊇ U (|H0 |1/2 RH0 |H0 |1/2 − 1)U ∗ + P1 z 1 1 = (ARH0 A∗ + (1 − P1 ) + P1 ) = (ARH0 A∗ + 1) , z z where we have used U U ∗ = 1 − P1 . We will use this result to compute the eigenvalues and eigenfunctions of the hydrogen atom in Section 10.4. In the physics literature this approach is also known as supersymmetric quantum mechanics. d 2 Problem 8.2. Show that H0 = − dx2 + q can formally (i.e., ignoring do- mains) be written as H0 = AA ∗ , where A = − d + φ, if the differential dx equation ψ + qψ = 0 has a positive solution. Compute H1 = A∗ A. (Hint: φ = ψ .) ψ d 2 Problem 8.3. Take H0 = − dx2 + λ, λ 0, and compute H1 . What about domains?
  • 193. Chapter 9 One-dimensional Schr¨dinger operators o 9.1. Sturm–Liouville operators In this section we want to illustrate some of the results obtained thus far by investigating a specific example, the Sturm–Liouville equation 1 d d τ f (x) = − p(x) f (x) + q(x)f (x) , f, pf ∈ ACloc (I). (9.1) r(x) dx dx The case p = r = 1 can be viewed as the model of a particle in one- dimension in the external potential q. Moreover, the case of a particle in three dimensions can in some situations be reduced to the investigation of Sturm–Liouville equations. In particular, we will see how this works when explicitly solving the hydrogen atom. The suitable Hilbert space is b L2 ((a, b), r(x)dx), f, g = f (x)∗ g(x)r(x)dx, (9.2) a where I = (a, b) ⊆ R is an arbitrary open interval. We require (i) p−1 ∈ L1 (I), positive, loc (ii) q ∈ L1 (I), real-valued, loc (iii) r ∈ L1 (I), positive. loc 181
  • 194. 182 9. One-dimensional Schr¨dinger operators o If a is finite and if p−1 , q, r ∈ L1 ((a, c)) (c ∈ I), then the Sturm–Liouville equation (9.1) is called regular at a. Similarly for b. If it is regular at both a and b, it is called regular. The maximal domain of definition for τ in L2 (I, r dx) is given by D(τ ) = {f ∈ L2 (I, r dx)|f, pf ∈ ACloc (I), τ f ∈ L2 (I, r dx)}. (9.3) It is not clear that D(τ ) is dense unless (e.g.) p ∈ ACloc (I), p , q ∈ L2 (I), loc r−1 ∈ L∞ (I) since C0 (I) ⊂ D(τ ) in this case. We will defer the general loc ∞ case to Lemma 9.4 below. Since we are interested in self-adjoint operators H associated with (9.1), we perform a little calculation. Using integration by parts (twice), we obtain the Lagrange identity (a c d b) d d g ∗ (τ f ) rdy = Wd (g ∗ , f ) − Wc (g ∗ , f ) + (τ g)∗ f rdy, (9.4) c c for f, g, pf , pg ∈ ACloc (I), where Wx (f1 , f2 ) = p(f1 f2 − f1 f2 ) (x) (9.5) is called the modified Wronskian. Equation (9.4) also shows that the Wronskian of two solutions of τ u = zu is constant Wx (u1 , u2 ) = W (u1 , u2 ), τ u1,2 = zu1,2 . (9.6) Moreover, it is nonzero if and only if u1 and u2 are linearly independent (compare Theorem 9.1 below). If we choose f, g ∈ D(τ ) in (9.4), then we can take the limits c → a and d → b, which results in g, τ f = Wb (g ∗ , f ) − Wa (g ∗ , f ) + τ g, f , f, g ∈ D(τ ). (9.7) Here Wa,b (g ∗ , f ) has to be understood as a limit. Finally, we recall the following well-known result from ordinary differ- ential equations. Theorem 9.1. Suppose rg ∈ L1 (I). Then there exists a unique solution loc f, pf ∈ ACloc (I) of the differential equation (τ − z)f = g, z ∈ C, (9.8) satisfying the initial condition f (c) = α, (pf )(c) = β, α, β ∈ C, c ∈ I. (9.9) In addition, f is entire with respect to z.
  • 195. 9.1. Sturm–Liouville operators 183 Proof. Introducing f 0 u= , v= , pf rg we can rewrite (9.8) as the linear first order system 0 p−1 (x) u − Au = v, A(x) = . q(x) − z r(x) 0 Integrating with respect to x, we see that this system is equivalent to the Volterra integral equation x x α u − Ku = w, (Ku)(x) = A(y)u(y)dy, w(x) = + v(y)dy. c β c We will choose some d ∈ (c, b) and consider the integral operator K in the Banach space C([c, d]). Then for any h ∈ C([c, d]) and x ∈ [c, d] we have the estimate x a1 (x)n |K n (h)(x)| ≤ h , a1 (x) = a(y)dy, a(x) = A(x) , n! c which follows from induction x x |K n+1 (h)(x)| = A(y)K n (h)(y)dy ≤ a(y)|K n (h)(y)|dy c c x a1 (y)n a1 (x)n+1 ≤ h a(y) dy = h . c n! (n + 1)! Hence the unique solution of our integral equation is given by the Neumann series (show this) ∞ u(x) = K n (w)(x). n=0 To see that the solution u(x) is entire with respect to z, note that the partial sums are entire (in fact polynomial) in z and hence so is the limit by uniform convergence with respect to z in compact sets. An analogous argument for d ∈ (a, c) finishes the proof. Note that f, pf can be extended continuously to a regular endpoint. Lemma 9.2. Suppose u1 , u2 are two solutions of (τ − z)u = 0 which satisfy W (u1 , u2 ) = 1. Then any other solution of (9.8) can be written as (α, β ∈ C) x x f (x) = u1 (x) α + u2 g rdy + u2 (x) β − u1 g rdy , c c x x f (x) = u1 (x) α + u2 g rdy + u2 (x) β − u1 g rdy . (9.10) c c Note that the constants α, β coincide with those from Theorem 9.1 if u1 (c) = (pu2 )(c) = 1 and (pu1 )(c) = u2 (c) = 0.
  • 196. 184 9. One-dimensional Schr¨dinger operators o Proof. It suffices to check τ f − z f = g. Differentiating the first equation of (9.10) gives the second. Next we compute (pf ) = (pu1 ) α + u2 g rdy + (pu2 ) β − u1 g rdy − W (u1 , u2 )gr = (q − zr)u1 α + u2 grdy + (q − zr)u2 β − u1 gdy − gr = (q − zr)f − gr which proves the claim. Now we want to obtain a symmetric operator and hence we choose A0 f = τ f, D(A0 ) = D(τ ) ∩ ACc (I), (9.11) where ACc (I) denotes the functions in AC(I) with compact support. This definition clearly ensures that the Wronskian of two such functions vanishes on the boundary, implying that A0 is symmetric by virtue of (9.7). Our first task is to compute the closure of A0 and its adjoint. For this the following elementary fact will be needed. Lemma 9.3. Suppose V is a vector space and l, l1 , . . . , ln are linear func- tionals (defined on all of V ) such that n Ker(lj ) ⊆ Ker(l). Then l = j=1 n j=0 αj lj for some constants αj ∈ C. Proof. First of all it is no restriction to assume that the functionals lj are linearly independent. Then the map L : V → Cn , f → (l1 (f ), . . . , ln (f )) is surjective (since x ∈ Ran(L)⊥ implies n xj lj (f ) = 0 for all f ). Hence j=1 there are vectors fk ∈ V such that lj (fk ) = 0 for j = k and lj (fj ) = 1. Then f − n lj (f )fj ∈ n Ker(lj ) and hence l(f ) − n lj (f )l(fj ) = 0. Thus j=1 j=1 j=1 we can choose αj = l(fj ). Now we are ready to prove Lemma 9.4. The operator A0 is densely defined and its closure is given by A0 f = τ f, D(A0 ) = {f ∈ D(τ ) | Wa (f, g) = Wb (f, g) = 0, ∀g ∈ D(τ )}. (9.12) Its adjoint is given by A∗ f = τ f, 0 D(A∗ ) = D(τ ). 0 (9.13) Proof. We start by computing A∗ and ignore the fact that we do not know 0 whether D(A0 ) is dense for now. By (9.7) we have D(τ ) ⊆ D(A∗ ) and it remains to show D(A∗ ) ⊆ D(τ ). 0 0 If h ∈ D(A∗ ), we must have 0 h, A0 f = k, f , ∀f ∈ D(A0 ),
  • 197. 9.1. Sturm–Liouville operators 185 ˜ ˜ for some k ∈ L2 (I, r dx). Using (9.10), we can find a h such that τ h = k and from integration by parts we obtain b (h(x) − h(x))∗ (τ f )(x)r(x)dx = 0, ˜ ∀f ∈ D(A0 ). (9.14) a ˜ Clearly we expect that h − h will be a solution of τ u = 0 and to prove this, we will invoke Lemma 9.3. Therefore we consider the linear functionals b b l(g) = (h(x) − h(x))∗ g(x)r(x)dx, ˜ lj (g) = uj (x)∗ g(x)r(x)dx, a a on L2 (I, r dx), c where uj are two solutions of τ u = 0 with W (u1 , u2 ) = 0. Then we have Ker(l1 ) ∩ Ker(l2 ) ⊆ Ker(l). In fact, if g ∈ Ker(l1 ) ∩ Ker(l2 ), then x b f (x) = u1 (x) u2 (y)g(y)r(y)dy + u2 (x) u1 (y)g(y)r(y)dy a x is in D(A0 ) and g = τ f ∈ Ker(l) by (9.14). Now Lemma 9.3 implies b (h(x) − h(x) + α1 u1 (x) + α2 u2 (x))∗ g(x)r(x)dx = 0, ˜ ∀g ∈ L2 (I, rdx) c a ˜ and hence h = h + α1 u1 + α2 u2 ∈ D(τ ). Now what if D(A0 ) were not dense? Then there would be some freedom in the choice of k since we could always add a component in D(A0 )⊥ . So suppose we have two choices k1 = k2 . Then by the above calculation, there ˜ ˜ ˜ are corresponding functions h1 and h2 such that h = h1 + α1,1 u1 + α1,2 u2 = ˜ 2 + α2,1 u1 + α2,2 u2 . In particular, h1 − h2 is in the kernel of τ and hence h ˜ ˜ ˜ ˜ k1 = τ h1 = τ h2 = k2 , a contradiction to our assumption. Next we turn to A0 . Denote the set on the right-hand side of (9.12) by D. Then we have D ⊆ D(A∗∗ ) = A0 by (9.7). Conversely, since A0 ⊆ A∗ , 0 0 we can use (9.7) to conclude Wa (f, h) + Wb (f, h) = 0, f ∈ D(A0 ), h ∈ D(A∗ ). 0 ˜ Now replace h by a h ∈ D(A∗ ) which coincides with h near a and vanishes 0 ˜ ˜ identically near b (Problem 9.1). Then Wa (f, h) = Wa (f, h) + Wb (f, h) = 0. Finally, Wb (f, h) = −Wa (f, h) = 0 shows f ∈ D. Example. If τ is regular at a, then Wa (f, g) = 0 for all g ∈ D(τ ) if and only if f (a) = (pf )(a) = 0. This follows since we can prescribe the values of g(a), (pg )(a) for g ∈ D(τ ) arbitrarily. This result shows that any self-adjoint extension of A0 must lie between A0 and A∗ . Moreover, self-adjointness seems to be related to the Wronskian 0 of two functions at the boundary. Hence we collect a few properties first.
  • 198. 186 9. One-dimensional Schr¨dinger operators o Lemma 9.5. Suppose v ∈ D(τ ) with Wa (v ∗ , v) = 0 and suppose there is a ˆ ˆ f ∈ D(τ ) with Wa (v ∗ , f ) = 0. Then, for f, g ∈ D(τ ), we have Wa (v, f ) = 0 ⇔ Wa (v, f ∗ ) = 0 (9.15) and Wa (v, f ) = Wa (v, g) = 0 ⇒ Wa (g ∗ , f ) = 0. (9.16) Proof. For all f1 , . . . , f4 ∈ D(τ ) we have the Pl¨ cker identity u Wx (f1 , f2 )Wx (f3 , f4 ) + Wx (f1 , f3 )Wx (f4 , f2 ) + Wx (f1 , f4 )Wx (f2 , f3 ) = 0 (9.17) which remains valid in the limit x → a. Choosing f1 = v, f2 = f, f3 = ˆ v ∗ , f4 = f , we infer (9.15). Choosing f1 = f, f2 = g ∗ , f3 = v, f4 = f , we ˆ infer (9.16). Problem 9.1. Given α, β, γ, δ, show that there is a function f in D(τ ) restricted to [c, d] ⊆ (a, b) such that f (c) = α, (pf )(c) = β and f (d) = γ, (pf )(c) = δ. (Hint: Lemma 9.2.) d 2 Problem 9.2. Let A0 = − dx2 , D(A0 ) = {f ∈ H 2 [0, 1]|f (0) = f (1) = 0} and B = q, D(B) = {f ∈ L2 (0, 1)|qf ∈ L2 (0, 1)}. Find a q ∈ L1 (0, 1) such that D(A0 ) ∩ D(B) = {0}. (Hint: Problem 0.30.) Problem 9.3. Let φ ∈ L1 (I). Define loc d A± = ± + φ, D(A± ) = {f ∈ L2 (I)|f ∈ AC(I), ±f + φf ∈ L2 (I)} dx and A0,± = A± |ACc (I) . Show A∗ = A and 0,± D(A0,± ) = {f ∈ D(A± )| lim f (x)g(x) = 0, ∀g ∈ D(A )}. x→a,b In particular, show that the limits above exist. Problem 9.4 (Liouville normal form). Show that every Sturm–Liouville equation can be transformed into one with r = p = 1 as follows: Show that b r(t) the transformation U : L2 ((a, b), r dx) → L2 (0, c), c = a p(t) dt, defined via u(x) → v(y), where x r(t) 4 y(x) = dt, v(y) = r(x(y))p(x(y)) u(x(y)), a p(t) is unitary. Moreover, if p, r, p , r ∈ AC(a, b), then −(pu ) + qu = rλu transforms into −v + Qv = λv,
  • 199. 9.2. Weyl’s limit circle, limit point alternative 187 where (pr)1/4 Q=q− p((pr)−1/4 ) . r 9.2. Weyl’s limit circle, limit point alternative Inspired by Lemma 9.5, we make the following definition: We call τ limit circle (l.c.) at a if there is a v ∈ D(τ ) with Wa (v ∗ , v) = 0 such that Wa (v, f ) = 0 for at least one f ∈ D(τ ). Otherwise τ is called limit point (l.p.) at a and similarly for b. Example. If τ is regular at a, it is limit circle at a. Since Wa (v, f ) = (pf )(a)v(a) − (pv )(a)f (a), (9.18) any real-valued v with (v(a), (pv )(a)) = (0, 0) works. Note that if Wa (f, v) = 0, then Wa (f, Re(v)) = 0 or Wa (f, Im(v)) = 0. Hence it is no restriction to assume that v is real and Wa (v ∗ , v) = 0 is trivially satisfied in this case. In particular, τ is limit point if and only if Wa (f, g) = 0 for all f, g ∈ D(τ ). Theorem 9.6. If τ is l.c. at a, then let v ∈ D(τ ) with Wa (v ∗ , v) = 0 and Wa (v, f ) = 0 for some f ∈ D(τ ). Similarly, if τ is l.c. at b, let w be an analogous function. Then the operator A : D(A) → L2 (I, r dx) (9.19) f → τf with D(A) = {f ∈ D(τ )| Wa (v, f ) = 0 if l.c. at a (9.20) Wb (w, f ) = 0 if l.c. at b} is self-adjoint. Moreover, the set D1 = {f ∈ D(τ )| ∃x0 ∈ I : ∀x ∈ (a, x0 ), Wx (v, f ) = 0, (9.21) ∃x1 ∈ I : ∀x ∈ (x1 , b), Wx (w, f ) = 0} is a core for A. Proof. By Lemma 9.5, A is symmetric and hence A ⊆ A∗ ⊆ A∗ . Let g ∈0 D(A∗ ). As in the computation of A0 we conclude Wa (f, g) = Wb (f, g) = 0 for all f ∈ D(A). Moreover, we can choose f such that it coincides with v near a and hence Wa (v, g) = 0. Similarly Wb (w, g) = 0; that is, g ∈ D(A). To see that D1 is a core, let A1 be the corresponding operator and observe that the argument from above, with A1 in place of A, shows A∗ = A.1 The name limit circle, respectively, limit point, stems from the original approach of Weyl, who considered the set of solutions τ u = zu, z ∈ CR, which satisfy Wx (u∗ , u) = 0. They can be shown to lie on a circle which
  • 200. 188 9. One-dimensional Schr¨dinger operators o converges to a circle, respectively, a point, as x → a or x → b (see Prob- lem 9.9). Before proceeding, let us shed some light on the number of possible boundary conditions. Suppose τ is l.c. at a and let u1 , u2 be two solutions of τ u = 0 with W (u1 , u2 ) = 1. Abbreviate j BCx (f ) = Wx (uj , f ), f ∈ D(τ ). (9.22) Let v be as in Theorem 9.6. Then, using Lemma 9.5, it is not hard to see that 1 2 Wa (v, f ) = 0 ⇔ cos(α)BCa (f ) − sin(α)BCa (f ) = 0, (9.23) 1 BCa (v) where tan(α) = − 2. Hence all possible boundary conditions can be BCa (v) parametrized by α ∈ [0, π). If τ is regular at a and if we choose u1 (a) = (pu2 )(a) = 1 and (pu1 )(a) = u2 (a) = 0, then 1 2 BCa (f ) = f (a), BCa (f ) = (pf )(a) (9.24) and the boundary condition takes the simple form sin(α)(pf )(a) − cos(α)f (a) = 0. (9.25) The most common choice of α = 0 is known as the Dirichlet boundary condition f (a) = 0. The choice α = π/2 is known as the Neumann boundary condition (pf )(a) = 0. Finally, note that if τ is l.c. at both a and b, then Theorem 9.6 does not give all possible self-adjoint extensions. For example, one could also choose BCa (f ) = eiα BCb (f ), 1 1 BCa (f ) = eiα BCb (f ). 2 2 (9.26) The case α = 0 gives rise to periodic boundary conditions in the regular case. Next we want to compute the resolvent of A. Lemma 9.7. Suppose z ∈ ρ(A). Then there exists a solution ua (z, x) of (τ − z)u = g which is in L2 ((a, c), r dx) and which satisfies the boundary condition at a if τ is l.c. at a. Similarly, there exists a solution ub (z, x) with the analogous properties near b. The resolvent of A is given by b (A − z)−1 g(x) = G(z, x, y)g(y)r(y)dy, (9.27) a where 1 ub (z, x)ua (z, y), x ≥ y, G(z, x, y) = (9.28) W (ub (z), ua (z)) ua (z, x)ub (z, y), x ≤ y.
  • 201. 9.2. Weyl’s limit circle, limit point alternative 189 Proof. Let g ∈ L2 (I, r dx) be real-valued and consider f = (A − z)−1 g ∈ c D(A). Since (τ − z)f = 0 near a, respectively, b, we obtain ua (z, x) by setting it equal to f near a and using the differential equation to extend it to the rest of I. Similarly we obtain ub . The only problem is that ua or ub might be identically zero. Hence we need to show that this can be avoided by choosing g properly. Fix z and let g be supported in (c, d) ⊂ I. Since (τ − z)f = g, we have x b f (x) = u1 (x) α + u2 gr dy + u2 (x) β + u1 gr dy . (9.29) a x ˜ Near a (x c) we have f (x) = αu1 (x) + βu2 (x) and near b (x d) we have b ˜ b f (x) = αu1 (x) + βu2 (x), where α = α + a u2 gr dy and β = β + a u1 gr dy. ˜ ˜ ˜ If f vanishes identically near both a and b, we must have α = β = α = β = 0 ˜ b and thus α = β = 0 and a uj (y)g(y)r(y)dy = 0, j = 1, 2. This case can be avoided by choosing a suitable g and hence there is at least one solution, say ub (z). Now choose u1 = ub and consider the behavior near b. If u2 is not square integrable on (d, b), we must have β = 0 since βu2 = f − αub is. If u2 is ˜ square integrable, we can find two functions in D(τ ) which coincide with ub and u2 near b. Since W (ub , u2 ) = 1, we see that τ is l.c. at a and hence 0 = Wb (ub , f ) = Wb (ub , αub + βu2 ) = β. Thus β = 0 in both cases and we ˜ have x b f (x) = ub (x) α + u2 gr dy + u2 (x) ub gr dy. a x b Now choosing g such that a ub gr dy = 0, we infer the existence of ua (z). Choosing u2 = ua and arguing as before, we see α = 0 and hence x b f (x) = ub (x) ua (y)g(y)r(y)dy + ua (x) ub (y)g(y)r(y)dy a x b = G(z, x, y)g(y)r(y)dy a for any g ∈ L2 (I, r dx). Since this set is dense, the claim follows. c Example. If τ is regular at a with a boundary condition as in the pre- vious example, we can choose ua (z, x) to be the solution corresponding to the initial conditions (ua (z, a), (pua )(z, a)) = (sin(α), cos(α)). In particular, ua (z, x) exists for all z ∈ C. If τ is regular at both a and b, there is a corresponding solution ub (z, x), again for all z. So the only values of z for which (A − z)−1 does not exist must be those with W (ub (z), ua (z)) = 0. However, in this case ua (z, x)
  • 202. 190 9. One-dimensional Schr¨dinger operators o and ub (z, x) are linearly dependent and ua (z, x) = γub (z, x) satisfies both boundary conditions. That is, z is an eigenvalue in this case. In particular, regular operators have pure point spectrum. We will see in Theorem 9.10 below that this holds for any operator which is l.c. at both endpoints. In the previous example ua (z, x) is holomorphic with respect to z and satisfies ua (z, x)∗ = ua (z ∗ , x) (since it corresponds to real initial conditions and our differential equation has real coefficients). In general we have: Lemma 9.8. Suppose z ∈ ρ(A). Then ua (z, x) from the previous lemma can be chosen locally holomorphic with respect to z such that ua (z, x)∗ = ua (z ∗ , x) (9.30) and similarly for ub (z, x). Proof. Since this is a local property near a, we can assume b is regular and choose ub (z, x) such that (ub (z, b), (pub )(z, b)) = (sin(β), − cos(β)) as in the example above. In addition, choose a second solution vb (z, x) such that (vb (z, b), (pvb )(z, b)) = (cos(β), sin(β)) and observe W (ub (z), vb (z)) = 1. If z ∈ ρ(A), z is no eigenvalue and hence ua (z, x) cannot be a multiple of ub (z, x). Thus we can set ua (z, x) = vb (z, x) + m(z)ub (z, x) and it remains to show that m(z) is holomorphic with m(z)∗ = m(z ∗ ). Choosing h with compact support in (a, c) and g with support in (c, b), we have h, (A − z)−1 g = h, ua (z) g ∗ , ub (z) = ( h, vb (z) + m(z) h, ub (z) ) g ∗ , ub (z) (with a slight abuse of notation since ub , vb might not be square integrable). Choosing (real-valued) functions h and g such that h, ub (z) g ∗ , ub (z) = 0, we can solve for m(z): h, (A − z)−1 g − h, vb (z) g ∗ , ub (z) m(z) = . h, ub (z) g ∗ , ub (z) This finishes the proof. d 2 Example. We already know that τ = − dx2 on I = (−∞, ∞) gives rise to the free Schr¨dinger operator H0 . Furthermore, o √ −zx u± (z, x) = e , z ∈ C, (9.31) √ are two linearly independent solutions (for z = 0) and since Re( −z) 0 for z ∈ C[0, ∞), there is precisely one solution (up to a constant multiple)
  • 203. 9.2. Weyl’s limit circle, limit point alternative 191 which is square integrable near ±∞, namely u± . In particular, the only choice for ua is u− and for ub is u+ and we get 1 √ G(z, x, y) = √ e− −z|x−y| (9.32) 2 −z which we already found in Section 7.4. If, as in the previous example, there is only one square integrable solu- tion, there is no choice for G(z, x, y). But since different boundary condi- tions must give rise to different resolvents, there is no room for boundary conditions in this case. This indicates a connection between our l.c., l.p. distinction and square integrability of solutions. Theorem 9.9 (Weyl alternative). The operator τ is l.c. at a if and only if for one z0 ∈ C all solutions of (τ − z0 )u = 0 are square integrable near a. This then holds for all z ∈ C and similarly for b. Proof. If all solutions are square integrable near a, τ is l.c. at a since the Wronskian of two linearly independent solutions does not vanish. Conversely, take two functions v, v ∈ D(τ ) with Wa (v, v ) = 0. By con- ˜ ˜ sidering real and imaginary parts, it is no restriction to assume that v and v are real-valued. Thus they give rise to two different self-adjoint operators ˜ ˜ A and A (choose any fixed w for the other endpoint). Let ua and ua be the ˜ corresponding solutions from above. Then W (ua , ua ) = 0 (since otherwise ˜ ˜ A = A by Lemma 9.5) and thus there are two linearly independent solutions which are square integrable near a. Since any other solution can be written as a linear combination of those two, every solution is square integrable near a. It remains to show that all solutions of (τ − z)u = 0 for all z ∈ C are square integrable near a if τ is l.c. at a. In fact, the above argument ensures ˜ this for every z ∈ ρ(A) ∩ ρ(A), that is, at least for all z ∈ CR. Suppose (τ − z)u = 0 and choose two linearly independent solutions uj , j = 1, 2, of (τ − z0 )u = 0 with W (u1 , u2 ) = 1. Using (τ − z0 )u = (z − z0 )u and (9.10), we have (a c x b) x u(x) = αu1 (x) + βu2 (x) + (z − z0 ) (u1 (x)u2 (y) − u1 (y)u2 (x))u(y)r(y) dy. c Since uj ∈ L2 ((c, b), rdx), we can find a constant M ≥ 0 such that b |u1,2 (y)|2 r(y) dy ≤ M. c
  • 204. 192 9. One-dimensional Schr¨dinger operators o Now choose c close to b such that |z − z0 |M 2 ≤ 1/4. Next, estimating the integral using Cauchy–Schwarz gives x 2 (u1 (x)u2 (y) − u1 (y)u2 (x))u(y)r(y) dy c x x ≤ |u1 (x)u2 (y) − u1 (y)u2 (x)|2 r(y) dy |u(y)|2 r(y) dy c c x ≤ M |u1 (x)|2 + |u2 (x)|2 |u(y)|2 r(y) dy c and hence x x |u(y)|2 r(y) dy ≤ (|α|2 + |β|2 )M + 2|z − z0 |M 2 |u(y)|2 r(y) dy c c x 1 ≤ (|α|2 + |β|2 )M + |u(y)|2 r(y) dy. 2 c Thus x |u(y)|2 r(y) dy ≤ 2(|α|2 + |β|2 )M c and since u ∈ ACloc (I), we have u ∈ L2 ((c, b), r dx) for every c ∈ (a, b). Now we turn to the investigation of the spectrum of A. If τ is l.c. at both endpoints, then the spectrum of A is very simple Theorem 9.10. If τ is l.c. at both endpoints, then the resolvent is a Hilbert– Schmidt operator; that is, b b |G(z, x, y)|2 r(y)dy r(x)dx ∞. (9.33) a a In particular, the spectrum of any self-adjoint extensions is purely discrete and the eigenfunctions (which are simple) form an orthonormal basis. Proof. This follows from the estimate b x b |ub (x)ua (y)|2 r(y)dy + |ub (y)ua (x)|2 r(y)dy r(x)dx a a x b b ≤2 |ua (y)|2 r(y)dy |ub (y)|2 r(y)dy, a a which shows that the resolvent is Hilbert–Schmidt and hence compact. Note that all eigenvalues are simple. If τ is l.p. at one endpoint, this is clear, since there is at most one solution of (τ − λ)u = 0 which is square integrable near this endpoint. If τ is l.c., this also follows since the fact that two solutions of (τ − λ)u = 0 satisfy the same boundary condition implies that their Wronskian vanishes.
  • 205. 9.2. Weyl’s limit circle, limit point alternative 193 If τ is not l.c., the situation is more complicated and we can only say something about the essential spectrum. Theorem 9.11. All self-adjoint extensions have the same essential spec- trum. Moreover, if Aac and Acb are self-adjoint extensions of τ restricted to (a, c) and (c, b) (for any c ∈ I), then σess (A) = σess (Aac ) ∪ σess (Acb ). (9.34) Proof. Since (τ − i)u = 0 has two linearly independent solutions, the defect indices are at most two (they are zero if τ is l.p. at both endpoints, one if τ is l.c. at one and l.p. at the other endpoint, and two if τ is l.c. at both endpoints). Hence the first claim follows from Theorem 6.20. For the second claim restrict τ to the functions with compact support in (a, c) ∪ (c, d). Then, this operator is the orthogonal sum of the operators A0,ac and A0,cb . Hence the same is true for the adjoints and hence the defect indices of A0,ac ⊕ A0,cb are at most four. Now note that A and Aac ⊕ Acb are both self-adjoint extensions of this operator. Thus the second claim also follows from Theorem 6.20. In particular, this result implies that for the essential spectrum only the behaviour near the endpoints a and b is relevant. Another useful result to determine if q is relatively compact is the fol- lowing: Lemma 9.12. Suppose k ∈ L2 ((a, b), r dx). Then kRA (z) is Hilbert– loc Schmidt if and only if b 2 1 kRA (z) 2 = |k(x)|2 Im(G(z, x, x))r(x)dx (9.35) Im(z) a is finite. Proof. From the first resolvent formula we have b G(z, x, y) − G(z , x, y) = (z − z ) G(z, x, t)G(z , t, y)r(t)dt. a Setting x = y and z = z ∗ , we obtain b Im(G(z, x, x)) = Im(z) |G(z, x, t)|2 r(t)dt. (9.36) a Using this last formula to compute the Hilbert–Schmidt norm proves the lemma. d 2 Problem 9.5. Compute the spectrum and the resolvent of τ = − dx2 , I = (0, ∞) defined on D(A) = {f ∈ D(τ )|f (0) = 0}.
  • 206. 194 9. One-dimensional Schr¨dinger operators o Problem 9.6. Suppose τ is given on (a, ∞), where a is a regular endpoint. Suppose there are two solutions u± of τ u = zu satisfying r(x)1/2 |u± (x)| ≤ Ce αx for some C, α 0. Then z is not in the essential spectrum of any self- adjoint operator corresponding to τ . (Hint: You can take any self-adjoint extension, say the one for which ua = u− and ub = u+ . Write down what you expect the resolvent to be and show that it is a bounded operator by comparison with the resolvent from the previous problem.) Problem 9.7. Suppose a is regular and limx→b q(x)/r(x) = ∞. Show that σess (A) = ∅ for every self-adjoint extension. (Hint: Fix some positive con- stant n and choose c ∈ (a, b) such that q(x)/r(x) ≥ n in (c, b) and use Theorem 9.11.) Problem 9.8 (Approximation by regular operators). Fix functions v, w ∈ D(τ ) as in Theorem 9.6. Pick Im = (cm , dm ) with cm ↓ a, dm ↑ b and define Am : D(Am ) → L2 (Im , r dr) , f → τf where D(Am ) = {f ∈ L2 (Im , r dr)| f, pf ∈ AC(Im ), τ f ∈ L2 (Im , r dr), Wcm (v, f ) = Wdm (w, f ) = 0}. Then Am converges to A in the strong resolvent sense as m → ∞. (Hint: Lemma 6.36.) Problem 9.9 (Weyl circles). Fix z ∈ CR and c ∈ (a, b). Introduce W (u, u∗ )x [u]x = ∈R z − z∗ and use (9.4) to show that x [u]x = [u]c + |u(y)|2 r(y)dy, (τ − z)u = 0. c Hence [u]x is increasing and exists if and only if u ∈ L2 ((c, b), r dx). Let u1,2 be two solutions of (τ − z)u = 0 which satisfy [u1 ]c = [u2 ]c = 0 and W (u1 , u2 ) = 1. Then, all (nonzero) solutions u of (τ − z)u = 0 which satisfy [u]b = 0 can be written as u = u2 + m u1 , m ∈ C, up to a complex multiple (note [u1 ]x 0 for x c). Show that [u2 + m u1 ]x = [u1 ]x |m − M (x)|2 − R(x)2 , where W (u2 , u∗ )x 1 M (x) = − W (u1 , u∗ )x 1
  • 207. 9.3. Spectral transformations I 195 and −2 R(x)2 = |W (u2 , u∗ )x |2 + W (u2 , u∗ )x W (u1 , u∗ )x 1 2 1 |z − z ∗ |[u1 ]x −2 = |z − z ∗ |[u1 ]x . Hence the numbers m for which [u]x = 0 lie on a circle which either converges to a circle (if limx→b R(x) 0) or to a point (if limx→b R(x) = 0) as x → b. Show that τ is l.c. at b in the first case and l.p. in the second case. 9.3. Spectral transformations I In this section we want to provide some fundamental tools for investigating the spectra of Sturm–Liouville operators and, at the same time, give some nice illustrations of the spectral theorem. d 2 Example. Consider again τ = − dx2 on I = (−∞, ∞). From Section 7.2 we know that the Fourier transform maps the associated operator H0 to the multiplication operator with p2 in √2 (R). To get multiplication by λ, as in L the spectral theorem, we set p = λ and split the Fourier integral into a positive and negative part, that is, √ i λx (U f )(λ) = R e √ f (x) dx , λ ∈ σ(H0 ) = [0, ∞). (9.37) −i λx f (x) dx Re Then 2 χ[0,∞) (λ) U : L2 (R) → L2 (R, √ dλ) (9.38) j=1 2 λ is the spectral transformation whose existence is guaranteed by the spectral theorem (Lemma 3.4). Note, however, √ the measure is not finite. This that √ can be easily fixed if we replace exp(±i λx) by γ(λ) exp(±i λx). √ Note that in the previous example the kernel e±i λx of the integral trans- form U is just a pair of linearly independent solutions of the underlying differential equation (though no eigenfunctions, since they are not square integrable). More generally, if U : L2 (I, r dx) → L2 (R, dµ), f (x) → u(λ, x)f (x)r(x) dx (9.39) I is an integral transformation which maps a self-adjoint Sturm–Liouville op- erator A to multiplication by λ, then its kernel u(λ, x) is a solution of the underlying differential equation. This formally follows from U Af = λU f
  • 208. 196 9. One-dimensional Schr¨dinger operators o which implies 0= u(λ, x)(τ − λ)f (x)r(x) dx = (τ − λ)u(λ, x)f (x)r(x) dx (9.40) I I and hence (τ − λ)u(λ, .) = 0. Lemma 9.13. Suppose k 2 U : L (I, r dx) → L2 (R, dµj ) (9.41) j=1 is a spectral mapping as in Lemma 3.4. Then U is of the form b U f (x) = u(λ, x)f (x)r(x) dx, (9.42) a where u(λ, x) = (u1 (λ, x), . . . , uk (λ, x)) is measurable and for a.e. λ (with respect to µj ) and each uj (λ, .) is a solution of τ uj = λuj which satisfies the boundary conditions of A (if any). Here the integral has to be understood as b d 2 a dx = limc↓a,d↑b c dx with limit taken in j L (R, dµj ). The inverse is given by k (U −1 F )(x) = uj (λ, x)∗ Fj (λ)dµj (λ). (9.43) j=1 R R Again the integrals have to be understood as R dµj = limR→∞ −R dµj with limits taken in L2 (I, r dx). If the spectral measures are ordered, then the solutions uj (λ), 1 ≤ j ≤ l, are linearly independent for a.e. λ with respect to µl . In particular, for ordered spectral measures we always have k ≤ 2 and even k = 1 if τ is l.c. at one endpoint. 1 Proof. Using Uj RA (z) = λ−z Uj , we have b Uj f (x) = (λ − z)Uj G(z, x, y)f (y)r(y) dy. a If we restrict RA (z) to a compact interval [c, d] ⊂ (a, b), then RA (z)χ[c,d] is Hilbert–Schmidt since G(z, x, y)χ[c,d] (y) is square integrable over (a, b) × (a, b). Hence Uj χ[c,d] = (λ − z)Uj RA (z)χ[c,d] is Hilbert–Schmidt as well and [c,d] by Lemma 6.9 there is a corresponding kernel uj (λ, y) such that b [c,d] (Uj χ[c,d] f )(λ) = uj (λ, x)f (x)r(x) dx. a
  • 209. 9.3. Spectral transformations I 197 c ˆ Now take a larger compact interval [ˆ, d] ⊇ [c, d]. Then the kernels coincide [c,d] c ˆ [ˆ,d] on [c, d], uj (λ, .) = uj (λ, .)χ[c,d] , since we have Uj χ[c,d] = Uj χ[ˆ,d] χ[c,d] . c ˆ In particular, there is a kernel uj (λ, x) such that b Uj f (x) = uj (λ, x)f (x)r(x) dx a for every f with compact support in (a, b). Since functions with compact support are dense and Uj is continuous, this formula holds for any f provided the integral is understood as the corresponding limit. Using the fact that U is unitary, F , U g = U −1 F , g , we see b b Fj (λ)∗ uj (λ, x)g(x)r(x) dx = (U −1 F )(x)∗ g(x)r(x) dx. j R a a Interchanging integrals on the right-hand side (which is permitted at least for g, F with compact support), the formula for the inverse follows. Next, from Uj Af = λUj f we have b b uj (λ, x)(τ f )(x)r(x) dx = λ uj (λ, x)f (x)r(x) dx a a for a.e. λ and every f ∈ D(A0 ). Restricting everything to [c, d] ⊂ (a, b), the above equation implies uj (λ, .)|[c,d] ∈ D(A∗ ) and A∗ uj (λ, .)|[c,d] = cd,0 cd,0 λuj (λ, .)|[c,d] . In particular, uj (λ, .) is a solution of τ uj = λuj . Moreover, if τ is l.c. near a, we can choose c = a and allow all f ∈ D(τ ) which satisfy the boundary condition at a and vanish identically near b. Finally, assume the µj are ordered and fix l ≤ k. Suppose l cj (λ)uj (λ, x) = 0. j=1 Then we have l cj (λ)Fj (λ) = 0, Fj = Uj f, j=1 for every f . Since U is surjective, we can prescribe Fj arbitrarily on σ(µl ), e.g., Fj (λ) = 1 for j = j0 and Fj (λ) = 0 otherwise, which shows cj0 (λ) = 0. Hence the solutions uj (λ, x), 1 ≤ j ≤ l, are linearly independent for λ ∈ σ(µl ) which shows k ≤ 2 since there are at most two linearly independent solutions. If τ is l.c. and uj (λ, x) must satisfy the boundary condition, there is only one linearly independent solution and thus k = 1. Note that since we can replace uj (λ, x) by γj (λ)uj (λ, x) where |γj (λ)| = 1, it is no restriction to assume that uj (λ, x) is real-valued.
  • 210. 198 9. One-dimensional Schr¨dinger operators o For simplicity we will only pursue the case where one endpoint, say a, is regular. The general case can often be reduced to this case and will be postponed until Section 9.6. We choose a boundary condition cos(α)f (a) − sin(α)p(a)f (a) = 0 (9.44) and introduce two solution s(z, x) and c(z, x) of τ u = zu satisfying the initial conditions s(z, a) = sin(α), p(a)s (z, a) = cos(α), c(z, a) = cos(α), p(a)c (z, a) = − sin(α). (9.45) Note that s(z, x) is the solution which satisfies the boundary condition at a; that is, we can choose ua (z, x) = s(z, x). In fact, if τ is not regular at a but only l.c., everything below remains valid if one chooses s(z, x) to be a solution satisfying the boundary condition at a and c(z, x) a linearly independent solution with W (c(z), s(z)) = 1. Moreover, in our previous lemma we have u1 (λ, x) = γa (λ)s(λ, x) and using the rescaling dµ(λ) = |γa (λ)|2 dµa (λ) and (U1 f )(λ) = γa (λ)(U f )(λ), we obtain a unitary map b U : L2 (I, r dx) → L2 (R, dµ), (U f )(λ) = s(λ, x)f (x)r(x)dx (9.46) a with inverse (U −1 F )(x) = s(λ, x)F (λ)dµ(λ). (9.47) R Note, however, that while this rescaling gets rid of the unknown factor γa (λ), it destroys the normalization of the measure µ. For µ1 we know µ1 (R) (if the corresponding vector is normalized), but µ might not even be bounded! In fact, it turns out that µ is indeed unbounded. So up to this point we have our spectral transformation U which maps A to multiplication by λ, but we know nothing about the measure µ. Further- more, the measure µ is the object of desire since it contains all the spectral information of A. So our next aim must be to compute µ. If A has only pure point spectrum (i.e., only eigenvalues), this is straightforward as the following example shows. Example. Suppose E ∈ σp (A) is an eigenvalue. Then s(E, x) is the cor- responding eigenfunction and the same is true for SE (λ) = (U s(E))(λ). In particular, χ{E} (A)s(E, x) = s(E, x) shows SE (λ) = (U χ{E} (A)s(E))(λ) = χ{E} (λ)SE (λ); that is, s(E) 2 , λ = E, SE (λ) = (9.48) 0, λ = 0.
  • 211. 9.3. Spectral transformations I 199 Moreover, since U is unitary, we have b 2 s(E) = s(E, x)2 r(x)dx = SE (λ)2 dµ(λ) = s(E) 4 µ({E}); (9.49) a R that is, µ({E}) = s(E) −2 . In particular, if A has pure point spectrum (e.g., if τ is limit circle at both endpoints), we have ∞ 1 dµ(λ) = 2 dΘ(λ − Ej ), σp (A) = {Ej }∞ , j=1 (9.50) s(Ej ) j=1 where dΘ is the Dirac measure centered at 0. For arbitrary A, the above formula holds at least for the pure point part µpp . In the general case we have to work a bit harder. Since c(z, x) and s(z, x) are linearly independent solutions, W (c(z), s(z)) = 1, (9.51) we can write ub (z, x) = γb (z)(c(z, x) + mb (z)s(z, x)), where cos(α)p(a)ub (z, a) + sin(α)ub (z, a) mb (z) = , z ∈ ρ(A), (9.52) cos(α)ub (z, a) − sin(α)p(a)ub (z, a) is known as the Weyl–Titchmarsh m-function. Note that mb (z) is holo- morphic in ρ(A) and that mb (z)∗ = mb (z ∗ ) (9.53) since the same is true for ub (z, x) (the denominator in (9.52) only vanishes if ub (z, x) satisfies the boundary condition at a, that is, if z is an eigenvalue). Moreover, the constant γb (z) is of no importance and can be chosen equal to one, ub (z, x) = c(z, x) + mb (z)s(z, x). (9.54) Lemma 9.14. The Weyl m-function is a Herglotz function and satisfies b Im(mb (z)) = Im(z) |ub (z, x)|2 r(x) dx, (9.55) a where ub (z, x) is normalized as in (9.54). Proof. Given two solutions u(x), v(x) of τ u = zu, τ v = z v, respectively, it ˆ is straightforward to check x (ˆ − z) z u(y)v(y)r(y) dy = Wx (u, v) − Wa (u, v) a (clearly it is true for x = a; now differentiate with respect to x). Now choose u(x) = ub (z, x) and v(x) = ub (z, x)∗ = ub (z ∗ , x), x −2 Im(z) |ub (z, y)|2 r(y) dy = Wx (ub (z), ub (z)∗ ) − 2 Im(mb (z)), a
  • 212. 200 9. One-dimensional Schr¨dinger operators o and observe that Wx (ub , u∗ ) vanishes as x ↑ b, since both ub and u∗ are in b b D(τ ) near b. Lemma 9.15. Let s(z, x)ub (z, y), y ≥ x, G(z, x, y) = (9.56) s(z, y)ub (z, x), y ≤ x, be the Green function of A. Then s(λ, x) p(x)s (λ, x) (U G(z, x, .))(λ) = and (U p(x)∂x G(z, x, .))(λ) = λ−z λ−z (9.57) for every x ∈ (a, b) and every z ∈ ρ(A). Proof. First of all note that G(z, x, .) ∈ L2 ((a, b), r dx) for every x ∈ (a, b) and z ∈ ρ(A). Moreover, from RA (z)f = U −1 λ−z U f we have 1 b s(λ, x)F (λ) G(z, x, y)f (y)r(y) dy = dµ(λ), (9.58) a R λ−z where F = U f . Here equality is to be understood in L2 , that is, for a.e. x. However, the left-hand side is continuous with respect to x and so is the right-hand side, at least if F has compact support. Since this set is dense, the first equality follows. Similarly, the second follows after differentiating (9.58) with respect to x. Corollary 9.16. We have 1 (U ub (z))(λ) = , (9.59) λ−z where ub (z, x) is normalized as in (9.54). Proof. Choosing x = a in the lemma, we obtain the claim from the first identity if sin(α) = 0 and from the second if cos(α) = 0. Now combining Lemma 9.14 and Corollary 9.16, we infer from unitarity of U that b 1 Im(mb (z)) = Im(z) |ub (z, x)|2 r(x) dx = Im(z) dµ(λ) (9.60) a R |λ − z|2 and since a holomorphic function is determined up to a real constant by its imaginary part, we obtain Theorem 9.17. The Weyl m-function is given by 1 λ mb (z) = d + − dµ(λ), d ∈ R, (9.61) R λ − z 1 + λ2
  • 213. 9.3. Spectral transformations I 201 and 1 d = Re(mb (i)), 2 dµ(λ) = Im(mb (i)) ∞. (9.62) R 1+λ Moreover, µ is given by the Stieltjes inversion formula λ+δ 1 µ(λ) = lim lim Im(mb (λ + iε))dλ, (9.63) δ↓0 ε↓0 π δ where b Im(mb (λ + iε)) = ε |ub (λ + iε, x)|2 r(x) dx. (9.64) a Proof. Choosing z = i in (9.60) shows (9.62) and hence the right-hand side of (9.61) is a well-defined holomorphic function in CR. By 1 λ Im(z) Im( − 2 )= λ−z 1+λ |λ − z|2 its imaginary part coincides with that of mb (z) and hence equality follows. The Stieltjes inversion formula follows as in the case where the measure is bounded. d 2 Example. Consider τ = − dx2 on I = (0, ∞). Then √ sin(α) √ c(z, x) = cos(α) cos( zx) − √ sin( zx) (9.65) z and √ cos(α) √ s(z, x) = sin(α) cos( zx) + √ sin( zx). (9.66) z Moreover, √ ub (z, x) = ub (z, 0)e− −zx (9.67) and thus √ sin(α) − −z cos(α) mb (z) = √ , (9.68) cos(α) + −z sin(α) respectively, √ λ dµ(λ) = dλ. (9.69) π(cos(α)2 + λ sin(α)2 ) 1 Note that if α = 0, we even have |λ−z| dµ(λ) 0 in the previous example and hence 1 mb (z) = − cot(α) + dµ(λ) (9.70) R λ−z in this case (the factor − cot(α) follows by considering the limit |z| → ∞ of both sides). Formally this even follows in the general case by choosing x = a in ub (z, x) = (U −1 λ−z )(x); however, since we know equality only for 1
  • 214. 202 9. One-dimensional Schr¨dinger operators o a.e. x, a more careful analysis is needed. We will address this problem in the next section. Problem 9.10. Show cos(α − β)mb,β (z) + sin(α − β) mb,α (z) = . (9.71) cos(α − β) − sin(α − β)mb,β (z) (Hint: The case β = 0 is (9.52).) Problem 9.11. Let φ0 (x), θ0 (x) be two real-valued solutions of τ u = λ0 u for some fixed λ0 ∈ R such that W (θ0 , φ0 ) = 1. We will call τ quasi-regular at a if the limits lim Wx (φ0 , u(z)), lim Wx (θ0 , u(z)) (9.72) x→a x→a exist for every solution u(z) of τ u = zu. Show that this definition is inde- pendent of λ0 (Hint: Pl¨cker’s identity). Show that τ is quasi-regular at a u if it is l.c. at a. Introduce φ(z, x) = Wa (c(z), φ0 )s(z, x) − Wa (s(z), φ0 )c(z, x), θ(z, x) = Wa (s(z), θ0 )c(z, x) − Wa (c(z), θ0 )s(z, x), (9.73) where c(z, x) and s(z, x) are chosen with respect to some base point c ∈ (a, b), and a singular Weyl m-function Mb (z) such that ψ(z, x) = θ(z, x) + Mb (z)φ(z, x) ∈ L2 (c, b). (9.74) Show that all claims from this section still hold true in this case for the operator associated with the boundary condition Wa (φ0 , f ) = 0 if τ is l.c. at a. 9.4. Inverse spectral theory In this section we want to show that the Weyl m-function (respectively, the corresponding spectral measure) uniquely determines the operator. For simplicity we only consider the case p = r ≡ 1. We begin with some asymptotics for large z away from the spectrum. √ We recall that z always denotes the branch with arg(z) ∈ (−π, π]. We will write c(z, x) = cα (z, x) and s(z, x) = sα (z, x) to display the dependence on α whenever necessary. We first observe (Problem 9.12)
  • 215. 9.4. Inverse spectral theory 203 Lemma 9.18. For α = 0 we have √ 1 √ c0 (z, x) = cosh( −z(x − a)) + O( √ e −z(x−a) ), −z 1 √ 1 √ s0 (z, x) = √ sinh( −z(x − a)) + O( e −z(x−a) ), (9.75) −z z uniformly for x ∈ (a, c) as |z| → ∞. Note that for z ∈ C[0, ∞) this can be written as 1 √ 1 c0 (z, x) = e −z(x−a) (1 + O( √ )), 2 −z 1 √ 1 s0 (z, x) = √ e −z(x−a) (1 + O( )), (9.76) 2 −z z for Im(z) → ∞ and for z = λ ∈ [0, ∞) we have √ 1 c0 (λ, x) = cos( λ(x − a)) + O( √ ), λ 1 √ 1 s0 (λ, x) = √ sin( λ(x − a)) + O( ), (9.77) λ λ as λ → ∞. From this lemma we obtain Lemma 9.19. The Weyl m-function satisfies − cot(α) + O( √1 ), −z α = 0, mb (z) = √ (9.78) − −z + O(1), α = 0, as z → ∞ in any sector | Re(z)| ≤ C| Im(z)|. Proof. As in the proof of Theorem 9.17 we obtain from Lemma 9.15 1 λ G(z, x, x) = d(x) + − s(λ, x)2 dµ(λ). R λ − z 1 + λ2 Hence, since the integrand converges pointwise to 0, dominated convergence (Problem 9.13) implies G(z, x, x) = o(z) as z → ∞ in any sector | Re(z)| ≤ C| Im(z)|. Now solving G(z, x, y) = s(z, x)ub (z, x) for mb (z) and using the asymptotic expansions from Lemma 9.18, we see c(z, x) √ mb (z) = − + o(ze−2 −z(x−a) ) s(z, x) from which the claim follows. Note that assuming q ∈ C k ([a, b)), one can obtain further asymptotic terms in Lemma 9.18 and hence also in the expansion of mb (z). The asymptotics of mb (z) in turn tell us more about L2 (R, dµ).
  • 216. 204 9. One-dimensional Schr¨dinger operators o Lemma 9.20. Let 1 λ F (z) = d + − dµ(λ) R λ − z 1 + λ2 be a Herglotz function. Then, for any 0 γ 2, we have ∞ dµ(λ) Im(F (iy)) ∞ ⇐⇒ dy ∞. (9.79) R 1 + |λ|γ 1 yγ Proof. First of all note that we can split F (z) = F1 (z) + F2 (z) according to dµ = χ[−1,1] dµ + (1 + χ[−1,1] )dµ. The part F1 (z) corresponds to a finite measure and does not contribute by Theorem 3.20. Hence we can assume that µ is not supported near 0. Then Fubini shows ∞ ∞ Im(F (iy)) y 1−γ π/2 1 dy = dµ(λ)dy = dµ(λ), 0 yγ 0 R λ2 + y2 sin(γπ/2) R |λ|γ which proves the claim. Here we have used (Problem 9.14) ∞ y 1−γ π/2 dy = . 0 λ2 + y 2 |λ|γ sin(γπ/2) For the case γ = 0 see Theorem 3.20 and for the case γ = 2 see Prob- lem 9.15. Corollary 9.21. We have s(λ, x)s(λ, y) G(z, x, y) = dµ(λ), (9.80) R λ−z where the integrand is integrable. Moreover, for any ε 0 we have √ G(z, x, y) = O(z −1/2+ε e− −z|y−x| ), (9.81) as z → ∞ in any sector | Re(z)| ≤ C| Im(z)|. Proof. The previous lemma implies s(λ, x)2 (1+|λ|)γ dµ(λ) ∞ for γ 1 . 2 This already proves the first part and also the second in the case x = y, and hence the result follows from |λ − z|−1 ≤ const Im(z)−1/2+ε (1 + λ2 )−1/4−ε/2 (Problem 9.13) in any sector | Re(z)| ≤ C| Im(z)|. But the case x = y implies √ ub (z, x) = O(z −1/2+ε e− −z(a−x) ), which in turn implies the x = y case. Now we come to our main result of this section:
  • 217. 9.4. Inverse spectral theory 205 Theorem 9.22. Suppose τj , j = 0, 1, are given on (a, b) and both are regular at a. Moreover, Aj are some self-adjoint operators associated with τj and the same boundary condition at a. Let c ∈ (0, b). Then q0 (x) = q1 (x) for x ∈ (a, c) if and only if for every √ ε 0 we have that m1,b (z) − m0,b (z) = O(e−2(a−ε) Re( −z) ) as z → ∞ along some nonreal ray. Proof. By (9.75) we have s1 (z, x)/s0 (z, x) → 1 as z → ∞ along any nonreal ray. Moreover, (9.81) in the case y = x shows s0 (z, x)u1,b (z, x) → 0 and s1 (z, x)u0,b (z, x) → 0 as well. In particular, the same is true for the difference s1 (z, x)c0 (z, x) − s0 (z, x)c1 (z, x) + (m1,b (z) − m0,b (z))s0 (z, x)s1 (z, x). Since the first two terms cancel for x ∈ (a, c), (9.75) implies m1,b (z) − √ m0,b (z) = O(e−2(a−ε) Re( −z) ). To see the converse, first note that the entire function s1 (z, x)c0 (z, x) − s0 (z, x)c1 (z, x) =s1 (z, x)u0,b (z, x) − s0 (z, x)u1,b (z, x) − (m1,b (z) − m0,b (z))s0 (z, x)s1 (z, x) vanishes as z → ∞ along any nonreal ray for fixed x ∈ (a, c) by the same arguments used before together with the assumption on m1,b (z) − m0,b (z). Moreover, by (9.75) this function has an order of growth ≤ 1/2 and thus by the Phragm´n–Lindel¨f theorem (e.g., [53, Thm. 4.3.4]) is bounded on e o all of C. By Liouville’s theorem it must be constant and since it vanishes along rays, it must be zero; that is, s1 (z, x)c0 (z, x) = s0 (z, x)c1 (z, x) for all z ∈ C and x ∈ (a, c). Differentiating this identity with respect to x and us- ing W (cj (z), sj (z)) = 1 shows s1 (z, x)2 = s0 (z, x)2 . Taking the logarithmic derivative further gives s1 (z, x)/s1 (z, x) = s0 (z, x)/s0 (z, x) and differentiat- ing once more shows s1 (z, x)/s1 (z, x) = s0 (z, x)/s0 (z, x). This finishes the proof since qj (x) = z + sj (z, x)/sj (z, x). Problem 9.12. Prove Lemma 9.18. (Hint: Without loss set a = 0. Now use that √ sin(α) √ c(z, x) = cos(α) cosh( −zx) − √ sinh( −zx) −z x √ 1 −√ sinh( −z(x − y))q(y)c(z, y)dy −z 0 √ by Lemma 9.2 and consider c(z, x) = e− ˜ −zx c(z, x).) Problem 9.13. Show 1 2 |z| ≤√ (9.82) λ−z 1+λ 2 Im(z)
  • 218. 206 9. One-dimensional Schr¨dinger operators o and 1 λ 2 (1 + |z|)|z| − 2 ≤ (9.83) λ−z 1+λ 1 + λ2 Im(z) for any λ ∈ R. (Hint: To obtain the first, search for the maximum as a function of λ (cf. also Problem 3.7). The second then follows from the first.) Problem 9.14. Show ∞ y 1−γ π/2 2 dy = , γ ∈ (0, 2), 0 1+y sin(γπ/2) by proving ∞ eαx π x dx = , α ∈ (0, 1). −∞ 1+e sin(γπ) (Hint: To compute the last integral, use a contour consisting of the straight lines connecting the points −R, R, R + 2πi, −R + 2πi. Evaluate the contour integral using the residue theorem and let R → ∞. Show that the contribu- tions from the vertical lines vanish in the limit and relate the integrals along the horizontal lines.) Problem 9.15. In Lemma 9.20 we assumed 0 γ 2. Show that in the case γ = 2 we have ∞ log(1 + λ2 ) Im(F (iy)) dµ(λ) ∞ ⇐⇒ dy ∞. R 1 + λ2 1 y2 ∞ y −1 log(1+λ2 ) (Hint: 1 λ2 +y 2 dy = 2λ2 .) 9.5. Absolutely continuous spectrum In this section we will show how to locate the absolutely continuous spec- trum. We will again assume that a is a regular endpoint. Moreover, we assume that b is l.p. since otherwise the spectrum is discrete and there will be no absolutely continuous spectrum. In this case we have seen in the Section 9.3 that A is unitarily equivalent to multiplication by λ in the space L2 (R, dµ), where µ is the measure asso- ciated to the Weyl m-function. Hence by Theorem 3.23 we conclude that the set Ms = {λ| lim sup Im(mb (λ + iε)) = ∞} (9.84) ε↓0 is a support for the singularly continuous part and Mac = {λ|0 lim sup Im(mb (λ + iε)) ∞} (9.85) ε↓0
  • 219. 9.5. Absolutely continuous spectrum 207 is a minimal support for the absolutely continuous part. Moreover, σ(Aac ) can be recovered from the essential closure of Mac ; that is, ess σ(Aac ) = M ac . (9.86) Compare also Section 3.2. We now begin our investigation with a crucial estimate on Im(mb (λ+iε)). Set x f (a,x) = |f (y)|2 r(y)dy, x ∈ (a, b). (9.87) a Lemma 9.23. Let −1 ε = (2 s(λ) (1,x) c(λ) (a,x) ) (9.88) and note that since b is l.p., there is a one-to-one correspondence between ε ∈ (0, ∞) and x ∈ (a, b). Then √ s(λ) (a,x) √ 5 − 24 ≤ |mb (λ + iε)| ≤ 5 + 24. (9.89) c(λ) (a,x) Proof. Let x a. Then by Lemma 9.2 u+ (λ + iε, x) = c(λ, x) − mb (λ + iε)s(λ, x) x − iε c(λ, x)s(λ, y) − c(λ, y)s(λ, x) u+ (λ + iε, y)r(y)dy. a Hence one obtains after a little calculation (as in the proof of Theorem 9.9) c(λ) − mb (λ + iε)s(λ) (a,x) ≤ ub (λ + iε) (a,x) + 2ε s(λ) (a,x) c(λ) (a,x) ub (λ + iε) (a,x) . Using the definition of ε and (9.55), we obtain 2 2 c(λ) − mb (λ + iε)s(λ) (a,x) ≤ 4 ub (λ + iε) (a,x) 2 4 ≤ 4 ub (λ + iε) (a,b) =Im(mb (λ + iε)) ε ≤ 8 s(λ) (a,x) c(λ) (a,x) Im(mb (λ + iε)). Combining this estimate with 2 2 c(λ) − mb (λ + iε)s(λ) (a,x) ≥ c(λ) (a,x) − |mb (λ + iε)| s(λ) (a,x) −1 shows (1 − t)2 ≤ 8t, where t = |mb (λ + iε)| s(λ) (a,x) c(λ) (a,x) . We now introduce the concept of subordinacy. A nonzero solution u of τ u = zu is called sequentially subordinate at b with respect to another solution v if u (a,x) lim inf = 0. (9.90) x→b v (a,x)
  • 220. 208 9. One-dimensional Schr¨dinger operators o If the lim inf can be replaced by a lim, the solution is called subordinate. Both concepts will eventually lead to the same results (cf. Remark 9.26 below). We will work with (9.90) since this will simplify proofs later on and hence we will drop the additional sequentially. It is easy to see that if u is subordinate with respect to v, then it is subordinate with respect to any linearly independent solution. In particular, a subordinate solution is unique up to a constant. Moreover, if a solution u of τ u = λu, λ ∈ R, is subordinate, then it is real up to a constant, since both the real and the imaginary parts are subordinate. For z ∈ CR we know that there is always a subordinate solution near b, namely ub (z, x). The following result considers the case z ∈ R. Lemma 9.24. Let λ ∈ R. There is a subordinate solution u(λ) near b if and only if there is a sequence εn ↓ 0 such that mb (λ + iεn ) converges to a limit in R ∪ {∞} as n → ∞. Moreover, cos(α)p(a)u (λ, a) + sin(α)u(λ, a) lim mb (λ + iεn ) = (9.91) n→∞ cos(α)u(λ, a) − sin(α)p(a)u (λ, a) in this case (compare (9.52)). Proof. We will consider the number α fixing the boundary condition as a parameter and write sα (z, x), cα (z, x), mb,α , etc., to emphasize the depen- dence on α. Every solution can (up to a constant) be written as sβ (λ, x) for some β ∈ [0, π). But by Lemma 9.23, sβ (λ, x) is subordinate if and only there is a sequence εn ↓ 0 such that limn→∞ mb,β (λ + iεn ) = ∞ and by (9.71) this is the case if and only if cos(α − β)mb,β (λ + iεn ) + sin(α − β) lim mb,α (λ+iεn ) = lim = cot(α−β) n→∞ n→∞ cos(α − β) − sin(α − β)mb,β (λ + iεn ) is a number in R ∪ {∞}. We are interested in N (τ ), the set of all λ ∈ R for which no subordinate solution exists, that is, N (τ ) = {λ ∈ R|No solution of τ u = λu is subordinate at b} (9.92) and the set Sα (τ ) = {λ| s(λ, x) is subordinate at b}. (9.93) From the previous lemma we obtain Corollary 9.25. We have λ ∈ N (τ ) if and only if lim inf Im(mb (λ + iε)) 0 and lim sup |mb (λ + iε)| ∞. ε↓0 ε↓0 Similarly, λ ∈ Sα (τ ) if and only if lim supε↓0 |mb (λ + iε)| = ∞.
  • 221. 9.6. Spectral transformations II 209 Remark 9.26. Since the set, for which the limit limε↓0 mb (λ + iε) does not exist, is of zero spectral and Lebesgue measure (Corollary 3.25), changing the lim in (9.90) to a lim inf will affect N (τ ) only on such a set (which is irrelevant for our purpose). Moreover, by (9.71) the set where the limit exists (finitely or infinitely) is independent of the boundary condition α. Then, as a consequence of the previous corollary, we have Theorem 9.27. The set N (τ ) ⊆ Mac is a minimal support for the absolutely continuous spectrum of H. In particular, ess σac (H) = N (τ ) . (9.94) Moreover, the set Sα (τ ) ⊇ Ms is a minimal support for the singular spectrum of H. Proof. By our corollary we have N (τ ) ⊆ Mac . Moreover, if λ ∈ Mac N (τ ), then either 0 = lim inf Im(mb ) lim sup Im(mb ) or lim sup Re(mb ) = ∞. The first case can only happen on a set of Lebesgue measure zero by Theo- rem 3.23 and the same is true for the second by Corollary 3.25. Similarly, by our corollary we also have Sα (τ ) ⊇ Ms and λ ∈ Sα (τ )Ms happens precisely when lim sup Re(mb ) = ∞ which can only happen on a set of Lebesgue measure zero by Corollary 3.25. Note that if (λ1 , λ2 ) ⊆ N (τ ), then the spectrum of any self-adjoint extension H of τ is purely absolutely continuous in the interval (λ1 , λ2 ). 2 d Example. Consider H0 = − dx2 on (0, ∞) with a Dirichlet boundary con- dition at x = 0. Then it is easy to check H0 ≥ 0 and N (τ0 ) = (0, ∞). Hence σac (H0 ) = [0, ∞). Moreover, since the singular spectrum is supported on [0, ∞)N (τ0 ) = {0}, we see σsc (H0 ) = ∅ (since the singular continuous spec- trum cannot be supported on a finite set) and σpp (H0 ) ⊆ {0}. Since 0 is no eigenvalue, we have σpp (H0 ) = ∅. d 2 Problem 9.16. Determine the spectrum of H0 = − dx2 on (0, ∞) with a general boundary condition (9.44) at a = 0. 9.6. Spectral transformations II In Section 9.3 we have looked at the case of one regular endpoint. In this section we want to remove this restriction. In the case of a regular endpoint (or more generally an l.c. endpoint), the choice of u(λ, x) in Lemma 9.13 was dictated by the fact that u(λ, x) is required to satisfy the boundary condition at the regular (l.c.) endpoint. We begin by showing that in the general case we can choose any pair of linearly independent solutions. We will choose
  • 222. 210 9. One-dimensional Schr¨dinger operators o some arbitrary point c ∈ I and two linearly independent solutions according to the initial conditions c(z, c) = 1, p(c)c (z, c) = 0, s(z, c) = 0, p(c)s (z, c) = 1. (9.95) We will abbreviate c(z, x) s(z, x) = . (9.96) s(z, x) Lemma 9.28. There is measure dµ(λ) and a nonnegative matrix R(λ) with trace one such that U : L2 (I, r dx) → L2 (R, R dµ) b (9.97) f (x) → a s(λ, x)f (x)r(x) dx is a spectral mapping as in Lemma 9.13. As before, the integral has to be b d understood as a dx = limc↓a,d↑b c dx with limit taken in L2 (R, R dµ), where L2 (R, R dµ) is the Hilbert space of all C2 -valued measurable functions with scalar product f, g = f ∗ Rg dµ. (9.98) R The inverse is given by (U −1 F )(x) = s(λ, x)R(λ)F (λ)dµ(λ). (9.99) R Proof. Let U0 be a spectral transformation as in Lemma 9.13 with corre- sponding real solutions uj (λ, x) and measures dµj (x), 1 ≤ j ≤ k. Without loss of generality we can assume k = 2 since we can always choose dµ2 = 0 and u2 (λ, x) such that u1 and u2 are linearly independent. Now define the 2 × 2 matrix C(λ) via u1 (λ, x) c(λ, x) = C(λ) u2 (λ, x) s(λ, x) and note that C(λ) is nonsingular since u1 , u2 as well as s, c are linearly independent. Set d˜ = dµ1 + dµ2 . Then dµj = rj d˜ and we can introduce R = µ µ ˜ ˜ ∗ r1 0 C. By construction R is a (symmetric) nonnegative matrix. More- C 0 r2 ˜ over, since C(λ) is nonsingular, tr(R) is positive a.e. with respect to µ. Thus ˜ we can set R = tr(R) ˜ ˜ ˜ −1 R and dµ = tr(R)−1 d˜ .µ This matrix gives rise to an operator C : L2 (R, R dµ) → L2 (R, dµj ), F (λ) → C(λ)F (λ), j which, by our choice of R dµ, is norm preserving. By CU = U0 it is onto and hence it is unitary (this also shows that L2 (R, R dµ) is a Hilbert space, i.e., complete).
  • 223. 9.6. Spectral transformations II 211 It is left as an exercise to check that C maps multiplication by λ in L2 (R, R dµ) to multiplication by λ in 2 j L (R, dµj ) and the formula for U −1 . Clearly the matrix-valued measure R dµ contains all the spectral in- formation of A. Hence it remains to relate it to the resolvent of A as in Section 9.3 For our base point x = c there are corresponding Weyl m-functions ma (z) and mb (z) such that ua (z) = c(z, x) − ma (z)s(z, x), ub (z) = c(z, x) + mb (z)s(z, x). (9.100) The different sign in front of ma (z) is introduced such that ma (z) will again be a Herglotz function. In fact, this follows using reflection at c, x − c → −(x − c), which will interchange the roles of ma (z) and mb (z). In particular, all considerations from Section 9.3 hold for ma (z) as well. Furthermore, we will introduce the Weyl M -matrix 1 −1 (ma (z) − mb (z))/2 M (z) = . ma (z) + mb (z) (ma (z) − mb (z))/2 ma (z)mb (z) (9.101) Note det(M (z)) = − 1 . Since 4 p(c)ua (z, c) p(c)ub (z, c) ma (z) = − and mb (z) = , (9.102) ua (z, c) ub (z, c) it follows that W (ua (z), ub (z)) = ma (z) + mb (z) and M (z) = G(z, x, x) (p(x)∂x + p(y)∂y )G(z, x, y)/2 lim , x,y→c (p(x)∂x + p(y)∂y )G(z, x, y)/2 p(x)∂x p(y)∂y G(z, x, y) (9.103) where G(z, x, y) is the Green function of A. The limit is necessary since ∂x G(z, x, y) has different limits as y → x from y x, respectively, y x. We begin by showing Lemma 9.29. Let U be the spectral mapping from the previous lemma. Then 1 (U G(z, x, .))(λ) = s(λ, x), λ−z 1 (U p(x)∂x G(z, x, .))(λ) = p(x)s (λ, x) (9.104) λ−z for every x ∈ (a, b) and every z ∈ ρ(A).
  • 224. 212 9. One-dimensional Schr¨dinger operators o Proof. First of all note that G(z, x, .) ∈ L2 ((a, b), r dx) for every x ∈ (a, b) and z ∈ ρ(A). Moreover, from RA (z)f = U −1 λ−z U f we have 1 b 1 G(z, x, y)f (y)r(y)dy = s(λ, x)R(λ)F (λ)dµ(λ) a R λ−z where F = U f . Now proceed as in the proof of Lemma 9.15. With the aid of this lemma we can now show Theorem 9.30. The Weyl M -matrix is given by 1 λ M (z) = D + − R(λ)dµ(λ), Djk ∈ R, (9.105) R λ − z 1 + λ2 and 1 D = Re(M (i)), R(λ)dµ(λ) = Im(M (i)), (9.106) R 1 + λ2 where 1 1 Re(M (z)) = M (z) + M ∗ (z) , Im(M (z)) = M (z) − M ∗ (z) . (9.107) 2 2 Proof. By the previous lemma we have b 1 |G(z, c, y)|2 r(y)dy = R11 (λ)dµ(λ). a R |z − λ|2 Moreover by (9.28), (9.55), and (9.100) we infer b c 1 |G(z, c, y)|2 r(y)dy = |ub (z, c)|2 |ua (z, y)|2 r(y)dy a |W (ua , ub )|2 a b Im(M11 (z)) + |ua (z, c)|2 |ub (z, y)|2 r(y)dy = . c Im(z) Similarly we obtain 1 Im(M22 (z)) R (λ)dµ(λ) = 2 22 R |z − λ| Im(z) and 1 Im(M12 (z)) R (λ)dµ(λ) = 2 12 . R |z − λ| Im(z) Hence the result follows as in the proof of Theorem 9.17. Now we are also able to extend Theorem 9.27. Note that by 1 λ tr(M (z)) = M11 (z) + M22 (z) = d + − dµ(λ) (9.108) R λ − z 1 + λ2 (with d = tr(D) ∈ R) we have that the set Ms = {λ| lim sup Im(tr(M (λ + iε))) = ∞} (9.109) ε↓0
  • 225. 9.6. Spectral transformations II 213 is a support for the singularly continuous part and Mac = {λ|0 lim sup Im(tr(M (λ + iε))) ∞} (9.110) ε↓0 is a minimal support for the absolutely continuous part. Theorem 9.31. The set Na (τ ) ∪ Nb (τ ) ⊆ Mac is a minimal support for the absolutely continuous spectrum of H. In particular, ess σac (H) = Na (τ ) ∪ Nb (τ ) . (9.111) Moreover, the set Sa,α (τ ) ∩ Sb,α (τ ) ⊇ Ms (9.112) α∈[0,π) is a support for the singular spectrum of H. Proof. By Corollary 9.25 we have 0 lim inf Im(ma ) and lim sup |ma | ∞ if and only if λ ∈ Na (τ ) and similarly for mb . Now suppose λ ∈ Na (τ ). Then lim sup |M11 | ∞ since lim sup |M11 | = −1 ∞ is impossible by 0 = lim inf |M11 | = lim inf |ma + mb | ≥ lim inf Im(ma ) 0. Similarly lim sup |M22 | ∞. Moreover, if lim sup |mb | ∞, we also have Im(ma + mb ) lim inf Im(ma ) lim inf Im(M11 ) = lim inf 2 ≥ 0 |ma + mb | lim sup |ma |2 + lim sup |mb |2 and if lim sup |mb | = ∞, we have ma lim inf Im(M22 ) = lim inf Im ≥ lim inf Im(ma ) 0. 1 + ma mb Thus Na (τ ) ⊆ Mac and similarly Nb (τ ) ⊆ Mac . Conversely, let λ ∈ Mac . By Corollary 3.25 we can assume that the limits lim ma and lim mb both exist and are finite after disregarding a set of Lebesgue measure zero. For such λ, lim Im(M11 ) and lim Im(M22 ) both exist and are finite. Moreover, either lim Im(M11 ) 0, in which case lim Im(ma + mb ) 0, or lim Im(M11 ) = 0, in which case |ma |2 Im(mb ) + |mb |2 Im(ma ) 0 lim Im(M22 ) = lim =0 |ma |2 + |mb |2 yields a contradiction. Thus λ ∈ Na (τ ) ∪ Nb (τ ) and the first part is proven. To prove the second part, let λ ∈ Ms . If lim sup Im(M11 ) = ∞, we have lim sup |M11 | = ∞ and thus lim inf |ma + mb | = 0. But this implies that there is some subsequence such that lim mb = − lim ma = cot(α) ∈ R∪{∞}. Similarly, if lim sup Im(M22 ) = ∞, we have lim inf |m−1 +m−1 | = 0 and there a b is some subsequence such that lim m−1 = − lim m−1 = tan(α) ∈ R ∪ {∞}. b a This shows Ms ⊆ α Sa,α (τ ) ∩ Sb,α (τ ).
  • 226. 214 9. One-dimensional Schr¨dinger operators o Problem 9.17. Show   Im(ma (λ)+mb (λ)) Im(ma (λ)m∗ (λ)) b  |ma (λ)|2 +|m∗ (λ)|2 |ma (λ)|2 +|mb (λ)|2 dλ R(λ)dµac (λ) = b Im(ma (λ)mb (λ)) |ma (λ)| 2 Im(m (λ))+|m (λ)|2 Im(m (λ))  , b b a π |ma (λ)|2 +|mb (λ)|2 |ma (λ)| 2 +|m (λ)|2 b where ma (λ) = limε↓0 ma (λ + iε) and similarly for mb (λ). Moreover, show that the choice of solutions ub (λ, x) c(λ, x) = V (λ) , ua (λ, x) s(λ, x) where 1 1 mb (λ) V (λ) = , ma (λ) + mb (λ) 1 −ma (λ) diagonalizes the absolutely continuous part, 1 Im(ma (λ)) 0 V −1 (λ)∗ R(λ)V (λ)−1 dµac (λ) = dλ. π 0 Im(mb (λ)) 9.7. The spectra of one-dimensional Schr¨dinger operators o In this section we want to look at the case of one-dimensional Schr¨dinger o operators; that is, r = p = 1 on (a, b) = (0, ∞). Recall that d2 H0 = − , D(H0 ) = H 2 (R), (9.113) dx2 is self-adjoint and 2 qH0 (f ) = f , Q(H0 ) = H 1 (R). (9.114) Hence we can try to apply the results from Chapter 6. We begin with a simple estimate: Lemma 9.32. Suppose f ∈ H 1 (0, 1). Then 1 1 1 sup |f (x)|2 ≤ ε |f (x)|2 dx + (1 + ) |f (x)|2 dx (9.115) x∈[0,1] 0 ε 0 for every ε 0. Proof. First note that x 1 |f (x)|2 = |f (c)|2 + 2 Re(f (t)∗ f (t))dt ≤ |f (c)|2 + 2 |f (t)f (t)|dt c 0 1 1 1 ≤ |f (c)|2 + ε |f (t)|2 dt + |f (t)|2 dt 0 ε 0 for any c ∈ [0, 1]. But by the mean value theorem there is a c ∈ (0, 1) such 1 that |f (c)|2 = 0 |f (t)|2 dt.
  • 227. 9.7. The spectra of one-dimensional Schr¨dinger operators o 215 As a consequence we obtain Lemma 9.33. Suppose q ∈ L2 (R) and loc n+1 sup |q(x)|2 dx ∞. (9.116) n∈Z n Then q is relatively bounded with respect to H0 with bound zero. Similarly, if q ∈ L1 (R) and loc n+1 sup |q(x)|dx ∞. (9.117) n∈Z n Then q is relatively form bounded with respect to H0 with bound zero. n+1 Proof. Let Q be in L2 (R) and abbreviate M = supn∈Z loc n |Q(x)|2 dx. Using the previous lemma, we have for f ∈ H 1 (R) that n+1 2 Qf ≤ |Q(x)f (x)|2 dx ≤ M sup |f (x)|2 n∈Z n n∈Z x∈[n,n+1] n+1 n+1 1 ≤M ε |f (x)|2 dx + (1 + ) |f (x)|2 dx n ε n 1 = M ε f 2 + (1 + ) f 2 . ε Choosing Q = |q| 1/2 , this already proves the form case since f 2 = q (f ). H0 Choosing Q = q and observing qH0 (f ) = f, H0 f ≤ H0 f f for f ∈ H 2 (R) shows the operator case. Hence in both cases H0 + q is a well-defined (semi-bounded) operator defined as operator sum on D(H0 + q) = D(H0 ) = H 2 (R) in the first case and as form sum on Q(H0 + q) = Q(H0 ) = H 1 (R) in the second case. Note also that the first case implies the second one since by Cauchy–Schwarz we have n+1 n+1 |q(x)|dx ≤ |q(x)|2 dx. (9.118) n n This is not too surprising since we already know how to turn H0 + q into a self-adjoint operator without imposing any conditions on q (except for L1 (R)) at all. However, we get at least a simple description of the (form) loc domains and by requiring a bit more, we can even compute the essential spectrum of the perturbed operator. Lemma 9.34. Suppose q ∈ L1 (R). Then the resolvent difference of H0 and H0 + q is trace class. √ Proof. Using G0 (z, x, x) = 1/(2 −z), Lemma 9.12 implies that |q|1/2 RH0 (z) is Hilbert–Schmidt and hence the result follows from Lemma 6.29.
  • 228. 216 9. One-dimensional Schr¨dinger operators o Lemma 9.35. Suppose q ∈ L1 (R) and loc n+1 lim |q(x)|dx = 0. (9.119) |n|→∞ n Then RH0 +q (z) − RH0 (z) is compact and hence σess (H0 + q) = σess (H0 ) = [0, ∞). Proof. By Weyl’s theorem it suffices to show that the resolvent difference is compact. Let qn (x) = q(x)χR[−n,n] (x). Then RH0 +q (z)−RH0 +qn (z) is trace class, which can be shown as in the previous theorem since q−qn has compact support (no information on the corresponding diagonal Green’s function is needed since by continuity it is bounded on every compact set). Moreover, by the proof of Lemma 9.33, qn is form bounded with respect to H0 with m+1 constants a = Mn and b = 2Mn , where Mn = sup|m|≥n m |q(x)|2 dx. Hence by Theorem 6.25 we see RH0 +qn (−λ) = RH0 (−λ)1/2 (1 − Cqn (λ))−1 RH0 (−λ)1/2 , λ 2, with Cqn (λ) ≤ Mn . So we conclude RH0 +qn (−λ) − RH0 (−λ) = −RH0 (−λ)1/2 Cqn (λ)(1 − Cqn (λ))−1 RH0 (−λ)1/2 , λ 2, which implies that the sequence of compact operators RH0 +q (−λ) − RH0 +qn (−λ) converges to RH0 +q (−λ) − RH0 (−λ) in norm, which implies that the limit is also compact and finishes the proof. Using Lemma 6.23, respectively, Corollary 6.27, we even obtain Corollary 9.36. Let q = q1 + q2 where q1 and q2 satisfy the assumptions of Lemma 9.33 and Lemma 9.35, respectively. Then H0 + q1 + q2 is self-adjoint and σess (H0 + q1 + q2 ) = σess (H0 + q1 ). This result applies for example in the case where q2 is a decaying per- turbation of a periodic potential q1 . Finally we turn to the absolutely continuous spectrum. Lemma 9.37. Suppose q = q1 + q2 , where q1 ∈ L1 (0, ∞) and q2 ∈ AC[0, ∞) with q2 ∈ L1 (0, ∞) and limx→∞ q2 (x) = 0. Then there are two solutions u± (λ, x) of τ u = λu, λ 0, of the form u± (λ, x) = (1 + o(1))u0,± (λ, x), u± (λ, x) = (1 + o(1))u0,± (λ, x) (9.120) as x → ∞, where x u0,± (λ, x) = exp ±i λ − q2 (y)dy . (9.121) 0
  • 229. 9.7. The spectra of one-dimensional Schr¨dinger operators o 217 Proof. We will omit the dependence on λ for notational simplicity. More- over, we will choose x so large that Wx (u− , u+ ) = 2i λ − q2 (x) = 0. Write u0,+ (x) u0,+ (x) a+ (x) u(x) = U0 (x)a(x), U0 (x) = , a(x) = . u0,− (x) u0,− (x) a− (x) Then 0 1 u (x) = u(x) q(x) − λ 0 0 0 + a(x) + U0 (x)a (x), q+ (x)u0,+ (x) q− (x)u0,− (x) where q2 (x) q± (x) = q1 (x) ± i . λ − q2 (x) Hence u(x) will solve τ u = λu if 1 q+ (x) q− (x)u0,− (x)2 a (x) = a(x). Wx (u− , u+ ) −q+ (x)u0,+ (x)2 −q− (x) Since the coefficient matrix of this linear system is integrable, the claim follows by a simple application of Gronwall’s inequality. Theorem 9.38 (Weidmann). Let q1 and q2 be as in the previous lemma and suppose q = q1 + q2 satisfies the assumptions of Lemma 9.35. Let H = H0 +q1 +q2 . Then σac (H) = [0, ∞), σsc (H) = ∅, and σp (H) ⊆ (−∞, 0]. Proof. By the previous lemma there is no subordinate solution for λ 0 on (0, ∞) and hence 0 Im(mb (λ+i0)) ∞. Similarly, there is no subordinate solution (−∞, 0) and hence 0 Im(ma (λ + i0)) ∞. Thus the same is true for the diagonal entries Mjj (z) of the Weyl M -matrix, 0 Im(Mjj (λ + i0)) ∞, and hence dµ is purely absolutely continuous on (0, ∞). Since σess (H) = [0, ∞), we conclude σac (H) = [0, ∞) and σsc (H) ⊆ {0}. Since the singular continuous part cannot live on a single point, we are done. Note that the same results hold for operators on [0, ∞) rather than R. Moreover, observe that the conditions from Lemma 9.37 are only imposed near +∞ but not near −∞. The conditions from Lemma 9.35 are only used to ensure that there is no essential spectrum in (−∞, 0). Having dealt with the essential spectrum, let us next look at the discrete spectrum. In the case of decaying potentials, as in the previous theorem, one key question is if the number of eigenvalues below the essential spectrum is finite or not. As preparation, we shall prove Sturm’s comparison theorem:
  • 230. 218 9. One-dimensional Schr¨dinger operators o Theorem 9.39 (Sturm). Let τ0 , τ1 be associated with q0 ≥ q1 on (a, b), respectively. Let (c, d) ⊆ (a, b) and τ0 u = 0, τ1 v = 0. Suppose at each end of (c, d) either Wx (u, v) = 0 or, if c, d ∈ (a, b), u = 0. Then v is either a multiple of u in (c, d) or v must vanish at some point in (c, d). Proof. By decreasing d to the first zero of u in (c, d] (and perhaps flipping signs), we can suppose u 0 on (c, d). If v has no zeros in (c, d), we can suppose v 0 on (c, d) again by perhaps flipping signs. At each endpoint, W (u, v) vanishes or else u = 0, v 0, and u (c) 0 (or u (d) 0). Thus, Wc (u, v) ≤ 0, Wd (u, v) ≥ 0. But this is inconsistent with d Wd (u, v) − Wc (u, v) = (q0 (t) − q1 (t))u(t)v(t) dt, (9.122) c unless both sides vanish. In particular, choosing q0 = q − λ0 and q1 = q − λ1 , this result holds for solutions of τ u = λ0 u and τ v = λ1 v. Now we can prove Theorem 9.40. Suppose q satisfies (9.117) such that H is semi-bounded and Q(H) = H 1 (R). Let λ0 · · · λn · · · be its eigenvalues below the essential spectrum and ψ0 , . . . , ψn , . . . the corresponding eigenfunctions. Then ψn has n zeros. Proof. We first prove that ψn has at least n zeros and then that if ψn has m zeros, then (−∞, λn ] has at least (m + 1) eigenvalues. If ψn has m zeros at x1 , x2 , . . . , xm and we let x0 = a, xm+1 = b, then by Theorem 9.39, ψn+1 must have at least one zero in each of (x0 , x1 ), (x1 , x2 ), . . . , (xm , xm+1 ); that is, ψn+1 has at least m + 1 zeros. It follows by induction that ψn has at least n zeros. On the other hand, if ψn has m zeros x1 , . . . , xm , define ψn (x), xj ≤ x ≤ xj+1 , ηj (x) = j = 0, . . . , m, (9.123) 0 otherwise, where we set x0 = −∞ and xm+1 = ∞. Then ηj is in the form domain m of H and satisfies ηj , Hηj = λn ηj 2 . Hence if η = j=0 cj ηj , then η, Hη = λn η 2 and it follows by Theorem 4.12 (i) that there are at least m + 1 eigenvalues in (−∞, λn ]. Note that by Theorem 9.39, the zeros of ψn interlace the zeros of ψn . The second part of the proof also shows Corollary 9.41. Let H be as in the previous theorem. If the Weyl solution u± (λ, .) has m zeros, then dim Ran(−∞,λ) (H) ≥ m. In particular, λ below the spectrum of H implies that u± (λ, .) has no zeros.
  • 231. 9.7. The spectra of one-dimensional Schr¨dinger operators o 219 The equation (τ −λ)u is called oscillating if one solution has an infinite number of zeros. Theorem 9.39 implies that this is then true for all solu- tions. By our previous considerations this is the case if and only if σ(H) has infinitely many points below λ. Hence it remains to find a good oscillation criterion. Theorem 9.42 (Kneser). Consider q on (0, ∞). Then 1 lim inf x2 q(x) − implies nonoscillation of τ near ∞ (9.124) x→∞ 4 and 1 lim sup x2 q(x) − implies oscillation of τ near ∞. (9.125) x→∞ 4 Proof. The key idea is that the equation d2 µ τ0 = − 2 + 2 dx x is of Euler type. Hence it is explicitly solvable with a fundamental system given by q 1 ± µ+ 1 x2 4. There are two cases to distinguish. If µ ≥ −1/4, all solutions are nonoscil- latory. If µ −1/4, one has to take real/imaginary parts and all solutions are oscillatory. Hence a straightforward application of Sturm’s comparison theorem between τ0 and τ yields the result. Corollary 9.43. Suppose q satisfies (9.117). Then H has finitely many eigenvalues below the infimum of the essential spectrum 0 if 1 lim inf x2 q(x) − (9.126) |x|→∞ 4 and infinitely many if 1 lim sup x2 q(x) − . (9.127) |x|→∞ 4 Problem 9.18. Show that if q is relatively bounded with respect to H0 , then necessarily q ∈ L2 (R) and (9.116) holds. Similarly, if q is relatively form loc bounded with respect to H0 , then necessarily q ∈ L1 (R) and (9.117) holds. loc d 2 Problem 9.19. Suppose q ∈ L1 (R) and consider H = − dx2 + q. Show that inf σ(H) ≤ R q(x)dx. In particular, there is at least one eigenvalue ∞ below the essential spectrum if R q(x)dx 0. (Hint: Let ϕ ∈ Cc (R) with ϕ(x) = 1 for |x| ≤ 1 and investigate qH (ϕn ), where ϕn (x) = ϕ(x/n).)
  • 233. Chapter 10 One-particle Schr¨dinger operators o 10.1. Self-adjointness and spectrum Our next goal is to apply these results to Schr¨dinger operators. The Hamil- o tonian of one particle in d dimensions is given by H = H0 + V, (10.1) where V : Rd → R is the potential energy of the particle. We are mainly interested in the case 1 ≤ d ≤ 3 and want to find classes of potentials which are relatively bounded, respectively, relatively compact. To do this, we need a better understanding of the functions in the domain of H0 . Lemma 10.1. Suppose n ≤ 3 and ψ ∈ H 2 (Rn ). Then ψ ∈ C∞ (Rn ) and for any a 0 there is a b 0 such that ψ ∞ ≤ a H0 ψ + b ψ . (10.2) Proof. The important observation is that (p2 + γ 2 )−1 ∈ L2 (Rn ) if n ≤ 3. ˆ Hence, since (p2 + γ 2 )ψ ∈ L2 (Rn ), the Cauchy–Schwarz inequality ˆ ψ 1 = (p2 + γ 2 )−1 (p2 + γ 2 )ψ(p) ˆ 1 2 2 −1 2 2 ˆ ≤ (p + γ ) (p + γ )ψ(p) ˆ shows ψ ∈ L1 (Rn ). But now everything follows from the Riemann-Lebesgue lemma, that is, ψ ∞ ≤ (2π)−n/2 (p2 + γ 2 )−1 ( p2 ψ(p) + γ 2 ψ(p) ) ˆ ˆ = (γ/2π)n/2 (p2 + 1)−1 (γ −2 H0 ψ + ψ ), which finishes the proof. 221
  • 234. 222 10. One-particle Schr¨dinger operators o Now we come to our first result. Theorem 10.2. Let V be real-valued and V ∈ L∞ (Rn ) if n 3 and V ∈ ∞ L∞ (Rn ) + L2 (Rn ) if n ≤ 3. Then V is relatively compact with respect to H0 . ∞ In particular, H = H0 + V, D(H) = H 2 (Rn ), (10.3) is self-adjoint, bounded from below and σess (H) = [0, ∞). (10.4) ∞ Moreover, Cc (Rn ) is a core for H. Proof. Our previous lemma shows D(H0 ) ⊆ D(V ). Moreover, invoking Lemma 7.11 with f (p) = (p2 − z)−1 and g(x) = V (x) (note that f ∈ L∞ (Rn ) ∩ L2 (Rn ) for n ≤ 3) shows that V is relatively compact. Since ∞ ∞ Cc (Rn ) is a core for H0 by Lemma 7.9, the same is true for H by the Kato–Rellich theorem. ∞ Observe that since Cc (Rn ) ⊆ D(H0 ), we must have V ∈ L2 (Rn ) if loc D(V ) ⊆ D(H0 ). 10.2. The hydrogen atom We begin with the simple model of a single electron in R3 moving in the external potential V generated by a nucleus (which is assumed to be fixed at the origin). If one takes only the electrostatic force into account, then V is given by the Coulomb potential and the corresponding Hamiltonian is given by γ H (1) = −∆ − , D(H (1) ) = H 2 (R3 ). (10.5) |x| If the potential is attracting, that is, if γ 0, then it describes the hydrogen atom and is probably the most famous model in quantum mechanics. 1 We have chosen as domain D(H (1) ) = D(H0 ) ∩ D( |x| ) = D(H0 ) and by Theorem 10.2 we conclude that H (1) is self-adjoint. Moreover, Theorem 10.2 also tells us σess (H (1) ) = [0, ∞) (10.6) and that H (1) is bounded from below, E0 = inf σ(H (1) ) −∞. (10.7) If γ ≤ 0, we have H (1) ≥ 0 and hence E0 = 0, but if γ 0, we might have E0 0 and there might be some discrete eigenvalues below the essential spectrum.
  • 235. 10.2. The hydrogen atom 223 In order to say more about the eigenvalues of H (1) , we will use the fact that both H0 and V (1) = −γ/|x| have a simple behavior with respect to scaling. Consider the dilation group U (s)ψ(x) = e−ns/2 ψ(e−s x), s ∈ R, (10.8) which is a strongly continuous one-parameter unitary group. The generator can be easily computed: 1 in Dψ(x) = (xp + px)ψ(x) = (xp − )ψ(x), ψ ∈ S(Rn ). (10.9) 2 2 Now let us investigate the action of U (s) on H (1) : H (1) (s) = U (−s)H (1) U (s) = e−2s H0 + e−s V (1) , D(H (1) (s)) = D(H (1) ). (10.10) Now suppose Hψ = λψ. Then ψ, [U (s), H]ψ = U (−s)ψ, Hψ − Hψ, U (s)ψ = 0 (10.11) and hence 1 H − H(s) 0 = lim ψ, [U (s), H]ψ = lim U (−s)ψ, ψ s→0 s s→0 s = ψ, (2H0 + V (1) )ψ . (10.12) Thus we have proven the virial theorem. Theorem 10.3. Suppose H = H0 + V with U (−s)V U (s) = e−s V . Then any normalized eigenfunction ψ corresponding to an eigenvalue λ satisfies 1 λ = − ψ, H0 ψ = ψ, V ψ . (10.13) 2 In particular, all eigenvalues must be negative. This result even has some further consequences for the point spectrum of H (1) . Corollary 10.4. Suppose γ 0. Then σp (H (1) ) = σd (H (1) ) = {Ej−1 }j∈N0 , E0 Ej Ej+1 0, (10.14) with limj→∞ Ej = 0. ∞ Proof. Choose ψ ∈ Cc (R{0}) and set ψ(s) = U (−s)ψ. Then ψ(s), H (1) ψ(s) = e−2s ψ, H0 ψ + e−s ψ, V (1) ψ which is negative for s large. Now choose a sequence sn → ∞ such that we have supp(ψ(sn )) ∩ supp(ψ(sm )) = ∅ for n = m. Then Theorem 4.12 (i) shows that rank(PH (1) ((−∞, 0))) = ∞. Since each eigenvalue Ej has finite multiplicity (it lies in the discrete spectrum), there must be an infinite number of eigenvalues which accumulate at 0.
  • 236. 224 10. One-particle Schr¨dinger operators o If γ ≤ 0, we have σd (H (1) ) = ∅ since H (1) ≥ 0 in this case. Hence we have obtained quite a complete picture of the spectrum of H (1) . Next, we could try to compute the eigenvalues of H (1) (in the case γ 0) by solving the corresponding eigenvalue equation, which is given by the partial differential equation γ − ∆ψ(x) − ψ(x) = λψ(x). (10.15) |x| For a general potential this is hopeless, but in our case we can use the rota- tional symmetry of our operator to reduce our partial differential equation to ordinary ones. First of all, it suggests itself to switch from Cartesian coordinates x = (x1 , x2 , x3 ) to spherical coordinates (r, θ, ϕ) defined by x1 = r sin(θ) cos(ϕ), x2 = r sin(θ) sin(ϕ), x3 = r cos(θ), (10.16) where r ∈ [0, ∞), θ ∈ [0, π], and ϕ ∈ (−π, π]. This change of coordinates corresponds to a unitary transform L2 (R3 ) → L2 ((0, ∞), r2 dr) ⊗ L2 ((0, π), sin(θ)dθ) ⊗ L2 ((0, 2π), dϕ). (10.17) In these new coordinates (r, θ, ϕ) our operator reads 1 ∂ 2∂ 1 γ H (1) = − 2 ∂r r + 2 L2 + V (r), V (r) = − , (10.18) r ∂r r r where 1 ∂ ∂ 1 ∂2 L2 = L2 + L2 + L2 = − 1 2 3 sin(θ) − . (10.19) sin(θ) ∂θ ∂θ sin(θ)2 ∂ϕ2 (Recall the angular momentum operators Lj from Section 8.2.) Making the product ansatz (separation of variables) ψ(r, θ, ϕ) = R(r)Θ(θ)Φ(ϕ), (10.20) we obtain the three Sturm–Liouville equations 1 d 2d l(l + 1) − 2 dr r + + V (r) R(r) = λR(r), r dr r2 1 d d m2 − sin(θ) + Θ(θ) = l(l + 1)Θ(θ), sin(θ) dθ dθ sin(θ) d2 − 2 Φ(ϕ) = m2 Φ(ϕ). (10.21) dϕ The form chosen for the constants l(l + 1) and m2 is for convenience later on. These equations will be investigated in the following sections. Problem 10.1. Generalize the virial theorem to the case U (−s)V U (s) = e−αs V , α ∈ R{0}. What about Corollary 10.4?
  • 237. 10.3. Angular momentum 225 10.3. Angular momentum We start by investigating the equation for Φ(ϕ) which is associated with the Sturm–Liouville equation τ Φ = −Φ , I = (0, 2π). (10.22) Since we want ψ defined via (10.20) to be in the domain of H0 (in particular continuous), we choose periodic boundary conditions the Sturm–Liouville equation AΦ = τ Φ, D(A) = {Φ ∈ L2 (0, 2π)| Φ ∈ AC 1 [0, 2π], Φ(0) = Φ(2π), Φ (0) = Φ (2π)}. (10.23) From our analysis in Section 9.1 we immediately obtain Theorem 10.5. The operator A defined via (10.22) is self-adjoint. Its spectrum is purely discrete, that is, σ(A) = σd (A) = {m2 |m ∈ Z}, (10.24) and the corresponding eigenfunctions 1 Φm (ϕ) = √ eimϕ , m ∈ Z, (10.25) 2π form an orthonormal basis for L2 (0, 2π). Note that except for the lowest eigenvalue, all eigenvalues are twice de- generate. We note that this operator is essentially the square of the angular mo- mentum in the third coordinate direction, since in polar coordinates 1 ∂ L3 = . (10.26) i ∂ϕ Now we turn to the equation for Θ(θ): 1 d d m2 τm Θ(θ) = − sin(θ) + Θ(θ), I = (0, π), m ∈ N0 . sin(θ) dθ dθ sin(θ) (10.27) For the investigation of the corresponding operator we use the unitary transform L2 ((0, π), sin(θ)dθ) → L2 ((−1, 1), dx), Θ(θ) → f (x) = Θ(arccos(x)). (10.28) The operator τ transforms to the somewhat simpler form d d m2 τm = − (1 − x2 ) − . (10.29) dx dx 1 − x2
  • 238. 226 10. One-particle Schr¨dinger operators o The corresponding eigenvalue equation τm u = l(l + 1)u (10.30) is the associated Legendre equation. For l ∈ N0 it is solved by the associated Legendre functions [1, (8.6.6)] dm Plm (x) = (−1)m (1 − x2 )m/2 Pl (x), |m| ≤ l, (10.31) dxm where the 1 dl 2 Pl (x) = (x − 1)l , l ∈ N0 , (10.32) 2l l! dxl are the Legendre polynomials [1, (8.6.18)] (Problem 10.2). Moreover, note that the Pl (x) are (nonzero) polynomials of degree l and since τm depends only on m2 , there must be a relation between Plm (x) and Pl−m (x). In fact, (Problem 10.3) (l + m)! m Pl−m (x) = (−1)m P . (10.33) (l − m)! l A second, linearly independent, solution is given by x dt Qm (x) = Plm (x) l . (10.34) 0 (1 − t2 )Plm (t)2 dt x In fact, for every Sturm–Liouville equation, v(x) = u(x) p(t)u(t)2 satisfies τ v = 0 whenever τ u = 0. Now fix l = 0 and note P0 (x) = 1. For m = 0 we have Q0 = arctanh(x) ∈ L2 and so τ0 is l.c. at both endpoints. For m 0 0 we have Qm = (x ± 1)−m/2 (C + O(x ± 1)) which shows that it is not square 0 integrable. Thus τm is l.c. for m = 0 and l.p. for m 0 at both endpoints. In order to make sure that the eigenfunctions for m = 0 are continuous (such that ψ defined via (10.20) is continuous), we choose the boundary condition generated by P0 (x) = 1 in this case: Am f = τm f, D(Am ) = {f ∈ L2 (−1, 1)| f ∈ AC 1 (−1, 1), τm f ∈ L2 (−1, 1), (10.35) limx→±1 (1 − x2 )f (x) = 0 if m = 0}. Theorem 10.6. The operator Am , m ∈ N0 , defined via (10.35) is self- adjoint. Its spectrum is purely discrete, that is, σ(Am ) = σd (Am ) = {l(l + 1)|l ∈ N0 , l ≥ m}, (10.36) and the corresponding eigenfunctions 2l + 1 (l − m)! m ul,m (x) = P (x), l ∈ N0 , l ≥ m, (10.37) 2 (l + m)! l form an orthonormal basis for L2 (−1, 1).
  • 239. 10.3. Angular momentum 227 Proof. By Theorem 9.6, Am is self-adjoint. Moreover, Plm is an eigenfunc- tion corresponding to the eigenvalue l(l + 1) and it suffices to show that the Plm form a basis. To prove this, it suffices to show that the functions Plm (x) are dense. Since (1 − x2 ) 0 for x ∈ (−1, 1), it suffices to show that the functions (1 − x2 )−m/2 Plm (x) are dense. But the span of these functions contains every polynomial. Every continuous function can be approximated by polynomials (in the sup norm, Theorem 0.15, and hence in the L2 norm) and since the continuous functions are dense, so are the polynomials. For the normalization of the eigenfunctions see Problem 10.7, respec- tively, [1, (8.14.13)]. Returning to our original setting, we conclude that the 2l + 1 (l + m)! m Θm (θ) = l P (cos(θ)), |m| ≤ l, (10.38) 2 (l − m)! l form an orthonormal basis for L2 ((0, π), sin(θ)dθ) for any fixed m ∈ N0 . Theorem 10.7. The operator L2 on L2 ((0, π), sin(θ)dθ) ⊗ L2 ((0, 2π)) has a purely discrete spectrum given σ(L2 ) = {l(l + 1)|l ∈ N0 }. (10.39) The spherical harmonics 2l + 1 (l − m)! m Ylm (θ, ϕ) = Θm (θ)Φm (ϕ) = l P (cos(θ))eimϕ , |m| ≤ l, 4π (l + m)! l (10.40) form an orthonormal basis and satisfy L2 Ylm = l(l + 1)Ylm and L3 Ylm = mYlm . Proof. Everything follows from our construction, if we can show that the Ylm form a basis. But this follows as in the proof of Lemma 1.10. Note that transforming the Ylm back to cartesian coordinates gives m 2l + 1 (l − |m|)! ˜ m x3 x1 ± ix2 Yl±m (x) = (−1)m P ( ) , r = |x|, 4π (l + |m|)! l r r (10.41) ˜ where Plm is a polynomial of degree l − m given by dl+m Plm (x) = (1 − x2 )−m/2 Plm (x) = l+m (1 − x2 )l . ˜ (10.42) dx In particular, the Ylm are smooth away from the origin and by construction they satisfy l(l + 1) m − ∆Ylm = Yl . (10.43) r2
  • 240. 228 10. One-particle Schr¨dinger operators o Problem 10.2. Show that the associated Legendre functions satisfy the differential equation (10.30). (Hint: Start with the Legendre polynomials (10.32) which correspond to m = 0. Set v(x) = (x2 − 1)l and observe (x2 − 1)v (x) = 2lx v(x). Then differentiate this identity l + 1 times using Leibniz’s rule. For the case of the associated Legendre functions, substitute v(x) = (1 − x2 )m/2 u(x) in (10.30) and differentiate the resulting equation once.) Problem 10.3. Show (10.33). (Hint: Write (x2 − 1)l = (x − 1)l (x + 1)l and use Leibniz’s rule.) Problem 10.4 (Orthogonal polynomials). Suppose the monic polynomials Pj (x) = xj + βj xj−1 + . . . are orthogonal with respect to the weight function w(x): b 2 αj , i = j, Pi (x)Pj (x)w(x)dx = a 0, otherwise. Note that they are uniquely determined by the Gram–Schmidt procedure. ¯ −1 Let Pj (x) = αj P (x) and show that they satisfy the three term recurrence relation ¯ ¯ ¯ ¯ aj Pj+1 (x) + bj Pj (x) + aj−1 Pj−1 (x) = xPj (x), where b b aj = ¯ ¯ xPj+1 (x)Pj (x)w(x)dx, bj = ¯ xPj (x)2 w(x)dx. a a Moreover, show αj+1 aj = , bj = βj − βj+1 . αj (Note that w(x)dx could be replaced by a measure dµ(x).) Problem 10.5. Consider the orthogonal polynomials with respect to the weight function w(x) as in the previous problem. Suppose |w(x)| ≤ Ce−k|x| for some C, k 0. Show that the orthogonal polynomials are dense in L2 (R, w(x)dx). (Hint: It suffices to show that f (x)xj w(x)dx = 0 for all j ∈ N0 implies f = 0. Consider the Fourier transform of f (x)w(x) and note that it has an analytic extension by Problem 7.11. Hence this Fourier transform will be zero if, e.g., all derivatives at p = 0 are zero (cf. Prob- lem 7.3).) Problem 10.6. Show l/2 (−1)k (2l − 2k)! Pl (x) = xl−2k . 2l k!(l − k)!(l − 2k)! k=0
  • 241. 10.4. The eigenvalues of the hydrogen atom 229 Moreover, by Problem 10.4 there is a recurrence relation of the form Pl+1 (x) = (˜l + ˜l x)Pl (x) + cl Pl−1 (x). Find the coefficients by comparing the highest a b ˜ powers in x and conclude (l + 1)Pl+1 (x) = (2l + 1)xPl (x) − lPl−1 . Use this to prove 1 2 Pl (x)2 dx = . −1 2l + 1 Problem 10.7. Prove 1 2 (l + m)! Plm (x)2 dx = . −1 2l + 1 (l − m)! 1 (Hint: Use (10.33) to compute −1 Plm (x)Pl−m (x)dx by integrating by parts until you can use the case m = 0 from the previous problem.) 10.4. The eigenvalues of the hydrogen atom Now we want to use the considerations from the previous section to decom- pose the Hamiltonian of the hydrogen atom. In fact, we can even admit any spherically symmetric potential V (x) = V (|x|) with V (r) ∈ L∞ (R) + L2 ((0, ∞), r2 dr) ∞ (10.44) such that Theorem 10.2 holds. The important observation is that the spaces Hl,m = {ψ(x) = R(r)Ylm (θ, ϕ)|R(r) ∈ L2 ((0, ∞), r2 dr)} (10.45) with corresponding projectors 2π π Plm ψ(r, θ, ϕ) = ψ(r, θ , ϕ )Ylm (θ , ϕ ) sin(θ )dθ dϕ Ylm (θ, ϕ) 0 0 (10.46) reduce our operator H = H0 + V . By Lemma 2.24 it suffices to check ∞ this for H restricted to Cc (R3 ), which is straightforward. Hence, again by Lemma 2.24, H = H0 + V = ˜ Hl , (10.47) l,m where ˜ 1 d 2d l(l + 1) Hl R(r) = τl R(r), ˜ τl = − ˜ r + + V (r), r2 dr dr r2 D(Hl ) ⊆ L2 ((0, ∞), r2 dr). (10.48) Using the unitary transformation L2 ((0, ∞), r2 dr) → L2 ((0, ∞)), R(r) → u(r) = rR(r), (10.49)
  • 242. 230 10. One-particle Schr¨dinger operators o our operator transforms to d2 l(l + 1) Al f = τl f, τl = − + + V (r), dr2 r2 D(Al ) = Plm D(H) ⊆ L2 ((0, ∞)). (10.50) It remains to investigate this operator (that its domain is indeed independent of m follows from the next theorem). Theorem 10.8. The domain of the operator Al is given by D(Al ) = {f ∈ L2 (I)| f, f ∈ AC(I), τ f ∈ L2 (I), (10.51) limr→0 (f (r) − rf (r)) = 0 if l = 0}, where I = (0, ∞). Moreover, σess (Al ) = σac (Al ) = [0, ∞), σsc (Al ) = ∅, σp ⊂ (−∞, 0]. (10.52) Proof. By construction of Al we know that it is self-adjoint and satisfies σess (Al ) ⊆ [0, ∞) (Problem 10.8). By Lemma 9.37 we have (0, ∞) ⊆ N∞ (τl ) and hence Theorem 9.31 implies σac (Al ) = [0, ∞), σsc (Al ) = ∅, and σp ⊂ (−∞, 0]. So it remains to compute the domain. We know at least D(Al ) ⊆ D(τ ) and since D(H) = D(H0 ), it suffices to consider the case V = 0. In this case the solutions of −u (r)+ l(l+1) u(r) = 0 are given by u(r) = αrl+1 +βr−l . r2 Thus we are in the l.p. case at ∞ for any l ∈ N0 . However, at 0 we are in the l.p. case only if l 0; that is, we need an additional boundary condition at 0 if l = 0. Since we need R(r) = u(r) to be bounded (such that (10.20) r is in the domain of H0 , that is, continuous), we have to take the boundary condition generated by u(r) = r. Finally let us turn to some explicit choices for V , where the correspond- ing differential equation can be explicitly solved. The simplest case is V = 0. In this case the solutions of l(l + 1) − u (r) + u(r) = zu(r) (10.53) r2 are given by √ √ u(r) = α z −l/2 r jl ( zr) + β z (l+1)/2 r yl ( zr), (10.54) where jl (r) and yl (r) are the spherical Bessel, respectively, spherical Neumann, functions l π 1 d sin(r) jl (r) = J (r) = (−r)l , 2r l+1/2 r dr r l π 1 d cos(r) yl (r) = Y (r) = −(−r)l . (10.55) 2r l+1/2 r dr r
  • 243. 10.4. The eigenvalues of the hydrogen atom 231 √ √ Note that z −l/2 r jl ( zr) and z (l+1)/2 r yl ( zr) are entire as functions of z √ √ and their Wronskian is given by W (z −l/2 r jl ( zr), z (l+1)/2 r yl ( zr)) = 1. See [1, Sects. 10.1 and 10.3]. In particular, r √ 2l l! ua (z, r) = j ( zr) = l/2 l rl+1 (1 + O(r2 )), z (2l + 1)! √ √ √ √ 1 ub (z, r) = −zr jl (i −zr) + iyl (i −zr) = e− −zr+ilπ/2 (1 + O( )) r (10.56) are the functions which are square integrable and satisfy the boundary con- dition (if any) near a = 0 and b = ∞, respectively. The second case is that of our Coulomb potential γ V (r) = − , γ 0, (10.57) r where we will try to compute the eigenvalues plus corresponding eigenfunc- tions. It turns out that they can be expressed in terms of the Laguerre polynomials ([1, (22.2.13)]) er dj −r j Lj (r) = e r (10.58) j! drj and the generalized Laguerre polynomials ([1, (22.2.12)]) (k) dk Lj (r) = (−1)k Lj+k (r). (10.59) drk (k) Note that the Lj (r) are polynomials of degree j − k which are explicitly given by j (k) j + k ri Lj (r) = (−1)i (10.60) j − i i! i=0 and satisfy the differential equation (Problem 10.9) r y (r) + (k + 1 − r)y (r) + j y(r) = 0. (10.61) Moreover, they are orthogonal in the Hilbert space L2 ((0, ∞), rk e−r dr) (Prob- lem 10.10): ∞ (j+k)! (k) (k) j! , i = j, Lj (r)Lj (r)rk e−r dr = (10.62) 0 0, otherwise. Theorem 10.9. The eigenvalues of H (1) are explicitly given by 2 γ En = − , n ∈ N0 . (10.63) 2(n + 1)
  • 244. 232 10. One-particle Schr¨dinger operators o An orthonormal basis for the corresponding eigenspace is given by the (n+1)2 functions ψn,l,m (x) = Rn,l (r)Ylm (x), |m| ≤ l ≤ n, (10.64) where l γ 3 (n − l)! γr γr − 2(n+1) γr (2l+1) Rn,l (r) = e Ln−l ( ). 2(n + 1)4 (n + l + 1)! n+1 n+1 (10.65) 2 In particular, the lowest eigenvalue E0 = − γ4 is simple and the correspond- γ 3 −γr/2 ing eigenfunction ψ000 (x) = 2 e is positive. Proof. Since all eigenvalues are negative, we need to look at the equation l(l + 1) γ −u (r) + ( − )u(r) = λu(r) r2 r √ ex/2 x for λ 0. Introducing new variables x = 2 −λ r and v(x) = xl+1 u( 2√−λ ), this equation transforms into Kummer’s equation γ xv (x) + (k + 1 − x)v (x) + j v(x) = 0, k = 2l + 1, j = √ − (l + 1). 2 −λ Now let us search for a solution which can be expanded into a convergent power series ∞ v(x) = vi xi , v0 = 1. (10.66) i=0 The corresponding u(r) is square integrable near 0 and satisfies the boundary condition (if any). Thus we need to find those values of λ for which it is square integrable near +∞. Substituting the ansatz (10.66) into our differential equation and com- paring powers of x gives the following recursion for the coefficients (i − j) vi+1 = vi (i + 1)(i + k + 1) and thus i−1 1 −j vi = . i! +k+1 =0 Now there are two cases to distinguish. If j ∈ N0 , then vi = 0 for i j and v(x) is a polynomial; namely −1 j+k (k) v(x) = Lj (x). j
  • 245. 10.4. The eigenvalues of the hydrogen atom 233 In this case u(r) is square integrable and hence an eigenfunction correspond- γ ing to the eigenvalue λj = −( 2(n+1) )2 , n = j + l. This proves the formula for Rn,l (r) except for the normalization which follows from (Problem 10.11) ∞ (k) (j + k)! Lj (r)2 rk+1 e−r dr = (2j + k + 1). (10.67) 0 j! It remains to show that we have found all eigenfunctions, that is, that there are no other square integrable solutions. Otherwise, if j ∈ N, we have vi+1 (1−ε) vi ≥ i+1 for i sufficiently large. Hence by adding a polynomial to v(x) (and perhaps flipping its sign), we can get a function v (x) such that vi ≥ ˜ ˜ (1−ε)i i! for all i. But then v (x) ≥ exp((1 − ε)x) and thus the corresponding ˜ u(r) is not square integrable near +∞. Finally, let us also look at an alternative algebraic approach for com- puting the eigenvalues and eigenfunctions of Al based on the commutation methods from Section 8.4. We begin by introducing d l+1 γ Ql f = − + − , dr r 2(l + 1) D(Ql ) = {f ∈ L2 ((0, ∞))|f ∈ AC((0, ∞)), Ql f ∈ L2 ((0, ∞))}. (10.68) Then (Problem 9.3) Ql is closed and its adjoint is given by d l+1 γ Q∗ f = l + − , dr r 2(l + 1) D(Q∗ ) = {f ∈ L2 ((0, ∞))| f ∈ AC((0, ∞)), Q∗ f ∈ L2 ((0, ∞)), l l (10.69) limx→0,∞ f (x)g(x) = 0, ∀g ∈ D(Ql )}. It is straightforward to check Ker(Ql ) = span{ul,0 }, Ker(Q∗ ) = {0}, l (10.70) where (l+1)+1/2 1 γ γ − 2(l+1) r ul,0 (r) = rl+1 e (10.71) (2l + 2)! l+1 is normalized. Theorem 10.10. The radial Schr¨dinger operator Al satisfies o Al = Q∗ Ql − c2 , l l Al+1 = Ql Q∗ − c2 , l l (10.72) where γ cl = . (10.73) 2(l + 1)
  • 246. 234 10. One-particle Schr¨dinger operators o Proof. Equality is easy to check for f ∈ AC 2 with compact support. Hence Q∗ Ql − c2 is a self-adjoint extension of τl restricted to this set. If l 0, l l there is only one self-adjoint extension and equality follows. If l = 0, we know u0,0 ∈ D(Q∗ Ql ) and since Al is the only self-adjoint extension with l u0,0 ∈ D(Al ), equality follows in this case as well. Hence, as a consequence of Theorem 8.6 we see σ(Al ) = σ(Al+1 )∪{−c2 }, l or, equivalently, σp (Al ) = {−c2 |j ≥ l} j (10.74) if we use that σp (Al ) ⊂ (−∞, 0), which already follows from the virial the- orem. Moreover, using Ql , we can turn any eigenfunction of Hl into one of Hl+1 . However, we only know the lowest eigenfunction ul,0 , which is mapped to 0 by Ql . On the other hand, we can also use Q∗ to turn an l eigenfunction of Hl+1 into one of Hl . Hence Q∗ ul+1,0 will give the second l eigenfunction of Hl . Proceeding inductively, the normalized eigenfunction of Hl corresponding to the eigenvalue −c2 is given by l+j j−1 −1 ul,j = (cl+j − cl+k ) Q∗ Q∗ · · · Q∗ l l+1 l+j−1 ul+j,0 . (10.75) k=0 The connection with Theorem 10.9 is given by 1 Rn,l (r) = ul,n−l (r). (10.76) r Problem 10.8. Let A = n An . Then n σess (An ) ⊆ σess (A). Problem 10.9. Show that the generalized Laguerre polynomials satisfy the differential equation (10.61). (Hint: Start with the Laguerre polynomials (10.58) which correspond to k = 0. Set v(r) = rj e−r and observe r v (r) = (j − r)v(r). Then differentiate this identity j + 1 times using Leibniz’s rule. For the case of the generalized Laguerre polynomials, start with the differential equation for Lj+k (r) and differentiate k times.) Problem 10.10. Show that the differential equation (10.58) can be rewritten in Sturm–Liouville form as d k+1 −r d −r−k er r e u = ju. dr dr We have found one entire solution in the proof of Theorem 10.9. Show that any linearly independent solution behaves like log(r) if k = 0, respectively, like r−k otherwise. Show that it is l.c. at the endpoint r = 0 if k = 0 and l.p. otherwise.
  • 247. 10.5. Nondegeneracy of the ground state 235 Let H = L2 ((0, ∞), rk e−r dr). The operator d k+1 −r d Ak f = τ f = −r−k er r e f, dr dr D(Ak ) = {f ∈ H| f ∈ AC 1 (0, ∞), τk f ∈ H, limr→0 rf (r) = 0 if k = 0} for k ∈ N0 is self-adjoint. Its spectrum is purely discrete, that is, σ(Ak ) = σd (Ak ) = N0 , (10.77) and the corresponding eigenfunctions (k) Lj (r), j ∈ N0 , (10.78) form an orthogonal base for H. (Hint: Compare the argument for the asso- ciated Legendre equation and Problem 10.5.) Problem 10.11. By Problem 10.4 there is a recurrence relation of the form (k) (k) (k) Lj+1 (r) = (˜j + ˜j r)Lj (r) + cj Lj−1 (r). Find the coefficients by comparing a b ˜ the highest powers in r and conclude (k) 1 (k) (k) Lj+1 (r) = (2j + k + 1 − r)Lj (r) − (j + k)Lj−1 (r) . 1+j Use this to prove (10.62) and (10.67). 10.5. Nondegeneracy of the ground state The lowest eigenvalue (below the essential spectrum) of a Schr¨dinger op- o erator is called the ground state. Since the laws of physics state that a quantum system will transfer energy to its surroundings (e.g., an atom emits radiation) until it eventually reaches its ground state, this state is in some sense the most important state. We have seen that the hydrogen atom has a nondegenerate (simple) ground state with a corresponding positive eigen- function. In particular, the hydrogen atom is stable in the sense that there is a lowest possible energy. This is quite surprising since the corresponding classical mechanical system is not — the electron could fall into the nucleus! Our aim in this section is to show that the ground state is simple with a corresponding positive eigenfunction. Note that it suffices to show that any ground state eigenfunction is positive since nondegeneracy then follows for free: two positive functions cannot be orthogonal. To set the stage, let us introduce some notation. Let H = L2 (Rn ). We call f ∈ L2 (Rn ) positive if f ≥ 0 a.e. and f ≡ 0. We call f strictly posi- tive if f 0 a.e. A bounded operator A is called positivity preserving if f ≥ 0 implies Af ≥ 0 and positivity improving if f ≥ 0 implies Af 0. Clearly A is positivity preserving (improving) if and only if f, Ag ≥ 0 ( 0) for f, g ≥ 0.
  • 248. 236 10. One-particle Schr¨dinger operators o Example. Multiplication by a positive function is positivity preserving (but not improving). Convolution with a strictly positive function is positivity improving. We first show that positivity improving operators have positive eigen- functions. Theorem 10.11. Suppose A ∈ L(L2 (Rn )) is a self-adjoint, positivity im- proving and real (i.e., it maps real functions to real functions) operator. If A is an eigenvalue, then it is simple and the corresponding eigenfunction is strictly positive. Proof. Let ψ be an eigenfunction. It is no restriction to assume that ψ is real (since A is real, both real and imaginary part of ψ are eigenfunctions as well). We assume ψ = 1 and denote by ψ± = f ±2 f | the positive and negative parts of ψ. Then by |Aψ| = |Aψ+ − Aψ− | ≤ Aψ+ + Aψ− = A|ψ| we have A = ψ, Aψ ≤ |ψ|, |Aψ| ≤ |ψ|, A|ψ| ≤ A ; that is, ψ, Aψ = |ψ|, A|ψ| and thus 1 ψ+ , Aψ− = ( |ψ|, A|ψ| − ψ, Aψ ) = 0. 4 Consequently ψ− = 0 or ψ+ = 0 since otherwise Aψ− 0 and hence also ψ+ , Aψ− 0. Without restriction ψ = ψ+ ≥ 0 and since A is positivity increasing, we even have ψ = A −1 Aψ 0. So we need a positivity improving operator. By (7.38) and (7.39) both e−tH0 , t 0, and Rλ (H0 ), λ 0, are since they are given by convolution with a strictly positive function. Our hope is that this property carries over to H = H0 + V . Theorem 10.12. Suppose H = H0 + V is self-adjoint and bounded from ∞ below with Cc (Rn ) as a core. If E0 = min σ(H) is an eigenvalue, it is simple and the corresponding eigenfunction is strictly positive. Proof. We first show that e−tH , t 0, is positivity preserving. If we set Vn = V χ{x| |V (x)|≤n} , then Vn is bounded and Hn = H0 + Vn is positivity preserving by the Trotter product formula since both e−tH0 and e−tV are. ∞ Moreover, we have Hn ψ → Hψ for ψ ∈ Cc (Rn ) (note that necessarily sr V ∈ L2 ) and hence Hn → H in the strong resolvent sense by Lemma 6.36. loc s Hence e−tHn → e−tH by Theorem 6.31, which shows that e−tH is at least positivity preserving (since 0 cannot be an eigenvalue of e−tH , it cannot map a positive function to 0).
  • 249. 10.5. Nondegeneracy of the ground state 237 Next I claim that for ψ positive the closed set N (ψ) = {ϕ ∈ L2 (Rn ) | ϕ ≥ 0, ϕ, e−sH ψ = 0 ∀s ≥ 0} is just {0}. If ϕ ∈ N (ψ), we have by e−sH ψ ≥ 0 that ϕe−sH ψ = 0. Hence etVn ϕe−sH ψ = 0; that is, etVn ϕ ∈ N (ψ). In other words, both etVn and e−tH leave N (ψ) invariant and invoking Trotter’s formula again, the same is true for t t k e−t(H−Vn ) = s-lim e− k H e k Vn . k→∞ s Since e−t(H−Vn ) → e−tH0 , we finally obtain that e−tH0 leaves N (ψ) invariant, but this operator is positivity increasing and thus N (ψ) = {0}. Now it remains to use (7.37), which shows ∞ ϕ, RH (λ)ψ = eλt ϕ, e−tH ψ dt 0, λ E0 , 0 for ϕ, ψ positive. So RH (λ) is positivity increasing for λ E0 . If ψ is an eigenfunction of H corresponding to E0 , it is an eigenfunction of RH (λ) corresponding to E01 and the claim follows since RH (λ) = −λ 1 E0 −λ . The assumptions are for example satisfied for the potentials V considered in Theorem 10.2.
  • 251. Chapter 11 Atomic Schr¨dinger o operators 11.1. Self-adjointness In this section we want to have a look at the Hamiltonian corresponding to more than one interacting particle. It is given by N N H=− ∆j + Vj,k (xj − xk ). (11.1) j=1 jk We first consider the case of two particles, which will give us a feeling for how the many particle case differs from the one particle case and how the difficulties can be overcome. We denote the coordinates corresponding to the first particle by x1 = (x1,1 , x1,2 , x1,3 ) and those corresponding to the second particle by x2 = (x2,1 , x2,2 , x2,3 ). If we assume that the interaction is again of the Coulomb type, the Hamiltonian is given by γ H = −∆1 − ∆2 − , D(H) = H 2 (R6 ). (11.2) |x1 − x2 | Since Theorem 10.2 does not allow singularities for n ≥ 3, it does not tell us whether H is self-adjoint or not. Let 1 I I (y1 , y2 ) = √ (x1 , x2 ). (11.3) 2 −I I 239
  • 252. 240 11. Atomic Schr¨dinger operators o Then H reads in this new coordinate system as √ γ/ 2 H = (−∆1 ) + (−∆2 − ). (11.4) |y2 | In particular, it is the sum of a free particle plus a particle in an external Coulomb field. From a physics point of view, the first part corresponds to the center of mass motion and the second part to the relative motion. √ Using that γ/( 2|y2 |) has (−∆2 )-bound 0 in L2 (R3 ), it is not hard to see that the same is true for the (−∆1 − ∆2 )-bound in L2 (R6 ) (details will follow in the next section). In particular, H is self-adjoint and semi-bounded √ for any γ ∈ R. Moreover, you might suspect that γ/( 2|y2 |) is relatively compact with respect to −∆1 − ∆2 in L2 (R6 ) since it is with respect to −∆2 √ in L2 (R6 ). However, this is not true! This is due to the fact that γ/( 2|y2 |) does not vanish as |y| → ∞. Let us look at this problem from the physical view point. If λ ∈ σess (H), this means that the movement of the whole system is somehow unbounded. There are two possibilities for this. First, both particles are far away from each other (such that we can neglect the interaction) and the energy corresponds to the sum of the kinetic energies of both particles. Since both can be arbitrarily small (but positive), we expect [0, ∞) ⊆ σess (H). Secondly, both particles remain close to each other and move together. In the last set of coordinates this corresponds to a bound state of the second operator. Hence we expect [λ0 , ∞) ⊆ σess (H), where λ0 = −γ 2 /8 is the smallest eigenvalue of the second operator if the forces are attracting (γ ≥ 0) and λ0 = 0 if they are repelling (γ ≤ 0). It is not hard to translate this intuitive idea into a rigorous proof. Let ψ1 (y1 ) be a Weyl sequence corresponding to λ ∈ [0, ∞) for −∆1 and let √ ψ2 (y2 ) be a Weyl sequence corresponding to λ0 for −∆2 −γ/( 2|y2 |). Then, ψ1 (y1 )ψ2 (y2 ) is a Weyl sequence corresponding to λ + λ0 for H and thus [λ0 ,√ ⊆ σess (H). Conversely, we have −∆1 ≥ 0, respectively, −∆2 − ∞) γ/( 2|y2 |) ≥ λ0 , and hence H ≥ λ0 . Thus we obtain −γ 2 /8, γ ≥ 0, σ(H) = σess (H) = [λ0 , ∞), λ0 = (11.5) 0, γ ≤ 0. Clearly, the physically relevant information is the spectrum of the operator √ −∆2 − γ/( 2|y2 |) which is hidden by the spectrum of −∆1 . Hence, in order to reveal the physics, one first has to remove the center of mass motion. To avoid clumsy notation, we will restrict ourselves to the case of one atom with N electrons whose nucleus is fixed at the origin. In particular, this implies that we do not have to deal with the center of mass motion
  • 253. 11.1. Self-adjointness 241 encountered in our example above. In this case the Hamiltonian is given by N N N N (N ) H =− ∆j − Vne (xj ) + Vee (xj − xk ), j=1 j=1 j=1 jk D(H (N ) ) = H 2 (R3N ), (11.6) where Vne describes the interaction of one electron with the nucleus and Vee describes the interaction of two electrons. Explicitly we have γj Vj (x) = , γj 0, j = ne, ee. (11.7) |x| We first need to establish the self-adjointness of H (N ) = H0 + V (N ) . This will follow from Kato’s theorem. Theorem 11.1 (Kato). Let Vk ∈ L∞ (Rd ) + L2 (Rd ), d ≤ 3, be real-valued ∞ and let Vk (y (k) ) be the multiplication operator in L2 (Rn ), n = N d, obtained by letting y (k) be the first d coordinates of a unitary transform of Rn . Then Vk is H0 bounded with H0 -bound 0. In particular, H = H0 + Vk (y (k) ), D(H) = H 2 (Rn ), (11.8) k ∞ is self-adjoint and C0 (Rn ) is a core. Proof. It suffices to consider one k. After a unitary transform of Rn we can assume y (1) = (x1 , . . . , xd ) since such transformations leave both the scalar product of L2 (Rn ) and H0 invariant. Now let ψ ∈ S(Rn ). Then 2 Vk ψ ≤ a2 |∆1 ψ(x)|2 dn x + b2 |ψ(x)|2 dn x, Rn Rn d 2 2 where ∆1 = j=1 ∂ /∂ xj , by our previous lemma. Hence we obtain d Vk ψ 2 ≤ a2 | ˆ p2 ψ(p)|2 dn p + b2 ψ 2 j Rn j=1 n ≤ a2 | ˆ p2 ψ(p)|2 dn p + b2 ψ 2 j Rn j=1 2 2 = a H0 ψ + b2 ψ 2 , which implies that Vk is relatively bounded with bound 0. The rest follows from the Kato–Rellich theorem. So V (N ) is H0 bounded with H0 -bound 0 and thus H (N ) = H0 + V (N ) is self-adjoint on D(H (N ) ) = D(H0 ).
  • 254. 242 11. Atomic Schr¨dinger operators o 11.2. The HVZ theorem The considerations of the beginning of this section show that it is not so easy to determine the essential spectrum of H (N ) since the potential does not decay in all directions as |x| → ∞. However, there is still something we can do. Denote the infimum of the spectrum of H (N ) by λN . Then, let us split the system into H (N −1) plus a single electron. If the single electron is far away from the remaining system such that there is little interaction, the energy should be the sum of the kinetic energy of the single electron and the energy of the remaining system. Hence, arguing as in the two electron example of the previous section, we expect Theorem 11.2 (HVZ). Let H (N ) be the self-adjoint operator given in (11.6). Then H (N ) is bounded from below and σess (H (N ) ) = [λN −1 , ∞), (11.9) where λN −1 = min σ(H (N −1) ) 0. In particular, the ionization energy (i.e., the energy needed to remove one electron from the atom in its ground state) of an atom with N electrons is given by λN − λN −1 . Our goal for the rest of this section is to prove this result which is due to Zhislin, van Winter, and Hunziker and is known as the HVZ theorem. In fact there is a version which holds for general N -body systems. The proof is similar but involves some additional notation. The idea of proof is the following. To prove [λN −1 , ∞) ⊆ σess (H (N ) ), we choose Weyl sequences for H (N −1) and −∆N and proceed according to our intuitive picture from above. To prove σess (H (N ) ) ⊆ [λN −1 , ∞), we will localize H (N ) on sets where one electron is far away from the nucleus whenever some of the others are. On these sets, the interaction term between this electron and the nucleus is decaying and hence does not contribute to the essential spectrum. So it remain to estimate the infimum of the spectrum of a system where one electron does not interact with the nucleus. Since the interaction term with the other electrons is positive, we can finally estimate this infimum by the infimum of the case where one electron is completely decoupled from the rest. We begin with the first inclusion. Let ψ N −1 (x1 , . . . , xN −1 ) ∈ H 2 (R3(N −1) ) such that ψ N −1 = 1, (H (N −1) − λN −1 )ψ N −1 ≤ ε and ψ 1 ∈ H 2 (R3 ) such that ψ 1 = 1, (−∆N − λ)ψ 1 ≤ ε for some λ ≥ 0. Now consider
  • 255. 11.2. The HVZ theorem 243 ψr (x1 , . . . , xN ) = ψ N −1 (x1 , . . . , xN −1 )ψr (xN ), ψr (xN ) = ψ 1 (xN − r). Then 1 1 (H (N ) − λ − λN −1 )ψr ≤ (H (N −1) − λN −1 )ψ N −1 1 ψr + ψ N −1 1 (−∆N − λ)ψr N −1 + (VN − VN,j )ψr , (11.10) j=1 where VN = Vne (xN ) and VN,j = Vee (xN − xj ). Using the fact that (VN − N −1 VN,j )ψ N −1 ∈ L2 (R3N ) and |ψr | → 0 pointwise as |r| → ∞ j=1 1 (by Lemma 10.1), the third term can be made smaller than ε by choosing |r| large (dominated convergence). In summary, (H (N ) − λ − λN −1 )ψr ≤ 3ε (11.11) proving [λN −1 , ∞) ⊆ σess (H (N ) ). The second inclusion is more involved. We begin with a localization formula. Lemma 11.3 (IMS localization formula). Suppose φj ∈ C ∞ (Rn ), 1 ≤ j ≤ m, is such that m φj (x)2 = 1, x ∈ Rn . (11.12) j=1 Then m ∆ψ = φj ∆(φj ψ) + |∂φj |2 ψ , ψ ∈ H 2 (Rn ). (11.13) j=1 Proof. The proof follows from a straightforward computation using the identities j φj ∂k φj = 0 and j ((∂k φj )2 + φj ∂k φj ) = 0 which follow by 2 differentiating (11.12). Now we will choose φj , 1 ≤ j ≤ N , in such a way that, for x outside some ball, x ∈ supp(φj ) implies that the j’th particle is far away from the nucleus. Lemma 11.4. Fix some C ∈ (0, √1 ). There exist smooth functions φj ∈ N C ∞ (Rn , [0, 1]), 1 ≤ j ≤ N , such that (11.12) holds, supp(φj ) ∩ {x| |x| ≥ 1} ⊆ {x| |xj | ≥ C|x|}, (11.14) and |∂φj (x)| → 0 as |x| → ∞. Proof. The open sets Uj = {x ∈ S 3N −1 | |xj | C}
  • 256. 244 11. Atomic Schr¨dinger operators o cover the unit sphere in RN ; that is, N Uj = S 3N −1 . j=1 ˜ By Lemma 0.13 there is a partition of unity φj (x) subordinate to this cover. ˜ Extend φj (x) to a smooth function from R 3N {0} to [0, 1] by ˜ ˜ φj (λx) = φj (x), x ∈ S 3N −1 , λ 0, ˜ and pick a function φ ∈ C ∞ (R3N , [0, 1]) with support inside the unit ball which is 1 in a neighborhood of the origin. Then the ˜ ˜ ˜ φ + (1 − φ)φj φj = N ˜ ˜ ˜ =1 (φ + (1 − φ)φ )2 are the desired functions. The gradient tends to zero since φj (λx) = φj (x) for λ ≥ 1 and |x| ≥ 1 which implies (∂φj )(λx) = λ−1 (∂φj )(x). By our localization formula we have N H (N ) = φj H (N,j) φj + P − K, j=1 N N N K= φ2 Vj + |∂φj |2 , j P = φ2 j Vj, , (11.15) j=1 j=1 =j where N N N (N,j) H =− ∆ − V + Vk, (11.16) =1 =j k , k, =j is the Hamiltonian with the j’th electron decoupled from the rest of the system. Here we have abbreviated Vj (x) = Vne (xj ) and Vj, = Vee (xj − x ). Since K vanishes as |x| → ∞, we expect it to be relatively compact with respect to the rest. By Lemma 6.23 it suffices to check that it is relatively compact with respect to H0 . The terms |∂φj |2 are bounded and vanish at ∞; hence they are H0 compact by Lemma 7.11. However, the terms φ2 Vj j have singularities and will be covered by the following lemma. Lemma 11.5. Let V be a multiplication operator which is H0 bounded with H0 -bound 0 and suppose that χ{x||x|≥R} V RH0 (z) → 0 as R → ∞. Then V is relatively compact with respect to H0 . Proof. Let ψn converge to 0 weakly. Note that ψn ≤ M for some M 0. It suffices to show that V RH0 (z)ψn converges to 0. Choose
  • 257. 11.2. The HVZ theorem 245 ∞ φ ∈ C0 (Rn , [0, 1]) such that it is one for |x| ≤ R. Note φD(H0 ) ⊂ D(H0 ). Then V RH0 (z)ψn ≤ (1 − φ)V RH0 (z)ψn + V φRH0 (z)ψn ≤ (1 − φ)V RH0 (z) ψn + a H0 φRH0 (z)ψn + b φRH0 (z)ψn . By assumption, the first term can be made smaller than ε by choosing R large. Next, the same is true for the second term choosing a small since H0 φRH0 (z) is bounded (by Problem 2.9 and the closed graph theorem). Finally, the last term can also be made smaller than ε by choosing n large since φ is H0 compact. So K is relatively compact with respect to H (N ) . In particular H (N ) + K is self-adjoint on H 2 (R3N ) and σess (H (N ) ) = σess (H (N ) + K). Since the operators H (N,j) , 1 ≤ j ≤ N , are all of the form H (N −1) plus one particle which does not interact with the others and the nucleus, we have H (N,j) − λN −1 ≥ 0, 1 ≤ j ≤ N . Moreover, we have P ≥ 0 and hence N N −1 ψ, (H (N ) +K −λ )ψ = φj ψ, (H (N,j) − λN −1 )φj ψ j=1 + ψ, P ψ ≥ 0. (11.17) Thus we obtain the remaining inclusion σess (H (N ) ) = σess (H (N ) + K) ⊆ σ(H (N ) + K) ⊆ [λN −1 , ∞), (11.18) which finishes the proof of the HVZ theorem. Note that the same proof works if we add additional nuclei at fixed locations. That is, we can also treat molecules if we assume that the nuclei are fixed in space. Finally, let us consider the example of helium-like atoms (N = 2). By the HVZ theorem and the considerations of the previous section we have 2 γne σess (H (2) ) = [−, ∞). (11.19) 4 Moreover, if γee = 0 (no electron interaction), we can take products of one- particle eigenfunctions to show that 2 1 1 − γne 2 + ∈ σp (H (2) (γee = 0)), n, m ∈ N. (11.20) 4n 4m2 In particular, there are eigenvalues embedded in the essential spectrum in this case. Moreover, since the electron interaction term is positive, we see 2 γne H (2) ≥ − . (11.21) 2
  • 258. 246 11. Atomic Schr¨dinger operators o Note that there can be no positive eigenvalues by the virial theorem. This even holds for arbitrary N , σp (H (N ) ) ⊂ (−∞, 0). (11.22)
  • 259. Chapter 12 Scattering theory 12.1. Abstract theory In physical measurements one often has the following situation. A particle is shot into a region where it interacts with some forces and then leaves the region again. Outside this region the forces are negligible and hence the time evolution should be asymptotically free. Hence one expects asymptotic states ψ± (t) = exp(−itH0 )ψ± (0) to exist such that ψ(t) − ψ± (t) → 0 as t → ±∞. (12.1) ! ¡ ψ+ (t)$$$$ X ¡ $$ ¡ $$$$ $ $$$ ¡ $$$ ¡ ¡ ¡ ψ(t) ¡ ¡ ¡ ¡ ψ− (t) ¡ ¡ ¡ ¡ Rewriting this condition, we see 0 = lim e−itH ψ(0) − e−itH0 ψ± (0) = lim ψ(0) − eitH e−itH0 ψ± (0) t→±∞ t→±∞ (12.2) and motivated by this, we define the wave operators by D(Ω± ) = {ψ ∈ H|∃ limt→±∞ eitH e−itH0 ψ}, (12.3) Ω± ψ = limt→±∞ eitH e−itH0 ψ. 247
  • 260. 248 12. Scattering theory The set D(Ω± ) is the set of all incoming/outgoing asymptotic states ψ± and Ran(Ω± ) is the set of all states which have an incoming/outgoing asymptotic state. If a state ψ has both, that is, ψ ∈ Ran(Ω+ ) ∩ Ran(Ω− ), it is called a scattering state. By construction we have Ω± ψ = lim eitH e−itH0 ψ = lim ψ = ψ (12.4) t→±∞ t→±∞ and it is not hard to see that D(Ω± ) is closed. Moreover, interchanging the roles of H0 and H amounts to replacing Ω± by Ω−1 and hence Ran(Ω± ) is ± also closed. In summary, Lemma 12.1. The sets D(Ω± ) and Ran(Ω± ) are closed and Ω± : D(Ω± ) → Ran(Ω± ) is unitary. Next, observe that lim eitH e−itH0 (e−isH0 ψ) = lim e−isH (ei(t+s)H e−i(t+s)H0 ψ) (12.5) t→±∞ t→±∞ and hence Ω± e−itH0 ψ = e−itH Ω± ψ, ψ ∈ D(Ω± ). (12.6) In addition, D(Ω± ) is invariant under exp(−itH0 ) and Ran(Ω± ) is invariant under exp(−itH). Moreover, if ψ ∈ D(Ω± )⊥ , then ϕ, exp(−itH0 )ψ = exp(itH0 )ϕ, ψ = 0, ϕ ∈ D(Ω± ). (12.7) Hence D(Ω± )⊥ is invariant under exp(−itH0 ) and Ran(Ω± )⊥ is invariant under exp(−itH). Consequently, D(Ω± ) reduces exp(−itH0 ) and Ran(Ω± ) reduces exp(−itH). Moreover, differentiating (12.6) with respect to t, we obtain from Theorem 5.1 the intertwining property of the wave operators. Theorem 12.2. The subspaces D(Ω± ), respectively, Ran(Ω± ), reduce H0 , respectively, H, and the operators restricted to these subspaces are unitarily equivalent: Ω± H0 ψ = HΩ± ψ, ψ ∈ D(Ω± ) ∩ D(H0 ). (12.8) It is interesting to know the correspondence between incoming and out- going states. Hence we define the scattering operator S = Ω−1 Ω− , + D(S) = {ψ ∈ D(Ω− )|Ω− ψ ∈ Ran(Ω+ )}. (12.9) Note that we have D(S) = D(Ω− ) if and only if Ran(Ω− ) ⊆ Ran(Ω+ ) and Ran(S) = D(Ω+ ) if and only if Ran(Ω+ ) ⊆ Ran(Ω− ). Moreover, S is unitary from D(S) onto Ran(S) and we have H0 Sψ = SH0 ψ, D(H0 ) ∩ D(S). (12.10) However, note that this whole theory is meaningless until we can show that the domains D(Ω± ) are nontrivial. We first show a criterion due to Cook.
  • 261. 12.1. Abstract theory 249 Lemma 12.3 (Cook). Suppose D(H) ⊆ D(H0 ). If ∞ (H − H0 ) exp( itH0 )ψ dt ∞, ψ ∈ D(H0 ), (12.11) 0 then ψ ∈ D(Ω± ), respectively. Moreover, we even have ∞ (Ω± − I)ψ ≤ (H − H0 ) exp( itH0 )ψ dt (12.12) 0 in this case. Proof. The result follows from t eitH e−itH0 ψ = ψ + i exp(isH)(H − H0 ) exp(−isH0 )ψds (12.13) 0 which holds for ψ ∈ D(H0 ). As a simple consequence we obtain the following result for Schr¨dinger o operators in R3 Theorem 12.4. Suppose H0 is the free Schr¨dinger operator and H = o H0 + V with V ∈ L2 (R3 ). Then the wave operators exist and D(Ω± ) = H. Proof. Since we want to use Cook’s lemma, we need to estimate 2 V ψ(s) = |V (x)ψ(s, x)|2 dx, ψ(s) = exp(isH0 )ψ, R3 for given ψ ∈ D(H0 ). Invoking (7.31), we get 1 V ψ(s) ≤ ψ(s) ∞ V ≤ ψ 1 V , s 0, (4πs)3/2 at least for ψ ∈ L1 (R3 ). Moreover, this implies ∞ 1 V ψ(s) ds ≤ 3/2 ψ 1 V 1 4π and thus any such ψ is in D(Ω+ ). Since such functions are dense, we obtain D(Ω+ ) = H, and similarly for Ω− . By the intertwining property ψ is an eigenfunction of H0 if and only if it is an eigenfunction of H. Hence for ψ ∈ Hpp (H0 ) it is easy to check whether it is in D(Ω± ) or not and only the continuous subspace is of interest. We will say that the wave operators exist if all elements of Hac (H0 ) are asymptotic states, that is, Hac (H0 ) ⊆ D(Ω± ), (12.14) and that they are complete if, in addition, all elements of Hac (H) are scattering states, that is, Hac (H) ⊆ Ran(Ω± ). (12.15)
  • 262. 250 12. Scattering theory If we even have Hc (H) ⊆ Ran(Ω± ), (12.16) they are called asymptotically complete. We will be mainly interested in the case where H0 is the free Schr¨dinger o operator and hence Hac (H0 ) = H. In this latter case the wave operators ex- ist if D(Ω± ) = H, they are complete if Hac (H) = Ran(Ω± ), and they are asymptotically complete if Hc (H) = Ran(Ω± ). In particular, asymptotic completeness implies Hsc (H) = {0} since H restricted to Ran(Ω± ) is uni- tarily equivalent to H0 . Completeness implies that the scattering operator is unitary. Hence, by the intertwining property, kinetic energy is preserved during scattering: ψ− , H0 ψ− = Sψ− , SH0 ψ− = Sψ− , H0 Sψ− = ψ+ , H0 ψ+ (12.17) for ψ− ∈ D(H0 ) and ψ+ = Sψ− . 12.2. Incoming and outgoing states In the remaining sections we want to apply this theory to Schr¨dinger op- o erators. Our first goal is to give a precise meaning to some terms in the intuitive picture of scattering theory introduced in the previous section. This physical picture suggests that we should be able to decompose ψ ∈ H into an incoming and an outgoing part. But how should incoming, respectively, outgoing, be defined for ψ ∈ H? Well, incoming (outgoing) means that the expectation of x2 should decrease (increase). Set x(t)2 = exp(iH0 t)x2 exp(−iH0 t). Then, abbreviating ψ(t) = e−itH0 ψ, d Eψ (x(t)2 ) = ψ(t), i[H0 , x2 ]ψ(t) = 4 ψ(t), Dψ(t) , ψ ∈ S(Rn ), dt (12.18) where D is the dilation operator introduced in (10.9). Hence it is natural to consider ψ ∈ Ran(P± ), P± = PD ((0, ±∞)), (12.19) as outgoing, respectively, incoming, states. If we project a state in Ran(P± ) to energies in the interval (a2 , b2 ), we expect that it cannot be found in a ball of radius proportional to a|t| as t → ±∞ (a is the minimal velocity of the particle, since we have assumed the mass to be two). In fact, we will show below that the tail decays faster then any inverse power of |t|. We first collect some properties of D which will be needed later on. Note FD = −DF (12.20) and hence Ff (D) = f (−D)F. Additionally, we will look for a transforma- tion which maps D to a multiplication operator.
  • 263. 12.2. Incoming and outgoing states 251 Since the dilation group acts on |x| only, it seems reasonable to switch to polar coordinates x = rω, (t, ω) ∈ R+ × S n−1 . Since U (s) essentially trans- forms r into r exp(s), we will replace r by ρ = log(r). In these coordinates we have U (s)ψ(eρ ω) = e−ns/2 ψ(e(ρ−s) ω) (12.21) and hence U (s) corresponds to a shift of ρ (the constant in front is absorbed by the volume element). Thus D corresponds to differentiation with respect to this coordinate and all we have to do to make it a multiplication operator is to take the Fourier transform with respect to ρ. This leads us to the Mellin transform M : L2 (Rn ) → L2 (R × S n−1 ), (12.22) ∞ 1 n ψ(rω) → (Mψ)(λ, ω) = √ r−iλ ψ(rω)r 2 −1 dr. 2π 0 By construction, M is unitary; that is, |(Mψ)(λ, ω)|2 dλdn−1 ω = |ψ(rω)|2 rn−1 drdn−1 ω, R S n−1 R+ S n−1 (12.23) where dn−1 ω is the normalized surface measure on S n−1 . Moreover, −1 −isλ M U (s)M = e (12.24) and hence M−1 DM = λ. (12.25) From this it is straightforward to show that σ(D) = σac (D) = R, σsc (D) = σpp (D) = ∅ (12.26) and that S(Rn ) is a core for D. In particular we have P+ + P− = I. Using the Mellin transform, we can now prove Perry’s estimate. ∞ Lemma 12.5. Suppose f ∈ Cc (R) with supp(f ) ⊂ (a2 , b2 ) for some a, b 0. For any R ∈ R, N ∈ N there is a constant C such that C χ{x| |x|2a|t|} e−itH0 f (H0 )PD ((±R, ±∞)) ≤ , ±t ≥ 0, (12.27) (1 + |t|)N respectively. Proof. We prove only the + case, the remaining one being similar. Consider ψ ∈ S(Rn ). Introducing ψ(t, x) = e−itH0 f (H0 )PD ((R, ∞))ψ(x) = Kt,x , FPD ((R, ∞))ψ ˆ = Kt,x , PD ((−∞, −R))ψ , where 1 2 Kt,x (p) = n/2 ei(tp −px) f (p2 )∗ , (2π)
  • 264. 252 12. Scattering theory we see that it suffices to show 2 const PD ((−∞, −R))Kt,x ≤ , for |x| 2a|t|, t 0. (1 + |t|)2N Now we invoke the Mellin transform to estimate this norm: −R 2 PD ((−∞, −R))Kt,x = |(MKt,x )(λ, ω)|2 dλdn−1 ω. −∞ S n−1 Since ∞ 1 ˜ (MKt,x )(λ, ω) = f (r)eiα(r) dr (12.28) (2π)(n+1)/2 0 ˜ with f (r) = f (r2 )∗ rn/2−1 ∈ Cc ((a, b)), α(r) = tr2 + rωx − λ log(r). Esti- ∞ mating the derivative of α, we see α (r) = 2tr + ωx − λ/r 0, r ∈ (a, b), for λ ≤ −R and t −R(2εa)−1 , where ε is the distance of a to the support ˜. Hence we can find a constant such that of f |α (r)| ≥ const(1 + |λ| + |t|), ˜ r ∈ supp(f ), for λ ≤ −R, t −R(εa)−1 . Now the method of stationary phase (Prob- lem 12.1) implies const |(MKt,x )(λ, ω)| ≤ (1 + |λ| + |t|)N for λ, t as before. By increasing the constant, we can even assume that it holds for t ≥ 0 and λ ≤ −R. This finishes the proof. ∞ Corollary 12.6. Suppose that f ∈ Cc ((0, ∞)) and R ∈ R. Then the operator PD ((±R, ±∞))f (H0 ) exp(−itH0 ) converges strongly to 0 as t → ∞. Proof. Abbreviating PD = PD ((±R, ±∞)) and χ = χ{x| |x|2a|t|} , we have PD f (H0 )e−itH0 ψ ≤ χeitH0 f (H0 )∗ PD ψ + f (H0 ) (I − χ)ψ since A = A∗ . Taking t → ∞, the first term goes to zero by our lemma and the second goes to zero since χψ → ψ. Problem 12.1 (Method of stationary phase). Consider the integral ∞ I(t) = f (r)eitφ(r) dr −∞ with f ∈ ∞ Cc (R)and a real-valued phase φ ∈ C ∞ (R). Show that |I(t)| ≤ CN t−N for any N ∈ N if |φ (r)| ≥ 1 for r ∈ supp(f ). (Hint: Make a change of variables ρ = φ(r) and conclude that it suffices to show the case φ(r) = r. Now use integration by parts.)
  • 265. 12.3. Schr¨dinger operators with short range potentials o 253 12.3. Schr¨dinger operators with short range potentials o By the RAGE theorem we know that for ψ ∈ Hc , ψ(t) will eventually leave every compact ball (at least on the average). Hence we expect that the time evolution will asymptotically look like the free one for ψ ∈ Hc if the potential decays sufficiently fast. In other words, we expect such potentials to be asymptotically complete. Suppose V is relatively bounded with bound less than one. Introduce h1 (r) = V RH0 (z)χr , h2 (r) = χr V RH0 (z) , r ≥ 0, (12.29) where χr = χ{x| |x|≥r} . (12.30) The potential V will be called short range if these quantities are integrable. We first note that it suffices to check this for h1 or h2 and for one z ∈ ρ(H0 ). Lemma 12.7. The function h1 is integrable if and only if h2 is. Moreover, hj integrable for one z0 ∈ ρ(H0 ) implies hj integrable for all z ∈ ρ(H0 ). ∞ Proof. Pick φ ∈ Cc (Rn , [0, 1]) such that φ(x) = 0 for 0 ≤ |x| ≤ 1/2 and φ(x) = 0 for 1 ≤ |x|. Then it is not hard to see that hj is integrable if and ˜ only if hj is integrable, where ˜ h1 (r) = V RH0 (z)φr , ˜ h2 (r) = φr V RH0 (z) , r ≥ 1, and φr (x) = φ(x/r). Using [RH0 (z), φr ] = −RH0 (z)[H0 (z), φr ]RH0 (z) = RH0 (z)(∆φr + (∂φr )∂)RH0 (z) and ∆φr = φr/2 ∆φr , ∆φr ∞ ≤ ∆φ 2, ∞ /r respectively, (∂φr ) = φr/2 (∂φr ), ∂φr ∞ ≤ ∂φ ∞ /r2 , we see ˜ ˜ c˜ |h1 (r) − h2 (r)| ≤ h1 (r/2), r ≥ 1. r ˜ 2 is integrable if h1 is. Conversely, Hence h ˜ ˜ ˜ c˜ ˜ c˜ 2c ˜ h1 (r) ≤ h2 (r) + h1 (r/2) ≤ h2 (r) + h2 (r/2) + 2 h1 (r/4) r r r ˜ ˜ shows that h2 is integrable if h1 is. Invoking the first resolvent formula φr V RH0 (z) ≤ φr V RH0 (z0 ) I − (z − z0 )RH0 (z) finishes the proof. As a first consequence note Lemma 12.8. If V is short range, then RH (z) − RH0 (z) is compact.
  • 266. 254 12. Scattering theory Proof. The operator RH (z)V (I−χr )RH0 (z) is compact since (I−χr )RH0 (z) is by Lemma 7.11 and RH (z)V is bounded by Lemma 6.23. Moreover, by our short range condition it converges in norm to RH (z)V RH0 (z) = RH (z) − RH0 (z) as r → ∞ (at least for some subsequence). In particular, by Weyl’s theorem we have σess (H) = [0, ∞). Moreover, V short range implies that H and H0 look alike far outside. Lemma 12.9. Suppose RH (z)−RH0 (z) is compact. Then so is f (H)−f (H0 ) for any f ∈ C∞ (R) and lim (f (H) − f (H0 ))χr = 0. (12.31) r→∞ Proof. The first part is Lemma 6.21 and the second part follows from part (ii) of Lemma 6.8 since χr converges strongly to 0. However, this is clearly not enough to prove asymptotic completeness and we need a more careful analysis. We begin by showing that the wave operators exist. By Cook’s criterion (Lemma 12.3) we need to show that V exp( itH0 )ψ ≤ V RH0 (−1) (I − χ2a|t| ) exp( itH0 )(H0 + I)ψ + V RH0 (−1)χ2a|t| (H0 + I)ψ (12.32) is integrable for a dense set of vectors ψ. The second term is integrable by our short range assumption. The same is true by Perry’s estimate (Lemma 12.5) for the first term if we choose ψ = f (H0 )PD ((±R, ±∞))ϕ. Since vectors of this form are dense, we see that the wave operators exist, D(Ω± ) = H. (12.33) Since H restricted to Ran(Ω∗ ) is unitarily equivalent to H0 , we obtain ± [0, ∞) = σac (H0 ) ⊆ σac (H). Furthermore, by σac (H) ⊆ σess (H) = [0, ∞) we even have σac (H) = [0, ∞). To prove asymptotic completeness of the wave operators, we will need that the (Ω± − I)f (H0 )P± are compact. ∞ Lemma 12.10. Let f ∈ Cc ((0, ∞)) and suppose ψn converges weakly to 0. Then lim (Ω± − I)f (H0 )P± ψn = 0; (12.34) n→∞ that is, (Ω± − I)f (H0 )P± is compact.
  • 267. 12.3. Schr¨dinger operators with short range potentials o 255 Proof. By (12.13) we see ∞ RH (z)(Ω± − I)f (H0 )P± ψn ≤ RH (z)V exp(−isH0 )f (H0 )P± ψn dt. 0 Since RH (z)V RH0 is compact, we see that the integrand RH (z)V exp(−isH0 )f (H0 )P± ψn = RH (z)V RH0 exp(−isH0 )(H0 + 1)f (H0 )P± ψn converges pointwise to 0. Moreover, arguing as in (12.32), the integrand is bounded by an L1 function depending only on ψn . Thus RH (z)(Ω± − I)f (H0 )P± is compact by the dominated convergence theorem. Furthermore, using the intertwining property, we see that ˜ (Ω± − I)f (H0 )P± =RH (z)(Ω± − I)f (H0 )P± − (RH (z) − RH0 (z))f (H0 )P± ˜ is compact by Lemma 6.21, where f (λ) = (λ + 1)f (λ). Now we have gathered enough information to tackle the problem of asymptotic completeness. We first show that the singular continuous spectrum is absent. This is not really necessary, but it avoids the use of Ces`ro means in our main a argument. sc Abbreviate P = PH PH ((a, b)), 0 a b. Since H restricted to Ran(Ω± ) is unitarily equivalent to H0 (which has purely absolutely continu- ous spectrum), the singular part must live on Ran(Ω± )⊥ ; that is, PH Ω± = 0. sc Thus P f (H0 ) = P (I − Ω+ )f (H0 )P+ + P (I − Ω− )f (H0 )P− is compact. Since f (H) − f (H0 ) is compact, it follows that P f (H) is also compact. Choos- ing f such that f (λ) = 1 for λ ∈ [a, b], we see that P = P f (H) is com- pact and hence finite dimensional. In particular σsc (H) ∩ (a, b) is a fi- nite set. But a continuous measure cannot be supported on a finite set, showing σsc (H) ∩ (a, b) = ∅. Since 0 a b are arbitrary, we even have σsc (H) ∩ (0, ∞) = ∅ and by σsc (H) ⊆ σess (H) = [0, ∞), we obtain σsc (H) = ∅. sc pp Observe that replacing PH by PH , the same argument shows that all nonzero eigenvalues are finite dimensional and cannot accumulate in (0, ∞). In summary we have shown Theorem 12.11. Suppose V is short range. Then σac (H) = σess (H) = [0, ∞), σsc (H) = ∅. (12.35) All nonzero eigenvalues have finite multiplicity and cannot accumulate in (0, ∞).
  • 268. 256 12. Scattering theory Now we come to the anticipated asymptotic completeness result of Enß. Choose ψ ∈ Hc (H) = Hac (H) such that ψ = f (H)ψ (12.36) for some f ∈ Cc∞ ((0, ∞)). By the RAGE theorem the sequence ψ(t) con- verges weakly to zero as t → ±∞. Abbreviate ψ(t) = exp(−itH)ψ. Intro- duce ϕ± (t) = f (H0 )P± ψ(t), (12.37) which satisfy lim ψ(t) − ϕ+ (t) − ϕ− (t) = 0. (12.38) t→±∞ Indeed this follows from ψ(t) = ϕ+ (t) + ϕ− (t) + (f (H) − f (H0 ))ψ(t) (12.39) and Lemma 6.21. Moreover, we even have lim (Ω± − I)ϕ± (t) = 0 (12.40) t→±∞ by Lemma 12.10. Now suppose ψ ∈ Ran(Ω± )⊥ . Then 2 ψ = lim ψ(t), ψ(t) t→±∞ = lim ψ(t), ϕ+ (t) + ϕ− (t) t→±∞ = lim ψ(t), Ω+ ϕ+ (t) + Ω− ϕ− (t) . (12.41) t→±∞ By Theorem 12.2, Ran(Ω± )⊥ is invariant under H and thus ψ(t) ∈ Ran(Ω± )⊥ implying 2 ψ = lim ψ(t), Ω ϕ (t) (12.42) t→±∞ = lim P f (H0 )∗ Ω∗ ψ(t), ψ(t) . t→±∞ Invoking the intertwining property, we see ψ 2 = lim P f (H0 )∗ e−itH0 Ω∗ ψ, ψ(t) = 0 (12.43) t→±∞ by Corollary 12.6. Hence Ran(Ω± ) = Hac (H) = Hc (H) and we thus have shown Theorem 12.12 (Enß). Suppose V is short range. Then the wave operators are asymptotically complete.
  • 271. Appendix A Almost everything about Lebesgue integration In this appendix I give a brief introduction to measure theory. Good refer- ences are [7], [32], or [47]. A.1. Borel measures in a nut shell The first step in defining the Lebesgue integral is extending the notion of size from intervals to arbitrary sets. Unfortunately, this turns out to be too much, since a classical paradox by Banach and Tarski shows that one can break the unit ball in R3 into a finite number of (wild – choosing the pieces uses the Axiom of Choice and cannot be done with a jigsaw;-) pieces, rotate and translate them, and reassemble them to get two copies of the unit ball (compare Problem A.1). Hence any reasonable notion of size (i.e., one which is translation and rotation invariant) cannot be defined for all sets! A collection of subsets A of a given set X such that • X ∈ A, • A is closed under finite unions, • A is closed under complements is called an algebra. Note that ∅ ∈ A and that, by de Morgan, A is also closed under finite intersections. If an algebra is closed under countable unions (and hence also countable intersections), it is called a σ-algebra. 259
  • 272. 260 A. Almost everything about Lebesgue integration Moreover, the intersection of any family of (σ-)algebras {Aα } is again a (σ-)algebra and for any collection S of subsets there is a unique smallest (σ-)algebra Σ(S) containing S (namely the intersection of all (σ-)algebras containing S). It is called the (σ-)algebra generated by S. If X is a topological space, the Borel σ-algebra of X is defined to be the σ-algebra generated by all open (respectively, all closed) sets. Sets in the Borel σ-algebra are called Borel sets. Example. In the case X = Rn the Borel σ-algebra will be denoted by Bn and we will abbreviate B = B1 . Now let us turn to the definition of a measure: A set X together with a σ-algebra Σ is called a measurable space. A measure µ is a map µ : Σ → [0, ∞] on a σ-algebra Σ such that • µ(∅) = 0, ∞ ∞ • µ( j=1 Aj ) = µ(Aj ) if Aj ∩ Ak = ∅ for all j = k (σ-additivity). j=1 It is called σ-finite if there is a countable cover {Xj }∞ of X with µ(Xj ) j=1 ∞ for all j. (Note that it is no restriction to assume Xj ⊆ Xj+1 .) It is called finite if µ(X) ∞. The sets in Σ are called measurable sets and the triple X, Σ, and µ is referred to as a measure space. If we replace the σ-algebra by an algebra A, then µ is called a premea- sure. In this case σ-additivity clearly only needs to hold for disjoint sets An for which n An ∈ A. We will write An A if An ⊆ An+1 (note A = n An ) and An A if An+1 ⊆ An (note A = n An ). Theorem A.1. Any measure µ satisfies the following properties: (i) A ⊆ B implies µ(A) ≤ µ(B) (monotonicity). (ii) µ(An ) → µ(A) if An A (continuity from below). (iii) µ(An ) → µ(A) if An A and µ(A1 ) ∞ (continuity from above). ˜ Proof. The first claim is obvious. The second follows using An = An An−1 ˜ and σ-additivity. The third follows from the second using An = A1 An and ˜n ) = µ(A1 ) − µ(An ). µ(A Example. Let A ∈ P(M ) and set µ(A) to be the number of elements of A (respectively, ∞ if A is infinite). This is the so-called counting measure. Note that if X = N and An = {j ∈ N|j ≥ n}, then µ(An ) = ∞, but µ( n An ) = µ(∅) = 0 which shows that the requirement µ(A1 ) ∞ in the last claim of Theorem A.1 is not superfluous.
  • 273. A.1. Borel measures in a nut shell 261 A measure on the Borel σ-algebra is called a Borel measure if µ(C) ∞ for any compact set C. A Borel measures is called outer regular if µ(A) = inf µ(O) (A.1) A⊆O,O open and inner regular if µ(A) = sup µ(C). (A.2) C⊆A,C compact It is called regular if it is both outer and inner regular. But how can we obtain some more interesting Borel measures? We will restrict ourselves to the case of X = R for simplicity. Then the strategy is as follows: Start with the algebra of finite unions of disjoint intervals and define µ for those sets (as the sum over the intervals). This yields a premeasure. Extend this to an outer measure for all subsets of R. Show that the restriction to the Borel sets is a measure. Let us first show how we should define µ for intervals: To every Borel measure on B we can assign its distribution function   −µ((x, 0]), x 0, µ(x) = 0, x = 0, (A.3) µ((0, x]), x 0,  which is right continuous and nondecreasing. Conversely, given a right con- tinuous nondecreasing function µ : R → R, we can set   µ(b) − µ(a),  A = (a, b], µ(b) − µ(a−), A = [a, b],  µ(A) = (A.4)  µ(b−) − µ(a),  A = (a, b), µ(b−) − µ(a−), A = [a, b),  where µ(a−) = limε↓0 µ(a − ε). In particular, this gives a premeasure on the algebra of finite unions of intervals which can be extended to a measure: Theorem A.2. For every right continuous nondecreasing function µ : R → R there exists a unique regular Borel measure µ which extends (A.4). Two different functions generate the same measure if and only if they differ by a constant. Since the proof of this theorem is rather involved, we defer it to the next section and look at some examples first. Example. Suppose Θ(x) = 0 for x 0 and Θ(x) = 1 for x ≥ 0. Then we obtain the so-called Dirac measure at 0, which is given by Θ(A) = 1 if 0 ∈ A and Θ(A) = 0 if 0 ∈ A.
  • 274. 262 A. Almost everything about Lebesgue integration Example. Suppose λ(x) = x. Then the associated measure is the ordinary Lebesgue measure on R. We will abbreviate the Lebesgue measure of a Borel set A by λ(A) = |A|. It can be shown that Borel measures on a locally compact second count- able space are always regular ([7, Thm. 29.12]). A set A ∈ Σ is called a support for µ if µ(XA) = 0. A property is said to hold µ-almost everywhere (a.e.) if it holds on a support for µ or, equivalently, if the set where it does not hold is contained in a set of measure zero. Example. The set of rational numbers has Lebesgue measure zero: λ(Q) = 0. In fact, any single point has Lebesgue measure zero, and so has any countable union of points (by countable additivity). Example. The Cantor set is an example of a closed uncountable set of Lebesgue measure zero. It is constructed as follows: Start with C0 = [0, 1] and remove the middle third to obtain C1 = [0, 3 ]∪[ 2 , 1]. Next, again remove 1 3 the middle third’s of the remaining sets to obtain C2 = [0, 1 ]∪[ 2 , 1 ]∪[ 2 , 7 ]∪ 9 9 3 3 9 [ 8 , 1]: 9 C0 C1 C2 . . C3 . Proceeding like this, we obtain a sequence of nesting sets Cn and the limit C = n Cn is the Cantor set. Since Cn is compact, so is C. Moreover, Cn consists of 2n intervals of length 3−n , and thus its Lebesgue measure is λ(Cn ) = (2/3)n . In particular, λ(C) = limn→∞ λ(Cn ) = 0. Using the ternary expansion, it is extremely simple to describe: C is the set of all x ∈ [0, 1] whose ternary expansion contains no one’s, which shows that C is uncountable (why?). It has some further interesting properties: it is totally disconnected (i.e., it contains no subintervals) and perfect (it has no isolated points). Problem A.1 (Vitali set). Call two numbers x, y ∈ [0, 1) equivalent if x − y is rational. Construct the set V by choosing one representative from each equivalence class. Show that V cannot be measurable with respect to any nontrivial finite translation invariant measure on [0, 1). (Hint: How can you build up [0, 1) from translations of V ?)
  • 275. A.2. Extending a premeasure to a measure 263 A.2. Extending a premeasure to a measure The purpose of this section is to prove Theorem A.2. It is rather technical and should be skipped on first reading. In order to prove Theorem A.2, we need to show how a premeasure can be extended to a measure. As a prerequisite we first establish that it suffices to check increasing (or decreasing) sequences of sets when checking whether a given algebra is in fact a σ-algebra: A collection of sets M is called a monotone class if An A implies A ∈ M whenever An ∈ M and An A implies A ∈ M whenever An ∈ M. Every σ-algebra is a monotone class and the intersection of monotone classes is a monotone class. Hence every collection of sets S generates a smallest monotone class M(S). Theorem A.3. Let A be an algebra. Then M(A) = Σ(A). Proof. We first show that M = M(A) is an algebra. Put M (A) = {B ∈ M|A ∪ B ∈ M}. If Bn is an increasing sequence of sets in M (A), then A ∪ Bn is an increasing sequence in M and hence n (A ∪ Bn ) ∈ M. Now A∪ Bn = (A ∪ Bn ) n n shows that M (A) is closed under increasing sequences. Similarly, M (A) is closed under decreasing sequences and hence it is a monotone class. But does it contain any elements? Well, if A ∈ A, we have A ⊆ M (A) implying M (A) = M for A ∈ A. Hence A ∪ B ∈ M if at least one of the sets is in A. But this shows A ⊆ M (A) and hence M (A) = M for any A ∈ M. So M is closed under finite unions. To show that we are closed under complements, consider M = {A ∈ M|XA ∈ M}. If An is an increasing sequence, then XAn is a decreasing sequence and X n An = n XAn ∈ M if An ∈ M and similarly for decreasing sequences. Hence M is a monotone class and must be equal to M since it contains A. So we know that M is an algebra. To show that it is a σ-algebra, let ˜ ˜ An ∈ M be given and put An = k≤n An ∈ M. Then An is increasing and ˜ n An = n An ∈ M. The typical use of this theorem is as follows: First verify some property for sets in an algebra A. In order to show that it holds for any set in Σ(A), it suffices to show that the collection of sets for which it holds is closed under countable increasing and decreasing sequences (i.e., is a monotone class). Now we start by proving that (A.4) indeed gives rise to a premeasure.
  • 276. 264 A. Almost everything about Lebesgue integration Lemma A.4. The interval function µ defined in (A.4) gives rise to a unique σ-finite regular premeasure on the algebra A of finite unions of disjoint in- tervals. Proof. First of all, (A.4) can be extended to finite unions of disjoint inter- vals by summing over all intervals. It is straightforward to verify that µ is well-defined (one set can be represented by different unions of intervals) and by construction additive. To show regularity, we can assume any such union to consist of open intervals and points only. To show outer regularity, replace each point {x} by a small open interval (x+ε, x−ε) and use that µ({x}) = limε↓0 µ(x+ε)− µ(x−ε). Similarly, to show inner regularity, replace each open interval (a, b) by a compact one, [an , bn ] ⊆ (a, b), and use µ((a, b)) = limn→∞ µ(bn )−µ(an ) if an ↓ a and bn ↑ b. It remains to verify σ-additivity. We need to show µ( Ik ) = µ(Ik ) k k whenever In ∈ A and I = k Ik ∈ A. Since each In is a finite union of in- tervals, we can as well assume each In is just one interval (just split In into its subintervals and note that the sum does not change by additivity). Sim- ilarly, we can assume that I is just one interval (just treat each subinterval separately). By additivity µ is monotone and hence n n µ(Ik ) = µ( Ik ) ≤ µ(I) k=1 k=1 which shows ∞ µ(Ik ) ≤ µ(I). k=1 To get the converse inequality, we need to work harder. By outer regularity we can cover each Ik by some open interval Jk such that µ(Jk ) ≤ µ(Ik ) + 2εk . First suppose I is compact. Then finitely many of the Jk , say the first n, cover I and we have n n ∞ µ(I) ≤ µ( Jk ) ≤ µ(Jk ) ≤ µ(Ik ) + ε. k=1 k=1 k=1 Since ε 0 is arbitrary, this shows σ-additivity for compact intervals. By additivity we can always add/subtract the endpoints of I and hence σ- additivity holds for any bounded interval. If I is unbounded, say I = [a, ∞),
  • 277. A.2. Extending a premeasure to a measure 265 then given x 0, we can find an n such that Jn cover at least [0, x] and hence n n µ(Ik ) ≥ µ(Jk ) − ε ≥ µ([a, x]) − ε. k=1 k=1 Since x a and ε 0 are arbitrary, we are done. This premeasure determines the corresponding measure µ uniquely (if there is one at all): Theorem A.5 (Uniqueness of measures). Let µ be a σ-finite premeasure on an algebra A. Then there is at most one extension to Σ(A). Proof. We first assume that µ(X) ∞. Suppose there is another extension µ and consider the set ˜ S = {A ∈ Σ(A)|µ(A) = µ(A)}. ˜ I claim S is a monotone class and hence S = Σ(A) since A ⊆ S by assump- tion (Theorem A.3). Let An A. If An ∈ S, we have µ(An ) = µ(An ) and taking limits ˜ (Theorem A.1 (ii)), we conclude µ(A) = µ(A). Next let An ˜ A and take limits again. This finishes the finite case. To extend our result to the σ-finite case, let Xj X be an increasing sequence such that µ(Xj ) ∞. By the finite case µ(A ∩ Xj ) = µ(A ∩ Xj ) (just restrict µ, µ to Xj ). Hence ˜ ˜ µ(A) = lim µ(A ∩ Xj ) = lim µ(A ∩ Xj ) = µ(A) ˜ ˜ j→∞ j→∞ and we are done. Note that if our premeasure is regular, so will the extension be: Lemma A.6. Suppose µ is a σ-finite measure on the Borel sets B. Then outer (inner) regularity holds for all Borel sets if it holds for all sets in some algebra A generating the Borel sets B. Proof. We first assume that µ(X) ∞. Set µ◦ (A) = inf µ(O) ≥ µ(A) A⊆O,O open and let M = {A ∈ B|µ◦ (A) = µ(A)}. Since by assumption M contains some algebra generating B, it suffices to prove that M is a monotone class. Let An ∈ M be a monotone sequence and let On ⊇ An be open sets such that µ(On ) ≤ µ(An ) + 2ε . Then n ε µ(An ) ≤ µ(On ) ≤ µ(An ) + n . 2
  • 278. 266 A. Almost everything about Lebesgue integration Now if An A, just take limits and use continuity from below of µ to see that On ⊇ An ⊇ A is a sequence of open sets with µ(On ) → µ(A). Similarly if An A, observe that O = n On satisfies O ⊇ A and µ(O) ≤ µ(A) + µ(On A) ≤ µ(A) + ε ε since µ(On A) ≤ µ(On An ) ≤ 2n . Next let µ be arbitrary. Let Xj be a cover with µ(Xj ) ∞. Given A, we can split it into disjoint sets Aj such that Aj ⊆ Xj (A1 = A ∩ X1 , A2 = (AA1 ) ∩ X2 , etc.). By regularity, we can assume Xj open. Thus there are open (in X) sets Oj covering Aj such that µ(Oj ) ≤ µ(Aj ) + 2εj . Then O = j Oj is open, covers A, and satisfies µ(A) ≤ µ(O) ≤ µ(Oj ) ≤ µ(A) + ε. j This settles outer regularity. Next let us turn to inner regularity. If µ(X) ∞, one can show as before that M = {A ∈ B|µ◦ (A) = µ(A)}, where µ◦ (A) = sup µ(C) ≤ µ(A) C⊆A,C compact is a monotone class. This settles the finite case. For the σ-finite case split A again as before. Since Xj has finite measure, there are compact subsets Kj of Aj such that µ(Aj ) ≤ µ(Kj ) + 2εj . Now we need to distinguish two cases: If µ(A) = ∞, the sum j µ(Aj ) will diverge and so will ˜ n = n ⊆ A is compact with j µ(Kj ). Hence K j=1 µ(K˜ n ) → ∞ = µ(A). If µ(A) ∞, the sum j µ(Aj ) will converge and choosing n sufficiently large, we will have ˜ ˜ µ(Kn ) ≤ µ(A) ≤ µ(Kn ) + 2ε. This finishes the proof. So it remains to ensure that there is an extension at all. For any pre- measure µ we define ∞ ∞ µ∗ (A) = inf µ(An ) A ⊆ An , A n ∈ A (A.5) n=1 n=1 where the infimum extends over all countable covers from A. Then the function µ∗ : P(X) → [0, ∞] is an outer measure; that is, it has the properties (Problem A.2) • µ∗ (∅) = 0, • A1 ⊆ A2 ⇒ µ∗ (A1 ) ≤ µ∗ (A2 ), and ∞ ∞ • µ∗ ( n=1 An ) ≤ ∗ n=1 µ (An ) (subadditivity).
  • 279. A.2. Extending a premeasure to a measure 267 Note that µ∗ (A) = µ(A) for A ∈ A (Problem A.3). Theorem A.7 (Extensions via outer measures). Let µ∗ be an outer measure. Then the set Σ of all sets A satisfying the Carath´odory condition e µ∗ (E) = µ∗ (A ∩ E) + µ∗ (A ∩ E), ∀E ⊆ X (A.6) (where A = XA is the complement of A) forms a σ-algebra and µ∗ re- stricted to this σ-algebra is a measure. Proof. We first show that Σ is an algebra. It clearly contains X and is closed under complements. Let A, B ∈ Σ. Applying Carath´odory’s condition e twice finally shows µ∗ (E) =µ∗ (A ∩ B ∩ E) + µ∗ (A ∩ B ∩ E) + µ∗ (A ∩ B ∩ E) + µ∗ (A ∩ B ∩ E) ≥µ∗ ((A ∪ B) ∩ E) + µ∗ ((A ∪ B) ∩ E), where we have used de Morgan and µ∗ (A ∩ B ∩ E) + µ∗ (A ∩ B ∩ E) + µ∗ (A ∩ B ∩ E) ≥ µ∗ ((A ∪ B) ∩ E) which follows from subadditivity and (A ∪ B) ∩ E = (A ∩ B ∩ E) ∪ (A ∩ B ∩ E) ∪ (A ∩ B ∩ E). Since the reverse inequality is just subadditivity, we conclude that Σ is an algebra. Next, let An be a sequence of sets from Σ. Without restriction we can assume that they are disjoint (compare the last argument in proof of ˜ Theorem A.3). Abbreviate An = k≤n An , A = n An . Then for any set E we have µ∗ (An ∩ E) = µ∗ (An ∩ An ∩ E) + µ∗ (An ∩ An ∩ E) ˜ ˜ ˜ = µ∗ (An ∩ E) + µ∗ (An−1 ∩ E) ˜ n = ... = µ∗ (Ak ∩ E). k=1 ˜ Using An ∈ Σ and monotonicity of µ∗ , we infer µ∗ (E) = µ∗ (An ∩ E) + µ∗ (An ∩ E) ˜ ˜ n ≥ µ∗ (Ak ∩ E) + µ∗ (A ∩ E). k=1 Letting n → ∞ and using subadditivity finally gives ∞ µ∗ (E) ≥ µ∗ (Ak ∩ E) + µ∗ (A ∩ E) k=1 ≥ µ∗ (A ∩ E) + µ∗ (B ∩ E) ≥ µ∗ (E) (A.7)
  • 280. 268 A. Almost everything about Lebesgue integration and we infer that Σ is a σ-algebra. Finally, setting E = A in (A.7), we have ∞ ∞ ∗ ∗ ∗ µ (A) = µ (Ak ∩ A) + µ (A ∩ A) = µ∗ (Ak ) k=1 k=1 and we are done. Remark: The constructed measure µ is complete; that is, for any mea- surable set A of measure zero, any subset of A is again measurable (Prob- lem A.4). The only remaining question is whether there are any nontrivial sets satisfying the Carath´odory condition. e Lemma A.8. Let µ be a premeasure on A and let µ∗ be the associated outer measure. Then every set in A satisfies the Carath´odory condition. e Proof. Let An ∈ A be a countable cover for E. Then for any A ∈ A we have ∞ ∞ ∞ µ(An ) = µ(An ∩ A) + µ(An ∩ A ) ≥ µ∗ (E ∩ A) + µ∗ (E ∩ A ) n=1 n=1 n=1 since An ∩ A ∈ A is a cover for E ∩ A and An ∩ A ∈ A is a cover for E ∩ A . Taking the infimum, we have µ∗ (E) ≥ µ∗ (E ∩A)+µ∗ (E ∩A ), which finishes the proof. Thus, as a consequence we obtain Theorem A.2. Problem A.2. Show that µ∗ defined in (A.5) is an outer measure. (Hint for the last property: Take a cover {Bnk }∞ for An such that µ∗ (An ) = k=1 ε ∞ ∞ 2n + k=1 µ(Bnk ) and note that {Bnk }n,k=1 is a cover for n An .) Problem A.3. Show that µ∗ defined in (A.5) extends µ. (Hint: For the cover An it is no restriction to assume An ∩ Am = ∅ and An ⊆ A.) Problem A.4. Show that the measure constructed in Theorem A.7 is com- plete. A.3. Measurable functions The Riemann integral works by splitting the x coordinate into small intervals and approximating f (x) on each interval by its minimum and maximum. The problem with this approach is that the difference between maximum and minimum will only tend to zero (as the intervals get smaller) if f (x) is sufficiently nice. To avoid this problem, we can force the difference to go to zero by considering, instead of an interval, the set of x for which f (x) lies
  • 281. A.3. Measurable functions 269 between two given numbers a b. Now we need the size of the set of these x, that is, the size of the preimage f −1 ((a, b)). For this to work, preimages of intervals must be measurable. A function f : X → Rn is called measurable if f −1 (A) ∈ Σ for every A ∈ Bn . A complex-valued function is called measurable if both its real and imaginary parts are. Clearly it suffices to check this condition for every set A in a collection of sets which generate Bn , since the collection of sets for which it holds forms a σ-algebra by f −1 (Rn A) = Xf −1 (A) and f −1 ( j Aj ) = −1 (A ). jf j Lemma A.9. A function f : X → Rn is measurable if and only if n f −1 (I) ∈ Σ ∀I = (aj , ∞). (A.8) j=1 In particular, a function f : X → Rn is measurable if and only if every component is measurable. Proof. We need to show that B is generated by rectangles of the above form. The σ-algebra generated by these rectangles also contains all open rectangles of the form I = n (aj , bj ). Moreover, given any open set O, j=1 we can cover it by such open rectangles satisfying I ⊆ O. By Lindel¨f’s o theorem there is a countable subcover and hence every open set can be written as a countable union of open rectangles. Clearly the intervals (aj , ∞) can also be replaced by [aj , ∞), (−∞, aj ), or (−∞, aj ]. If X is a topological space and Σ the corresponding Borel σ-algebra, we will also call a measurable function a Borel function. Note that, in particular, Lemma A.10. Let X be a topological space and Σ its Borel σ-algebra. Any continuous function is Borel. Moreover, if f : X → Rn and g : Y ⊆ Rn → Rm are Borel functions, then the composition g ◦ f is again Borel. Sometimes it is also convenient to allow ±∞ as possible values for f , that is, functions f : X → R, R = R ∪ {−∞, ∞}. In this case A ⊆ R is called Borel if A ∩ R is. The set of all measurable functions forms an algebra. Lemma A.11. Let X be a topological space and Σ its Borel σ-algebra. Suppose f, g : X → R are measurable functions. Then the sum f + g and the product f g are measurable. Proof. Note that addition and multiplication are continuous functions from R2 → R and hence the claim follows from the previous lemma.
  • 282. 270 A. Almost everything about Lebesgue integration Moreover, the set of all measurable functions is closed under all impor- tant limiting operations. Lemma A.12. Suppose fn : X → R is a sequence of measurable functions. Then inf fn , sup fn , lim inf fn , lim sup fn (A.9) n∈N n∈N n→∞ n→∞ are measurable as well. Proof. It suffices to prove that sup fn is measurable since the rest follows from inf fn = − sup(−fn ), lim inf fn = supk inf n≥k fn , and lim sup fn = inf k supn≥k fn . But (sup fn )−1 ((a, ∞)) = n fn ((a, ∞)) and we are done. −1 A few immediate consequences are worthwhile noting: It follows that if f and g are measurable functions, so are min(f, g), max(f, g), |f | = max(f, −f ), and f ± = max(±f, 0). Furthermore, the pointwise limit of measurable functions is again measurable. A.4. The Lebesgue integral Now we can define the integral for measurable functions as follows. A mea- surable function s : X → R is called simple if its range is finite; that is, if p s= α j χA j , Aj = s−1 (αj ) ∈ Σ. (A.10) j=1 Here χA is the characteristic function of A; that is, χA (x) = 1 if x ∈ A and χA (x) = 0 otherwise. For a nonnegative simple function we define its integral as p s dµ = αj µ(Aj ∩ A). (A.11) A j=1 Here we use the convention 0 · ∞ = 0. Lemma A.13. The integral has the following properties: (i) A s dµ = X χA s dµ. (ii) S∞ s dµ = ∞ Aj j=1 s dµ, Aj ∩ Ak = ∅ for j = k. j=1 Aj (iii) A α s dµ =α A s dµ, α ≥ 0. (iv) A (s + t)dµ = A s dµ + A t dµ. (v) A ⊆ B ⇒ A s dµ ≤ B s dµ. (vi) s ≤ t ⇒ A s dµ ≤ A t dµ.
  • 283. A.4. The Lebesgue integral 271 Proof. (i) is clear from the definition. (ii) follows from σ-additivity of µ. (iii) is obvious. (iv) Let s = j α j χA j , t = j βj χBj and abbreviate Cjk = (Aj ∩ Bk ) ∩ A. Then, by (ii), (s + t)dµ = (s + t)dµ = (αj + βk )µ(Cjk ) A j,k Cjk j,k = s dµ + t dµ = s dµ + t dµ. j,k Cjk Cjk A A (v) follows from monotonicity of µ. (vi) follows since by (iv) we can write s = j αj χCj , t = j βj χCj where, by assumption, αj ≤ βj . Our next task is to extend this definition to arbitrary positive functions by f dµ = sup s dµ, (A.12) A s≤f A where the supremum is taken over all simple functions s ≤ f . Note that, except for possibly (ii) and (iv), Lemma A.13 still holds for this extension. Theorem A.14 (Monotone convergence). Let fn be a monotone nondecreas- ing sequence of nonnegative measurable functions, fn f . Then fn dµ → f dµ. (A.13) A A Proof. By property (vi), A fn dµ is monotone and converges to some num- ber α. By fn ≤ f and again (vi) we have α≤ f dµ. A To show the converse, let s be simple such that s ≤ f and let θ ∈ (0, 1). Put An = {x ∈ A|fn (x) ≥ θs(x)} and note An A (show this). Then fn dµ ≥ fn dµ ≥ θ s dµ. A An An Letting n → ∞, we see α≥θ s dµ. A Since this is valid for any θ 1, it still holds for θ = 1. Finally, since s ≤ f is arbitrary, the claim follows. In particular f dµ = lim sn dµ, (A.14) A n→∞ A
  • 284. 272 A. Almost everything about Lebesgue integration for any monotone sequence sn f of simple functions. Note that there is always such a sequence, for example, 2n k k k+1 sn (x) = χ −1 (x), Ak = [ , ), A2n = [n, ∞). (A.15) 2n f (Ak ) 2n 2n k=0 By construction sn converges uniformly if f is bounded, since sn (x) = n if 1 f (x) = ∞ and f (x) − sn (x) n if f (x) n + 1. Now what about the missing items (ii) and (iv) from Lemma A.13? Since limits can be spread over sums, the extension is linear (i.e., item (iv) holds) and (ii) also follows directly from the monotone convergence theorem. We even have the following result: Lemma A.15. If f ≥ 0 is measurable, then dν = f dµ defined via ν(A) = f dµ (A.16) A is a measure such that g dν = gf dµ. (A.17) Proof. As already mentioned, additivity of µ is equivalent to linearity of the integral and σ-additivity follows from the monotone convergence theorem: ∞ ∞ ∞ ∞ ν( An ) = ( χAn )f dµ = χAn f dµ = ν(An ). n=1 n=1 n=1 n=1 The second claim holds for simple functions and hence for all functions by construction of the integral. If fn is not necessarily monotone, we have at least Theorem A.16 (Fatou’s lemma). If fn is a sequence of nonnegative mea- surable function, then lim inf fn dµ ≤ lim inf fn dµ. (A.18) A n→∞ n→∞ A Proof. Set gn = inf k≥n fk . Then gn ≤ fn implying gn dµ ≤ fn dµ. A A Now take the lim inf on both sides and note that by the monotone conver- gence theorem lim inf gn dµ = lim gn dµ = lim gn dµ = lim inf fn dµ, n→∞ A n→∞ A A n→∞ A n→∞ proving the claim.
  • 285. A.4. The Lebesgue integral 273 If the integral is finite for both the positive and negative part f ± of an arbitrary measurable function f , we call f integrable and set f dµ = f + dµ − f − dµ. (A.19) A A A The set of all integrable functions is denoted by L1 (X, dµ). Lemma A.17. Lemma A.13 holds for integrable functions s, t. Similarly, we handle the case where f is complex-valued by calling f integrable if both the real and imaginary part are and setting f dµ = Re(f )dµ + i Im(f )dµ. (A.20) A A A Clearly f is integrable if and only if |f | is. Lemma A.18. For any integrable functions f , g we have | f dµ| ≤ |f | dµ (A.21) A A and (triangle inequality) |f + g| dµ ≤ |f | dµ + |g| dµ. (A.22) A A A z∗ Proof. Put α = |z| , where z = Af dµ (without restriction z = 0). Then | f dµ| = α f dµ = α f dµ = Re(α f ) dµ ≤ |f | dµ, A A A A A proving the first claim. The second follows from |f + g| ≤ |f | + |g|. In addition, our integral is well behaved with respect to limiting opera- tions. Theorem A.19 (Dominated convergence). Let fn be a convergent sequence of measurable functions and set f = limn→∞ fn . Suppose there is an inte- grable function g such that |fn | ≤ g. Then f is integrable and lim fn dµ = f dµ. (A.23) n→∞ Proof. The real and imaginary parts satisfy the same assumptions and so do the positive and negative parts. Hence it suffices to prove the case where fn and f are nonnegative. By Fatou’s lemma lim inf fn dµ ≥ f dµ n→∞ A A
  • 286. 274 A. Almost everything about Lebesgue integration and lim inf (g − fn )dµ ≥ (g − f )dµ. n→∞ A A Subtracting A g dµ on both sides of the last inequality finishes the proof since lim inf(−fn ) = − lim sup fn . Remark: Since sets of measure zero do not contribute to the value of the integral, it clearly suffices if the requirements of the dominated convergence theorem are satisfied almost everywhere (with respect to µ). 1 Note that the existence of g is crucial, as the example fn (x) = n χ[−n,n] (x) on R with Lebesgue measure shows. Example. If µ(x) = n αn Θ(x − xn ) is a sum of Dirac measures, Θ(x) centered at x = 0, then f (x)dµ(x) = αn f (xn ). (A.24) n Hence our integral contains sums as special cases. Problem A.5. Show that the set B(X) of bounded measurable functions with the sup norm is a Banach space. Show that the set S(X) of simple functions is dense in B(X). Show that the integral is a bounded linear func- tional on B(X). (Hence Theorem 0.26 could be used to extend the integral from simple to bounded measurable functions.) Problem A.6. Show that the dominated convergence theorem implies (un- der the same assumptions) lim |fn − f |dµ = 0. n→∞ Problem A.7. Let X ⊆ R, Y be some measure space, and f : X × Y → R. Suppose y → f (x, y) is measurable for every x and x → f (x, y) is continuous for every y. Show that F (x) = f (x, y) dµ(y) (A.25) A is continuous if there is an integrable function g(y) such that |f (x, y)| ≤ g(y). Problem A.8. Let X ⊆ R, Y be some measure space, and f : X × Y → R. Suppose y → f (x, y) is measurable for all x and x → f (x, y) is differentiable for a.e. y. Show that F (x) = f (x, y) dµ(y) (A.26) A
  • 287. A.5. Product measures 275 ∂ is differentiable if there is an integrable function g(y) such that | ∂x f (x, y)| ≤ ∂ g(y). Moreover, x → ∂x f (x, y) is measurable and ∂ F (x) = f (x, y) dµ(y) (A.27) A ∂x in this case. A.5. Product measures Let µ1 and µ2 be two measures on Σ1 and Σ2 , respectively. Let Σ1 ⊗ Σ2 be the σ-algebra generated by rectangles of the form A1 × A2 . Example. Let B be the Borel sets in R. Then B2 = B ⊗ B are the Borel sets in R2 (since the rectangles are a basis for the product topology). Any set in Σ1 ⊗ Σ2 has the section property; that is, Lemma A.20. Suppose A ∈ Σ1 ⊗ Σ2 . Then its sections A1 (x2 ) = {x1 |(x1 , x2 ) ∈ A} and A2 (x1 ) = {x2 |(x1 , x2 ) ∈ A} (A.28) are measurable. Proof. Denote all sets A ∈ Σ1 ⊗ Σ2 with the property that A1 (x2 ) ∈ Σ1 by S. Clearly all rectangles are in S and it suffices to show that S is a σ-algebra. Now, if A ∈ S, then (A )1 (x2 ) = (A1 (x2 )) ∈ Σ2 and thus S is closed under complements. Similarly, if An ∈ S, then ( n An )1 (x2 ) = n (An )1 (x2 ) shows that S is closed under countable unions. This implies that if f is a measurable function on X1 ×X2 , then f (., x2 ) is measurable on X1 for every x2 and f (x1 , .) is measurable on X2 for every x1 (observe A1 (x2 ) = {x1 |f (x1 , x2 ) ∈ B}, where A = {(x1 , x2 )|f (x1 , x2 ) ∈ B}). Given two measures µ1 on Σ1 and µ2 on Σ2 , we now want to construct the product measure µ1 ⊗ µ2 on Σ1 ⊗ Σ2 such that µ1 ⊗ µ2 (A1 × A2 ) = µ1 (A1 )µ2 (A2 ), Aj ∈ Σj , j = 1, 2. (A.29) Theorem A.21. Let µ1 and µ2 be two σ-finite measures on Σ1 and Σ2 , respectively. Let A ∈ Σ1 ⊗ Σ2 . Then µ2 (A2 (x1 )) and µ1 (A1 (x2 )) are mea- surable and µ2 (A2 (x1 ))dµ1 (x1 ) = µ1 (A1 (x2 ))dµ2 (x2 ). (A.30) X1 X2 Proof. Let S be the set of all subsets for which our claim holds. Note that S contains at least all rectangles. It even contains the algebra of finite disjoint unions of rectangles. Thus it suffices to show that S is a monotone class. If µ1 and µ2 are finite, measurability and equality of both integrals follow from the monotone convergence theorem for increasing sequences of
  • 288. 276 A. Almost everything about Lebesgue integration sets and from the dominated convergence theorem for decreasing sequences of sets. If µ1 and µ2 are σ-finite, let Xi,j Xi with µi (Xi,j ) ∞ for i = 1, 2. Now µ2 ((A ∩ X1,j × X2,j )2 (x1 )) = µ2 (A2 (x1 ) ∩ X2,j )χX1,j (x1 ) and similarly with 1 and 2 exchanged. Hence by the finite case µ2 (A2 ∩ X2,j )χX1,j dµ1 = µ1 (A1 ∩ X1,j )χX2,j dµ2 (A.31) X1 X2 and the σ-finite case follows from the monotone convergence theorem. Hence we can define µ1 ⊗ µ2 (A) = µ2 (A2 (x1 ))dµ1 (x1 ) = µ1 (A1 (x2 ))dµ2 (x2 ) (A.32) X1 X2 or equivalently, since χA1 (x2 ) (x1 ) = χA2 (x1 ) (x2 ) = χA (x1 , x2 ), µ1 ⊗ µ2 (A) = χA (x1 , x2 )dµ2 (x2 ) dµ1 (x1 ) X1 X2 = χA (x1 , x2 )dµ1 (x1 ) dµ2 (x2 ). (A.33) X2 X1 Additivity of µ1 ⊗ µ2 follows from the monotone convergence theorem. Note that (A.29) uniquely defines µ1 ⊗ µ2 as a σ-finite premeasure on the algebra of finite disjoint unions of rectangles. Hence by Theorem A.5 it is the only measure on Σ1 ⊗ Σ2 satisfying (A.29). Finally we have Theorem A.22 (Fubini). Let f be a measurable function on X1 × X2 and let µ1 , µ2 be σ-finite measures on X1 , X2 , respectively. (i) If f ≥ 0, then f (., x2 )dµ2 (x2 ) and f (x1 , .)dµ1 (x1 ) are both measurable and f (x1 , x2 )dµ1 ⊗ µ2 (x1 , x2 ) = f (x1 , x2 )dµ1 (x1 ) dµ2 (x2 ) = f (x1 , x2 )dµ2 (x2 ) dµ1 (x1 ). (A.34) (ii) If f is complex, then |f (x1 , x2 )|dµ1 (x1 ) ∈ L1 (X2 , dµ2 ), (A.35) respectively, |f (x1 , x2 )|dµ2 (x2 ) ∈ L1 (X1 , dµ1 ), (A.36)
  • 289. A.5. Product measures 277 if and only if f ∈ L1 (X1 × X2 , dµ1 ⊗ dµ2 ). In this case (A.34) holds. Proof. By Theorem A.21 and linearity the claim holds for simple functions. To see (i), let sn f be a sequence of nonnegative simple functions. Then it follows by applying the monotone convergence theorem (twice for the double integrals). For (ii) we can assume that f is real-valued by considering its real and imaginary parts separately. Moreover, splitting f = f + −f − into its positive and negative parts, the claim reduces to (i). In particular, if f (x1 , x2 ) is either nonnegative or integrable, then the order of integration can be interchanged. Lemma A.23. If µ1 and µ2 are σ-finite regular Borel measures, then so is µ 1 ⊗ µ2 . Proof. Regularity holds for every rectangle and hence also for the algebra of finite disjoint unions of rectangles. Thus the claim follows from Lemma A.6. Note that we can iterate this procedure. Lemma A.24. Suppose µj , j = 1, 2, 3, are σ-finite measures. Then (µ1 ⊗ µ2 ) ⊗ µ3 = µ1 ⊗ (µ2 ⊗ µ3 ). (A.37) Proof. First of all note that (Σ1 ⊗ Σ2 ) ⊗ Σ3 = Σ1 ⊗ (Σ2 ⊗ Σ3 ) is the sigma algebra generated by the rectangles A1 ×A2 ×A3 in X1 ×X2 ×X3 . Moreover, since ((µ1 ⊗ µ2 ) ⊗ µ3 )(A1 × A2 × A3 ) = µ1 (A1 )µ2 (A2 )µ3 (A3 ) = (µ1 ⊗ (µ2 ⊗ µ3 ))(A1 × A2 × A3 ), the two measures coincide on the algebra of finite disjoint unions of rectan- gles. Hence they coincide everywhere by Theorem A.5. Example. If λ is Lebesgue measure on R, then λn = λ ⊗ · · · ⊗ λ is Lebesgue measure on Rn . Since λ is regular, so is λn . Problem A.9. Show that the set of all finite union of rectangles A1 × A2 forms an algebra. Problem A.10. Let U ⊆ C be a domain, Y be some measure space, and f : U × Y → R. Suppose y → f (z, y) is measurable for every z and z → f (z, y) is holomorphic for every y. Show that F (z) = f (z, y) dµ(y) (A.38) A
  • 290. 278 A. Almost everything about Lebesgue integration is holomorphic if for every compact subset V ⊂ U there is an integrable function g(y) such that |f (z, y)| ≤ g(y), z ∈ V . (Hint: Use Fubini and Morera.) A.6. Vague convergence of measures Let µn be a sequence of Borel measures, we will say that µn converges to µ vaguely if f dµn → f dµ (A.39) X X for every f ∈ Cc (X). We are only interested in the case of Borel measures on R. In this case we have the following equivalent characterization of vague convergence. Lemma A.25. Let µn be a sequence of Borel measures on R. Then µn → µ vaguely if and only if the (normalized) distribution functions converge at every point of continuity of µ. Proof. Suppose µn → µ vaguely. Let I be any bounded interval (closed, half closed, or open) with boundary points x0 , x1 . Moreover, choose continuous functions f, g with compact support such that f ≤ χI ≤ g. Then we have f dµ ≤ µ(I) ≤ gdµ and similarly for µn . Hence µ(I) − µn (I) ≤ gdµ − f dµn ≤ (g − f )dµ + f dµ − f dµn and µ(I) − µn (I) ≥ f dµ − gdµn ≥ (f − g)dµ − gdµ − gdµn . Combining both estimates, we see |µ(I) − µn (I)| ≤ (g − f )dµ + f dµ − f dµn + gdµ − gdµn and so lim sup |µ(I) − µn (I)| ≤ (g − f )dµ. n→∞ Choosing f , g such that g − f → χ{x0 } + χ{x1 } pointwise, we even get from dominated convergence that lim sup |µ(I) − µn (I)| ≤ µ({x0 }) + µ({x1 }), n→∞ which proves that the distribution functions converge at every point of con- tinuity of µ. Conversely, suppose that the distribution functions converge at every point of continuity of µ. To see that in fact µn → µ vaguely, let f ∈ Cc (R). Fix some ε 0 and note that, since f is uniformly continuous, there is a
  • 291. A.6. Vague convergence of measures 279 δ 0 such that |f (x) − f (y)| ≤ ε whenever |x − y| ≤ δ. Next, choose some points x0 x1 · · · xk such that supp(f ) ⊂ (x0 , xk ), µ is continuous at xj , and xj −xj−1 ≤ δ (recall that a monotone function has at most countable discontinuities). Furthermore, there is some N such that |µn (xj ) − µ(xj )| ≤ ε 2k for all j and n ≥ N . Then k f dµn − f dµ ≤ |f (x) − f (xj )|dµn (x) j=1 (xj−1 ,xj ] k + |f (xj )||µ((xj−1 , xj ]) − µn ((xj−1 , xj ])| j=1 k + |f (x) − f (xj )|dµ(x). j=1 (xj−1 ,xj ] Now, for n ≥ N , the first and the last term on the right-hand side are both ε bounded by (µ((x0 , xk ]) + k )ε and the middle term is bounded by max |f |ε. Thus the claim follows. Moreover, every bounded sequence of measures has a vaguely convergent subsequence. Lemma A.26. Suppose µn is a sequence of finite Borel measures on R such that µn (R) ≤ M . Then there exists a subsequence which converges vaguely to some measure µ with µ(R) ≤ M . Proof. Let µn (x) = µn ((−∞, x]) be the corresponding distribution func- tions. By 0 ≤ µn (x) ≤ M there is a convergent subsequence for fixed x. Moreover, by the standard diagonal series trick, we can assume that µn (x) converges to some number µ(x) for each rational x. For irrational x we set µ(x) = inf x0 x {µ(x0 )|x0 rational}. Then µ(x) is monotone, 0 ≤ µ(x1 ) ≤ µ(x2 ) ≤ M for x1 ≤ x2 . Furthermore, µ(x−) ≤ lim inf µn (x) ≤ lim sup µn (x) ≤ µ(x) shows that µn (x) → µ(x) at every point of continuity of µ. So we can redefine µ to be right continuous without changing this last fact. In the case where the sequence is bounded, (A.39) even holds for a larger class of functions. Lemma A.27. Suppose µn → µ vaguely and µn (R) ≤ M . Then (A.39) holds for any f ∈ C∞ (R). Proof. Split f = f1 + f2 , where f1 has compact support and |f2 | ≤ ε. Then | f dµ − f dµn | ≤ | f1 dµ − f1 dµn | + 2εM and the claim follows.
  • 292. 280 A. Almost everything about Lebesgue integration Example. The example dµn (λ) = dΘ(λ − n) shows that in the above claim f cannot be replaced by a bounded continuous function. Moreover, the example dµn (λ) = n dΘ(λ − n) also shows that the uniform bound cannot be dropped. A.7. Decomposition of measures Let µ, ν be two measures on a measure space (X, Σ). They are called mutually singular (in symbols µ ⊥ ν) if they are supported on disjoint sets. That is, there is a measurable set N such that µ(N ) = 0 and ν(XN ) = 0. Example. Let λ be the Lebesgue measure and Θ the Dirac measure (cen- tered at 0). Then λ ⊥ Θ: Just take N = {0}; then λ({0}) = 0 and Θ(R{0}) = 0. On the other hand, ν is called absolutely continuous with respect to µ (in symbols ν µ) if µ(A) = 0 implies ν(A) = 0. Example. The prototypical example is the measure dν = f dµ (compare Lemma A.15). Indeed µ(A) = 0 implies ν(A) = f dµ = 0 (A.40) A and shows that ν is absolutely continuous with respect to µ. In fact, we will show below that every absolutely continuous measure is of this form. The two main results will follow as simple consequence of the following result: Theorem A.28. Let µ, ν be σ-finite measures. Then there exists a unique (a.e.) nonnegative function f and a set N of µ measure zero, such that ν(A) = ν(A ∩ N ) + f dµ. (A.41) A Proof. We first assume µ, ν to be finite measures. Let α = µ + ν and consider the Hilbert space L2 (X, dα). Then (h) = h dν X is a bounded linear functional by Cauchy–Schwarz: 2 | (h)|2 = 1 · h dν ≤ |1|2 dν |h|2 dν X ≤ ν(X) |h|2 dα = ν(X) h 2 .
  • 293. A.7. Decomposition of measures 281 Hence by the Riesz lemma (Theorem 1.8) there exists a g ∈ L2 (X, dα) such that (h) = hg dα. X By construction ν(A) = χA dν = χA g dα = g dα. (A.42) A In particular, g must be positive a.e. (take A the set where g is negative). Furthermore, let N = {x|g(x) ≥ 1}. Then ν(N ) = g dα ≥ α(N ) = µ(N ) + ν(N ), N which shows µ(N ) = 0. Now set g f= χN , N = XN. 1−g Then, since (A.42) implies dν = g dα, respectively, dµ = (1 − g)dα, we have g f dµ = χA χN dµ = χA∩N g dα = ν(A ∩ N ) A 1−g ˜ as desired. Clearly f is unique, since if there is a second function f , then ˜ ˜ A (f − f )dµ = 0 for every A shows f − f = 0 a.e. To see the σ-finite case, observe that Xn X, µ(Xn ) ∞ and Yn X, ν(Yn ) ∞ implies Xn ∩ Yn X and α(Xn ∩ Yn ) ∞. Hence when restricted to Xn ∩Yn , we have sets Nn and functions fn . Now take N = Nn and choose f such that f |Xn = fn (this is possible since fn+1 |Xn = fn a.e.). Then µ(N ) = 0 and ν(A ∩ N ) = lim ν(A ∩ (Xn N )) = lim f dµ = f dµ, n→∞ n→∞ A∩X A n which finishes the proof. Now the anticipated results follow with no effort: Theorem A.29 (Lebesgue decomposition). Let µ, ν be two σ-finite mea- sures on a measure space (X, Σ). Then ν can be uniquely decomposed as ν = νac + νsing , where νac and νsing are mutually singular and νac is abso- lutely continuous with respect to µ. Proof. Taking νsing (A) = ν(A ∩ N ) and dνac = f dµ, there is at least one such decomposition. To show uniqueness, first let ν be finite. If there ˜ ˜ is another one, ν = νac + νsing , then let N be such that µ(N ) = 0 and ˜ ˜ ˜ ˜ ˜ ˜ ˜ νsing (N ) = 0. Then νsing (A) − νsing (A) = A (f − f )dµ. In particular, ˜ ˜ ˜ A∩N ∩N (f − f )dµ = 0 and hence f = f a.e. away from N ∪ N . Since ˜
  • 294. 282 A. Almost everything about Lebesgue integration ˜ ˜ µ(N ∪ N ) = 0, we have f = f a.e. and hence νac = νac as well as νsing = ˜ ˜ ν − νac = ν − νac = νsing . The σ-finite case follows as usual. ˜ Theorem A.30 (Radon–Nikodym). Let µ, ν be two σ-finite measures on a measure space (X, Σ). Then ν is absolutely continuous with respect to µ if and only if there is a positive measurable function f such that ν(A) = f dµ (A.43) A for every A ∈ Σ. The function f is determined uniquely a.e. with respect to dν µ and is called the Radon–Nikodym derivative dµ of ν with respect to µ. Proof. Just observe that in this case ν(A ∩ N ) = 0 for every A; that is, νsing = 0. Problem A.11. Let µ be a Borel measure on B and suppose its distribution function µ(x) is differentiable. Show that the Radon–Nikodym derivative equals the ordinary derivative µ (x). Problem A.12. Suppose µ and ν are inner regular measures. Show that ν µ if and only if µ(C) = 0 implies ν(C) = 0 for every compact set. Problem A.13. Let dν = f dµ. Suppose f 0 a.e. with respect to µ. Then µ ν and dµ = f −1 dν. Problem A.14 (Chain rule). Show that ν µ is a transitive relation. In particular, if ω ν µ, show that dω dω dν = . dµ dν dµ Problem A.15. Suppose ν µ. Show that for any measure ω we have dω dω dµ = dν + dζ, dµ dν where ζ is a positive measure (depending on ω) which is singular with respect to ν. Show that ζ = 0 if and only if µ ν. A.8. Derivatives of measures If µ is a Borel measure on B and its distribution function µ(x) is differen- tiable, then the Radon–Nikodym derivative is just the ordinary derivative µ (x) (Problem A.11). Our aim in this section is to generalize this result to arbitrary regular Borel measures on Bn . We call µ(Bε (x)) (Dµ)(x) = lim (A.44) ε↓0 |Bε (x)|
  • 295. A.8. Derivatives of measures 283 the derivative of µ at x ∈ Rn provided the above limit exists. (Here Br (x) ⊂ Rn is a ball of radius r centered at x ∈ Rn and |A| denotes the Lebesgue measure of A ∈ Bn .) Note that for a Borel measure on B, (Dµ)(x) exists if and only if µ(x) (as defined in (A.3)) is differentiable at x and (Dµ)(x) = µ (x) in this case. To compute the derivative of µ, we introduce the upper and lower derivative, µ(Bε (x)) µ(Bε (x)) (Dµ)(x) = lim sup and (Dµ)(x) = lim inf . (A.45) ε↓0 |Bε (x)| ε↓0 |Bε (x)| Clearly µ is differentiable if (Dµ)(x) = (Dµ)(x) ∞. First of all note that they are measurable: Lemma A.31. The upper derivative is lower semicontinuous; that is, the set {x|(Dµ)(x) α} is open for every α ∈ R. Similarly, the lower derivative is upper semicontinuous; that is, {x|(Dµ)(x) α} is open. Proof. We only prove the claim for Dµ, the case Dµ being similar. Abbre- viate µ(Bε (x)) Mr (x) = sup 0εr |Bε (x)| and note that it suffices to show that Or = {x|Mr (x) α} is open. If x ∈ Or , there is some ε r such that µ(Bε (x)) α. |Bε (x)| Let δ 0 and y ∈ Bδ (x). Then Bε (x) ⊆ Bε+δ (y) implying n µ(Bε+δ (y)) ε µ(Bε (x)) ≥ α |Bε+δ (y)| ε+δ |Bε (x)| for δ sufficiently small. That is, Bδ (x) ⊆ O. In particular, both the upper and lower derivatives are measurable. Next, the following geometric fact of Rn will be needed. Lemma A.32. Given open balls B1 , . . . , Bm in Rn , there is a subset of disjoint balls Bj1 , . . . , Bjk such that m k Bi ≤ 3n |Bji |. (A.46) i=1 i=1 Proof. Assume that the balls Bj are ordered by radius. Start with Bj1 = B1 = Br1 (x1 ) and remove all balls from our list which intersect Bj1 . Observe
  • 296. 284 A. Almost everything about Lebesgue integration that the removed balls are all contained in 3B1 = B3r1 (x1 ). Proceeding like this, we obtain Bj1 , . . . , Bjk such that m k Bi ⊆ 3Brji i=1 i=1 and the claim follows since |3B| = 3n |B|. Now we can show Lemma A.33. Let α 0. For any Borel set A we have µ(A) |{x ∈ A | (Dµ)(x) α}| ≤ 3n (A.47) α and |{x ∈ A | (Dµ)(x) 0}| = 0, whenever µ(A) = 0. (A.48) Proof. Let Aα = {x ∈ A|(Dµ)(x) α}. We will show µ(O) |K| ≤ 3n α for any compact set K and open set O with K ⊆ Aα ⊆ O. The first claim then follows from regularity of µ and the Lebesgue measure. Given fixed K, O, for every x ∈ K there is some rx such that Brx (x) ⊆ O and |Brx (x)| α−1 µ(Brx (x)). Since K is compact, we can choose a finite subcover of K. Moreover, by Lemma A.32 we can refine our set of balls such that k k 3n µ(O) |K| ≤ 3n |Bri (xi )| µ(Bri (xi )) ≤ 3n . α α i=1 i=1 To see the second claim, observe that ∞ 1 {x ∈ A | (Dµ)(x) 0} = {x ∈ A | (Dµ)(x) } j j=1 and by the first part |{x ∈ A | (Dµ)(x) 1 }| = 0 for any j if µ(A) = 0. j Theorem A.34 (Lebesgue). Let f be (locally) integrable, then for a.e. x ∈ Rn we have 1 lim |f (y) − f (x)|dy = 0. (A.49) r↓0 |Br (x)| Br (x) Proof. Decompose f as f = g + h, where g is continuous and h 1 ε (Theorem 0.34) and abbreviate 1 Dr (f )(x) = |f (y) − f (x)|dy. |Br (x)| Br (x)
  • 297. A.8. Derivatives of measures 285 Then, since lim Dr (g)(x) = 0 (for every x) and Dr (f ) ≤ Dr (g) + Dr (h), we have lim sup Dr (f )(x) ≤ lim sup Dr (h)(x) ≤ (Dµ)(x) + |h(x)|, r↓0 r↓0 where dµ = |h|dx. This implies {x | lim sup Dr (f )(x) ≥ 2α} ⊆ {x|(Dµ)(x) ≥ α} ∪ {x | |h(x)| ≥ α} r↓0 and using the first part of Lemma A.33 plus |{x | |h(x)| ≥ α}| ≤ α−1 h 1 , we see ε |{x | lim sup Dr (f )(x) ≥ 2α}| ≤ (3n + 1) . r↓0 α Since ε is arbitrary, the Lebesgue measure of this set must be zero for every α. That is, the set where the lim sup is positive has Lebesgue measure zero. The points where (A.49) holds are called Lebesgue points of f . Note that the balls can be replaced by more general sets: A sequence of sets Aj (x) is said to shrink to x nicely if there are balls Brj (x) with rj → 0 and a constant ε 0 such that Aj (x) ⊆ Brj (x) and |Aj | ≥ ε|Brj (x)|. For example Aj (x) could be some balls or cubes (not necessarily containing x). However, the portion of Brj (x) which they occupy must not go to zero! For example the rectangles (0, 1 ) × (0, 2 ) ⊂ R2 do shrink nicely to 0, but the j j rectangles (0, 1 ) × (0, j2 ) do not. j 2 Lemma A.35. Let f be (locally) integrable. Then at every Lebesgue point we have 1 f (x) = lim f (y)dy (A.50) j→∞ |Aj (x)| A (x) j whenever Aj (x) shrinks to x nicely. Proof. Let x be a Lebesgue point and choose some nicely shrinking sets Aj (x) with corresponding Brj (x) and ε. Then 1 1 |f (y) − f (x)|dy ≤ |f (y) − f (x)|dy |Aj (x)| Aj (x) ε|Brj (x)| Brj (x) and the claim follows. Corollary A.36. Suppose µ is an absolutely continuous Borel measure on R. Then its distribution function is differentiable a.e. and dµ(x) = µ (x)dx. As another consequence we obtain
  • 298. 286 A. Almost everything about Lebesgue integration Theorem A.37. Let µ be a Borel measure on Rn . The derivative Dµ exists a.e. with respect to Lebesgue measure and equals the Radon–Nikodym derivative of the absolutely continuous part of µ with respect to Lebesgue measure; that is, µac (A) = (Dµ)(x)dx. (A.51) A Proof. If dµ = f dx is absolutely continuous with respect to Lebesgue mea- sure, the claim follows from Theorem A.34. To see the general case, use the Lebesgue decomposition of µ and let N be a support for the singular part with |N | = 0. Then (Dµsing )(x) = 0 for a.e. x ∈ Rn N by the second part of Lemma A.33. In particular, µ is singular with respect to Lebesgue measure if and only if Dµ = 0 a.e. with respect to Lebesgue measure. Using the upper and lower derivatives, we can also give supports for the absolutely and singularly continuous parts. Theorem A.38. The set {x|(Dµ)(x) = ∞} is a support for the singular and {x|0 (Dµ)(x) ∞} is a support for the absolutely continuous part. Proof. First suppose µ is purely singular. Let us show that the set Ok = {x | (Dµ)(x) k} satisfies µ(Ok ) = 0 for every k ∈ N. Let K ⊂ Ok be compact, and let Vj ⊃ K be some open set such that |Vj K| ≤ 1 . For every x ∈ K there is some ε = ε(x) such that Bε (x) ⊆ Vj j and µ(Bε (x)) ≤ k|Bε (x)|. By compactness, finitely many of these balls cover K and hence µ(K) ≤ µ(Bεi (xi )) ≤ k |Bεi (xi )|. i i Selecting disjoint balls as in Lemma A.32 further shows µ(K) ≤ k3n |Bεi (xi )| ≤ k3n |Vj |. Letting j → ∞, we see µ(K) ≤ k3n |K| and by regularity we even have µ(A) ≤ k3n |A| for every A ⊆ Ok . Hence µ is absolutely continuous on Ok and since we assumed µ to be singular, we must have µ(Ok ) = 0. Thus (Dµsing )(x) = ∞ for a.e. x with respect to µsing and we are done. Finally, we note that these supports are minimal. Here a support M of some measure µ is called a minimal support (it is sometimes also called an essential support) if any subset M0 ⊆ M which does not support µ (i.e., µ(M0 ) = 0) has Lebesgue measure zero.
  • 299. A.8. Derivatives of measures 287 Lemma A.39. The set Mac = {x|0 (Dµ)(x) ∞} is a minimal support for µac . Proof. Suppose M0 ⊆ Mac and µac (M0 ) = 0. Set Mε = {x ∈ M0 |ε (Dµ)(x)} for ε 0. Then Mε M0 and 1 1 1 |Mε | = dx ≤ (Dµ)(x)dx = µac (Mε ) ≤ µac (M0 ) = 0 Mε ε Mε ε ε shows |M0 | = limε↓0 |Mε | = 0. Note that the set M = {x|0 (Dµ)(x)} is a minimal support of µ. Example. The Cantor function is constructed as follows: Take the sets Cn used in the construction of the Cantor set C: Cn is the union of 2n closed intervals with 2n − 1 open gaps in between. Set fn equal to j/2n on the j’th gap of Cn and extend it to [0, 1] by linear interpolation. Note that, since we are creating precisely one new gap between every old gap when going from Cn to Cn+1 , the value of fn+1 is the same as the value of fn on the gaps of Cn . In particular, fn − fm ∞ ≤ 2− min(n,m) and hence we can define the Cantor function as f = limn→∞ fn . By construction f is a continuous function which is constant on every subinterval of [0, 1]C. Since C is of Lebesgue measure zero, this set is of full Lebesgue measure and hence f = 0 a.e. in [0, 1]. In particular, the corresponding measure, the Cantor measure, is supported on C and is purely singular with respect to Lebesgue measure. Problem A.16. Show that M = {x|0 (Dµ)(x)} is a minimal support of µ.
  • 301. Bibliographical notes The aim of this section is not to give a comprehensive guide to the literature, but to document the sources from which I have learned the materials and which I have used during the preparation of this text. In addition, I will point out some standard references for further reading. In some sense all books on this topic are inspired by von Neumann’s celebrated monograph [64] and the present text is no exception. General references for the first part are Akhiezer and Glazman [2], Berthier (Boutet de Monvel) [9], Blank, Exner, and Havl´cek [10], Edmunds ıˇ and Evans [16], Lax [25], Reed and Simon [40], Weidmann [60], [62], or Yosida [66]. Chapter 0: A first look at Banach and Hilbert spaces As a reference for general background I can warmly recommend Kelly’s classical book [26]. The rest is standard material and can be found in any book on functional analysis. Chapter 1: Hilbert spaces The material in this chapter is again classical and can be found in any book on functional analysis. I mainly follow Reed and Simon [40], respectively, Weidmann [60], with the main difference being that I use orthonormal sets and their projections as the central theme from which everything else is derived. For an alternate problem based approach see Halmos’ book [22]. Chapter 2: Self-adjointness and spectrum This chapter is still similar in spirit to [40], [60] with some ideas taken from Schechter [48]. 289
  • 302. 290 Bibliographical notes Chapter 3: The spectral theorem The approach via the Herglotz representation theorem follows Weidmann [60]. However, I use projection-valued measures as in Reed and Simon [40] rather than the resolution of the identity. Moreover, I have augmented the discussion by adding material on spectral types and the connections with the boundary values of the resolvent. For a survey containing several recent results see [28]. Chapter 4: Applications of the spectral theorem This chapter collects several applications from various sources which I have found useful or which are needed later on. Again Reed and Simon [40] and Weidmann [60], [63] are the main references here. Chapter 5: Quantum dynamics The material is a synthesis of the lecture notes by Enß [18], Reed and Simon [40], [42], and Weidmann [63]. Chapter 6: Perturbation theory for self-adjoint operators This chapter is similar to [60] (which contains more results) with the main difference that I have added some material on quadratic forms. In particular, the section on quadratic forms contains, in addition to the classical results, some material which I consider useful but was unable to find (at least not in the present form) in the literature. The prime reference here is Kato’s monumental treatise [24] and Simon’s book [49]. For further information on trace class operators see Simon’s classic [52]. The idea to extend the usual notion of strong resolvent convergence by allowing the approximating operators to live on subspaces is taken from Weidmann [62]. Chapter 7: The free Schr¨dinger operator o Most of the material is classical. Much more on the Fourier transform can be found in Reed and Simon [41]. Chapter 8: Algebraic methods This chapter collects some material which can be found in almost any physics text book on quantum mechanics. My only contribution is to provide some mathematical details. I recommend the classical book by Thirring [58] and the visual guides by Thaller [56], [57]. Chapter 9: One-dimensional Schr¨dinger operators o One-dimensional models have always played a central role in understand- ing quantum mechanical phenomena. In particular, general wisdom used to say that Schr¨dinger operators should have absolutely continuous spectrum o plus some discrete point spectrum, while singular continuous spectrum is a pathology that should not occur in examples with bounded V [14, Sect. 10.4].
  • 303. Bibliographical notes 291 In fact, a large part of [43] is devoted to establishing the absence of sin- gular continuous spectrum. This was proven wrong by Pearson, who con- structed an explicit one-dimensional example with singular continuous spec- trum. Moreover, after the appearance of random models, it became clear that such kind of exotic spectra (singular continuous or dense pure point) are frequently generic. The starting point is often the boundary behaviour of the Weyl m-function and its connection with the growth properties of solutions of the underlying differential equation, the latter being known as Gilbert and Pearson or subordinacy theory. One of my main goals is to give a modern introduction to this theory. The section on inverse spectral theory presents a simple proof for the Borg–Marchenko theorem (in the local ver- sion of Simon) from Bennewitz [8]. Again this result is the starting point of almost all other inverse spectral results for Sturm–Liouville equations and should enable the reader to start reading research papers in this area. Other references with further information are the lecture notes by Weid- mann [61] or the classical books by Coddington and Levinson [13], Levitan [29], Levitan and Sargsjan [30], [31], Marchenko [33], Naimark [34], Pear- son [37]. See also the recent monographs by Rofe-Betekov and Kholkin [46], Zettl [67] or the recent collection of historic and survey articles [4]. For a nice introduction to random models I can recommend the recent notes by Kirsch [27] or the classical monographs by Carmona and Lacroix [11] or Pas- tur and Figotin [36]. For the discrete analog of Sturm–Liouville operators, Jacobi operators, see my monograph [54]. Chapter 10: One-particle Schr¨dinger operators o The presentation in the first two sections is influenced by Enß [18] and Thirring [58]. The solution of the Schr¨dinger equation in spherical coordi- o nates can be found in any text book on quantum mechanics. Again I tried to provide some missing mathematical details. Several other explicitly solv- able examples can be found in the books by Albeverio et al. [3] or Fl¨gge u [19]. For the formulation of quantum mechanics via path integrals I suggest Roepstorff [45] or Simon [50]. Chapter 11: Atomic Schr¨dinger operators o This chapter essentially follows Cycon, Froese, Kirsch, and Simon [14]. For a recent review see Simon [51]. Chapter 12: Scattering theory This chapter follows the lecture notes by Enß [18] (see also [17]) using some material from Perry [38]. Further information on mathematical scattering theory can be found in Amrein, Jauch, and Sinha [5], Baumgaertel and Wollenberg [6], Chadan and Sabatier [12], Cycon, Froese, Kirsch, and Simon [14], Newton [35], Pearson [37], Reed and Simon [42], or Yafaev [65].
  • 304. 292 Bibliographical notes Appendix A: Almost everything about Lebesgue integration Most parts follow Rudin’s book [47], respectively, Bauer [7], with some ideas also taken from Weidmann [60]. I have tried to strip everything down to the results needed here while staying self-contained. Another useful reference is the book by Lieb and Loss [32].
  • 305. Bibliography [1] M. Abramovitz and I. A. Stegun, Handbook of Mathematical Functions, Dover, New York, 1972. [2] N. I. Akhiezer and I. M. Glazman, Theory of Linear Operators in Hilbert Space, Vols. I and II, Pitman, Boston, 1981. [3] S. Albeverio, F. Gesztesy, R. Høegh-Krohn, and H. Holden, Solvable Models in Quantum Mechanics, 2nd ed., American Mathematical Society, Providence, 2005. [4] W. O. Amrein, A. M. Hinz, and D. B. Pearson, Sturm–Liouville Theory: Past and Present, Birkh¨user, Basel, 2005. a [5] W. O. Amrein, J. M. Jauch, and K. B. Sinha, Scattering Theory in Quantum Mechanics, W. A. Benajmin Inc., New York, 1977. [6] H. Baumgaertel and M. Wollenberg, Mathematical Scattering Theory, Birkh¨user, Basel, 1983. a [7] H. Bauer, Measure and Integration Theory, de Gruyter, Berlin, 2001. [8] C. Bennewitz, A proof of the local Borg–Marchenko theorem, Commun. Math. Phys. 218, 131–132 (2001). [9] A. M. Berthier, Spectral Theory and Wave Operators for the Schr¨dinger Equa- o tion, Pitman, Boston, 1982. [10] J. Blank, P. Exner, and M. Havl´cek, Hilbert-Space Operators in Quantum ıˇ Physics, 2nd ed., Springer, Dordrecht, 2008. [11] R. Carmona and J. Lacroix, Spectral Theory of Random Schr¨dinger Operators, o Birkh¨user, Boston, 1990. a [12] K. Chadan and P. C. Sabatier, Inverse Problems in Quantum Scattering Theory, Springer, New York, 1989. [13] E. A. Coddington and N. Levinson, Theory of Ordinary Differential Equations, Krieger, Malabar, 1985. [14] H. L. Cycon, R. G. Froese, W. Kirsch, and B. Simon, Schr¨dinger Operators, o 2nd printing, Springer, Berlin, 2008. [15] M. Demuth and M. Krishna, Determining Spectra in Quantum Theory, Birkh¨user, Boston, 2005. a 293
  • 306. 294 Bibliography [16] D. E. Edmunds and W. D. Evans, Spectral Theory and Differential Operators, Oxford University Press, Oxford, 1987. [17] V. Enss, Asymptotic completeness for quantum mechanical potential scattering, Comm. Math. Phys. 61, 285–291 (1978). [18] V. Enß, Schr¨dinger Operators, lecture notes (unpublished). o [19] S. Fl¨gge, Practical Quantum Mechanics, Springer, Berlin, 1994. u [20] I. Gohberg, S. Goldberg, and N. Krupnik, Traces and Determinants of Linear Operators, Birkh¨user, Basel, 2000. a [21] S. Gustafson and I. M. Sigal, Mathematical Concepts of Quantum Mechanics, Springer, Berlin, 2003. [22] P. R. Halmos, A Hilbert Space Problem Book, 2nd ed., Springer, New York, 1984. [23] P. D. Hislop and I. M. Sigal, Introduction to Spectral Theory, Springer, New York, 1996. [24] T. Kato, Perturbation Theory for Linear Operators, Springer, New York, 1966. [25] P. D. Lax, Functional Analysis, Wiley-Interscience, New York, 2002. [26] J. L. Kelly, General Topology, Springer, New York, 1955. [27] W. Kirsch, An Invitation to Random Schr¨dinger Operators, in Random o Schr¨dinger Operators, M. Dissertori et al. (eds.), 1–119, Panoramas et Synth`se o e 25, Soci´t´ Math´matique de France, Paris, 2008. ee e [28] Y. Last, Quantum dynamics and decompositions of singular continuous spectra, J. Funct. Anal. 142, 406–445 (1996). [29] B. M. Levitan, Inverse Sturm–Liouville Problems, VNU Science Press, Utrecht, 1987. [30] B. M. Levitan and I. S. Sargsjan, Introduction to Spectral Theory, American Mathematical Society, Providence, 1975. [31] B. M. Levitan and I. S. Sargsjan, Sturm–Liouville and Dirac Operators, Kluwer Academic Publishers, Dordrecht, 1991. [32] E. Lieb and M. Loss, Analysis, American Mathematical Society, Providence, 1997. [33] V. A. Marchenko, Sturm–Liouville Operators and Applications, Birkh¨user, a Basel, 1986. [34] M.A. Naimark, Linear Differential Operators, Parts I and II , Ungar, New York, 1967 and 1968. [35] R. G. Newton, Scattering Theory of Waves and Particles, 2nd ed., Dover, New York, 2002. [36] L. Pastur and A. Figotin, Spectra of Random and Almost-Periodic Operators, Springer, Berlin, 1992. [37] D. Pearson, Quantum Scattering and Spectral Theory, Academic Press, London, 1988. [38] P. Perry, Mellin transforms and scattering theory, Duke Math. J. 47, 187–193 (1987). [39] E. Prugoveˇki, Quantum Mechanics in Hilbert Space, 2nd ed., Academic Press, c New York, 1981. [40] M. Reed and B. Simon, Methods of Modern Mathematical Physics I. Functional Analysis, rev. and enl. ed., Academic Press, San Diego, 1980.
  • 307. Bibliography 295 [41] M. Reed and B. Simon, Methods of Modern Mathematical Physics II. Fourier Analysis, Self-Adjointness, Academic Press, San Diego, 1975. [42] M. Reed and B. Simon, Methods of Modern Mathematical Physics III. Scattering Theory, Academic Press, San Diego, 1979. [43] M. Reed and B. Simon, Methods of Modern Mathematical Physics IV. Analysis of Operators, Academic Press, San Diego, 1978. [44] J. R. Retherford, Hilbert Space: Compact Operators and the Trace Theorem, Cambridge University Press, Cambridge, 1993. [45] G. Roepstorff, Path Integral Approach to Quantum Physics, Springer, Berlin, 1994. [46] F.S. Rofe-Beketov and A.M. Kholkin, Spectral Analysis of Differential Operators. Interplay Between Spectral and Oscillatory Properties, World Scientific, Hacken- sack, 2005. [47] W. Rudin, Real and Complex Analysis, 3rd ed., McGraw-Hill, New York, 1987. [48] M. Schechter, Operator Methods in Quantum Mechanics, North Holland, New York, 1981. [49] B. Simon, Quantum Mechanics for Hamiltonians Defined as Quadratic Forms, Princeton University Press, Princeton, 1971. [50] B. Simon, Functional Integration and Quantum Physics, Academic Press, New York, 1979. [51] B. Simon, Schr¨dinger operators in the twentieth century, J. Math. Phys. 41:6, o 3523–3555 (2000). [52] B. Simon, Trace Ideals and Their Applications, 2nd ed., Amererican Mathemat- ical Society, Providence, 2005. [53] E. Stein and R. Shakarchi, Complex Analysis, Princeton University Press, Prince- ton, 2003. [54] G. Teschl, Jacobi Operators and Completely Integrable Nonlinear Lattices, Math. Surv. and Mon. 72, Amer. Math. Soc., Rhode Island, 2000. [55] B. Thaller, The Dirac Equation, Springer, Berlin 1992. [56] B. Thaller, Visual Quantum Mechanics, Springer, New York, 2000. [57] B. Thaller, Advanced Visual Quantum Mechanics, Springer, New York, 2005. [58] W. Thirring, Quantum Mechanics of Atoms and Molecules, Springer, New York, 1981. [59] G. N. Watson, A Treatise on the Theory of Bessel Functions, 2nd ed., Cambridge University Press, Cambridge, 1962. [60] J. Weidmann, Linear Operators in Hilbert Spaces, Springer, New York, 1980. [61] J. Weidmann, Spectral Theory of Ordinary Differential Operators, Lecture Notes in Mathematics, 1258, Springer, Berlin, 1987. [62] J. Weidmann, Lineare Operatoren in Hilbertr¨umen, Teil 1: Grundlagen, B. G. a Teubner, Stuttgart, 2000. [63] J. Weidmann, Lineare Operatoren in Hilbertr¨umen, Teil 2: Anwendungen, B. a G. Teubner, Stuttgart, 2003. [64] J. von Neumann, Mathematical Foundations of Quantum Mechanics, Princeton University Press, Princeton, 1996. [65] D. R. Yafaev, Mathematical Scattering Theory: General Theory, American Math- ematical Society, Providence, 1992.
  • 308. 296 Bibliography [66] K. Yosida, Functional Analysis, 6th ed., Springer, Berlin, 1980. [67] A. Zettl, Sturm–Liouville Theory, American Mathematical Society, Providence, 2005.
  • 309. Glossary of notation AC(I) . . . absolutely continuous functions, 84 B = B1 Bn . . . Borel σ-field of Rn , 260 C(H) . . . set of compact operators, 128 C(U ) . . . set of continuous functions from U to C C∞ (U ) . . . set of functions in C(U ) which vanish at ∞ C(U, V ) . . . set of continuous functions from U to V ∞ Cc (U, V ) . . . set of compactly supported smooth functions χΩ (.) . . . characteristic function of the set Ω dim . . . dimension of a linear space dist(x, Y ) = inf y∈Y x − y , distance between x and Y D(.) . . . domain of an operator e . . . exponential function, ez = exp(z) E(A) . . . expectation of an operator A, 55 F . . . Fourier transform, 161 H . . . Schr¨dinger operator, 221 o H0 . . . free Schr¨dinger operator, 167 o H m (a, b) . . . Sobolev space, 85 H m (Rn ) . . . Sobolev space, 164 hull(.) . . . convex hull H . . . a separable Hilbert space i . . . complex unity, i2 = −1 I . . . identity operator Im(.) . . . imaginary part of a complex number inf . . . infimum Ker(A) . . . kernel of an operator A, 22 297
  • 310. 298 Glossary of notation L(X, Y ) . . . set of all bounded linear operators from X to Y , 23 L(X) = L(X, X) Lp (X, dµ) . . . Lebesgue space of p integrable functions, 26 Lp (X, dµ) loc . . . locally p integrable functions, 31 Lp (X, dµ) c . . . compactly supported p integrable functions L∞ (X, dµ) . . . Lebesgue space of bounded functions, 26 L∞ (Rn ) ∞ . . . Lebesgue space of bounded functions vanishing at ∞ 1 (N) . . . Banach space of summable sequences, 13 2 (N) . . . Hilbert space of square summable sequences, 17 ∞ (N) . . . Banach space of bounded summable sequences, 13 λ . . . a real number ma (z) . . . Weyl m-function, 199 M (z) . . . Weyl M -matrix, 211 max . . . maximum M . . . Mellin transform, 251 µψ . . . spectral measure, 95 N . . . the set of positive integers N0 = N ∪ {0} o(x) . . . Landau symbol little-o O(x) . . . Landau symbol big-O Ω . . . a Borel set Ω± . . . wave operators, 247 PA (.) . . . family of spectral projections of an operator A, 96 P± . . . projector onto outgoing/incoming states, 250 Q . . . the set of rational numbers Q(.) . . . form domain of an operator, 97 R(I, X) . . . set of regulated functions, 112 RA (z) . . . resolvent of A, 74 Ran(A) . . . range of an operator A, 22 rank(A) = dim Ran(A), rank of an operator A, 127 Re(.) . . . real part of a complex number ρ(A) . . . resolvent set of A, 73 R . . . the set of real numbers S(I, X) . . . set of simple functions, 112 S(Rn ) . . . set of smooth functions with rapid decay, 161 sign(x) . . . +1 for x 0 and −1 for x 0; sign function σ(A) . . . spectrum of an operator A, 73 σac (A) . . . absolutely continuous spectrum of A, 106 σsc (A) . . . singular continuous spectrum of A, 106 σpp (A) . . . pure point spectrum of A, 106 σp (A) . . . point spectrum (set of eigenvalues) of A, 103 σd (A) . . . discrete spectrum of A, 145 σess (A) . . . essential spectrum of A, 145
  • 311. Glossary of notation 299 span(M ) . . . set of finite linear combinations from M , 14 sup . . . supremum supp(f ) . . . support of a function f , 7 Z . . . the set of integers z . . . a complex number √ z . . . square root of z with branch cut along (−∞, 0] z∗ . . . complex conjugation A∗ . . . adjoint of A, 59 A . . . closure of A, 63 fˆ = Ff , Fourier transform of f , 161 fˇ = F −1 f , inverse Fourier transform of f , 163 . . . . norm in the Hilbert space H, 17 . p . . . norm in the Banach space Lp , 25 ., .. . . . scalar product in H, 17 Eψ (A) = ψ, Aψ , expectation value, 56 ∆ψ (A) = Eψ (A2 ) − Eψ (A)2 , variance, 56 ∆ . . . Laplace operator, 167 ∂ . . . gradient, 162 ∂α . . . derivative, 161 ⊕ . . . orthogonal sum of linear spaces or operators, 45, 79 M⊥ . . . orthogonal complement, 43 A . . . complement of a set (λ1 , λ2 ) = {λ ∈ R | λ1 λ λ2 }, open interval [λ1 , λ2 ] = {λ ∈ R | λ1 ≤ λ ≤ λ2 }, closed interval ψn → ψ . . . norm convergence, 12 ψn ψ . . . weak convergence, 49 An → A . . . norm convergence s An → A . . . strong convergence, 50 An A . . . weak convergence, 50 nr An → A . . . norm resolvent convergence, 153 sr An → A . . . strong resolvent convergence, 153
  • 313. Index a.e., see almost everywhere operator, 23 absolue value of an operator, 99 sesquilinear form, 21 absolute convergence, 16 absolutely continuous C-real, 83 function, 84 canonical form of compact operators, 137 measure, 280 Cantor spectrum, 106 function, 287 adjoint operator, 47, 59 measure, 287 algebra, 259 set, 262 almost everywhere, 262 Cauchy sequence, 6 angular momentum operator, 176 Cauchy–Schwarz–Bunjakowski inequality, 18 B.L.T. theorem, 23 Cayley transform, 81 Baire category theorem, 32 Ces`ro average, 126 a Banach algebra, 24 characteristic function, 270 Banach space, 13 closable Banach–Steinhaus theorem, 33 form, 71 base, 5 operator, 63 basis, 14 closed orthonormal, 40 form, 71 spectral, 93 operator, 63 Bessel function, 171 closed graph theorem, 66 spherical, 230 closed set, 5 Bessel inequality, 39 closure, 5 Borel essential, 104 function, 269 commute, 115 measure, 261 compact, 8 regular, 261 locally, 10 set, 260 sequentially, 8 σ-algebra, 260 complete, 6, 13 transform, 95, 100 completion, 22 boundary condition configuration space, 56 Dirichlet, 188 conjugation, 83 Neumann, 188 continuous, 7 periodic, 188 convergence, 6 bounded convolution, 165 301
  • 314. 302 Index core, 63 graph, 63 cover, 8 graph norm, 64 C ∗ algebra, 48 Green’s function, 171 cyclic vector, 93 ground state, 235 dense, 6 Hamiltonian, 57 dilation group, 223 harmonic oscillator, 178 Dirac measure, 261, 274 Hausdorff space, 5 Dirichlet boundary condition, 188 Heine–Borel theorem, 10 discrete topology, 4 Heisenberg picture, 130 distance, 3, 10 Hellinger-Toeplitz theorem, 67 distribution function, 261 Herglotz domain, 22, 56, 58 function, 95 dominated convergence theorem, 273 representation theorem, 107 Hermite polynomials, 179 eigenspace, 112 hermitian eigenvalue, 74 form, 71 multiplicity, 112 operator, 58 eigenvector, 74 Hilbert space, 17, 37 element separable, 41 adjoint, 48 H¨lder’s inequality, 26 o normal, 48 HVZ theorem, 242 positive, 48 hydrogen atom, 222 self-adjoint, 48 unitary, 48 ideal, 48 equivalent norms, 20 identity, 24 essential induced topology, 5 closure, 104 inner product, 17 range, 74 inner product space, 17 spectrum, 145 integrable, 273 supremum, 26 integral, 270 expectation, 55 interior, 6 extension, 59 interior point, 4 intertwining property, 248 finite intersection property, 8 involution, 48 first resolvent formula, 75 ionization, 242 form, 71 bound, 149 Jacobi operator, 67 bounded, 21, 72 closable, 71 Kato–Rellich theorem, 135 closed, 71 kernel, 22 core, 72 KLMN theorem, 150 domain, 68, 97 hermitian, 71 l.c., see limit circle nonnegative, 71 l.p., see limit point semi-bounded, 71 Lagrange identity, 182 Fourier Laguerre polynomial, 231 series, 41 generalized, 231 transform, 126, 161 Lebesgue Friedrichs extension, 70 decomposition, 281 Fubini theorem, 276 measure, 262 function point, 285 absolutely continuous, 84 Legendre equation, 226 lemma Gaussian wave packet, 175 Riemann-Lebesgue, 165 gradient, 162 Lidskij trace theorem, 143 Gram–Schmidt orthogonalization, 42 limit circle, 187
  • 315. Index 303 limit point, 4, 187 ONS, see orthonormal set Lindel¨f theorem, 8 o open ball, 4 linear open set, 4 functional, 24, 44 operator operator, 22 adjoint, 47, 59 Liouville normal form, 186 bounded, 23 localization formula, 243 bounded from below, 70 closable, 63 maximum norm, 12 closed, 63 mean-square deviation, 56 closure, 63 measurable compact, 128 function, 269 domain, 22, 58 set, 260 finite rank, 127 space, 260 hermitian, 58 measure, 260 Hilbert–Schmidt, 139 absolutely continuous, 280 linear, 22, 58 complete, 268 nonnegative, 68 finite, 260 normal, 60, 67, 91 growth point, 99 positive, 68 Lebesgue, 262 relatively bounded, 133 minimal support, 286 relatively compact, 128 mutually singular, 280 self-adjoint, 59 product, 275 semi-bounded, 70 projection-valued, 88 strong convergence, 50 space, 260 symmetric, 58 spectral, 95 unitary, 39, 57 support, 262 weak convergence, 50 Mellin transform, 251 orthogonal, 17, 38 metric space, 3 complement, 43 Minkowski’s inequality, 27 polynomials, 228 mollifier, 30 projection, 43 momentum operator, 174 sum, 45 monotone convergence theorem, 271 orthonormal multi-index, 161 basis, 40 order, 161 set, 38 multiplicity orthonormal basis, 40 spectral, 94 oscillating, 219 outer measure, 266 neighborhood, 4 Neumann parallel, 17, 38 boundary condition, 188 parallelogram law, 19 function parity operator, 98 spherical, 230 Parseval’s identity, 163 series, 76 partial isometry, 99 Nevanlinna function, 95 partition of unity, 11 Noether theorem, 174 perpendicular, 17, 38 norm, 12 phase space, 56 operator, 23 Pl¨ cker identity, 186 u norm resolvent convergence, 153 polar decomposition, 99 normal, 11, 91 polarization identity, 19, 39, 58 normalized, 17, 38 position operator, 173 normed space, 12 positivity nowhere dense, 32 improving, 235 preserving, 235 observable, 55 premeasure, 260 ONB, see orthonormal basis probability density, 55 one-parameter unitary group, 57 product measure, 275
  • 316. 304 Index product topology, 8 spectral projection, 48 basis, 93 pure point spectrum, 106 ordered, 105 Pythagorean theorem, 17, 38 mapping theorem, 105 measure quadrangle inequality, 11 maximal, 105 quadratic form, 58, see form theorem, 97 compact operators, 136 Radon–Nikodym vector, 93 derivative, 282 maximal, 105 theorem, 282 spectrum, 73 RAGE theorem, 129 absolutely continuous, 106 range, 22 discrete, 145 essential, 74 essential, 145 rank, 127 pure point, 106 reducing subspace, 80 singularly continuous, 106 regulated function, 112 spherical coordinates, 224 relatively compact, 128 spherical harmonics, 227 resolution of the identity, 89 spherically symmetric, 166 resolvent, 74 ∗-ideal, 48 convergence, 153 ∗-subalgebra, 48 formula stationary phase, 252 first, 75 Stieltjes inversion formula, 95, 114 second, 135 Stone theorem, 124 Neumann series, 76 Stone’s formula, 114 set, 73 Stone–Weierstraß theorem, 52 Riesz lemma, 44 strong convergence, 50 strong resolvent convergence, 153 scalar product, 17 Sturm comparison theorem, 218 scattering operator, 248 Sturm–Liouville equation, 181 scattering state, 248 regular, 182 Schatten p-class, 141 subcover, 8 Schauder basis, 14 subordinacy, 207 Schr¨dinger equation, 57 o subordinate solution, 208 Schur criterion, 28 subspace second countable, 5 reducing, 80 second resolvent formula, 135 superposition, 56 self-adjoint supersymmetric quantum mechanics, 180 essentially, 63 support, 7 semi-metric, 3 separable, 6, 14 Temple’s inequality, 120 series tensor product, 46 absolutely convergent, 16 theorem sesquilinear form, 17 B.L.T., 23 bounded, 21 Bair, 32 parallelogram law, 21 Banach–Steinhaus, 33 polarization identity, 21 closed graph, 66 short range, 253 dominated convergence, 273 σ-algebra, 259 Fubini, 276 σ-finite, 260 Heine–Borel, 10 simple function, 112, 270 Hellinger-Toeplitz, 67 simple spectrum, 94 Herglotz, 107 singular values, 137 HVZ, 242 singularly continuous Kato–Rellich, 135 spectrum, 106 KLMN, 150 Sobolev space, 164 Lebesgue decomposition, 281 span, 14 Lindel¨f, 8 o
  • 317. Index 305 monotone convergence, 271 Weyl–Titchmarsh m-function, 199 Noether, 174 Wiener theorem, 126 Pythagorean, 17, 38 Wronskian, 182 Radon–Nikodym, 282 RAGE, 129 Young’s inequality, 165 Riesz, 44 Schur, 28 spectral, 97 spectral mapping, 105 Stone, 124 Stone–Weierstraß, 52 Sturm, 218 Urysohn, 10 virial, 223 Weierstraß, 14 Weyl, 146 Wiener, 126, 167 topological space, 4 topology base, 5 product, 8 total, 14 trace, 143 class, 142 triangel inequality, 3, 12 inverse, 3, 12 trivial topology, 4 Trotter product formula, 131 uncertainty principle, 174 uniform boundedness principle, 33 unit vector, 17, 38 unitary group, 57 generator, 57 strongly continuous, 57 weakly continuous, 124 Urysohn lemma, 10 variance, 56 virial theorem, 223 Vitali set, 262 wave function, 55 operators, 247 weak Cauchy sequence, 49 convergence, 24, 49 derivative, 85, 164 Weierstraß approximation, 14 Weyl M -matrix, 211 circle, 194 relations, 174 sequence, 76 singular, 145 theorem, 146