PART X.1 - Superstring Theory

From First Principles January 2017 – R4.7
Maurice R. TREMBLAY
PART X – SUPERSTRING THEORY
A Calabi-Yau manifold is an
example of a smooth space
(i.e., it is Ricci flat – RΜΝ =0)
that represents a deformation
which, from a space-time
point of view, smooths out an
orbifold singularity (i.e., an
infinite amount of curvature
located at each of the points.)
Chapter 1

In 1964, Feynman said in his Lectures on Quantum Mechanics (Vol. III): “I think I can
safely say that nobody understands quantum mechanics.” Well, that was back in the day
when he worked on a quantum theory of gravitation (c.f., Acta Phys. Pol.) and if he were
to extend the point particle idea to that of a vibrating string and develop a theory that des-
cribes energy and matter as being composed of tiny, wiggling strands of energy that look
like strings, then surely he would say: “I think I can safely say that only Ed Witten under-
stands string theory.” So, in no way does this set of slides come close to the height of his
intellect but it is meant to provide at least some familiarity to study the more established
and serious textbooks on the subject (c.f., References) and spearhead some gifted kid!*
Forward
2
2017
MRT
As with my other work, nothing of this is new or even developed first hand and frankly
it is a rearranged compilation of various quotes from various sources (c.f., ibid) that aims
to display an abridged but yet concise and straightforward mathematicaldevelopment of
string theory and superstrings (and some compactificationas a consequence) as I have
understood it and wish it to be presented to the layman or to the inquisitive person. Now,
as a matter of convention,I have included the setting h≡c≡1 in most of the equations and
ancillary theoretical discussions and I use the summation convention that implies the
summation over any repeated indices (typically subscript-superscript) in an equation.
* In order for someone to do active string theory research today, you have to be a very smart and committed person…Seriously!
Anyhow, this is my take on string theory… It will be work in progress (in three parts
probably) for some time to come as I myself mature and develop skills to make sense of
it all. String theory is a tough subject and it does require mathematical and physical intu-
ition that is so daunting but it will be quite a challenging experience to unravel some of it!

Contents of 2-chapter PART X
2017
MRT
A History of the Origins of String
Theories
The Classical Bosonic String
The Quantum Bosonic String
The Interacting String
Fermions in String Theories
String Quantum Numbers
Anomalies
The Heterotic String
Compactification and N=1 SUSY
Compactification and Chiral Fermions
Compactification and Symmetry
Breaking
Epilogue: Quantum Gravity
Appendix I: The Gamma Function
Appendix II: The Beta Function
Appendix III: Feynman’s Take on
Gravitation
Appendix IV: Review of Supersymmetry
“We propose that the ten-dimensional E8×E8 heterotic string is related to an eleven-dimensional theory on
the orbifold R10×S1/Z2 in the same way that the Type IIA string in ten dimensions is related to R10×S1. This
in particular determines the strong coupling behavior of the ten-dimensional E8×E8 theory. […]” Petr
Hořava and Edward Witten, Preface to their paper prescribing M-Theory for the first time (1995).
Appendix V: A Brief Review of Groups
and Forms
References

Contents
2017
MRT
A History of the Origins of String
Theories
Fermions in String Theories
String Quantum Numbers
Anomalies
The Heterotic String
Compactification and N=1 SUSY
Compactification and Chiral Fermions
Compactification and Symmetry
Breaking
Epilogue: Quantum Gravity
Appendix III: Feynman’s Take on
Gravitation
Appendix IV: Review of Supersymmetry
Appendix V: A Brief Review of Groups
and Forms
References

The story of the development of string theories is such a beautiful example of the
disorganized and unpredictable way in which elementary particle theory has evolved that
we cannot resist the temptation to give a brief historical (i.e., mathematical) introduction.
5
2017
MRT
A History of the Origins of String Theories
During the early 1960s, particle physics, or at least the strong-interaction sector which
was receiving the most attention, both theoretical and experimental, was dominated by
S-matrix theory. Its basic ingredients were the general properties of the scattering matrix
or S-matrix (N.B., the S-matrix elements are given by 〈 f |S|i〉 which are the probability
amplitudes for the transition between an initial state i and a final state f ) rather than
fundamental fields described by Lagrangians. The S-matrix was expected to be
determined mainly by the requirements of analyticity, crossing, and unitarity, terms
which we shall now explain.
In their seminal two-volume treatise (suitable for an ‘advanced’ graduate-level course),
M. Green,J. Schwarz and E. Witten, Superstring Theory, Vol. I - Introduction, Cambridge
University Press (1987), Section 1.1 entitledThe Early Days of Dual Models, gives a more
detailed and systematic study than the rough outline that is presented in this chapter. That
first section of their book begins with the paragraph: “In 1900, in the course of trying to fit
to experimental data, Planck wrote down his celebrated formula for blackbody radiation
(i.e., I(ν,T )=(2hν 3/c2){1/[exp(hν/kBT )−1]}). It does not usually happen in physics that an
experimental curve is directly related by a more or less intricate chain of calculations.
But blackbody radiation was a lucky exception to this rule. In fitting to experimental
curves, Planck wrote down a formula that directly led […] to the concept of the quantum.”

It is convenient to write the S-matrix as the sum:
6
2017
MRT
22
)()( dcba pppps +=+≡
AiS += 1
where 1 is the unit matrix representing processes in which the particles do not interact,
and A is the scattering amplitude. The analyticity assumption is that A must be an
analytic function of the Lorentz invariants that describe the process, with only isolated
singularities that are determined by the spinless particles (c.f., PART VIII – THE
STANDARD MODEL: Scattering Experiments):
dcba +→+
This is an analytic function of the two-independent Lorentz invariants that can be formed
from the four-momenta (i.e., pa ≡pa
µ, &c.) of the particles:
and:
22
)()( dbca ppppt −=−≡
referred to as s-channel and t-channel, respectively. The Diagram below shows the flow.
There is also u=(pa −pd)2 =(pb −pc)2. s, t, and u are the well-known Mandelstam variables.
++
↑↑↑↑
+→+ dcba
−−
↑↑↑↑
+→+ dcba
s-channel t-channel

The amplitude will contain poles corresponding to the propagators of single particles.
For example, a particle of spin J and mass M coupling to the state a+b will give a pole of
the form:
7
2017
MRT
L+
−
= 221
)(
),(
Ms
zP
ggtsA sJ
where g1 and g2 are the coupling constants (see Figure), PJ a Legendre function and zs is
the cosine of the scattering angle in the center-of-mass system. In the simple case when
the masses of the external particles are all equal to m, we readily calculate from the t-
channel process t=4m2 −2E2 +2|p|2 cosθ (c.f., op cit) that zs =1+2t/(s−4m2). The physical
region of the variables for this so-called s-channel process is given by the conditions:
04 2
≤≥ tms and
Processes giving (Left) an s-channel pole at s=M2 and (Right) a t-channel pole at t =M2.
s
t
Mg1 g2
a
b
c
d
s
t
M
g′1
g′2
a
b
c
d

Crossing is the assumption that the same scattering amplitude, analytically continued
to appropriate regions of the variables, also describes the crossed processes:
8
2017
MRT
04 2
≤≥ smt and
The physical region for the first of these (called the t-channel process because in this
case t≡(pa −pc)2 =(pb −pd)2 gives the total center-of-mass energy) is:
bcdadbca +→++→+ and
Crossing requires that the amplitude has poles, analogous to those of A(s,t) above:
L+
−
′′= 221
)(
),(
Mt
zP
ggtsA tJ
The other main ingredient of S-matrix theory is that S is unitary:
1=SS†
in any physical region for any given process. This ensures the conservation of
probability (i.e., that the sum over all the possible things that can happen is unity or
always 100% every time, if you wish). We note that the sum over all possible
intermediate states implied in the matrix multiplication S†S=1 above automatically
couples together an infinite sequence of possible processes. Also, since only those
states that are energetically possible can contribute, unitarity requires the existence
of new singularities at each new threshold. They are branch points, the first of which
occurs at s=4m2 in our example above. More generally, they arise at values of s equal
to the square of the sum of the masses of the particles in the intermediate state.

In order to make calculations in S-matrix theory, use was made of various pole
approximations. Clearly, in the neighborhood of a pole (e.g., s≈M2), that pole will
dominate the amplitude. Because of the large number of resonances, it seemed
reasonable to guess that a sum over such resonances:
9
2017
MRT
∑ −
=
i i
sJ
ii
Ms
zP
ggtsA i
221
)(
),(
might be a fair approximation to the amplitude in the physical s-channel region. Note that
the above lowest s-channel threshold the states are unstable, so the masses are
complex, Mi →Mi −½iΓi , the imaginary part being the decay width Γi.
The t-channel poles, on the other hand, give the forces that dominate forward
scattering (i.e., zs ≅1 so t is near zero). However, a t-channel pole corresponding to a
particle of high spin would give an unacceptable contribution at high s, since it behaves
like:
JJ
ttJ
tJ
szzPtsA
Mt
zP
ggtsA ~~)(~),(
)(
),( 221 →+
−
′′= L
at fixed t whereas unitarity can be shown to allow only amplitudes growing no faster than
s.

Now, in order to understand the significance of such t-channel poles for the s-channel
scattering process, it is necessary to make a partial-wave expansion in the t-channel
and to define an analytic continuation of the partial-wave amplitude to complex values of
the angular momentum J. The poles then occur along so called Regge trajectories:
10
The solutions of this equation for integer values of J give the mass-squared of the
particle, the value of J being its spin.
A typical Regge trajectory. This trajectory
connects a particle of mass m1
2 and spin-
1 and a particle of mass m2
2 and spin-2.
There is also a spin-0 tachyon state ( ) of
negative [mass]2, which must be removed
by having zero residue.
2017
MRT
For example, a nonrelativistic three dimensional harmonic
oscillator of classical frequency ω has the energy eigenvalues:






++ω=





+ω=
2
3
2
2
3
, JnnE rJn hh
where nr is the radial quantum number and J is the orbital angular
momentum. Although J is quantized (J=1,2,…) this equation can
be regarded as giving the mass of the bound states Mn,J (=En,J)
as a function of J. We can invert it to write the Regge trajectory:
)(
2
3
2 2
Mn
M
J r α≡−−
ω
=
h
More generally, a Regge trajectory connects particles with the
same internal quantum numbers but different spins (see Figure).
)(sJ α=
s
α(s)
m1
2 m2
2
1
2
)(sJ α=
−m0
2
0

The t-channel partial wave expansion does not converge in the s-channel physical
region, but it is possible to replace the sum over integer J by a contour integral in the
complex J plane which is well defined everywhere. For high s, this integral is dominated
by the rightmost singularities in the J plane, which under suitable assumptions, will be
Regge poles. We therefore obtain an approximation of the form:
11
2017
MRT
∑i
t
i
i
sttsA )(
)(~),( α
β
showing that the large s amplitude at fixed t is dominated by the leading Regge
trajectories (i.e., those with the largest αi(t)). Thus, the high-energy behavior of a given
s-channel process is determined by the particles that can be exchanged in the t-channel
(i.e., those with the t-channel quantum numbers).
The equations A(s,t)=Σigi1gi2[Pzi
(zs)/(s−Mi
2)] and A(s,t)~Σi βi(t)sαi(t) above are
approximate expressions for the scattering amplitude in terms of the s- and t-channel
poles, respectively. They have very different analytic forms but, if both are reasonable
approximations to the amplitude in part of the physical region, they must be
approximately equal to each other, at least in some average sense. This is the
assumption of duality (i.e., the resonances approximately equal Regge poles) for
hadron scattering:
∑∑ ≅
− i
t
i
i i
sJ
ii
ii
st
Ms
zP
gg )(
221 )(
)( α
β

In 1968, Veneziano proposed the following toy model of a scattering amplitude A that
satisfies duality:
12
Regge trajectories of the Veneziano
model (with scattering amplitude: A(s,t)=
Γ[−α(s)]Γ[−α(t)]/Γ[−α(s)−α(t)].
2017
MRT
It has s- and t-channel poles corresponding to particles of all
spins J=0,1,2,…, whose masses are given by the solutions of:
JM =)( 2
α
(see Figure), because Γ(x) has poles at x= 0,−1,−2,…. Also,
since:
ba
x
x
bx
ax −
∞→
 →
+Γ
+Γ
)(
)(
the Veneziano amplitude also has the Regge asymptotic behavior
of A(s,t)~Σi βi(t)sαi(t) as s→∞ at fixed t. It can be shown that it fully
satisfies the duality hypothesis in the sense that the sum over the
infinite set of s-channel poles reproduces this Regge behavior.
∫
−−−−
−=−−=
−−Γ
−Γ−Γ
=
1
0
1)(1)(
)1()](),([
)]()([
)]([)]([
),( ts
xxxdtsB
ts
ts
tsA αα
αα
αα
αα
where s and t are the Mandelstam variables from above, B is the combination of Euler
Gamma functions, Γ, known as the Beta function. This expression is clearly crossing
symmetric under s↔t and, provided the Regge trajectory function α(s) is linear:
ss ααα ′+= )0()(
where α′ is the slope and α(0) the intercept of the trajectory.
s
α(s)
1
2
3 sd
sd )(α
α =′
)0(α

Generalization to more complicated amplitudes, such as many-particle production
processes, we soon made and, for a time, there was much activity trying to fit such dual-
model Veneziano amplitudes to experimental data. Even a year later, in 1969, Virasoro
generalized Veneziano amplitude by using the four-point function:
13
2017
MRT
The next step was to try to deduce these dual-model amplitudes from some underlying
dynamics, and the key here was the realization that the spectrum of states is that of a
relativistic string. It was then found that the interaction of two open strings led
automatically to the existence of closed strings. The dual-string model required a new
trajectory to have a slope α ′/2, in rough agreement with experiment.
However, a major problem is that the particle poles are all on the real axis, whereas in
reality they should lie below this axis if they are to correspond to unstable particles (i.e.,
if their masses are above the physical threshold). Attempts to correct this by putting
branch points into the function α(s) are not successful because they destroy the linearity
that is essential to achieve the other satisfactory properties of the model. Thus, dual
models came to be seen as just the first-order term in some type of expansion, in which
the higher-order terms would ensure unitarity and push the poles off the real axis.
)](½)(½[)](½)(½[)](½)(½[
)](½[)](½[)](½[
),(
utusts
uts
tsA
αααααα
ααα
−−Γ−−Γ−−Γ
−Γ−Γ−Γ
=

In spite of interesting work on a possible dynamical theory underlying dual models,
and some success with its phenomenology, interest in the dual model of hadron declined
during the early 1970s. Serious obstacles were that self-consistency required that the
ground state of the closed string must have spin-2, rather than the spin-1 required for the
Pomeron (i.e., a particle that was invented to explain the behavior of elastic-scattering
cross sections), and that the models are only fully satisfactory in higher-dimensional
spaces. Also, the clear evidence for a parton-like behavior in deep inelastic scattering
suggested the existence of pointlike constituents within hadrons and led to the
reinstatement of field theory, with quarks and gluons as the new fundamental fields. The
idea that hadrons are a self-consistent set of states uniquely determined just by the
postulates of S-matrix theory (i.e., the bootstrap hypothesis) became untenable.
14
2017
MRT
The dual model, however, lived on. It survived by abandoning hadron physics and
becoming instead a candidate theory of the fundamental particles. All of this stemmed
from a suggestion by Scherk and Schwarz (1974) that the spin-2 value for the closed
string ground state should be taken as an indication that the theory contains gravity, and
that the slope α ′ should be taken as O(EP
−2) (EP being the Planck energy), so that only
the ground states of the strings with mass<<MP ≡EP /c2 are observed in normal physics.
Instead their role is to improve the chance of obtaining a finite theory of all interactions,
including gravity, by modifying the very short-distance behavior. However, higher dimen-
sionality is required and it is hoped that a suitable compactification scheme, ideally one
forced by the theory rather than selected by us to produce the required results, will
lead to something like the standard supersymmetry (SUSY) model for E<<EP.

The ultraviolet (i.e., large-momentum or short-distance) infinities of field theory (i.e., the
realm of Renormalization Theory) have their origin in the supposed pointlike nature of
elementary particles, and are indeed related to the well-known divergence of the
electromagnetic self-energy of a classical point charge. It is, therefore, natural to wonder
whether the idea that elementary particles as points should be instead considered as
extended objects. (N.B., We are not thinking of extension in the sense that the proton,
for example, has finite size owing to the spatial distribution of its pointlike constituents –
the quarks. Rather, we are introducing size as an intrinsic property of even elementary
objects).
15
A particle trajectory xµ =xµ(τ) between τ1
and τ2.
2017
MRT
We recall from Sp=−m ∫dτ √(gµν xµxν ) of the PART IX – SUPER-
SYMMETRY: The Newtonian Limit chapter that the action for
such a particle is proportional to path length (see Figure):
∫ ⋅−= xxdmSp &&τ
where x ≡dx/dτ (N.B., x is shorthand for xµ) and τ is the para-
meter used to label positions on the trajectory. Minimization of
this action with respect to variations in xµ(τ) yields ∂[xµ/√(x⋅x)]/∂τ
=0. If we choose τ as the proper time, this gives the expected
result (c.f., xρ+Γρ
µν xµ xν =0 with no gravitational fields – i.e., gµν
=ηµν so Γρ
µν =0). The momentum conjugate to xµ is given by:
⋅⋅⋅⋅ ⋅⋅⋅⋅
⋅⋅⋅⋅
from which we obtain (N.B., p is also shorthand for pµ):
⋅⋅⋅⋅ ⋅⋅⋅⋅ ⋅⋅⋅⋅
⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅
xx
xm
x
L
p
&&
&
& ⋅
==
µ
µ
µ
δ
δ
2
mpp =⋅
x2 x0
τ1
τ2
x1

We now generalize these results to a one-dimensional extended object (i.e., a string).
We require the string to have a finite length, so it can be either an open string, with two
free ends, or a closed string, with no free ends but loops around itself. The trajectories
followed by such strings are two dimensional surfaces in space-time (see Figure).
16
Parameters τ and σ label the time and space directions, respectively, so that σ1 labels a particular point
on the string. (Left) A typical trajectory for an open string. (Right) A typical trajectory for a closed string.
2017
MRT
These surfaces can be labeled by two parameters τ and σ such that the space-time
coordinate is:
x2
x1
x0
τ1
τ2
σ1
x2
x1
x0
τ1
τ2
σ1
xµ(τ,σ)xµ(τ,σ)
where µ =0,1,…,(D −1) with D being the dimension (e.g., D=4 is 4D). We will normally
think of τ as labeling the time and σ the space direction. For the open string, we choose
0≤σ ≤π and, correspondingly, for the closed string we impose that xµ(τ,σ +π)=xµ(τ,σ).
),( στµµ
xx =
0 ≤σ ≤π xµ(τ,σ +π)=xµ(τ,σ)

The obvious generalization of the path length Sp =−m∫dτ √(x⋅x) above is to take the
action of a string to be proportional to the area of the surface mapped out as it moves
between τ1 and τ2. Thus, we put, where we reinsert the constants h and c for clarity:
17
2017
MRT
where x′≡∂x/∂σ and where α ′ is a parameter that will later turn out to be the slope of
the Regge trajectory associated with the string. Alternatively, (2πα ′hc)−1 represents the
tension To in the string. Up to factors of h and c, the string length is the square root of α ′:
∫=′⋅′⋅−′⋅
′π
−=
2
1
1
))(()(
2
1 2
τ
τ
τσ
α
ddL
c
Sxxxxxx
c
L and&&&
h
⋅⋅⋅⋅ ⋅⋅⋅⋅
The classical equation of motion for the closed string, obtained by minimizing S, is
(c.f., the Euler-Lagrange equation of PART VIII – THE STANDARD MODEL: Field
Equations):
0=







′∂
∂
+







∂
∂
µµ
δ
δ
σδ
δ
τ x
L
x
L
&
or, explicitly:
0
)()()()(
=







 ′⋅−′⋅
∂
∂
+







 ′⋅′−′′⋅
∂
∂
L
xxxxxx
L
xxxxxx µµµµ
στ
&&&&&&
This is a very complicated equation of motion! (N.B., L→√[(x⋅x′)2 −(x⋅x)(x′⋅x′)]!)
oπ2
1
T
cs
α
α
′
=′= hl
⋅⋅⋅⋅ ⋅⋅⋅⋅ ⋅⋅⋅⋅

This last equation above can be simplified if we use the freedom of reparametrization
invariance to require that the σ and τ directions be orthogonal (see Figure):
18
2017
MRT
0=′⋅ xx&
and also scale σ,τ so that the distance moved when σ changes by dσ is equal to that
moved when τ changes by dτ =dσ. Recalling that these distances are spacelike and
timelike, respectively, we can write this condition as:
0=′⋅′+⋅ xxxx &&
Illustrating orthonormal coordinates on the
open string trajectory.
For some purposes it is useful to remove the remaining
freedom in τ and σ by putting:
)()constant( µ
µτ xn=
where n is a constant vector. One particular choice of n, in which
it has the components:
so it is on the light-cone:
]1...,,0,0,1[
2
1
−=µn
When these two last conditions above are satisfied, we are said to be using an
orthonormal or conformal gauge. In such a gauge, the equation of motion above
becomes the familiar wave equation:
µµ
xx ′′=&&
leads to the so-called light-cone or transverse gauge.
0=⋅nn
τ1
σ1 σ2
σ3
τ2
τ3

With xµ =x″µ above, we can write the general solution for a closed string in the form:
19
2017
MRT
and:
where XL
µ, XR
µ are arbitrary, twice-differentiable, functions that represent waves traveling
in opposite directions ( left and right , respectively) around the closed string. The
boundary condition xµ(τ,σ +π)=xµ(τ,σ) above implies that they can be written in the
form:
)()(),( στστστ µµµ
−++= RL XXx
⋅⋅⋅⋅⋅⋅⋅⋅
∑≠
+−
+++=
′ 0
)(2
0 e
1
2
)(
2
1
2
1
n
niL
nL
n
i
qX στµµµµ
αστα
α
∑≠
−−
+−+=
′ 0
)(2
0 e
1
2
)(
2
1
2
1
n
niR
nR
n
i
qX στµµµµ
αστα
α
where qµ, α0
µ, and (α µ)n
L,R are all Lorentz vectors, with qµ and α0
µ being real (N.B.,
these are the center-of-mass position and momentum of the string, respectively) and
with:
*)( ,, RL
n
RL
n −= αα
(N.B., The Fourier decompositions XL
µ/√(2α′) and XR
µ/√(2α′) above will be essential to
quantize the string. Then the αs will have an interpretation as creation/annihilation
operators and the inclusion of the factors √(2α ′) and 1/n will give an appropriate
normalization to the αn parameters.)

The apparent simplicity of the solution xµ(τ,σ)= XL
µ(τ +σ)+XR
µ(τ −σ) is spoilt by the
above constraints (i.e., x⋅x′=0 and x⋅x+ x′⋅x′=0) which become:
20
2017
MRT
must be zero, that is:
Using XL,R
µ/√(2α′) above, these equations imply that the so-called Virasoro generators
defined by:
0=⋅=⋅ RRLL XXXX &&&&
⋅⋅⋅⋅ ⋅⋅⋅⋅ ⋅⋅⋅⋅
∑
+∞
−∞=
−⋅−≡
k
L
kn
L
k
L
nL αα
2
1
The closed string that we have considered here is oriented in the sense that changing
σ →−σ generally changes the solution. It is possible to consider an unoriented closed
string on which we impose invariance under σ →−σ. Clearly, from xµ(τ,σ)= XL
µ(τ +σ)+
XR
µ(τ −σ) above, this requires:
( )stringclosedunorientedµµ
RL XX =
( )nLL R
n
L
n 0 allfor==
∑
+∞
−∞=
−⋅−≡
k
R
kn
R
k
R
nL αα
2
1
and

The open string can be treated in a similar way, but a new problem arises: minimiza-
tion of the action does not lead to the Euler-Lagrange equation of motion above because
of the contributions of boundary terms at σ =0 and π. These stem from the integration by
parts. In order to remove the boundary terms, we require that open strings satisfy the
boundary conditions:
21
2017
MRT
Given x′µ(τ,0)=x′µ(τ,π)=0 above we can proceed as before, except that for open
strings we need the standing-wave solution:
Equation x⋅x+ x′⋅x′=0 above then implies that x⋅x=0 at σ =0 and σ =π (i.e., that the ends
of the string move with the velocity of light!)
0),()0,( =π′=′ ττ µµ
xx
⋅⋅⋅⋅ ⋅⋅⋅⋅ ⋅ ⋅
)cos(e
1
),(
2
1
0
0 σαταστ
α
τµµµµ
∑≠
−
++=
′ n
ni
n n
n
iqx
where again qµ and α0
µ are real and:
*)( µµ
αα nn −=
As before, the constraint equations (i.e., x⋅x′=0 and x⋅x+ x′⋅x′=0) imply that:⋅⋅⋅⋅ ⋅⋅⋅⋅ ⋅⋅⋅⋅
0
2
1
=⋅−≡ ∑
+∞
−∞=
−
k
knknL αα

It is possible to simplify the constrain equations if we work in the light-cone gauge,
defined by τ =(constant)(nµ xµ) and nµ=(1/√2)[1,0,0,…,−1] above. For this purpose we
introduce D-dimensional light-cone coordinates:
22
2017
MRT
where bold-face quantities are vectors in the (D−2)-dimensional transverse space, with
components x1, x2, …, xD−2. Also, from τ =(constant)(nµ xµ) and nµ=(1/√2)[1,0,0,…,−1]
above:
which, to put things into perspective, reduces to the x± ≡(1/√2)(x0 ±x3) and x1, x2 light-
cone coordinates (x1, x2 being transverse coordinates) when D=4 is taken to be physical
space-time. Then any vectors x and y satisfy:





±≡
−
−±
221
10
,,,
)(
2
1
D
D
xxx
xxx
K
yx ⋅⋅⋅⋅−=⋅ −+
xxyx
+
= x)constant(τ
and hence in xµ(τ,σ)/√(2α ′) above, q+ and αn
+ (n≠0) are zero. Thus, the constraint Ln ≡
−½Σkαk ⋅αn−k =0 above becomes an equation for αn
−:
∑
+∞
−∞=
−+
−
=
k
knkn αααα⋅⋅⋅⋅αααα
0
1
α
α
showing that the only degrees of freedom are α0
+, q−, and the transverse modes ααααk.

Similarly, for the closed string, τ =(constant)x+ above (N.B., recall that we also defined
x± ≡(1/√2)(x0 ±x3) above) implies that (α n
+)L,R =0 (for all n≠0), so the constraints give:
23
2017
MRT
in the conformal gauge. The total momentum of the string is obtained by integrating pµ
over the whole string:
We calculate the momentum as for a point particle:
∑ −+
−
=
k
L
kn
L
k
L
n αααα⋅⋅⋅⋅αααα
0
1
)(
α
α
αδ
δ µ
µ
µ
′π
==
2
x
x
L
p
&
&
µµµµ
α
α
αα
α
σ 00
2/1
0
2
2)2(
2
1
′
=π′
′π
== ∫
π
pdP
for the closed string, and:
µµµ
α
α
α
α 00
2
12
2
1
′
=
′
=P
for the open string.
showing that the only degrees of freedom are α0
+, q−, and the transverse modes (ααααk)L,R.
∑ −+
−
=
k
R
kn
R
k
R
n αααα⋅⋅⋅⋅αααα
0
1
)(
α
α
and
( )movers
( )movers
Left
Right

These equations determine the mass of the string. For the closed string we have:
24
2017
MRT
In the light-cone gauge, these last two equations for the M2 of closed and open strings
become:
where to obtain the last line we have used the zero-frequency constraints Ln
L=Ln
R =0 (for
all n) above (N.B., Ln
L=Ln
R =0 also implies that the L and R contributions to the mass are
identical). For the open string we find, similarly:
∑
∞
=
−− ⋅+⋅
′
−=⋅
′
=⋅=
1
00
2
)(
22
n
R
n
R
n
L
n
L
nPPM αααα
α
αα
α
∑
∞
=
− ⋅
′
−=⋅
′
=
1
00
2 1
2
1
n
nnM αα
α
αα
α
and:
( )closed∑
∞
=
−− +
′
=
1
2
)(
2
n
R
n
R
n
L
n
L
nM αααα⋅⋅⋅⋅αααααααα⋅⋅⋅⋅αααα
α
( )open∑
∞
=
−
′
=
1
2 1
n
nnM αααα⋅⋅⋅⋅αααα
α
respectively.

Before we proceed to quantize the string, it is of interest to introduce an equivalent
classical model defined by the action (c.f., S= ∫τ Ldσ dτ above):
25
2017
MRT
where R(D=2)(hab)≡habRab
(D=2) is the scalar curvature obtained from hab and where Rab
(D=2)=
∂b Γc
ac −∂c Γc
ab −Γc
ab Γd
cd +Γc
ad Γd
bc with Γc
ab=½hcd (∂a hbd +∂b had − ∂d hab ). However, in two
dimensions, such a term is a constant that does not affect the dynamics!
where h=−det(hab) with hab the metric on the two-dimensional world-sheet of the string.
The indices a,b take values 0,1 (N.B., ξ 0, ξ 1 here play the roles of τ,σ used earlier). We
can think of the action S above as describing a set of D=2 scalar fields xµ(ξ 0,ξ 1) on a
two-dimensional space-time, with background metric hab (i.e., the usual space-time
variables have become scalar fields). Such an interpretation suggests several possible
generalizations. One fairly obvious possibility is to add to S above the action of Einstein’s
gravity on the two-dimensional world-sheet:
ba
ab xx
hhddS
ξξ
ξξ
α ∂
∂
⋅
∂
∂
′π
−= ∫
10
4
1
)()2(10)2( abDD
hRhddS ==
∫= ξξ
The equivalence of S= ∫τ Ldσ dτ and S above, as classical theories, can be demons-
trated by finding the Euler-Lagrange equation corresponding to variations in hab, solving
it, and using the solution to eliminate hab from S above, thereby recovering the action S=
∫τ Ldσ dτ. However, such an equivalence does not in general hold in quantum theory
because of the quantum fluctuations about the minimum solution for hab.

More can be said about the nature of these fluctuations by noting that hab, being sym-
metric, is defined by three independent functions of ξ 0,ξ 1. The freedom to reparametrize
(i.e., reparametrization invariance) allows us to introduce two new arbitrary functions:
26
2017
MRT
and so the only remaining degree of freedom is the scale (or conformal) factor φ(ξ 0,ξ 1).
When we insert this hab into S above the factor φ clearly cancels and so plays no role in
the classical equations. In fact, with hab above, we can write S as:*
for a=1,2, which can be chosen so that hab has the form:
),( 10
ξξξξ aa
′=′
baba
h ηξξφξξφ ),(
10
01
),( 1010
=





−
=
ba
ab xx
ddS
ξξ
ηξξ
α ∂
∂
⋅
∂
∂
′π
−= ∫
10
4
1
* Green, Schwarz & Witten (c.f., § 1.3.3), in their notation, use, hαβ =ηαβ expφ where expφ is an unknown conformal gauge. Their
free field action reduces to S=−(T/2)∫dσ 2ηµνηαβ ∂α Xµ ∂β Xν (with dσ 2 ≡dτ dσ and Xµ=Xµ(τ,σ) is their label for string position).

In quantum theory it is necessary to do a path integral over all metrics and all
trajectories x(ξ a). This requires us to define suitable weights for the path integrals and to
remove the divergences that arise. A. M. Polyakov (1981) showed that S above is
equivalent to S= ∫τ Ldσ dτ if, and only if, the dimension of space-time is D=26! This
cancellation of divergences, which seems to be essential for a consistent string theory, is
referred to as the cancellation of the conformal anomaly where Lorentz invariance is
only established in the critical dimension for bosonic strings:
In the above discussion we have implicitly assumed that the background (i.e., D=26)
space-time is flat (i.e., gµν =ηµν ) – we have ignored gravity! Since the string automati-
cally contains gravity, this is inconsistent. The correct requirement for conformal
invariance to hold is that the background metric should satisfy an equation that, in lowest
order, reduces to Einstein’s equation Rµν −½Rgµν=−κ 2Tµν (c.f., PART IX – SUPERSYM-
METRY: The Inclusion of Matter or Appendix III: Feynman’s Take on Gravitation for a
field theory derivation of gravitation).
27
2017
MRT
26=D

The quantization of a classical system is not always a uniquely defined procedure. The
classical limit can be regarded as the first term of an expansion in powers of h and there
is clearly an infinite number of possible expansions that have the same zeroth-order
limit. Several methods of quantizing the classical string, based on the analogy with point-
particle models, have been used, and the results agree provided that certain consistency
conditions are satisfied.
28
2017
MRT
Here we use the canonical quantization method (as opposed to light-cone gauge
quantization) in which we regard xµ(τ,σ) and its conjugate momentum pµ(τ,σ)=xµ/2πα′
as operators that satisfy the equal-time commutation relation:
⋅⋅⋅⋅
)()],(),,([ σσδστστ νµνµ
′−−=′ gipx
Using the Fourier expansion xµ(τ,σ)/√(2α′)=qµ +α0
µ +iΣn≠0(1/n)αn
µ exp(−inτ)cos(nσ) for
the open string, we see that this relation requires:
νµνµ
α giq −=],[ 0
all the other commutators being zero. Using αm
µ =(α−m
µ)† we can replace [αm
µ †,αn
ν]
above by:
νµνµ
δαα gn nmnm =],[ †
which shows that we can regard αn and its adjoint αn
†, for positive n, as annihilation
and creation operators, respectively.
and:
νµνµ
δαα gm mnnm 0,],[ +−=

We shall now impose the constraint Ln ≡−½Σk=±∞ αk ⋅αn−k =0 of the The Classical
Bosonic String chapter by working in the light-cone gauge (N.B., there is also the
Lorentz covariant quantization approach which will not be used here). Since there is no
longitudinal motion, this means that only the transverse operators αn
ν used in the
canonical quantization method (where ν =1,2,…,D−2) represent independent degrees of
freedom, and for these:
29
2017
MRT
assuming the background is flat (i.e., gµν =ηµν). It should be noted, however, that this
choice of gauge breaks the explicit Lorentz invariance of the theory, which is the source
of a difficulty that we shall encounter below.
ˆ
ˆ
νµνµ
δδαα ˆˆˆ†ˆ
],[ nmnm n−=
The vacuum state, |0〉, can be defined as the eigenstate of all the annihilation
operators, αn
µ (n positive), that has zero eigenvalues:ˆ
( )...,2,1,000ˆ
== nn
µ
α
Other states of the system can be obtained by acting on |0〉 with the creation operators
α−1
ν, α−2
ν, &c.ˆ ˆ

We can calculate the mass of the resulting states by using the mass operators given
by M2 =(1/2α′)α0 ⋅α0 of the The Classical Bosonic String chapter. However, when we
relate this to the αn operators by means of the constraints (i.e., to obtain the analog of
M2 =−(1/α′)Σnα−n⋅αn), we meet a new problem. The operators must be normal ordered
(i.e., all the annihilation operators must be placed to the right of all the creation
operators). Since the order is irrelevant for classical quantities, and each change of
order introduces a constant as in [αm
µ†,αn
ν ]=−nδmnδ µν above, the reordering required to
produce the quantized expression introduces an arbitrary constant. Thus, instead of M2 =
−(1/α′)Σnα−n⋅αn of the The Classical Bosonic String chapter we write:
30
2017
MRT
where the number operator N is defined to be:
)]0([
12
α
α
−
′
= NM
∑
∞
=
−=
1n
nnN αααα⋅⋅⋅⋅αααα
in the light-cone gauge, and where we have called the unknown constant α(0). The
reason for this choice is that, if we rewrite M2 =(1/α′)[N−α(0)] above in the form:
2
)0( MN αα ′+=
it becomes reminiscent of the linear Regge trajectory of slope α ′ and intercept α(0)
as in α(s)=α(0)+α ′s of the A History of the Origins of String Theories chapter, if N is
interpreted as the angular momentum.
ˆˆ ˆˆ

In order to study the angular momenta of the states, we need the angular-momentum
operators:
31
2017
MRT
which are generalizations of the angular momentum in three dimensions discussed in
PART IX – SUPERSYMMETRY: The SUSY Algebra chapter or Appendix IV: Review of
Supersymmetry. In the The Classical Bosonic String chapter,we used the expansion
[1/√(2α′)]xµ(τ,σ)=qµ +αm
µτ +iΣn≠0(1/n)αn
µ ⋅exp(−inτ)cos(nσ) and this becomes:
∫ −=
π
0
)( µννµµν
σ pxpxdM
∑
∞
=
−− −
+−=
1
00
)(
)(
n
nnnn
n
iqqM
µννµ
µννµµν αααα
αα
We can use this expression to calculate explicitly the commutation relations of the
different components of Mµν. These do not turn out to be exactly what one would expect
for angular momentum. In particular, when we use (±) light-cone gauge coordinates x± ≡
(1/√2)(x0 ± xD−1), and calculate the commutation relations between the components M−µ
(where µ are the transverse coordinates as in [αm
µ†,αn
ν ]=−nδmnδ µν above), we obtain:ˆˆ ˆˆ
ˆ
ˆ
∑
∞
=
−−+
−−
−∆
′
−=
1
ˆˆˆˆ
2
0
ˆˆ
][
)(
2
],[
m
mmmmmMM µννµνµ
αααα
α
α
instead of zero, which would be expected if Mµν obeyed the usual angular-momentum
commutation relations:
nspermutatiocyclic],[ += σνρµσρµν
η MMM

The quantity ∆m in [M−µ,M−ν] above turns out to be:
32
2017
MRT
So it appears that by working in a particular Lorentz frame (i.e., using the light-cone
gauge) we have lost Lorentz invariance and that our string theory is inconsistent (i.e.,
since ∆m ≠0). However, this problem does not arise if we impose the conditions:
m
D
m
D
m
1
)]0(1[2
12
26
12
26






−+
−
+




 −
≡∆ α
ˆˆ
1)0(26 == αandD
because then ∆m ≡0 and the required commutation relations hold.
The existence of these consistency conditions is a remarkable new feature of strings
as compared to point particles. The masses of the states, and the dimension of space-
time in which they can exist, are uniquely specified. The ground state, with N=0, is,
according to N=α(0)+α ′M2, a tachyon (i.e., it has negative [mass]2):
α′
−=
12
M
where we are always taking α ′, which gives the string tension, to be positive. The first
excited state is:
01 ˆ
1
µ
α−=
is massless and has spin-1. It has the expected 24 transverse degrees of freedom
associated with a massless vector particle in D=26 dimensions.

This last remark offers a quick way of understanding the reason for the constraint α(0)
=1 above. If the states in |1〉=α−1
µ |0〉 had a finite mass there would be an inconsistency
because their longitudinal degrees of freedom would be missing. And given the value of
α(0) we can understand why we need D=26. We first write M2 =(1/α′)Σnαααα−n⋅⋅⋅⋅ααααn (for an
open string) of the The Classical Bosonic String chapter in the form:
33
2017
MRT
∑
∞
=
−− +
′
=
1
2
)(
2
1
n
nnnnM αααα⋅⋅⋅⋅αααααααα⋅⋅⋅⋅αααα
α
that is, with the two possible orderings included equally. If we regard this equation as the
definition of the quantum operator M2, and put the ααααn into normal order, we find:
ˆ
∑∑∑∑
∞
=
∞
=
−
∞
=
−
′
−
+
′
=
′
+
′
=
1ˆ 1
ˆˆ
1
2
2
21
”“],[
2
11
”“
nn
nn
n
nn n
D
NM
αα
αα
αα µ
µµ
αααα⋅⋅⋅⋅αααα
where in the last equation we have used [αm
µ†,αn
ν ]=−nδmnδ µν above with αm
† =α−m.
These supposed equalities (i.e., “=”) have been put in quotes because strictly they are
meaningless, since the final sum does not converge. It can, however, be given a
meaning by using zeta-function regularization.
ˆˆ ˆˆ

To this end, we note that the Riemann zeta-function ζ(s) is given by:
34
2017
MRT
( )1Re)(
1
>= ∑
∞
=
−
sns
n
s
forζ
and can also be represented by:
∫
∞
−
−
−
−Γ
=
0
1
e1
e
)(
1
)( t
t
s
ttd
s
sζ
for Res>0. This last integral can be analytically continued to Res<0 by doing the
integration by parts. In particular, we find that:
12
1
)1( −=−ζ
Hence, M2 “=” (1/α′)N+[(D−2)/2α′]Σnn above becomes:
αα ′
−
−
′
=
24
212 D
NM
so to get α(0)=1 we require D=26 (c.f., N=α(0)+α ′M2 above).

It is important to know whether these conditions on D and α(0) are essential or are
merely a consequence of the choice of the method of quantization. Since they stem from
the breaking of Lorentz invariance in [αm
µ†,αn
ν ]=−nδmnδ µν above, it is necessary to see
what happens if we quantize in a manifestly Lorentz-invariant manner. However,
problems then arise through the occurrence of states of negative norm called ghosts. To
understand the origin of these ghosts we start with the vacuum state, |0〉, which has
positive norm:
35
2017
MRT
100 =
Then we consider the one-particle timelike excitation:
ˆˆ ˆˆ
01 0
1
0
−≡ α
00000)(011 0
1
0
1
0
1
†0
1
00
−=== −−− αααα
for which:
from [αm
µ,αn
ν ]=−mδn+m,0 gµν above. Thus |10 〉 is a ghost state with negative norm.

Since negative norm states are meaningless (e.g., they would give problems with
unitarity because they imply a negative probability), we must somehow exclude them
from the physical spectrum. Now, from Ln ≡−½Σk=±∞αk ⋅αn−k =0 the The Classical Bosonic
String chapter, physical states have to be eigenstates of the Virasoro generators Ln with
zero eigenvalues. This condition can only be imposed for positive n, and it suffices to
remove the ghosts provided that:
36
2017
MRT
1)0(26 == αandD
or:
1)0(26 ≤< αandD
Thus, the conditions for the removal of ghosts are somewhat less restrictive than just D=
26 and α(0)=1 alone. However, D=26 appears to be necessary when we consider
interacting strings in order to remove the conformal anomaly we mentioned earlier. But
there is general agreement that the natural dimension for the quantized bosonic string is
D=26.

We now briefly discuss the closed string, for which essentially the same procedure can
be followed. If we start from the light-cone gauge expression M2 =(2/α′)Σn(αααα−n
L⋅ααααn
L +αααα−n
R⋅
ααααn
R) (for a closed string) of the The Classical Bosonic String chapter, again find that we
require D=26, and the mass spectrum becomes:
37
2017
MRT
∑
∞
=
−≡
1
,,,
n
RL
n
RL
n
RL
N αααα⋅⋅⋅⋅αααα
where NL and NR are number operators for the left- and right-moving sectors:
)2(
22
−+
′
= RL
NNM
α
which, as before, are required to be equal by the constraints. The slope of the Regge
trajectory for the closed string is thus α′/2, and the zero mass state now has spin-2 (NL =
NR =1).
This concludes our discussion of the quantized bosonic string. We have found that it
belongs most naturally in D=26 dimensions. The open string then has a zero-mass
spin-1 (vector) boson which the closed string has a zero-mass spin-2 boson. So there
are already hints that string theories may contain both Yang-Mills and gravity gauge
fields! But the occurrence of the tachyons (i.e., ground states with negative [mass]2) is
unsatis-factory. In string theories, we need to impose supersymmetry (c.f., Appendix IV:
Review of Supersymmetry) to eliminate the tachyons! Finally, we remind the reader that if
1/α′ in M2 =(1/α′)[N−α(0)] above is O(MP
2) then the excited states of the strings are all
expected to have mass O(MP) and hence be unobservable in low-energy particle
physics.

In this introduction to string theory of the classical and quantum string we only discussed
free (bosonic) strings. We must now introduce interactions between the strings. A proper
treatment of this topic presumably requires a quantum field theory of strings, with
operators that create and annihilate strings, analogous to those employed in a field
theory of particles.
38
Here we shall proceed by noting that the chief practical consequence of a field theory
of particles is the Feynman diagrams, including the rules for their evaluation. The most
primitive Feynman diagram has two particles coming together to form a third (see
Figure 1), and more complex diagrams can be built up by joining such primitive
diagrams together.
a
b
c
Figure 1: A Feynman diagram for the process a +b→ c.
2017
MRT

By analogy, we expect that the corresponding theory of strings will have a basic
diagram in which two strings join to form a new string. Various possibilities, involving
both open string and closed strings, are illustrated in Figure 2.
39
All the other scattering processes can then be obtained by combining together these
basic diagrams.
2017
MRT
Figure 2: The world-sheets corresponding to (Left) two open strings joining to make an open string,
(Middle) two closed strings forming a closed string, and (Right) two open strings joining to make a
closed string. Relations between the couplings g, κ, and κ′ are discussed in remark (4) in the slides.
g
κ′
κ

For example, just as we can use the particle diagram of Figure 1 twice in order to
obtain the single-particle exchange contribution in both the diagrams (like Figure 2 -
Left) to obtain the open-string contribution to open-string scattering shown in Figure 3 -
Right.
40
2017
MRT
Figure 3: Diagrams (Left) and (Middle) show particle scattering due to single particle exchange; (Right)
shows the corresponding closed string amplitude which incudes (Left) and (Middle).
A
B
C
D
A
B
C
D
Y
X Y
X
A
B
C
D

We should note, however, some very important differences between the string and
particles diagrams:
41
2017
MRT
1. In contrast to point-particle field theories there is apparently no freedom in string
interactions, because there is no particular point that can be identified with a vertex.
Indeed, every part of a string diagram is simply a freely propagating string in some frame
of reference, so there is no Lorentz-invariant way of specifying the space-time point at
which the two strings join. We see this by reference to Figure 4.
Figure 4: Showing how the intersecting point depends on the choice of time axis.
If we choose the time axis t to be horizontal, the strings join at P. On the other hand, if
we view the system from a different reference frame in which time is measured along t′,
then P is just one end of a freely propagating string.
2. Because of remark (1), the potentially ultraviolet-divergent region of integration, which
is Feynman diagrams occur when vertices are close together (i.e., at high momentum
transfer), are missing from string diagrams. This is of course the reason why we hope
that the field theory of an extended object like a string may be finite.
t
t′
P

3. There are many fewer string diagrams. For example, the string diagram of Figure 3 -
Right contains the particle exchange diagram of Figure 3 - Left corresponding to the
exchange of all the particles that are described by the string X (e.g., the infinite sum of
states A(s,t)=Σi gi1 gi2 PJi
(zs)/(s−Mi
2) of the A History of the Origins of String Theories
chapter). Further, Figure 3 - Right can be regarded as describing strings a and b joining
to form a single string X, which then splits again, or as strings a and c joining to form the
string Y, which subsequently splits. Thus, like A(s,t)=Γ[−α(s)]Γ[−α(t)]/Γ[−α(s)−α(t)] of
the A History of the Origins of String Theories chapter, the single string diagram contains
both of the particle exchange diagrams Left and Middle of Figure 3. So, whereas the
Feynman diagrams, Figure 3 - Left and Middle, have to be added, both are included in
the single string diagram of Figure 3 - Right. Similarly, in Figure 5, the one-loop string
diagram in Left includes both of the Feynman diagrams shown in Middle and Right.
42
2017
MRTFigure 5: Showing how one string loop diagram (Left) contains two different types of Feynman loop
diagrams (Middle) and (Right).

4. The fact that the string diagrams can be interpreted in these different ways means
that the various coupling constants are related. Thus, initially the theory may appear to
allow three coupling constants g, κ, and κ ′, associated with the three string interactions
shown in Figure 2. However, if we consider Figure 6 we see that (reading from left to
right) it is a closed+open→closed+open diagram, with coupling constant κκ′. Hence,
the coupling constants κ and κ ′ must be equal.
43
2017
MRT
Figure 6: An open, open, closed, closed scattering diagram that relates κ and κ ′.
If we are hoping to identify one of the states of the closed string with the graviton, this
equality is of course essential to ensure that open and closed strings all fall at the same
rate in a gravitational field.

The other relation that we obtain is, however, new. To derive it we consider the contri-
bution to the scattering of two open strings (--) by the one-loop diagram of Figure 7.
44
2017
MRT
n
g ακ ′= 24
If we interpret this as going from left to right the intermediate state is two open
strings (--) so it is:
Figure 7: A one-loop diagram for the scattering of two open strings. This diagram relates the gauge and
gravitational coupling constants, g and κ respectively. [This convention will be applied henceforth.]
where the power of α ′ is n=(D/2)−3, chosen to make the dimensions correct. Note
that in D dimensionsg has the dimensionality of [length](D/2)−2, whileκ has [length](D/2)−1
and α ′ has [length]2.
whereasif weviewit as going from bottom to top it is oneclosedstring (--) and hence:
4
~ g
Thus, we can relate g4 to κ 2 =κ ′2:
2
~ κ′
or≡ ≡

The argument above was originally used by C. Lovelace in the context of hadronic
string theories to show that unitarity required the existence of closed strings, and hence
of the pomeron (i.e., a Regge trajectory postulated in 1961 to explain the slowly rising
cross section of hadronic collisions at high energies), with a coupling given by g4 =κ 2α ′n
above (and do recall that n=(D/2)−3). In the present context, it shows instead the
necessity for gravity. The relation between the gravitational coupling κ and the gauge
coupling g given in g4 =κ 2α ′n above does not immediately imply anything about the four-
dimensional couplings because they have different dimensions, and so the two sides of
the equation with be scaled by the compactification volume in different ways.
45
2017
MRT
α
κ
′
=
2
2
g
We shall see later (c.f., The Heterotic String chapter) that the most promising theories
are based on a single closed string that contains both the gauge vector particles and the
graviton. Then g4 =κ 2α ′n above is replaced by:
This relation in dimensionally correct for any value of D and so both sides scale in the
same way on compactification. Thus, provided α ′ is of the order of κ 2, the theory
predicts gauge couplings that are of O(1) as required.

The evaluation of a scattering amplitude relies on the freedom in the choice of the
world-sheet metric, which was mentioned at the end of the The Classical Bosonic String
chapter. To understand why this is so, recall that the action is expressed by:
46
2017
MRT
as an integral over the two-dimensional world-sheet of the string. For scattering
processes, such world sheets contain portions corresponding to the incoming and
outgoing strings that extend to infinity. However, we can make a conformal rescaling of
the metric, hab →exp(φ)hab , and by a suitable choice of the function φ it is possible to map
these surfaces into compact surfaces on which the external strings appear as point
vertices (see Figure 6). Each of these points gives a certain vertex factor in the integral.
ba
ab xx
hhddS
ξξ
ξξ
α ∂
∂
⋅
∂
∂
′π
−= ∫
10
4
1
Figure 6: The scattering diagram for two closed strings can be mapped into the surface of a sphere.
×××× ××××
×××× ××××
→
We shall not discuss the mathematical expressions that lie behind the diagrams that
follow, for which the reader should consult such a reference as M. Green, J. H. Schwarz,
and E. Witten, Superstring Theory, Vols 1 and 2, Cambridge University Press (1987), but
there is one important feature that we need to examine.

The next stage in the evaluation of the amplitude is to reparametrize the surface so
that the metric takes a simple form. For example, in the case of the planar diagram for
the scattering of two closed strings shown in Figure 6, we can use the standard metric
on the surface of a sphere. With this choice, the path integral over the metric
disappears, and what remains is just a two-dimensional field theory in which the
diagrams can readily be evaluated. However, there are two potential difficulties with the
above procedure.
47
2017
MRT
The first is that the conformal factor φ does not, in general, disappear from the
quantum theory because of the anomaly noted towards the end of the The Classical
Bosonic String chapter. Here, we meet yet another reason why we can use the bosonic
string consistently only in D=26 dimensions, where, as we recall, this anomaly vanishes.
The second difficulty is that the reparametrization necessary to put the metric into a
simple form is only possible globally on a surface of genus zero (i.e., topologically
equivalent to a sphere). More complicated surfaces that are invariant under
reparametrization introduce additional problems because it is necessary to include in the
path integral all those paths that are genuinely different (i.e., those that are not related to
each other by reparametrization).

As an example of the second difficulty, consider the one-loop amplitudes for the closed
strings. Using an analogous argument to that which allows one to reduce the world-
sheet in Figure 6 to the surface of a sphere, we can reduce the world-sheet of a one-
loop diagram to the surface of a torus. To define such a torus, we use a complex
variable z as the coordinate of the two-dimensional surface, and identify points z and z+
mλ1 +nλ2, where m and n are integers and λ1 and λ2 are complex numbers whose ratio:
48
2017
MRT
2
1
λ
λ
τ =
is not real. By suitable ordering, we can choose:
The significance of τ is that under reparametrization it only changes by a so-called
modular transformation:
0Im >τ
dc
ba
+
+
→
τ
τ
τ
where a, b, c, and d are integers related by ad−bc=1. Thus, evaluation of one-loop dia-
grams requires integration over the upper-half τ -plane, subject to the identification of
points related by such modular transformations (i.e., values of τ related by τ →(aτ +b)
⋅(cτ +d)−1 above must only be counted once). This modular invarianceof the theory is cru-
cial for showing that the SUSY theories of the next chapters are finite, because if we
had to integrate over the whole of the upper-half τ -plane the integral would diverge).

Finally, we note some elementary results of these calculations. The planar diagram for
the scattering of two open strings (c.f., Figure 3 – Right):
49
2017
MRT
)]()([
)]([)]([
),(
ts
ts
tsA
αα
αα
−−Γ
−Γ−Γ
=
The spin-1 vector particles of the open bosonic string couple exactly as they would in a
local gauge theory to lowest order in α ′, and similarly the spin-2 particle in the closed
bosonic string couples like the graviton. So, although neither local gauge invariance nor
local coordinate invariance are used as ingredients, they both emerge as predictions of
string theory (N.B., to lowest order)! In particular, it must be emphasized that whereas
point field theory does not permit gravity, string theory not only permits it, but seems to
require it!
gives the Veneziano amplitude of the A History of the Origin of String Theories chapter:

The (complete) Gamma function Γ(n) is defined to be an extension of the factorial to
complex and real number arguments. It is related to the factorial by Γ(n)=(n−1)!.
( )∫
∞
−−
>=Γ
0
1
0e)( nttdn tn
with the recursion formula:
( )10...,2,1,0!)1()()1( ===+ΓΓ=+Γ !whereifand nnnnnn
( )...,3,2,1π
2
)12(531
)½(π)½(1)0( =
+⋅⋅
=+Γ=Γ=Γ m
m
m m
L
&,
For n<0 the Gamma function can be defined by using Γ(n)=Γ(n+1)/n.
The graph is:
Special values:
The Gamma function can be defined as a definite integral (Euler’s integral form):
Γ(n)
n

The Beta function B(m,n) is the name used by Legendre & Whittaker and Watson (1990)
for the Beta integral defined by:
( )∫ >−= −−
1
0
11
0,)(1),( nmtttdnmB nm
Relationship with Gamma function:
)(
)()(
),(
nm
nm
nmB
+Γ
ΓΓ
=
),(),( mnBnmB =
The graph is:
Properties:
∫∫
∞
+
−
−−
+
==
0
12π
0
1212
)1(
),(cossin2),( nm
m
nm
t
t
tdnmBdnmB ,θθθ
and
B(m,n)
n
B(m,n)
m
n
m

52
2017
MRT
Appendix III: Feynman’s Take on Gravitation
“Now I will show you that I too can write equations that nobody can understand.” Richard Feynman,
Conference on Relativistic Theories of Gravitation, Jablonna / Warsaw (July, 1962).
Contents
Introduction
A Field Approach to Gravitation
The Characteristics of Gravitational
Phenomena
The Spin of the Graviton
Amplitudes and Polarizations in
Electrodynamics
Amplitudes for Exchange of a Graviton
Interpretation of the Terms in the Amplitudes
The Lagrangian for the Gravitational Field
The Equations for the Gravitational Field
The Stress-Energy Tensor for Scalar Matter
Amplitudes for Scattering for Scalar Matter
Detailed Properties of Plane Waves
The Self Energy of the Gravitational Field
The Bilinear Terms of the Stress-Energy
Tensor
Formulation of a Theory Correct to all
Orders
Invariants and Infinitesimal
Transformations
Lagrangian of the Theory Correct to all
Orders
Einstein Equation for the Stress-Energy
Tensor
The Connection between Curvature and
Matter
The Field Equations of Gravity
Classical Particles in a Gravitational Field
Matter Fields in a Gravitational Field
Coupling between Matter Fields and
Gravity
Radiation of Gravitons with Particle Decays
Article: Quantum Theory of Gravitation,
R. P. Feynman, Conference on Relativistic
Theories of Gravitation (1962)

In this Appendix we outline the key developments within R. P. Feynman’s book Feynman
Lectures on Gravitation (F. B. Morinigo and W. G. Wagner [editors]; edited by B. Hatfield;
foreword by J. Preskill and K. S. Thorne), Addison-Wesley (1995) [FLoG 95] based on
lectures given at Caltech during the 1962-63 academic year. Why? Well, because
Feynman’s unique insights are by themselves enough to provide a qualitative survey of
the difficulties in formulating satisfactory quantum gravity theories but moreover, it is
fabulous to see his development of the Einstein equations which is paramount:
53
2017
MRT
and:
“[Feynman’s] approach, presented in lectures 3-6, develops the theory of a
massless spin-2 field (the graviton) coupled to the energy-momentum tensor
of matter, and demonstrates that the effort to make the theory self-consistent
leads inevitably to Einstein’s general relativity.” Forward P. ix
Introduction
“Feynman’s whole approach to general relativity is shaped by his desire to
arrive at a quantum theory of gravitation as straightforwardly a possible.”
Forward P. x
Please forgive the literary shortcuts in learning but I too want to get right into the key
developments without having to re-invent what the References or the original text can
surely fill-in regarding the consequences of the historical research surrounding the study
of gravity or for that matter, the nuances of Einstein theory of General Relativity, when
actually the goal is to formulate Einstein’s equation Rµν −½Rgµν =−κ 2Tµν and delve into
the formulation of a quantum theory of gravitation from scratch à la Feynman.

The fundamental law of gravitation was discovered by Newton, that gravitational forces
are proportional to masses and that they follow an inverse-square law. The law was later
modified by Einstein in order to make it relativistic. The changes that are needed to
make the theory relativistic are very fundamental; we know that the masses of particles
are not constants in relativity, so that a fundamental question is: How does the mass
change in relativity affect the law of gravitation? Well, Einstein’s gravitational theory
resulted in beautiful relations connecting gravitational phenomena with the geometry of
space [-time]. The apparent similarity of gravitational forces and electric forces, for
example, in that they both follow inverse-square laws, which every kid can understand,
made every one of these kids dream that when he grew up, he would find the way of
geometrizing electrodynamics. Unfortunately, there is no successful unified field theory
that combines gravitation and electrodynamics. [Even Einstein himself couldn’t do it!]
54
2017
MRT
A Field Approach to Gravitation
The phenomena of gravitation adds another field into the mix, it is a new field which
was left out of previous considerations, and it is only one of thirty or so. Explaining
gravitational behavior amounts to explaining three percent of the total number of known
fields… We imagine that in some small region of the universe, say a planet such as
Venus, we have scientists who know all about the fields of the universe, who know just
what we do about nucleons, mesons, &c., but who do not know about gravitation. And
suddenly, an amazing new experiment is performed, which shows that two large neutral
masses attract each other with a very, very tiny force! Wow… What would the Venus-
ians do with such an amazing extra experimental fact to be explained? They would
probably try to interpret it in terms of the field theories which are familiar to them.

Let us review some of the experimental facts which a Venusian theoretical physicist
would have to play with in constructing a theory for the amazing new effect…
55
2017
MRT
The Characteristics of Gravitational Phenomena
First of all there is the fact that the attraction follows an inverse-square law. Then
there is the fact that the force is proportional to the masses of the objects. From
various experiments, the conclusion is that the ratio of the inertial masses to the
gravitational masses is a constant to an accuracy of one part in 108, for many substan-
ces, from oxygen to lead. Now, an accuracy of 108 tells us that the ratio of the inertial
and gravitational mass of the binding energy is constant to within one part in 106. At
these accuracies we also have a check on the gravitational behavior of antimatter. There
is absolutely no evidence that would require us to assume that matter and antimatter
differ in their gravitational behavior. Also, all the evidence, experimental and even a little
theoretical, seems to indicate that it is the energy content which is involved in
gravitation, and therefore, since matter and antimatter both represent positive energies,
gravitation makes no distinction. The same goes for particle and antiparticle altogether.
The second amazing thing about gravitation is how weak it is! We mean that
gravitational forces are very weak compared to the other forces that exist between
particles! For example, the ratio of gravitational to electrical force between two electrons
is FElec /FGrav =4.17×1042. In other words, gravitation is really weak. All other fields that we
know about are so much stronger than gravity that we have a feeling that gravitation
would probably never be explained as some correction,some left-over terms until now
neglected, in a theory that might unify all other fields that we know about.

So, in the following [slides] we shall start to construct a quantum theory of gravitation!
[It is like playing LEGO using tensor calculus and the action principle for gravitational
matter in a Riemannian-based geometry of 4D space-time such as to establish a relation
between Riemannian geometry and the stress-energy tensor while all along contem-
plating the individually assembled parts with an eye for consistency]. It might be well for
us to keep in mind whether these would be any observable effects of such a theory.
56
2017
MRT
Let us first consider gravitation as a perturbation on the hydrogen atoms. Evidently, an
extra attraction between the electron and proton produces a small change in the energy
of bound hydrogen; we can calculate this energy change from perturbation theory and
get a value, ε. Now, the time dependent wave function of a hydrogen atom goes as ψ =
exp(−iEt), with E of such a size that the frequency is something like 1016 cycles per
second. Now, in order to observe any effects due to ε, we should have to wait for a time
until the true wave function should differ from the unperturbed wave function by some-
thing like 2π in phase. But the magnitude of ε is so small that the phase difference would
be only 43 seconds (of phase) in a time equal to 100 times the age of the universe T. So
gravitational effects in atoms are unobservable! [Not that they can’t actually happen!]
Let us consideranother possibility, an atom held together by gravity alone. For example,
we might have two neutrons in a bound state. When we calculate the Bohr radius (i.e., ao
=h2/mc2 with m the mass [of arbitrary mass!]) of such an atom, we find that it would be
108 light years, and that the atomic binding energy would be 10−70 Ry (i.e.,Ry is a Rydberg
– R=mq4/2h2 with q the charge). There is then little hope of ever observing gravitational
effects on systems which are simple enough to be calculable in quantum mechanics!

A prediction of the quantum theory of gravitation would be that the force would be
mediated by the virtual exchange of some particle, which is usually called the graviton.
We might therefore expect that under certain circumstances we might see some
graviton, as we have been able to observe photons. We as humans observe gravity, in
that we know we are pulled to the earth, but classical gravitational waves have not as
yet been observed; this is not inconsistent with what we expect – gravitation is so weak
that no experiment that we could perform today [circa 1962] would be anything near
sensitive enough to measure gravitational radiation waves, at least, those which are
expected to exist from the strongest sources that we might consider, such as rapidly
rotating binary stars. And the quantum aspect of gravitational waves is a million times
further removed from detectability; there is apparently no hope of ever observing a
graviton!
57
2017
MRT
Let us return to a construction of a theory of gravitation, as our friends the Venusians
might go about it. In general we expect that there would be two schools of thought about
what to do with the new phenomenon. These are: that gravitation is a new field, number
31; and that gravitation is a consequence of something that we know, but that we have
not calculated correctly…
The Spin of the Graviton
The second point of view leads to one possibility which is that gravitation may be some
attraction due to fluctuations (e.g., we know that all molecules attract one another by a
force which at long distances goes like 1/r6 which we understand in terms of dipole
moments induced by fluctuations in the charge distributionof molecules)in something
we do not know just what, perhaps having to do with charge.

Now onto the first point. We assume that Venusian scientists know the general
properties of field theories; they are now searching for a field having the characteristics
of gravity. Gravity, on the other hand, has the following properties; that large masses
attract with a force proportional to the inertia, and to the inverse square of the distances;
also that the mass and inertia represent the energy content, since the binding energies
of atoms and nuclei have a gravitational behavior identical to the rest energies. We can
imagine that one group of field theories has attempted to interpret gravity in terms of
known particles and has failed. Another group of field theorists has begun to deduce
some of the properties of a new field that might behave as gravity.
58
2017
MRT
In the first place,the fact that gravity is found to have a long range automatically means
that the interaction energy depends on separation as 1/r. There is no other possibility in
a field theory. The field is carried by the exchange of a particle, which henceforth we
shall call the graviton. It must have mass m=0 so that the force proportional to 1/r2 results
from the interaction. The next guess we must make before we can write down a field
theory is: What is the spin of the graviton? If the spin were ½, or half-integral, we would
run into the difficulties (e.g., FLoG 95 – Chapter 2 considers this for the exchange of
neutrinos since it has half-integral spin – the single exchange does not result in a static
force because after a single exchange the neutrino source is no longer in the same state
it was initially), in that there could be no interference between the amplitude of a single
exchange, and no exchange. Thus, the spin of the graviton must be integral, some num-
ber in the sequence 0,1,2,3,4,… Any of these spins would give an interaction pro-
portional to 1/r, since any radial dependence is determined exclusively by the mass.

A spin-1 theory would be essentially the same as electrodynamics. There is nothing to
forbid the existence of two spin-1 fields, but gravity can’t be one of them, because one
consequence of the spin 1 is that likes repel, and unlikes attract. This is in fact a property
of all odd-spin theories; conversely, it is also found that even spins lead to attraction
forces, so that we need to consider only spin 0 and 2, and perhaps 4 if 2 fails; there is no
need to work out the more complicated theories until the simpler ones are found [to be]
inadequate.
59
2017
MRT
The rejection of spin-0 theories of gravitation is made on the basis of the gravitational
behavior of the binding energies. Experimental evidence on gravity suggests that the
force is greater if the gases are hotter. We know what happens in electrodynamics. The
electric forces are unchanged by random motions of the particles. Now the interaction
energy is proportional to the expectation value of the [relativistic] operator γt, which is
(1−v2/c2)−1/2. Since the potential [i.e.,V(r)] resulting from this operator is not velocity
dependent the proportionality factor must go as (1−v2/c2)1/2. This means that the
interaction energy resulting from the operator 1, corresponding to the spin-0 field,
would be proportional to (1−v2/c2)1/2. In other words, the spin-0 theory would predict
that the attraction between masses of hot gas would be smaller than for cool gas. In a
similar way, it can be shown that the spin-2 theory leads to an interaction energy which
has (1−v2/c2)1/2 in the denominator, in agreement with the experimental results on the
gravitational effect of binding energies. Thus, the spin-0 one is out, and we need a
spin-2 particle in order to have a theory in which the attraction will be proportional to
the energy content!

Our program in now to construct a spin-2 particle theory in analogy to the other field
theories that we have. Let us proceed and guess at the correct theory in analogy with
electrodynamics.
60
2017
MRT
In the theories of scalar, vector, and tensor fields (another way of denoting spins 0, 1,
and 2) the fields are described by scalar, vector, or tensor potential functions:
Amplitudes and Polarizations in Electrodynamics
Another theory would result from assuming that the tensor ≡hµν is antisymmetric; it
would not lead to something resembling gravity, but rather something resembling
electromagnetism; the six independent components of the antisymmetric tensor h[µν]
would appear as two space vectors.
Spin-0 X Scalar potential
Spin-1 Aµ Vector potential
Spin-2 hµν Tensor potential (symmetric)

The source of electromagnetism is the vector current jµ, which is related to the vector
potential Aµ by the equation (with k2 =Σσ kσ kσ):
61
2017
MRT
∑ ′−
µ
µ
µ j
k
j 2
1
µµ j
k
A 2
1
−=
Here we have taken the Fourier transform and use the momentum-space representation
(i.e.,the d’Alembertian operator =Σρ∂ρ∂ρ=(∂/∂t)2 −∇2 is simply k2 in momentum space).
The calculation of amplitudes in electromagnetism is made with the help of propagators
connecting currents in the manner symbolized by diagrams. We compute amplitudes for
such processes as a function of the relativistic invariants, and restrict our answer as de-
manded by rules of momentum and energy conservation. The guts of electromagnetism
are contained in the specification of the interaction between a current and the field as
Σµ jµAµ; in terms of the sources, this becomes an interaction between two currents:
For a particular choice of coordinate axes, the vector kµ =Σνηµν kν may be expressed as:
]0,0,,ω[ 3kk =µ
and kµ =[ω, −k3, 0, 0]. (N.B., We – not me but Feynman – use an index ordering 4,3,2,1
such that xµ =[t,z,y,x], Aµ=[A4,A3, A2, A1] and ηµν =diag(+1,−1,−1,−1)). Then the current-
current interaction when the exchanged particle has a four momentum kµ is given by:
)(
ω
1
)0)(0()0)(0())((ω
1
112233442
3
2
33
2
jjjjjjjj
k
j
kk
j ′−′−′−′
−
−=








++−+
′− ∑µ
µ
µ
kµ
φ
N.B., This diagram assumes
that matter is represented by
a scalar function φ.

The conservation of charge, which states that the four divergence of the current is
zero, in momentum space becomes simply the restriction:
62
2017
MRT
where we now sum over repeated indices. In the particular coordinate system we have
chosen, this restriction connects the third and fourth component of the currents by:
0=µ
µ jk
4
3
33
3
4 ω
0ω j
k
jjkj ==− or
If we insert this expression for j3 into the amplitude −j′µ(1/k2) jµ=−[1/(ω2 −k3
2)]( j′4 j4 − j′3 j3
− j′2 j2 − j′1 j1) above, we find that:
)(
ω
111
22112
3
2442
3
2
jjjj
k
jj
k
j
k
j ′+′
−
+′=





′− µ
µ
Now we may give an interpretation to the two terms of this last equation. The fourth
component of the current j4 is simply the charge density; in the situation that we have
stationary charges it is the only non-zero component. The first term is independent of the
frequency ω; when we take the inverse Fourier transform in order to convert this to a
space-interaction, we find that it represents an instantaneous Coulomb potential (i.e.,
[F.T.]−1( j′4 j4/k3
2)=(e2/4πr)δ (t−t′)). This is always the leading term in the limit of small
velocities. The term appears instantaneously, but this is only because the separation we
have made into two terms is not manifestly covariant. The total interaction is indeed a
covariant quantity; the second term represents corrections to the instantaneous
Coulomb interaction.

The interaction between two currents always involves virtual photons. We can learn
something about the properties of real photons by looking at the poles of the interaction
amplitude, which occur for ω=±k3. Of course, any photon that has a physical effect may
be considered as a virtual photon, since it is not observed unless it interacts, so that
observed photons never really have ω=±k3. There are however no difficulties in passing
to the limit; physically, we know of photons that come from the moon, or the sun, for
which the fractional difference between ω and k3 is very, very small. If we consider that
we also observe photons from distant galaxies which are millions of light years away, we
see that it must make physical sense to think we are so close to the pole that for these
there can’t be any physical effect not like the pole term. The residue of the pole term at
ω=k3 is the sum of two terms each of which is the product of two factors. It seems that
there is one kind of photon which interacts with j1 and j′1, and another kind of photon that
interacts with j2 and j′2. In the usual language, we describe this by saying that there are
two independent polarizations for photons.
63
2017
MRT
The circular polarizations are just linear combinations of plane polarized photons,
corresponding to a separation of the sum of products ( j′2 j2 + j′1 j1) in a different basis:
*)(
2
1
)(
2
1
*)(
2
1
)(
2
1
)( 212121212211 jijjijjijjijjjjj ′−′−+′+′+=′+′
Still we see that there are two kinds of photons. The circularly polarized photons
rotate into themselves, so to speak, because they change only by a phase as the
coordinates are rotated by an angle θ; the phases are exp(iθ) and exp(−iθ).

The quantum mechanical rules describing the behavior of systems under rotations tell
us that systems having this property are in a state of unique angular momentum; the
photons that change in phase as exp(+iθ) have an angular momentum projection +1, and
the others that change in phase as exp(−iθ) have a projection −1.
64
2017
MRT
We might expect that if the photons are objects of spin 1, there might be a third kind of
photon having a spin projection 0. However, it can be shown, that for a relativistic theory
of particles with zero rest mass, only two projection states are allowed, having the
maximum and minimum values of projection along the direction of propagation. This is a
general result, good for particles with any spin, which has been proven by E. Wigner.
)( µ
µµ
µ εε alongofprojection jj ≡−
The actual polarization of a photon is perhaps best defined in terms of the projection of
the vector potential in specific directions, such as εµ (N.B., εµ is a unit vector). The
interaction of such a photon with a current jµ (i.e., the amplitude to absorb or emit such a
photon), is given by:

We shall write down the amplitudes for the exchange of a graviton by simple analogy to
electrodynamics. We shall have to pay particular attention to the instantaneous,
nonrelativistic terms, since only these are apparent in the present experimental
observation of gravity. The full theory gives us both the instantaneous terms (analogous
to the Coulomb interaction), and the corrections which appear as retarded waves; we
will have to separate out these retardation effects for calculations of observable effects.
65
2017
MRT
We assume the d’Alembertian in momentum space is k2; by simple analogy with Aµ =
−(1/k2) jµ of the previous chapter (i.e., Aµ is a field four-vector and jµ is the source) we
expect the field tensor hµν to be related to its source Tµν as follows:
µνµν T
k
h 2
1
=
Amplitudes for Exchange of a Graviton
What can the interaction be? Since electrodynamics relates currents, let us guess that
the source tensors appear in the interaction energy as:
µνµν T
k
T 2
1
′=nInteractio
It is out job now to ascribe particular characteristics to the tensor ≡Tµν so that the
characteristics of gravity are reproduced. It is possible a priori that the tensor
involves gradients, that is, the vector k. If only gradients are involved, the resulting
theory has no monopoles; the simplest objects would be dipoles. We want the tensor
to be such that in the nonrelativistic limit, the energy densities appear in analogy to
the charge densities j4.

As is well known, we have in electromagnetism a stress tensor whose component T44
is precisely the energy density of the electromagnetic field. It is therefore quite likely that
there is some general tensor whose component T44 is the total energy density; this will
give the Newtonian law of gravity in the limit of small velocities, an interaction energy:
66
2017
MRT
Then, in order to have a correct relativistic theory, it must follow that the amplitude
involves the complete tensor ≡Tµν , as we have guessed in T′µν (1/k2)Tµν above.
2
3
4444
k
TT′
−
There is one point which we haven’t mentioned about this tensor . The trace of a
symmetric tensor is an invariant quantity, not necessarily zero. Thus in computing from a
symmetric tensor of nonzero trace we might get a theory which is a mixture of spin 0 and
2. If we write a theory using such a tensor, we will find, when we come to separating the
interaction into its polarizations, that there are apparently three polarizations instead of
the two allowed to a massless spin-2 particle. To be more explicit, we may have besides
the interaction T′µν (1/k2)Tµν another possible invariant form proportional to T′µ
µ (1/k2)Tν
ν .
We shall try to adjust the proportions of these two so that no real gravitons of angular
momentum zero would be exchanged.

We write explicitly the various terms as follows:
67
2017
MRT
In electrodynamics, we had obtained a simplification by using the law of conservation of
charge. Here, we obtain a simplification by using the law of energy conservation, which
is expressed in momentum space by:
0=µν
µ
Tk
In our usual coordinate system, such that k1 and k2 are zero, this relates the 4 index to
the 3 index of our tensor by:
νν 334ω TkT =
When we use this relation to eliminate the index 3, we find that the amplitude separates
into an instantaneous part, having a typical denominator k3
2, and a retarded part, with a
denominator ω2 −k3
2. For the instantaneous term we get:








′−′−








−′− 424241412
3
2
44442
3
22
ω
1
1
TTTT
k
TT
k
and for the retarded term:
)2(
ω
1
2121222211112
3
2
TTTTTT
k
′+′+′
−
−
)22
2222(
ω
11
11112222333321213131
232341414242434344442
3
22
TTTTTTTTTT
TTTTTTTTTT
k
T
k
T
′+′+′+′+′+
′+′−′−′−′
−
=





′ µνµν

68
2017
MRT
In the retarded term, this adds pieces of the tensor as follows:
The transverse components of the tensor are presumably independent, so that this
last result (i.e., the retarded part) represents a sum of three independent products, or
three polarizations. We see that this theory contains a mixture of spin 0 and spin 2. To
eliminate the spin 0 part we must add to our amplitude a term of the form:
ν
ν
µ
µα T
k
T 2
1
′
))((
ω
1
221122112
3
2
TTTT
k
+′+′
−
α
We can adjust the parameter α so that the retarded term contains only a sum of two
independent products. The proper value of α is −1/2, to make the retarded term equal to:






′+−′−′
−
1212221122112
3
2
2))((
2
1
ω
1
TTTTTT
k
There are then two directions of the polarization, which are generated by these
combinations of the tensor elements:
122211 2)(
2
1
TTT and−
The different normalization is a result of the symmetry of our tensor. We can restore
some of the symmetry by writing √2T12 = (1/√2)(T12 +T21).

A possible plane-wave solution representing our graviton is therefore:
69
2017
MRT
σ
σ
µνµν ε xki
h e=
where the polarization tensor εµν has the following nonzero components:
2
1
2
1
2
1
21122211 ==−== εεεε and,
Our interaction is in general:
ν
ν
µ
µ
µν
µν T
k
TT
k
T 22
1
2
11
′−′=nInteractio
and can be written:
µν
µντσ
τσ
TPT ,′
where the propagator for the graviton is given by:
2,
1
)(
2
1
k
P τσµνσντµτνµσµντσ ηηηηηη −+=
We shall usually prefer for simplicity to keep the propagator as a simple factor 1/k2 and
to represent the interaction by virtual gravitons, emitted from the source with amplitude:






−= σ
σµνµνµν η TT
k
h
2
11
2
and with a coupling hµν T′µν for absorption.
The amplitude for emission of a real graviton of polarization εσ τ if ε σ
σ =0 as in ε11 =
(1/√2), ε22 =−(1/√2), and ε12 =ε21 =(1/√2) above, is given by the inner product εστ Tστ.

The polarization of a graviton is a tensor quantity [i.e., εµν ]. We may visualize this with
pictures similar to those we use in describing stresses; we draw arrows indicating the
direction to be associated with surfaces normal to the axes. In the plane perpendicular to
the direction of propagation we have the two stresses in Figure 1. These are the only
two possible quadrupole stresses [i.e., Figure 1 (a)&(b)]; the stresses representable by
all arrows pointing towards the origin (or away from the origin) are sometimes like a fluid
pressure, which has spin 0. The stresses (actually rotations) representable by all arrows
pointing in a clockwise (or counter-clockwise) direction correspond to spin 1.
70
2017
MRT
Interpretation of the Terms in the Amplitudes
Figure 1 (a) Figure 1 (b)
Figure 2 (a)
2 2
1
2
1
2
1
Figure 2 (b)
Spin-0 Spin-1
Spin-2
0° 45° 90°
135°180°225°
315°360°
exp(−2iθ )
exp(2iθ )
1
270°

The stresses represented by Figure 1 (b) may be referred to axes which are 45° from
the original axes; in this case the picture is Figure 2 (a) which is nothing but the same
stress of Figure 1 (a), rotated by 45°. From this we find that these two polarizations turn
into each other with a rotation of axes by 45°. As we rotate by 90°, each polarization
returns to itself; the arrows are reversed, but we must think of an oscillating time
dependence associated with the polarization. Continuing in this way, we see a complete
360° rotation corresponds to two complete cycles of phase – the spin is 2! There are two
orthogonal linear combinations of these two polarizations, whose rotational phase
change behaves as exp(2iθ) and exp(−2iθ) (see Figure 2 (b) for a general way of
representing spin rotations in 1-2 phase space). This is simply a different separation of
the retarded term; with a little trial and error we can simply display the two parts:
71
2017
MRT
That these have character of spin 2, projection ±2 tensors is obvious when we compare
the form of these products of harmonic polynomials; we know that (x±iy)(x±iy) are
evidently of spin 2 and projection ±2; these products are xx−yy±2ixy, which have the
same structure as our terms in the equation above. Thus we conclude that with the
choice α=−1/2, our gravitons have only two possible polarizations! This is possibly the
correct theory, equivalent to a field theory of spin 2 which our field theorists Pauli and
Fierz (1939) have already worked out and expressed in terms of field Lagrangians.
)2)(2(
4
1
)2)(2(
4
1
122211122211122211122211 TiTTTiTTTiTTTiTT +−′−′−′+−−′+′−′

We are approaching the spin-2 theory from analogies to a spin-1 theory; thus we have
without explanation assumed the existence of graviton plane waves, since the photon
plane wave are represented by poles of the propagator, and the graviton propagator also
has poles at ω=±k3. But the experimental evidence is lacking – we have observed
neither gravitons nor even classical gravity waves!
72
2017
MRT
There are some problems at present completely disregarded to which we shall later
turn our attention. The sources of electromagnetism are conserved, and energy is
conserved, which is the source of gravity. But this is conservation of a different
character, since the photon is uncharged, and hence is not a source of itself, whereas
the graviton has an energy content equal to EG=hω, and therefore it is itself a source of
gravitons! (i.e., gravitons – like gluons g1…g8 in QCD – will interact with matter and with
themselves!) We speak of this as the nonlinearity of the gravitational field.
In electromagnetism, we were able to deduce field equations (i.e., Maxwell’s
equations), which are inconsistent if charge is not conserved. We have so far avoided
discussion of a field equation for gravity by worrying only about amplitudes, and not the
field themselves. Also, we need yet to discuss whether the theory we can write will be
dependent on a gauge, and whether we can at all write a field equation corresponding to
Maxwell’s ∂Fµν/∂xν = jµ.

There are some physical consequences of our theory that may be discussed without
field equations, by simply considering the form of the interaction. We write the complete
expression corresponding to α=−1/2:
73
2017
MRT
(N.B., If you so desire, the term (ω2/k3
2)T′44T44 may either be replaced by T′43T43 or by
T44T′33 +T44T′44). We have already discussed the retarded term and its polarizations.
Let us now look at the first term. The tensor is the stress density; for slow particles,
the space components are of order v/c, so that the Newtonian law is represented by
only one of the products, T′44T44. The other products are somewhat analogous to
magnetism. Note that in the separation they appear as instantaneous terms. The
retardation effects, the travelling waves, appear only at even powers of v/c.
4444444 34444444 21
44444444444444 344444444444444 21
Retarded
ousInstantane
]4))([(
ω
1
44)()(
ω
1
11
2
11
2
1212221122112
3
2
424241412211442211442
3
2
44442
3
22
TTTTTT
k
TTTTTTTTTT
k
TT
k
T
k
TT
k
T
′+−′−′
−
−








′−′−+′+′+′+








−′−=











′−





′ ν
ν
µ
µ
µν
µν

We shall now study our theory in terms of a Lagrangian, studying the fields themselves
rather than simply the amplitudes.
74
2017
MRT
It is from such a Lagrangian that we eventually deduce field equations; we want the
gravitational analogue of Aµ=−(1/k2) jµ. It is not difficult to guess at the form of the
second term, or coupling term:
∫ 







+








∂
∂
−
∂
∂








∂
∂
−
∂
∂
−= µ
µ
µ
ν
ν
µ
µ
ν
ν
µ
Aj
x
A
x
A
x
A
x
A
dVS
4
1
EM
The Lagrangian for the Gravitational Field
K+








∂
∂
−
∂
∂
+








∂
∂
−
∂
∂
+








∂
∂
−
∂
∂
σ
σ
ν
ν
µ
µ
σ
µν
ν
µσ
σ
µν
σ
µν
x
h
x
h
c
x
h
x
h
b
x
h
x
h
a
Our theory will not be complete until we have invented some criterion for assigning
values to the coefficients a,b,c,d,e,….
We first review the situation in electrodynamics. There, the action is:
Now the analogy for the terms involving the derivatives of hµν is not so obvious; there
are simply too many indices which can be permuted in too many ways. We will have to
write a general form of the Lagrangian, as a sum over all the ways of writing the field
derivatives, putting arbitrary coefficients in from of each term, as follows:
µν
µνκ Th−

Perhaps we can make a guess by some analogy to electromagnetism. If we vary the
Lagrangian in SEM above, with respect to A (i.e., to first-order in Aµ), we arrive at a
differential equation connecting the fields and the current:
75
2017
MRT
For economy of writing, we shall henceforth indicate such differentiations (i.e., gradients)
by simply indicating the index of the coordinates after a comma; the equation above
becomes, for the first ∂/∂xν differential in the parenthesis above:
µνµ
ν
µν
ν
jA
xx
A
xx
=








∂
∂
∂
∂
−








∂
∂
∂
∂
µµν
ν
νµ
ν
jA
x
A
x
=
∂
∂
−
∂
∂
,,
The conservation of charge is expressed by taking the divergence of jµ equal to zero.
But we may notice that the Maxwell equations for the field are not consistent unless
there is charge conservation, and that the gradient of the expression on the left hand
side of Aµ,ν
,ν − Aν,µ
,ν = jµ above is identically zero. With the correct electromagnetic
Lagrangian, then, the conservation of charge can be deduced as a consequence of the
field equations. The left hand side satisfies an identity, its [i.e., jµ] divergence is zero:
0)( ,,
,
,
,
,,
,
,
, ==−=− µ
µ
µν
µν
µν
νµ
µν
µν
ν
νµ jAAAA
µ
ν
µν
ν
νµ jAA =− ,
,
,
,
and then for the next ∂/∂xν differential operating on the parenthesis:

A similar requirement serves to define the size of the coefficients a,b,c,d,e,… relative
to each other. We will write a general Lagrangian, deduce the differential equations by
varying it, and then demand that since the divergence of the tensor vanishes, the field
quantities that are equal to it should have a divergence which vanishes identically. This
will result in a unique assignment for the values of the coefficients. We will carry out the
algebra explicitly, adjusting the coefficients so that the field equations are considered
only if:
76
2017
MRT
0, =µν
νT

We begin by writing down all the possible products of derivatives of our field tensor hµν .
At each step, considerable simplification results if we use the symmetry of hµν in order to
combine terms. If the two tensor indices are different from the derivative index, we have
two possible products:
77
2017
MRT
If there are two indices which are equal, we may have three possible products:
νµσ
σµν
σµν
σµν
,
,
,
, hhhh and
The Equations for the Gravitational Field
µσ
σ
ν
µν
σ
µσ
µν
ν
σ
σµ
µν
ν
,
,,,,, hhhhhh and,
Not all of these five products are necessary; hµν,σ hµσ,ν may be omitted because it can be
converted to hµν
,ν hσ
µ,ν by integration by parts. This leaves only four independent
products of derivatives. That is, we assume a graviton action (with a −κ hµν Tµν
interaction term) of the form:
∫ −+++= )( ,
,,,,,
,
,G
µν
µν
µσ
σ
ν
µν
σ
µσ
µν
ν
σ
σµ
µν
ν
σµν
σµν κτ ThhhdhhchhbhhadS
We now vary this sum of four products with respect to hαβ in order to obtain a differential
equation relating the field derivatives with the source tensor Tαβ . The result is (N.B.,
watch out to remember that δhαβ is symmetric in α, β, so only the symmetric part of its
coefficient must be zero):
βα
µσ
µσβα
µν
νµβα
σ
βασ
σ
ασβ
σ
βσα
σ
σβα κηη Thdhhchhbha −=+++++ ,
,,,
,
,
,
,
,
, 2)()(2

We take the derivative of each of these with respect to the index β, then the
requirement that the divergence of the left should be identically zero gives the equation:
78
2017
MRT
We now gather terms of the same ilk, and set the coefficients equal to zero; this involves
occasional flipping of indices, and interchanging index labels:
022 ,
,
,
,
,
,
,
,
,
,
,
, =+++++ µασ
µσ
αµν
µν
βασ
βσ
ασβ
βσ
βσα
βσ
σβα
βσ hdhchchbhbha
0)2(0)(0)2( ,
,
,
,
,
, =+=+=+ dchcbhbah βασ
βσ
ασβ
σβ
σβα
βσ and,
If we choose a scale for our units, such that a=1/2, we obtain:
,and,,
2
1
11
2
1
−==−== dcba
So the above field equations are hαβ ,σ
,σ −hασ ,β
,σ −hβσ ,α
,σ +hσ
σ ,αβ +ηαβ hµν
,νµ −ηαβ hσ
σ ,µ
,µ=
−κ Tαβ . Presumably, we have now obtained the correct Lagrangian for the gravity field.
As consequences of this Lagrangian, we shall eventually get a field equation. For now
the action is:
∫ 





−−+−= µν
µν
µσ
σ
ν
µν
σ
µσ
µν
ν
σ
σµ
µν
ν
σµν
σµν κτ ThhhhhhhhhdS ,
,,,,,
,
,G
2
1
2
1

The manipulation of these tensor quantities becomes increasingly tedious in the work
that is to follow; in order not to get bogged down in the algebra of many indices, some
simplifying tricks may be developed. It is not necessary obvious at this point that the
definitions we are about to make are useful; the justification comes in our later use.
79
2017
MRT
For a symmetric tensor such as , the rule is simpler because the two terms in the first
parenthesis are equal:
σ
σµνµνµνµν η XXXX
2
1
)(
2
1
−+=
µν
σ
σµνµνµν
σ
σµνµνµν ηη hhhhhhh =−=−=
2
1
2
1
and
and notice that the ‘bar’ operation is its own reciprocal for symmetric tensors. Define
also the use of the unindexed tensor symbol to represent its trace:
hhhh −=== σ
σ
σ
σ and)(Tr
With these notations, the field equations hαβ ,σ
,σ −hασ ,β
,σ −hβσ ,α
,σ +hσ
σ ,αβ +ηαβ hµν
,νµ −
ηαβ hσ
σ ,µ
,µ =−κ Tαβ may be written as follows, in a symmetric version:
µν
σ
νµσ
σ
σµν κ Thh −=− ,
,
,
, 2
To get a relation for Tµν, we simply ‘bar’ both sides of this equation.
We now define a ‘bar’ operator on an arbitrary second rank tensor by:

We next look for something analogous to the gauge invariance properties of
electrodynamics to simplify the solution of hµν ,σ
,σ −2hµσ ,ν
,σ =−κTµν above. In
electrodynamics, the field equation Aµ,ν
,ν−Aν
,νµ = jµ resulted in the possibility of
describing fields equally well in terms of a new four-vector A′µ, obtainable from Aµ by
addition of a gradient of a scalar function X, that is A′µ = Aµ + X,µ . What might be the
analogous property of a tensor field? We guess that the following might hold (we have to
be careful to keep our tensors symmetric!), that is the substitution of:
80
2017
MRT
µννµµνµν ,, XXhh ++=′
into the left hand side of hµν ,σ
,σ −2hµσ ,ν
,σ =−κ Tµν above will not alter the form of that
equation. Using this property of gauge invariance, it will be simpler to obtain equations
for the fields in a definite gauge which is more appropriate, something like the Lorentz
gauge of electrodynamics. By analogy with the choice Aν
,ν = 0 we shall make the
corresponding choice (which we shall call a Lorentz condition):
0, =µσ
σh
This results in field equations relating the ‘bar’ of to the field:
µνµνσµν
σ κ Thkh −=−= 2,
,
or solving hµν =(κ/k2)Tµν. It follows immediately that the amplitude of interaction of such
an with another source T′µν from the κhµν T′µν in the Lagrangian is κ 2T′µν (1/k2)Tµν .
Thus, this produces precisely what we have found before in discussing the amplitudes
directly!
__
__
_
_

We would now like to deduce some useful general properties of the fields, by using the
properties of the Lagrangian density. For the gravitational field, we define at this point
the coupling constant and normalization of plane waves that we will use from now on.
We will let:
81
2017
MRT
Here, GN is the usual gravitational constant in natural units (h≡c≡1); the square root is
included in the definition so that the constant κ is analogous to the electron charge e of
electrodynamics, rather that to the square. The factor √(8π) serves to eliminate irrelevant
factors from the most useful formulæ. To represent plane-wave gravitons, we shall use
fields:
NGπ8=κ
xki
h ⋅
= eµνµν ε
where k≡kµ =[ω,k3,0,0] and with the polarization tensor εµν normalized in such a way
that:
1=µν
µν εε
The action that describes the total system of gravity field, matter, and coupling
between matter and gravitons, has the following form:
32144 344 21444444 3444444 21
MattertermCouplingFields
mmg SThdhhhhdSSS +−−=+≡ ∫∫
µν
µν
ν
µν
µσ
σσµν
σµν
τκτ )2(
2
1 ,
,,
,
G

We may from the Lagrangians of fields deduce some important properties, for
example, we can understand why gravitation is attractive for likes and unlikes, whereas
in electricity likes repel and unlikes attract. It can be shown that this property is inherent
in the sign of the Lagrangian, so that if we change S→−S, the force changes sign. The
sign of the coupling constant κ or e or g makes no difference, since it appears as a
square in any diagram which represents a correction to the energy; always two vertices
are involved. We can change the sign of the energy corresponding to a diagram only if
we can introduce a factor i at each vertex, for example if we are to use fields iφ rather
than φ.
82
2017
MRT
However, the fields φ must represent appropriate plane waves which are consistently
defined so that the standing waves in a large box have positive energies, and the
quantum mechanical oscillators which represent these standing waves behave properly.
Scalar fields have plane waves:
xki
a ⋅
= eφ
The amplitude a for a quantum field appears as the coordinate of a quantum mechanical
oscillator. If the kinetic energies of such oscillators, which are proportional to (da/dt)2, are
to represent positive energies, we must write our theory in a consistent way, and the
replacement φ →−iφ would be wrong.

For electromagnetic wave, it is the components in the transverse direction,
perpendicular to the direction of propagation, which are restricted by a similar
consideration. A negative sign appears in the associated energy because the energy
involves the space indices in the dot product of two vectors, which we have defined as
Aµ Bµ =A4B4 −(A3B3 + A2B2 + A1B1). The sign of the Coulomb forces comes from the sign
of the time components in the Lagrangian. For the gravity waves, it is again transverse
components that are restricted, but in contracting over two indices (or any even number
of indices) the signs cancel out, the sign of the time component h44 is the opposite of the
electrical case and we have attractions!
83
2017
MRT

Before we can calculate observable effects and make predictions other than the inverse-
square rule and the fact that likes attract, with a force proportional to the energy, we
must specify how the matter defines the stress tensor Tµν. We shall first carry out in
some detail calculations based on the simplest assumption, that matter can be
represented by a scalar function φ.
84
2017
MRT
How does one go about generating a stress density from a scalar field φ? The
Lagrangian presumably depends on the fields and their derivatives:
),( ,
ii
νψψLL =
The Stress-Energy Tensor for Scalar Matter
The 44 component of a stress tensor Tµν should represent the energy density, which is
the Hamiltonian.Therefore, by a generalization of the usual classical prescriptionfor gene-
rating a Hamiltonian from a Lagrangian, H=q(∂L/∂q)−L, the following rule is obtained:⋅⋅⋅⋅ ⋅⋅⋅⋅
where the symbol δ µ
ν is the Kronecker delta, which is 1 if µ =ν, and zero if µ ≠ν. This
rule is not generally correct. For one thing, it does not necessarily result in an expression
symmetric in µ and ν.
L
L µ
ν
µ
ν
µ
ν δ
ψ
ψ −
∂
∂
= i
i
T
,
,
If the tensor Tµν =½(T(µν) +T[µν]) is not symmetric,T(µν) (e.g., it is antisymmetric–which
is represented by T[µν]), the resulting theory is pathological (e.g., there is no way to
define angular momentum in the field). The conservation of energy is confused
because the divergence involves terms which are no longer equal (i.e., Tµν
,ν ≠Tνµ
,ν).

In our particular scalar case, the rule Tµ
ν =ψ i
,ν (∂ L/∂ψ i
,µ)−δ µ
ν L does turn out to give
a satisfactory symmetric form. We have a Lagrangian and action (e.g., for scalar matter
represented by φ – c.f., PART VIII – THE STANDARD MODEL: Field Equations
chapter):
85
2017
MRT
which results in the following stress tensor:
∫ −= )(
2
1 22
,
,
φφφ σ
σ
φ mdVS
µνσ
σ
µνµνµν ηφφφηφφ 22
,
,
,,
2
1
2
1
mT +−=
With this stress tensor Tµν for scalar matter, the coupling term in the Lagrangian
becomes:






−−−=− )(
2
1 22
,
,
,, φφφηφφκκ σ
σ
µν
µν
νµ
µν
µν
µν
mhhTh
In terms of our ‘bar’ notation, this is written as:






+− 22
,,
2
1
φφφκ νµ
µν
mhh
where h=hσ
σ . We may now use such a coupling term to generate amplitudes for
scattering by exchange of a graviton!

The amplitude for scattering corresponding to an exchange of a single graviton (G) may
now be written since we know the form of the propagator and that at each vertex we also
know the coupling term in the Lagrangian to be −κ [hµν φ,µφ,ν −½hµνηµν(φ ,σφ,σ −m2φ 2)] on
the The Stress-Energy Tensor for Scalar Matter chapter. We replace the gradients of the
scalar fields φ by components of the four-momenta in the momentum representation:
86
2017
MRT
and label np (for n=1…4) for the particles so that the coupling becomes for one vertex ( ):
ννφ pi =,
Amplitudes for Scattering for Scalar Matter






−− )(
2
1
2 221
,, mpp σ
σ
µννµ ηφφκ
We have written an ‘underline’ under the products pµ pν to remind us that we must use
an appropriate symmetrized version, since hµν is symmetric. Explicitly, this means:
)(
2
1
µννµνµ BABABA +≡
which could also be written as A(µ Bν). The complete expression for the amplitude is then:






−−





− )(
2
11
2
1
4 22121
2
2432
mpppp
q
mpp σ
σ
µννµµννµ ηηκ
These abbreviations (e.g., underline) will result in simplifications of the algebraic
manipulations in the more complicated calculations which are to come, so it is
worthwhile to use them.
q
2p
1p
4p
3p
G
κ
κ

Our theory has given us an expression for the amplitude of gravitational scattering of
one particle by another. In order to compute something of observable size, we must go
to very large masses, and in order to observe something not given by the Newtonian
law, we need velocities close to the speed of light. We can, for example, calculate the
deflection of a small mass of very high velocity (v ≈c) as it goes by a star such as the
sun. Here, we need to justify the replacement of a sum of amplitudes from all the
particles in the star by a single amplitude corresponding to a mass m; the replacement is
an approximation, but it gives the correct answer to some kind of first-order. The
deflection is 1+v2/c2 larger than the prediction of Newtonian theory.
87
2017
MRT
The preceding result cannot be said to correspond to the deflection of light by the sun,
because a photon is not a scalar particle, hence cannot be represented by our scalar
mass field φ. For the scattering of two identical particles, the amplitude should include
an exchange term, but for the case of the star, the particles are clearly not identical.
Our theory has not as yet considered the possibility that we might add a piece of zero
divergence to our stress tensor Tµν; this would correspond to a different distribution in
space of the mass and stress. Even for scalar matter we do have a real ambiguity in the
description of Tµν. This difficulty also arises in electrodynamics when we attempt to write
a coupling of photons to charged vector mesons.

We can study the properties of the gravitational waves in the absence of matter; by
varying the Lagrangian we obtain the equation (c.f., hµν ,σ
,σ −2hµσ ,ν
,σ =−κ Tµν of the The
Equations for the Gravitational Field chapter):
88
2017
MRT
which is analogous to Maxwell’s equation in empty space. If we use plane waves:
Detailed Properties of Plane Waves
02 ,
,
,
, =− σ
νµσ
σ
σµν hh
xqi
h ⋅
= eµνµν ε
the equation becomes:
02
=−− σν
σ
µσµ
σ
νµν εεε qqqqq
We are interested in cases when q2 ≠0 and q2 =0. If q2 ≠0 we may divide through by q2
and rearrange things so that:








+







= σν
σ
µσµ
σ
νµν εεε q
q
qq
q
q 22
11
This separation has explicitly expressed εµν as a symmetrized gradient of a vector:
µννµµν χχε ,, +=
We have discussed previously how gauge invariance of the gravitational field mass that
the addition of a term of this form makes no difference in the physics. It is therefore
always possible to add a piece to εµν so that εµν =0. We shall call these waves with q2 ≠0
gauge waves; they have no physical effects and can always be removed by a gauge
transformation.
_ _

If q2 =0, then q2εµν −qν qσεµν −qµ qσεσν =0 above implies that:
89
2017
MRT
That is free waves must satisfy the Lorentz gauge condition. It is not just a matter of
choosing hµν
,ν=0 for convenience for cases in which the wave is not free. This has its
electromagnetic analogue, for photons qµεµ must be zero.
0=σµ
σ
εq
We may deduce the actual form of the polarization tensor εµν in a system of
coordinates such that the momentum vector is qµ = [ω,ω,0,0]. If we choose:
µννµµνµν χχεε qq ++=′
and demand that ε ′µν should have components only in the transverse direction, we
obtain a system of equations which may be solved to get the answer:
2
1
2
1
21122211 =′=′=′−=′ εεεε and
To arrive at this result, note that the qσεσµ =0 equation above implies that εµ4 =−εµ3 so that
only the component 4, 1, and 2 are independent. The 4 components can be removed, if
desired, by the transformation ε ′µν =εµν +qµχν +qν χµ above. For example, ε ′14 =ε14 +ωχ1
so choose χ1 =−ε14/ω and χ2 =−ε24/ω. Then ε ′43 =ε43 +ωχ4−ωχ3 so choose χ3 −χ4 =−ε34/ω,
then ε ′43 =ε ′43 =ε ′44 =ε ′33. We make ε ′44 =ε44 +2ωχ4 =0 by choosing χ4=−ε44/2ω. Then,
since ε ′44 is also zero, the trace ε ′σ
σ is zero, as is therefore ε ′33 and ε ′11 +ε22. Therefore
only components with µ,ν =1 or 2 in ε ′µν survive, and for them ε ′11 =−ε22. There are then
only two linearly independent normalized combinations like the above ε ′11=−ε22=1/√2
and ε ′12=−ε21=1/√2 components for the polarization tensor ε ′µν.
__
_
_ _ _
_
_ __

The amplitude for Compton scattering of a graviton by a particle of mass m
corresponds to the diagrams in the Figure. The graviton polarization is represented by
the tensor εµν for scalar mass, the momentum components at each vertex 1pµ, 1pν +1pν =
2pν +2pν, and 2pµ. In term of these quantities, we have (for the Diagram on the Left):
90
2017
MRT
The propagator we have written is that which is appropriate to a scalar particle. Some
simplifications in the formula result from the restriction on the plane waves q2 =0 and
qσεσµ =0.






−+
−+






−+ αββαα
αβ
µνννµ
µν
ηεηεκ 21111
2211
222222
2
1
)(
)(
1
2
1
)(4 mqpp
mqp
mqpp
_
2εµν
1εαβ
1p
2p
2q
1q
2εµν
1εαβ
1p
2p
2q
1q
Exchange Term
κ
κ
κ
κ
2211
)(
1
mqp −+
e−
G

From gauge invariance we expect that a change of 1εµν to 1εµν +1qµ aν +1qν aµ would
have no effect on the Compton amplitude. A direct substitution shows that is not true.
Our result is unsatisfactory and incomplete… What is wrong?
91
2017
MRT
In the Compton scattering of photons by electrons, there is a third diagram (see
Figure) not analogous to one of the pair in the previous Figure. It corresponds to the
quadratic (or second-order) coupling, in A2, which appear in the Lagrangian to make the
theory gauge invariant. In analogy to the situation in electrodynamics, we might suspect
that in considering only the pair of diagrams (see previous Figure), we have made an
approximation to the truth by linearizing. The existence of an amplitude with quadratic
coupling (see Figure) may be deduced in electrodynamics by demanding that the gauge
substitution ε′1 =ε1 +qa should lead to no change in the amplitude to a given order. The
procedure in simply to equate the terms of the same order of the amplitudes obtained
with ε1 and ε′1, with coefficients in front of each term which are to be determined. It may
be possible to deduce the form of the quadratic graviton terms in an analogous fashion,
but this has not yet been done because the self-coupling of the graviton makes things
very complicated in second order.
1p
2p 2q
1q
κ2

In developing and making modifications in our field theory, we have neglected to check
that the theory is internally consistent. We have written down a total Lagrangian having a
field term, a matter term, and a coupling term. We have arrived at a field equation by
arranging that the divergence of the stress-energy tensor should be identically zero. This
procedure is evidently incorrect, since we have written a stress tensor which did not
include the energy of the gravitational field itself! Thus our present theory is physically
untenable, since the energy of the matter is not conserved…
92
2017
MRT
We shall attempt to correct this theoretical deficiency by searching for a new tensor to
be added to our old Tµν, which might fix things up so that:
0)( , =+ ν
µνµν
χT
The Self Energy of the Gravitational Field
and at the same time the self energy of the field is correctly taken into account. How do
we find this term? We know we are attempting to describe a nonlinear effect: the
gravitational field is produced by energies, and the energy of the field is a source of
more field… We shall insist that the field equations come out of the variation of some
action, and shall ask ourselves: What kind of a term must we add to the Lagrangian to
get a term like χµν to go into an equation of motion:
)(2 ,,,
,
µνµνσνµ
σ
σµν
σ χκ +−=− Thh
and such that (Tµν + χµν),ν =0 above is satisfied? What might χµν look like if it repre-
sents a kind of gravitational energy? Undoubtedly, at least in part, it is proportional to
squares of field strengths; that is, the product of two gradients of the potential. Per-
haps, χµν is a sum of terms like hµσ
,τ hντ
,σ +&c., each with two hs and two derivatives.

We shall insist that our equations be deducible from a variational principle such as the
Least Action. When we vary the products, we reduce the number of hs, so the
Lagrangian to be varied needs a net new term, of third of in Tµν, which we shall call F3;
we shall try to arrange things so that the variation of F3 leads to the term χµν :
93
2017
MRT
The algebraic character of F3 must be that it involves products of three hs and has two
derivative indices. A typical term of F3 might be:
µν
µν
χκ
δ
δ
=
h
F3
K+= ν
τσ
τµν
µν
,,3
hhhaF
When we write all possible such products, we find there are 24. We may further reduce
the number by noting that certain terms can be reduced to combinations of others by
integrating twice by parts; these considerations lead us to write down 18 different and
independent forms. We wind up therefore with an expression for χµν in terms of hs and
18 independent constants.
The next procedure is evident. We attempt to determine the constants by demanding
that (Tµν + χµν),ν =0. This results in a set of many more that 18 equations for the 18
constants. However, it turns out that they are all consistent, and the 18 constants are
unique. When we have done this, we shall have an improved theory which correctly
takes account the energy of the gravitational field itself to second order in hµν.

Our present theory is linear in the sense that we have written down an equation relating
the gravitational fields hµν to a stress tensor Tµν:
94
2017
MRT
But we have specified Tµν in terms of matter alone, as though it were unaffected by
gravity, and as though the gravitational field energy were not itself a source of fields. The
effects of gravity on matter that we want to include may be illustrated by considering
what may happen as we bring two masses 1 and 2 together, in the presence of a third
object. Part of the work done may go into heating up the third object, so that energy is
not conserved by considering only the masses 1 and 2 and the fields they generate. The
nonlinear effect due to field energies is conceptually more familiar; we have calculated
the fields due to mass sources as a first approximation; the next approximation is to
include the first-order fields as sources, and so approach a self-consistent solution.
The Bilinear Terms of the Stress-Energy Tensor
µν
σ
νµσ
σ
σµν κ Thh −=− ,
,
,
, 2
We shall construct a new stress tensor from our old one by adding a term which shall
be derivable from a neglected piece of the Lagrangian, which we denote by F3, by a
variation:
µνµνµν
χ+= TT oldnew
where:
µν
µν
δ
δ
χκ
h
hF ][3
=
and hope that the difficulties will be removed, at least to a higher order in hµν.

Since we are attempting to construct χµν so that it fixes up the failure of oldTµν to
conserve energy, oldTµν ≠0, we get a hint as to the structure of χµν by taking the
divergence of oldTµν. The divergence of χµν should cancel it out, at least to the first
nonvanishing order. To compute this divergence, we fist rewrite oldTµν for a moving
particle in a new form which looks at first unfamiliar but allows easier manipulations. In
terms of an integral over a scalar parameter which might as well be the proper time s
(N.B., we also indicate derivatives with respect to s by dots):
95
2017
MRT
That this expression for oldTµν is equivalent to what we have used previously may be
checked by comparing the corresponding terms of the action:
∫ −= νµµν
δ zzszxsdmxT &&)]([)( 4
o
old
∫∫ = νµ
µνµν
µν
κκ zzxhsdmxhxTxd &&)()()( o
4 old
There is a simple physical way to interpret the significance of the δ-function in the
integral for oldTµν above; it simply says that there is no interaction energy except where
the particle actually is!
It is perhaps easier to understand how appropriate there forms are by reviewing
our familiar electrodynamics in the same language; the interaction term of the
Lagrangian is a volume integral of − jµ Aµ and jµ is related to the particle velocity by
jµ(x)=e∫dsδ 4[x−z(s)]zµ and the action Sint =−e ∫dsAµ(z)zµ. The parallelism with our
gravitational-field expressions (i.e., the integral of oldTµν above and the relation
below it) is evident.
⋅⋅⋅⋅⋅⋅⋅⋅

We compute the divergence of oldTµν from our integral for it is given above. We first
recognize the δ-function is symmetric in x and z, so that the derivatives with respect to xν
in the divergence may be replaced by minus derivatives with respect to zν. Then we
make use of the identity zν (∂/∂zν ) f [z(s)]=(d/ds) f [z(s)] to obtain for the divergence of
oldTµν:
96
2017
MRT
We see that this divergence is a density of acceleration. At this point we shall assume
that we have already included correctly in our Lagrangian all energies other than
gravitation, so that the acceleration zµ represents the effect of gravity as given by the
equation of motion:
∫ −= µµν
ν δ zszxsdT &&)]([4
,
old
⋅⋅⋅⋅
⋅⋅⋅⋅⋅⋅⋅⋅
νµ
σµν
νµ
σµνµσννµσ
µ
µσ zzzzgggzg z
&&&&&& ,,,, )(
2
1
Γ−=−+−=
where Γµν,σ are the Christoffel symbols of the first kind and the left-superscript z on it is to
remind us to which variable the indices refer. We now multiply the divergence oldTµν
,ν by
gµσ (x), and replace gµσ zµ by −zΓµν,σ zµzν. We note that because of the δ-function, zΓµν,σ
has the same effect as xΓµν,σ . This means that the Christoffel symbol can be pulled out
of the integral sign, leaving us an expression involving the divergence oldTµν
,ν and the
original tensor oldTµν:
⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅⋅ ⋅⋅⋅⋅
)()()( ,, xTxTxg µν
σµν
ντ
νστ
oldold
Γ−=
is an exact equation which oldTµν must satisfy. But now we only use it to first order in h.

We may separate gτσ into its parts ητσ +2κhτσ , and obtain an equation which tells us
that oldTµν
,ν starts with a linear term in the coupling constant κ:
97
2017
MRT
since the Christoffel symbol involves derivatives which make the zero-order part ηµν of
gµν play no role.
µν
ν
µν
ν
µν
ν χ ,,, += TT oldnew
ντ
νστ
µν
σµν
ν
σν κ ,,
,
2 ThTT −Γ−= oldold
When we compare this equation to the requirement that the new tensor newTµν should
have zero divergence:
and if we assume that χµν itself is bilinear in the fields, we see that the divergence χµν
,ν
should be:
K++Γ= )( 2
,
,
κχ σν
µσν
ν
µν OTold
Knowing the divergence does not determine χµν for us. We have an additional
requirement in that we expect to deduce χµν from a variation of F3 with respect to hµν. If
we construct F3 as a sum over all possible independent products involving trilinear
products of field components and two derivative indices, these two requirements
determine F3 uniquely. We shall not carry out here the determination of the 18 constants,
but quote the result of a lot of hard algebraic labor:












+++−+−= τσ
τσ
β
β
α
α
βα
βα
βτ
τ
ασ
σβα
δγ
γαβδ
βαδ
δβα
αγβ
γδγβα
δγβα
κ ,,,
,,
,,
3
4
1
2
1
22 hhhhhhhhhhhhhhhhhF

We have succeeded in the task we had set before us in the beginning, to develop a field
theory of gravitation in analogy to our other well-known field theories, which would
adequately describe all the known characteristics of gravitational phenomena. Thus, our
fictional Venusian viewpoint has been fruitful. There are yet some loose ends in our
theory; we might conceive that the hard-working among the Venusian theorists might be
dissatisfied with a theory which left third-order effects unspecified, and some of them
might pursue the chase after functions F4 and then F5 to be added to the Lagrangian
integral to make the theory consistent to higher orders. This approach is an incredible
laborious procedure to calculate unobservable corrections, so we shall not emulate our
fictional Venusians in this respect.
98
2017
MRT
We shall therefore search for the complete functional F[h]:
L++++= 5432
FFFFF
Formulation of a Theory Correct to all Orders
which shall be defined by the requirement that the resulting equation of motion:
µν
µν
κ
δ
δ
T
h
F
=
shall automatically imply the divergence property of Tµν (i.e., gµσ
oldTµν
,ν =−Γµν,σ
oldTµν of
the The Bilinear Terms of the Stress-Energy Tensor chapter).

The functional F must therefore satisfy the following differential functional equation:
99
2017
MRT
which we are to solve. This is in general an exceedingly difficult problem, and there
is no procedure (that I know) for generating the solutions. We shall have to rely on
our ingenuity in devising functionals which are solutions in the sense that they satisfy
δF/δhµν =κ Tµν above when plugged in. There is no general solution whose leading
terms are the F2 and F3 we have deduced by other methods. There is, however, an
evident simplest solution (involving the smallest number of derivatives of the gµν –
just two). We choose it! When this is done, we shall have arrived at a theory which is
identical to Einstein’s! At that point, then, we shall abandon the Venusian point of
view and proceed to study the theory of gravitation from the terrestrian point of view
developed by Einstein himself.
0,
,
=








Γ+








µν
σµν
νντ
στ
δ
δ
δ
δ
h
F
h
F
g

To solve the problem of constructing solutions to satisfy gτσ (δF/δhτν ),ν +Γµν,σ (δF/δhτν )=
0 (c.f., Formulation of a Theory Correct to all Orders chapter) we shall convert that
equation to an equivalent statement of a property of F. We first note that gτσ (δF/δhτν ),ν +
Γµν,σ (δF/δhτν )=0 is a vector equation. If we take the dot product of the equation with n
arbitrary vector Aσ(x), and integrate over all space, we deduce an equation which looks a
little different:
100
2017
MRT
If F satisfies this for arbitrary Aσ , then gτσ (δF/δhτν ),ν +Γµν,σ (δF/δhτν )=0 is implied. We
may now integrate by parts the first term in the integrands, so as to get rid of the
gradient with respect to ν. We deduce that:
Invariants and Infinitesimal Transformations
0)()()( ,
,
=
















Γ+








∫ ντ
σντ
σ
νντ
στ
σ
δ
δ
δ
δ
τ
h
F
xA
h
F
xgxAd
0)}()]()([{ ,, =Γ+−








∫ xAxgxA
h
F
d σ
σντντσ
σ
ντδ
δ
τ
We have put ‘angular brackets 〈⋅⋅⋅⋅〉’ around dτ to remind us that we are to take the
average of the integral and the corresponding integral having σ and ν interchanged:
since hτν is symmetric, a meaningful mathematical identity is obtained only in the case
that the Christoffel symbol Γ is also symmetric is σ and ν.

We can interpret the ∫〈dτ 〉(δF/δhτν)[−(Aσ gστ),ν +Γτν,σ Aσ] (with gστ =gστ (x) and Aσ =Aσ
(x)) equation in another way. We note that if we make a first order change in h, say let hτν
change to hτν +ξτν , the value of F changes as follows:
101
2017
MRT
Therefore, our ∫〈dτ 〉(δF/δhτν)[−(Aσ gστ),ν +Γτν,σ Aσ] equation tells us that for ξτ ν
infinitesimal, and of the form appearing in the above expansion, F remains unchanged.
K+








+=+
ντ
ντντντντ
δ
δ
ξξ
h
F
hFhF ][][
Let the field tensor hµν be changed by an infinitesimal transformation Aσ into the tensor
h′µν. We express h′µν according to the rule implied in ∫〈dτ 〉(δF/δhτν)[−(Aσ gστ),ν +Γτν,σ Aσ]
as follows (N.B., We must remember to symmetrize in τν and to use the explicit
expression for Γτν,σ ):
σ
ντσ
σ
σντντντ ,,
2
1
AgAghh −+=′
where, recall, AµBν =½(Aµ Bν + Aν Bµ). For convenience we let –κ Aν =ζ ν and write the
equation in terms of gτν instead of hτν as follows:
σντ
σσ
τστ
σ
νστντντ ζζζ ,,, ggggg +++=′
Our problem is this: To find a form for F, a functional of gµν such that under the
infinitesimal this transformation of the gµν to g′µν , the F is unchanged to first order in ζ σ
for any ζ σ (x).

We now proceed to find the desired invariant form of F. We shall find it useful to define
a matrix which is the reciprocal of gµν , using superscripts as indices rather than
subscripts:
102
2017
MRT
The symbol δ µ
σ is now a true Kronecker delta, which is 1 if µ =σ, and zero if µ ≠σ.
µ
σσν
µν
δ=gg
The reciprocal of a matrix A′= A+B, if B is infinitesimal, is given by the expression:
K−+−=
′ A
B
A
B
AA
B
AAA
1111111
Since ζ σ is infinitesimal, we may easily construct the reciprocal of g′σν according to this
last rule:
σντ
βντασανβ
ν
βνα
ν
αβαβ
ζζζ ,,, ggggggg −−+=′
We next examine briefly one invariant which may easily be found, in order to
understand the methods, and in the next chapter construct a more complicated invariant
which leads us to our complete theory.

Let us consider how the determinant of a matrix changes as we change the matrix a
little. We use the following expression for the determinant:
103
2017
MRT
)ln(Tr
edet A
A =
We notice that it is trivially true in the case that the matrix is in diagonal form:
)(lnTr)(ln)(ln)(ln
332211 eedet 332211 AAAA
AAAA ===
+++ L
L
We now apply the above rule to compute the determinant of A+B, where B is an infinite-
simal matrix. What we need is the matrix logarithm (i.e., ln≡loge) of A+B; the proper
expansion is:


















+
⋅=⋅=





+⋅=











+
B
A
B
A
AAB
A
AB
A
A
1
Tr
1
1lnTr
edetedet
1
1detdet
1
1det
We now use this rule to compute the determinant of g′σν and take the logarithm of the
resultant expression:
ντ
σντ
σσ
σ ζζ gggg ,,2)detln()detln( ++−=′−
The product of g matrices in the last term may be related to the determinant as follows:
σ
ντ
σντ ,, )]det[ln( ggg −=
What we have achieved is a new expression involving ζ σ and its gradients together
with number, not matrices.

We set:
104
2017
MRT
and rewrite the resulting equation as:
)detln( gC −=
σ
σ
σ
σ ζζ ,,2 CCC ++=′
If this expression were a pure derivative, we could integrate over all space to obtain our
invariant. The form of the last two terms suggest exp(C/2) as an integrating factor. There-
fore, we search for an invariant of the form exp(α C′), adjusting α properly. Since ζ σ is
infinitesimal, an expansion keeping the first term only gives:
)2(eeee ,,
)2( ,, σ
σ
σ
σ
ααζζαα
ζαζα
σ
σ
σ
σ
CCCCCC
++==
++′
The second term of this expression has a form which may be converted to a pure
derivative; we note that:
CCC
C ασ
σ
σ
σ
α
σ
σα
ζαζζ ee)e( ,,, +=
which is the same as the second term of the expansion for exp(α C′) above for α=1/2.
When we integrate the expansion for exp(α C′) over all space, for α =1/2 the integral of
the second term vanishes, and we are left with the equality:
∫∫ =′ 22
ee CC
dd ττ
The invariant solution, in terms of the matrix gσν , is therefore:
∫ −= gdF det0
τ

The invariant 0F of the previous chapter is indeed a solution of the differential functional
equation gτσ (δF/δhτν ),ν +Γµν,σ (δF/δhτν )=0 (c.f., Formulation of a Theory Correct to all
Orders chapter) but is not the solution needed for our theory, since it involves no
derivatives. In this chapter, we shall construct the solution needed by our theory by an
analogous method. The success of the trickery hinges on finding a perfect divergence
which can be integrated over all space.
105
2017
MRT
The starting point is again the equation g′τν =gτν +gτσ ζ σ
,ν +gνσ ζ σ
,τ +ζ σ gτν,σ (c.f.,
Invariants and Infinitesimal Transformations chapter) involving ζσ and first derivatives.
The principle is this: We wish to find combinations of gµν s and derivatives which involve
no ζ (or at least a perfect differential) when transformed. We have in g′τν =gτν +gτσ ζ σ
,ν +
gνσ ζ σ
,τ +ζ σ gτν,σ first derivatives of ζ. If we form g′µν,τ, second derivatives like ζ σ
,τν
appear, &c. It looks like ever increasing complexity. But if the highest derivative is ζ σ
,τν
and it occurs in only one term, isolated, we can eliminate it by subtracting the term with
τ,ν reversed. To begin then we form g′µν,τ which yields second derivatives of ζ of the
form ζ σ
,τν , but there are two of them, ζ σ
,ντ and ζ σ
,µτ. We try to get it to one by combining
it with other derivatives like g′µτ,ν. It turns out that we can get rid of 2 terms, but an equal
number of new terms appears, so no simplification is achieved. But when we consider
the third possible ordering, g′τν,µ, then we obtain by additions and subtractions a new
equation in which two of the terms can be added, because they are the same. One
difficulty is that as we take derivatives of products the number of terms increases so
much (e.g., g′µν,τ =gµν,τ +gµσ,τζ σ
,ν +gνσ,τζ σ
,µ +gµσ ζ σ
,ντ +gνσ ζ σ
,µτ +ζ σ gµν,στ +ζ σ
,τ gµν,σ).
Lagrangian of the Theory Correct to all Orders

But when all is added and subtracted and all indices flipped, and symmetrized we find:
106
2017
MRT
so only one ζ σ
,µν appears. We must now get the gτσ away by multiplying by the
reciprocal matrix. First we introduce a new notation that simplifies the handling. We let:
σ
µντσστµν
σσ
τσµν
σ
µτσν
σ
µτµστµντµν ζζζζζ ,,,,,,,,,,, g−Γ+Γ+Γ+Γ+Γ=Γ′
λ
µντνµµτνντµ
τλ
τµν
τλ
Γ=−+=Γ )(
2
1
,,,, ggggg
which is the Christoffel symbol of the second kind (or holonomic connection coefficients).
If we multiply Γ′µν,τ by gτλ in order to isolate the remaining second derivative, the
equation becomes, in terms of the Christoffel symbols of the second kind:
λ
µν
σλ
σµν
λ
σ
σ
µν
σ
µ
λ
σν
σ
ν
λ
µσ
λ
µν
λ
µν ζζζζζ ,,,,, −Γ+Γ−Γ+Γ+Γ=Γ′
This is automatically symmetric in µν. To go further we differentiate again. If we
differentiate this last equation with respect to a new index ρ, and subtract the
corresponding equation having ρ and ν interchanged, only the following terms are not
cancelled in the subtraction:
λ
νσ
σ
ρµ
σ
ρµ
λ
σν
σλ
σνµρ
σ
µ
λ
νρσ
σ
ρ
λ
νµσ
λ
νρµ
λ
ρσ
σ
µν
σ
µν
λ
ρσ
σλ
ρσµν
σ
µ
λ
ρσν
σ
ν
λ
ρµσ
λ
ρµν
λ
νρµ
λ
ρµν
ζζζζζ
ζζζζζ
,,,,,,,,
,,,,,,,,,,
Γ+Γ−Γ−Γ−Γ−Γ−
Γ−Γ+Γ+Γ+Γ+Γ=Γ′−Γ′

The trick is now to get rid of two double derivatives. These come multiplied by Γs. But
in the equation for Γ′λ
µν above we have an expression which yields just ζ σ
,µν. We use it
to supply counter terms to those in our last equation for Γ′λ
µν,ρ − Γ′λ
µρ,ν. This can be
accomplished by taking the product of two equations such as for Γ′λ
µν above; we see
that the indices in the Γs of one term are the same as those of the ζ in the other term; so
that by taking a product of two Γ′λ
µν equations, one having the set if indices λρσ, the
other σµν, replacing λµν and adding to the equation for Γ′λ
µν,ρ − Γ′λ
µρ,ν, the second
derivatives cancel.
107
2017
MRT
We introduce a new quantity Rλ
µνρ defined as follows:
λ
ρµ
λ
σν
λ
νρµ
σ
νµ
λ
ρσ
λ
ρνµ
λ
νρµ ΓΓ−Γ−ΓΓ+Γ= ,,R
Note that this tensor is explicitly antisymmetric in ρ and ν. In terms of this, the equation
we finally obtain is:
λ
σρνµ
σσ
νρµ
λ
σ
λ
ρνσ
σ
µ
λ
σνµ
σ
ρ
λ
ρσµ
σ
ν
λ
ρνµ
λ
ρµν ζζζζζ ,,,,, RRRRRRR +++++=′
What must now be done is to treat this last equation as we treated g′τν =gτν +gτσ ζ σ
,ν +
gνσ ζ σ
,τ +ζ σ gτν,σ (c.f., Invariants and Infinitesimal Transformations chapter), which is of
the same form, except that it is the tensor Rλ
µνρ which is involved rather than gτν . A
procedure entirely analogue to the one used before leads to the answer for the
invariant quantity F:
∫ −−= λ
λνµ
µν
αβτ
κ
RggdF )det(
2
1
2

The functional F =−[1/(2κ 2)]∫dτ gµνRλ
µνλ√[−det(gαβ)] which we have just deduced in the
last chapter results in a Venusian theory of gravitation which is identical to that
developed by Einstein (N.B., which is commonly written SE =−[1/(2κ 2)]∫dτ R√(−g) with κ 2
=8πGN /c4). If we make an expansion of the functional F when the gravitational fields are
weak, we obtain as the leading terms the F2 and F3 functionals of our earlier theory. We
may say therefore that our Venusian viewpoint has succeeded in its aim to construct a
self-consistent theory of gravitation by means of successive logical steps guessed at by
analogy but without apparently demanding a superhumanly keen intuition.
108
2017
MRT
Einstein himself, of course, arrived at the same Lagrangian but without the help of a
developed field theory, and I [Feynman talking here] must admit that I have no idea of
how he ever guessed at the final result. We have had troubles enough in arriving at the
theory – but I feel as though he had done it while swimming underwater, blindfolded, and
with his hands tied behind his back! Nevertheless, now that we have arrived at an
equivalent theory, we shall abandon the Venusian viewpoint and discuss the terrestrian
point of view due to Einstein.
Einstein Equation for the Stress-Energy Tensor

We shall use the following standard notation for three tensors deduced from Rλ
µνρ by
multiplying by gαβ and contracting:
109
2017
MRT
The quantity Rτµνλ is a tensor (i.e., the Riemann tensor). It is antisymmetric for an
exchange of ν and ρ, also antisymmetric of interchange of τµ, and symmetric is the pair
τµ in interchanged with the pair νρ, Rµν (i.e., the Ricci tensor) is symmetric.
RRgRRRRg === νµ
νµ
νµ
λ
λνµρνµτ
λ
ρνµλτ ρνµτ and,andinricantisymmet )(
The variation of the functional F =−[1/(2κ 2)]∫dτ √(−g)R (where g=det(gµν)) with respect
to gµν yields (c.f., δF/δhµν =κ Tµν ‒ Formulation of a Theory Correct to all Orders chapter):






−−=
−
−= RgRg
g
Rg
g
F νµνµ
νµνµ κδ
δ
κδ
δ
2
11)(1
2 22
The last quantity is the stress-energy tensor of our theory, and it satisfies the following
equation:
if substituted for Tµν as we required it to do. That is, the full equations of the gravitational
field to all orders are:
where Tµν is our matter energy tensor. This is the equation Einstein obtained. QED
νµνµνµ
κ TRgRg 2
2
1
=





−−
νµσ
νµ
σν
ν
νµ
τνµ
σν
νστ TTTTg Γ−=Γ−= ,,, or

When we develop formulæ which allow us to compare the physical meaningful
difference between a tensor at one point and its value as neighboring points, that is,
describing the variation of the tensor in a way which cancels out the changes in compo-
nents induced by arbitrary choices of coordinates; for example, comparing a vectors at
xµ to another at a point an infinitesimal displacement dxµ away, by carrying one of the
vectors, keeping it constant (i.e., keeping it parallel to itself) to the other point, we even-
tually end up with the curvature tensor (c.f., PART II – MODERN PHYSICS):
110
2017
MRT
The Connection between Curvature and Matter
µλ
τνµ
λ
ντ
λ
τν ARAA =− ;;
where we use a semicolon to denote covariant differentiation (commas are for gradients).
The fact that the curvature tensor appears as connecting the second covariant
derivatives serves as a clue that enables us to give another useful geometrical picture of
curvature. The noncommuting property of the second derivative represents a limit of
differences in the vector as we move first a displacement along ν and then along τ, or
first along τ and then along ν. If the coordinates are flat, for a constant vector there is no
difference. If we have curved space as we take these displacements in different orders,
we find a different resulting vector. We can understand how the gradients of the
Christoffel symbols Γ get into the act. Carrying out the algebra, we arrive at the result:
0;;; =++ λ
τνρµ
λ
νρτµ
λ
ρτνµ RRR
The curvature tensor Rλ
µνλ itself may be shown to satisfy the Bianchi identity:
QED)( ,,;;
µλ
τνµ
µρ
τµ
λ
ρν
ρ
νµ
λ
ρτ
λ
νµτ
λ
τµν
λ
ντ
λ
τν ARAAA =ΓΓ−ΓΓ+Γ−Γ=−

Connecting now this curvature tensor to the sources of gravity (i.e., matter and energy)
is done by contracting the first and last indices of the curvature tensor to obtain the Ricci
tensor (i.e., Rµν =Rλ
µνλ). This is the only way to contract the curvature tensor once. Now,
the generalized law of energy and momentum conservation says that the contracted
covariant derivative, that is, the covariant divergence of the stress-energy tensor, must
be zero (i.e., Tµν
,ν =0). We now search for a form involving the Ricci tensor in such a way
that its contracted covariant derivative is identically zero. The answer comes from
contracting the Bianchi identity above twice. The contraction in the indices λτ results in
an expression involving the Ricci tensors:
111
2017
MRT
0;;; =++ λ
λνρµνρµρνµ RRR
Contracting now in µν we get:
0;;; =++ λ
λρ
µ
µρρ RRR
So that the tensor quantity which has zero covariant divergence is:
0
2
1
;
=





−
λ
λ
ρ
λ
ρ RgR
It was Einstein’s guess that this quantity is precisely the stress-energy tensor. To write
this in Einstein's form, we simply raise one index to write a doubly contravariant tensor:
0
2
1
;
2
==−= µν
ν
µνµνµνµν
κ GTRgRG and
This is the complete law of gravitation of Einstein. Gµν is called the Einstein tensor.

We have discovered a tensor, called the curvature tensor, defined by what happened as
we carried vector about in our space. Since it is a tensor, we can use it to generate
quantities to be used in writing down covariant equations. We get no physics by just
writing down covariant equations, however, we must specify the connection of the
equations to the real world of matter. What Einstein did was simply to guess what the
connection was. What Einstein did was simply to guess what the connection was. There
is no way to deduce the connection from more fundamental principles.
112
2017
MRT
The physical hint that may help us is that gravity couples to the energy density, so that,
since the energy density in relativity is the 44 component of a second rank tensor, we
need a second rank tensor in the equations. The curvature of a fourth rank tensor, so
that we contract it once and use the Ricci tensor. It was Einstein’s first guess that the
stress-energy tensor was simply equal to the Ricci tensor, κ 2Tµν =Rµν. However, another
choice is possible; we can add to the Ricci tensor a tensor multiple of the scalar
curvature R (i.e., the contracted Ricci tensor). This is what Einstein finally chose:
The Field Equations of Gravity
µνµνµν κ TRgR 2
2
1
=−
There is good reason why this choice is better. If we take the covariant divergence of
this equation, the answer is identically zero. This means that the law of conservation of
energy is a consequence simply of the form of the equation above.

To write down an action principle relativistically, we need an integral which is a scalar
invariant. We choose the action for the gravitational field to be:
113
2017
MRT
where κ 2 =8πGN, d4x=dxdydzdt, R=gµνRµν (N.B., Rλ
µνλ=Rµν and gλτ Rλ
µνρ=Rτµνρ) and g=
detgµν. The action Sg is a scalar because R is a scalar and √(−g)d4x is a scalar. The
curvature tensor Rµν appears when we take the variation of Sg with respect to gµν:
∫∫ −=−−=≡ RxdRgxdSS Eg
4
2
4
2
2
1
2
1
κκ






−−= RgRg
g
Sg µνµν
µν κδ
δ
2
1
2
1
2
It is because of this that we can use the integral of R as the action of the gravitational
part of the complete problem.
We now write down the integrand of the action for the fields directly in terms of the
metric tensor gµν :
ρ
σρνµµρνσ
µσν
λνσρµτρστλµντσρλµνρµµνλν
τρµσλν
,,
,,
)]([
)22(
4
gggggg
gggggggggggggg
g
gR
−−+
−+−
−
−=−−
It will also be sufficient to write down the action as the integral of the first term:
∫ ΓΓ−ΓΓ−−= )(
2
1 4
2
σ
σρ
ρ
νµ
σ
µρ
ρ
σν
µν
δ
κ
δ ggxdSg

We now want to show that if the functional:
114
2017
MRT
is invariant under coordinate transformations, then the covariant divergence if the
variation of Sg with respect to gµν is identically zero. Under the infinitesimal
transformation to contravariant primed coordinates, xµ →x′µ:
)(xhxx ′+′→ µµµ
∫ Σ= ][4
µνgxdSg
the change of gµν is given by:
)()()()()()( ,,, xghxghxghxgxgxg ′+′+′+′=′′→ ρνµ
ρ
ρν
ρ
µρµ
ρ
νµνµνµν
In terms of the new coordinates, dropping the primes on the integration variables, the
invariant action is:
∫∫∫
++Σ
+Σ=′Σ=
µν
ρνµ
ρ
ρν
ρ
µρµ
ρ
ν
µνµν
δ
δ
g
ghghgh
xdgxdgxdSg
][
][][
,,,444
When we do an integration by parts on the second term of this expression. we convert it
to an expression involving the functional derivatives of the function Σ. We set it equal to
zero, since we know that the change in the action must be zero for any hρ, because of
the character of Σ:
0
2
1
=
∂
∂Σ
−







 Σ
∂
∂
ρ
µν
µν
ρν
µν
µ δ
δ
δ
δ
x
g
g
g
gx

Let us denote by Gµν the variation of 2κ2Sg with respect to gµν:
115
2017
MRT
The quantity Gµν is a contravariant tensor density of second rank. With this definition,
the ∂[(δΣ/δgµν)gνρ]/∂xµ − ½(δ Σ/δgµν)(∂gµν /∂xρ)=0 equation above becomes:
µν
µν
δ
δ
κ
g
Sg2
2=G
0
2
1
)( ,, =− µν
ρµνν
µν
µρ GG gg
which is equivalent to the statement that the covariant divergence of Gµν is zero:
0; =µν
νG
In the Formulation of a Theory Correct to all Orders chapter, we worked with functional
equations (c.f., gτσ (δF/δhτν ),ν +Γµν,σ (δF/δhτν )=0) of the same form as Sg =−[1/(2κ 2)]∫d4x
⋅R√(−g) above; the solutions to these equation are really tensor densities and not
tensors. The stress-energy tensor density T µν satisfies the equation:
σρµ
σρ
µν
ν TT Γ−=,
where:
τµ
τ
σρµ
σρ
νµ
ν Tg
g
TT ,,
2
1
−Γ−=
but the stress-energy tensor Tµν satisfies the following:
νµνµ
Tg−=T

We discuss how one writes down a general law of physics, one which describes not only
the gravity fields, but also the matter. We assume that it can be deduced from a principle
of least action; the mathematical statement is that the variation of the action is zero:
116
2017
MRT
The Lagrangian density L contains various kinds of fields, for example, the gravity tensor
field gµν, the electromagneticfield Aµ, and, if matter is scalar, a scalar matter field φ. When
we vary this action with respect to the various fields, we get the equations of propagation
for the corresponding fields. We have written down one piece of this action; let us denote
what if left over by a quantity Sm which depends on the matter fields φ and electromag-
netic fields Aµ and all other fields that we know of. When we take the variation of:
0),,(4
== ∫ Kµµνδδ AgxdS L
Classical Particles in a Gravitational Field
mmg SRgxdSSS +−−=+= ∫
4
2
2
1
κ
with respect to gµν , we get the following equation:
m
g
SRgRg
g
S
+





−−= µνµν
µν κδ
δ
2
1
2
1
2
The stress-energy tensor density of matter T µν must be the variational derivative of Sm:
µν
µν
δ
δ
g
Sm
2−=T
if Tµν is to be the source of the gravitational field.

As a starting point for the action term Sm, it is useful to consider classical limits. A
convenient method to generate such generalized formulæ is to go back to the locally
falling (freely falling) tangent coordinate system, and figure out how to add in factors of
gµν and Rµν to make the thing invariant. For example, a free particle under no force has
an action:
117
2017
MRT
This example illustrates the procedure for guessing; it has usually been found to be
fruitful. We write down things as they are in flat coordinates, change to curvilinear
coordinates, and see where the gµνs come in. It is often obvious which general form will
reduce to the flat space results. If zµ(s) is the orbit of a particle, freely falling, the term Sm
of the action is:
∫ −−=
sd
zd
sd
zd
gszxsdxd
m
Sm
νµ
µνδ )]([
2
44o
∫ −=
sd
zd
sd
zd
gszxsdm
νµ
µν
µν
δ )]([4
oT
∫−=
sd
zd
sd
zd
sd
m
Sm
µ
µ
2
o
The stress-energy tensor density T µν is obtained by variation of this Sm with respect to
gµν , which gives:
The analogy to results in electromagnetism is so strong, that this result is not
surprising. There are no troubles of inconsistency with this T µν ; since we have
started from an invariant form, this T µν satisfies the covariant divergence condition.

Next, we prepare the passage to quantum theory. If scalar particles are described by a
scalar field φ, then the appropriate action term in flat coordinates is:
118
2017
MRT
The generalization to curvilinear coordinate may be easily made; we guess that:
∫ −= )(
2
1 22,
,
4
φφφ ν
ν mxdSm
Matter Fields in a Gravitational Field
∫ −−= )(
2
1 22
,,
4
φφφ νµ
µν
mggxdSm
This form is evidently invariant under arbitrary coordinate transformations, which is one
requirement, and it reduces to the flat result in flat space. We can, however, write down
other terms which are perfectly good invariants, quadratic in the fields φ, involving the
curvature that the action should contain proportions α and β of the following terms:
∫∫ −+−− )( ,,
424
νµ
µν
φφβφα RgxdRgxd
We see that the action we write down is not unique. The first term we have written
down much be there, since it is the only one that reduces to the flat result. Now, since
derivatives are more complicated things than just fields, the β term is more complicated
because it contains four derivatives, two in the fields and two in the Rµν. The α term
has only two, however, both on the gµν field. I [Feynman] do not know of any
satisfactory way to determine α, and feel that the action for a scalar field is ambiguous.
The significance of a term like the β one has to do with whether a particle can feel the
gravity field over a region large enough to sense the local curvature.

Let us proceed to obtain the equations of motion of the matter field φ. Starting from Sm
=½∫d4x √(−g)(gµν φ,µφ,ν −m2φ 2) above, we may use the following variations of the
reciprocal matrix and square root of the determinant:
119
2017
MRT
to get the following expression for T µν :
ρσ
ρσ
ρσ
ρνµσµν
δδδδ gggggggg −=−−=
2
1
)(and
Next, we vary with respect to the field φ and set the variation equal to zero, to obtain
something which is an analogue of the Klein-Gordon equation:
02)( 2
,, =−+−+− φαφφ µν
µν
Rgmggg
We obtain an equation in which tensors appear by dividing through by the scalar density
√(−g):
02)(
1 2
,, =++−
−
φαφφ µν
µν
Rmgg
g
σµν
σ
µνµνρ
ρ
µννµ
µν
µν
δ
δ
φφα
φαφφφφφ
δ
δ
,
,
222;
;
;;
4
2
1
)(
2
1
2
g
R
g
RgRgmggg
g
Sm
−−






−−−−−−−=
−=T

Using the fact that the contracted Christoffel symbol:
120
2017
MRT
we see that our last equation becomes:
µµµτν
τν
τνµµτνντµ
τνν
µν ,,,,,, )(
1
)][ln(
2
1
)(
2
1
g
g
ggggggg −
−
=−≡=−+=Γ
and for a scalar function φ, covariant gradients are the same as ordinary gradients:
νν φφ ,; =
φαφµ
µ )2( 2
; Rm ++
The relation to the Klein-Gordon equation may be seen by considering the case α =0;
the ordinary d’Alembertian has simply been replaced by its covariant analogue, the
covariant d’Alembertian.
The preceding steps have given us a definite theory, since we have can specify that,
like Einstein’s gravitation theory, matter moves along a geodesic equation:
where zµ(s) is the orbit of a freely falling particle, and that the source tensor is given by:
∫ −=
sd
zd
sd
zd
gszxsdm
νµ
µν
µν
δ )]([4
oT
02
2
=Γ+
sd
zd
sd
zd
sd
zd σρ
µ
ρσ
µ
Also, ds is given by the interval (ds)2 =gµν dxµdxν and it was Einstein’s idea that the
complete description of gravity could be always be specified by a metric tensor gµν .

We are now ready again to make a quantum theory, after having a theory from
Einstein’s point of view. The theory is more complete than when we were discussing the
Venusian viewpoint – we have the complete Lagrangians including interactions with
matter correct to all orders. If we restrict our attention to a universe consisting only of
gravity fields and scalar matter, the field theory is obtained by considering expansions in
terms of a coupling constant:
121
2017
MRT
In the Lagrangian, the terms which are quadratic correspond simply to the propagators,
the terms involving products of two φ and one h, and terms involving three hs and also
two φs, corresponding to diagrams such as are shown in the Figure. In this way, we
have arrived at a prescription for calculating quantum mechanical amplitudes for the
motion of matter after having started out from a geometrical point of view.
µνµνµν κη hg 2+=
φ
φ
hκ κ3
φ
φ
h
h
h

In considering the terms in the action, we might consider why the field term might not
include a certain proportion Λ of ∫d4x √(−g). This would be an integral proportional to the
volume of the universe, presumably a constant. The resulting equation for the field
behaves somewhat as though the gravitons had a mass and a universal source. The
observation of the extremely long range of gravity forces makes it rather pointless to
introduce such a term, even though it might lead to a consistent theory. The equations of
motion that come from it are:
122
2017
MRT
The constant Λ is known as the cosmological constant. Einstein wanted a closed
universe, so he assigned to it a value which made steady-state solutions of such a
universe possible. This he later referred to as his Great Mistake; had he chosen it equal
to zero, he would have concluded that the universe must be expanding (or contracting).
It was only later that Hubble discovered that the faraway galaxies were receding and the
universe was expanding. Ever since this change in Einstein’s theory of the universe was
introduced, cosmology has been hounded by difficulties in the determination of the
cosmological constant. I [Feynman] agree with Einstein’s second guessing and think Λ=
0 is most likely.
µνµνµν κ TgG 2
+Λ=

In the Matter Fields in a Gravitational Field chapter, we have written down the action
terms corresponding to the propagation of free particles and fields. All that is left over
from the complete action may be considered as a coupling between the fields, and we
may proceed to compute various processes by perturbation theory. There needs to be
no apology for the use of perturbations, since gravity is far weaker than other fields for
which perturbation theory seems to make extremely accurate predictions. The known
parts of the general action are the following:
123
2017
MRT
The first simplification we shall make is to set the coefficient α equal to zero. Leaving
such a term in the action ordinarily worsens many problems of divergence that we shall
encounter later, and it increases the labor of computing. The second step is to pull out
the term representing the propagator of the fields by introducing the expansion:
∫∫∫ −−−−+−−=
++=
2422
,,
44
2
)(
2
1
2
1
φαφφφ
κ
νµ
µν
φ
RgxdmggxdRgxd
SSSS mg
Coupling between Matter Fields and Gravity
µνµνµν κη hg 2+=

When we write the action in terms of the field gµν and scalar matter fields φ, we get:
124
2017
MRT
with Xµν =½(Xµν + Xνµ)−½ηµν Xσ
σ . The variation of the function I[hµν ,φ] with respect to the
field hµν or φ represent the source terms in the differential equations of the fields. These
may be written as follows, in space and in momentum representations:
∫∫∫ ++= ][],[][ 4424
φφνµνµ MxdhIxdhFxdAction
where:
)(
2
1
][)(
2
1
][ 2
,,
,
,,
,2
φφφηφ νµ
νµν
νµ
λµ
λ
νµ
λ
λνµ
νµ mMhhhhhF −=−= and






+−
−=→





−=−
φδ
δ
ε
φ
φδ
δ
φφ
I
I
imk
I
m 22
2 1
and
νµ
λ
µλν
λ
νλµ
λ
λνµ κ Shhh =++− ,
,
,
,
,
,
_
where:








−=
νµ
νµ
δ
δ
κ h
I
S
1
Note that Sµν is what we called newTµν in The Bilinear Terms of the Stress-Energy
Tensor chapter (c.f., newTµν =oldTµν + χµν where κχ µν =δ F 3[h]/δhµν).

How do we go on from here? Because of the careful design of the original action, as
an invariant integral, it can be shown that the ordinary divergence of the source tensor
Sµν is identically zero. In the momentum representation:
125
2017
MRT
The source tensor contains approximately both the matter sources and the gravity
sources. Because of the freedom we have in choosing a gauge, we can make the barred
tensor hµν divergenceless, and thus obtain a solution:
0=νµ
ν
Sk
_
νµνµνµ
ν
κ Shkhk =→= 2
0
so that:
νµνµ
ε
κ
S
ik
h
−
= 2

The tensor above is not simply an unknown source tensor, but it is not well defined in
terms of the original action (c.f., the action −[1/(2κ)]∫d4x √(−g)R+½∫d4x √(−g)(gµνφ,µφ,ν −
m2φ 2)−α ∫d4x √(−g)Rφ 2 above with the expansion gµν =ηµν +2κhµν ), so that the equations
are properly consistent and energy is conserved. Once we have an expansion in powers
of the coupling constant κ, then we can proceed by the familiar rules of perturbation
theory to compute all the diagrams to any given order in κ. The key expansions are
those of gµν and of √(−g). The first is easily written down by analogy with the expansion
of (1+ x)−1 when x is a small number. We have:
126
2017
MRT
where one must remember the flat-space summation convention (i.e., Σµ Aµ Bµ = A4 B4 −
(A3 B3+ A2 B2 + A1 B1) where ηµν =diag(+1,−1,−1,−1)). The formula for √(−g)=√(−detgµν) may
be calculated by means of the tricks of the Invariants and Infinitesimal Transformations
chapter (i.e., using det{A[1+(1/A)B]}=det Aexp{Tr[(1/A)B]} with gµν =ηµσ (δ σ
ν +2κhσ
ν)):
K+−+−=+= − ντ
τσ
σµνσµ
σ
νµνµ
νµνµ
νµ
κκκηκη hhhhhhhg 321
842)2(












++−=












++−=






+−=−
K
K
τ
ρ
σ
τ
ρ
σ
σ
ρ
ρ
σ
ρ
ρ
τ
ν
σ
τ
ρ
σ
σ
ν
ρ
σ
ρ
ν
ρ
ν
ρ
ννµνµ
κκκ
κκκκδη
hhhhhh
hhhhhhhg
32
32
)2(
3
1
)2(
2
1
2
2
1
exp
)2(
3
1
)2(
2
1
2Tr
2
1
exp)]2[ln(Tr
2
1
expdetdet
and finally get for √(−g):
K+−+=−≡− )(1det 2 λ
ρ
ρ
λ
ρ
ρνµ κκ hhhgg

Plugging these expressions for √(−g) and for gµν into the action, we get explicit forms
for the coupling of matter and gravity; the result from ½∫d4x √(−g)(gµν φ,µφ,ν −m2φ 2) of the
action is:
127
2017
MRT
or better yet:
∫ +−+−++−= ])(1][))(42[(
2
1 222
,,
24
KK λ
ρ
ρ
λ
ρ
ρνµ
ν
σ
σµνµνµ
κκφφφκκη hhhmhhhxdSm
K+





−−−






+−−=
∫
∫∫
νµ
ν
ρ
ρµ
µ
µρ
λ
λ
ρ
νµνµ
νµ
µ
µ
φφφφφκ
ηφφφκφφφ
,,
22
,
,42
22
,,
422
,
,4
2)(
2
1
2
1
)(
2
1
hhmhhxd
mhxdmxdSm
The lowest order terms involve the interaction of two φs and one h in κ. this corresponds
to a vertex such as in shown in the Figure. At each vertex, we require that the linear
momenta be conserved. This rule come from the volume integration in the action – there
is no contribution unless the total phase of the term is equal to zero.
h
p2
p1
κ
φ
φ

We write plane wave solutions as:
128
2017
MRT
and:
xqi
h ⋅
= eνµνµ ε
xpi ⋅
=eφ
in terms of the polarization tensor εµν, the amplitude at a first order vertex is:






−−− )(
2
1
2 2
2121 mpppp σ
σ
ρ
ρνµνµ εεκ
Any diagram which involves only such vertices may now be calculated by simply
plugging in the proper amplitudes at each vertex and the particle and graviton
propagators between vertices, just as in electrodynamics.

Let us look at the next order. The terms shown in the action involving products of two
hs and two φs in κ 2, that two lines and two wiggles come together at a junction, as in the
Figure – Outermost Left. These are also terms coming from the expansion of the first
term of the action −[1/(2κ)]∫d4x √(−g)R+½∫d4x √(−g)(gµν φ,µφ,ν −m2φ 2)−α ∫d4x √(−g)Rφ 2
above, involving products of three hs, corresponding to diagrams in which three wiggles
come together at a point, as in the Figure – Left. The profusion of implicit sums in three
indices results in terms which are very, very long when written out explicitly. For
example, one of the terms in which three wiggles come together is hµν,ρhµρhνσ
,σ ; then
this is translated to momenta and polarization components, we get terms corresponding
to all permutations of the three gravitons:
129
2017
MRT
This complexity goes with a single vertex, which is always one half of an amplitude;
when we compound these complexities, as for example in computing a diagram such as
shown in the Figure – Right and Figure – Outermost Right, we may get as many as 108
terms.
K+++ σν
σ
ρµ
νµρ
σν
σ
ρµ
νµρ
σν
σ
ρµ
νµρ εεεεεεεεε aacbbccabbccbaa qqqqqq
κ2
φ
φ
h
h
κ3
h
h
h
b
a
c
κ
κ
κ
κ
κκ

The coupling of gravitons to matter is so weak that there is truly no hope of observing
quantum gravitational effects associated with particle events. In this sense, the
calculations we are about to do are absolutely irrelevant in practice. Still, we have
proposed a definite theory, and these unobservable processes are the simplest effects
that our theory predicts; they would be observable and important if the coupling were
stronger.
130
2017
MRT
There are many one-graviton diagrams in the decay of a particle. For the illustration
we take that of the Figure (i.e., a→b+c). The amplitude at the a-graviton vertex is given
by:






−−− )(
2
1
2 2
2121 mpppp σ
σνµνµ
νµ
ηεκ
Radiation of Gravitons with Particle Decays
κ
b
k
a
c
p1
p2
where the preceding superscripts 1 and 2 denote the matter particle before and after the
vertex. After the emission, the particle a propagates with momentum pa −k to the decay
vertex, hence p2
σ=(pa −k)σ.

If we let the decay amplitude be represented by a quantity A depending on the
momenta of the three particle {a,b,c} whose trajectories enter the black box in the
previous Figure, the expression for the amplitude is:
131
2017
MRT
For our purposes, the exact nature of the amplitude A is unimportant; it represents
whatever would have been there without the graviton.
),,(
ω
ω
cba
a
aa
pppA
k
p
pp
⋅
⋅
⋅
µν
νµ
εκ
)],,([
)(
1
])([
2
1
)(2 22
2
cba
aa
aaaaa ppkpA
mkp
mkppkpp −
−−





−−⋅−−− νµνµ
νµ
ηεκ
The amplitude above is large only when the propagator (i.e., 1/[(pa−k)−ma
2]) has a
very small value, that is, when k is very much smaller than pa, so the propagation
corresponds to a nearly free particle. In the limiting case of weak gravitons, the process
is identical to that of breaking radiation, bremsstrahlung emission of weak photons; it is
also closely related to the classical limit, since it depends on how the charge (mass)
currents move. The denominator is −2pa ⋅k, and in the limit that the frequencies ω and k
are very small, we may set k=0 in the numerator. If we factor out κ/ω, the second factor
in the amplitude has a definite limit depending only on the direction of the graviton, its
polarization, and the decay amplitude:

There are three similar diagrams, corresponding to emission of a graviton from any of
the three particles {a,b,c}. A diagram corresponding to the graviton having the black box
(see previous Figure) may readily be shown to be much smaller in magnitude; what
happens is that there is no nearly free particle to propagate, hence no small
denominator to enhance the term. If we neglect this term and all higher orders, we find
that the amplitude to emit a number of gravitons is:
132
2017
MRT
where:
),,(
ω
cba pppAa⋅⋅
κ
∑
⋅
−=
i i
ii
i
k
p
pp
a
ω
)(
νµ
νµ
ε
with i representing the particle joined to the graviton vertex and where (−)i is a factor
which is +1 for an incoming particle and −1 for an outgoing particle. The quantity a is a
kinematic and geometric factor.

To compute a transition rate we square the amplitude, insert a density-of-state factor
k2dkdΩ/(2π)3, and a normalization factor which is π/(2Ei) where Ei is the energy of each
particle. The final result is:
133
2017
MRT
Although we have worked out the theory assuming scalar particles [such as the spin-0
φ], in the low-energy limit the answer is the same no matter what the spin of the particles
may be. This is because in the low energy limit only the mass currents, the movement of
the masses, is relevant. Our answer, of course, has an infrared divergence, so that the
probability for the emission of a graviton, (if its energy is irrelevant) appears infinite. The
trouble is no more serious than the infrared divergence for the emission of low energy
photons – and the troubles may be cured by the same tricks as in low-energy
bremsstrahlung.
Ω=
Ω
= dda
dd
aP ω
ω
1
π61π4ω
ω
π4
2
3
2
2
2
2 κκ
giving the probability of graviton emission per disintegration. The factor κ 2 =8πGN makes
the rate extremely low, so low that the odds are very much against there being a
measurable graviton recoil in any cloud chamber or hydrogen chamber or spark
chamber decay ever recorded. The inverse energy factor 1/ω makes the quantity large
for extremely low graviton energies, ω (i.e., EG =hω) however, the fact is almost
irrelevant, since κ2/ω becomes near 1 only for energies so low that the wavelength of the
graviton would exceed the radius of the universe [as known in 1962-63] by some factor
such as 1039.

160
2017
MRT
Appendix IV: Review of Supersymmetry*
Contents
Introduction
A Brief Introduction to Supersymmetry
Supersymmetry
Supersymmetric Actions
Superspace
“Readers whose intellectual curiosity is not excited by this are advised to put this book aside permanently
and watch television instead of reading it.” D. Z. Freedman and A. Van Proeyan, Supergravity, Cambridge
University Press, 2012, P. 111.
* This Appendix is taken from M. Kaku, Introduction to Superstrings, Springer-Verlag, 1988, Appendix - §A.4 and M. Kaku,
Quantum Field Theory – A Modern Introduction, Oxford University Press, 1993, pp. 663-677.

161
2017
MRT
Introduction
In the late 1960s, after the success of the SU(3) description of quarks, physicists tried to
construct the master group that would allow a synthesis of an internal symmetry group
(e.g., SU(3)) and the Lorentz L+ or Poincaré group P. They sought a Master Group M
that was a nontrivial union of an internal group U and the Poincaré group:
Intense interest was sparked in groups like SU(6,6) or U(12) that combined both the
internal and space-time groups. However, only later was the no-go theorem discovered,
which seemed to doom all these ambitious efforts. The celebrated Coleman-Mandula
theorem (1967) showed that this program was impossible – there are no unitary finite-
dimensional representations of a noncompact group! This meant that because of this no-
go theorem, unitary representations of the particle were necessarily infinite dimensional:
These groups possessed nonphysical properties, such as an infinite number of particles
in each irreducible representation, or a continuous spectrum of masses for each
irreducible representation. So the conclusion was that either: 1) the group M has
continuous masses, or 2) the group M has an infinite number of particles in each
irreducible representation. As a consequence, after a period of brief enthusiasm, the no-
go theorem doomed all these naïve efforts to build a Master Group for all particle
interactions.
P
~
M ⊗⊃ U
~~
~

162
2017
MRT
Years later, however, it was discovered that there was a loophole in the no-go
theorem! It turns out that supergroups or graded Lie groups allow for an evasion of the
no-go theorem. It was possible to evade this no-go theorem (the most comprehensible
version being the Coleman-Mandula theorem) because it made an implicit assumption:
that the parameters θi of the Master Group were all c-numbers! However, if the θi could
be anticommuting, then the no-go theorem could be evaded. By this, we mean that the
work of S. Lie and E. Cartan concerned only continuous simple groups where the
parameters θi were real. However, if we allow these parameters to be Grassman-valued
(i.e., a Grassman-valued number θi anticommutes with all other Grassman-valued
numbers and commutes with ordinary real or complex numbers), we can extend the
classical groups to the Super Groups.
In turn, super groups eventually lead to superstrings, which have revived efforts to
build Master Groups containing all known particle interactions, including gravity. Thus,
supersymmetry holds the promise of being a fundamental symmetry of physics.
Supersymmetry has a long and interesting history. Apparently, the first known mention of
a supersymmetric group was by H. Myazawa who discovered the Super (Unitary) Group
SU(N/M) in 1966. His motivation was to find a Master Group that could combine both
internal groups and noncompact space-time groups in a nontrivial fashion. Supergroups,
in fact, are the only known way in which to avoid the Coleman-Mandula theorem, which
forbids naïve unions of compact and noncompact groups. Unfortunately, this important
work was largely ignored by the physics community.

163
2017
MRT
Two large infinite classes of groups we will be interested in are the OSp(N/M) and the
SU(N/M). Let us begin with the orthogonal group O(N), which preserves the invariant:
and the symplectic group Sp(M), which preserves the form:
invariant:)(O =ii xxN
invariant:)(Sp =nmnmCM θθ
where the C-matrices are real antisymmetric matrices because the θi are Grassman-
valued (i.e., a collection of Grassmann variables θi is defined with an θj by θiθj =−θjθi or
such that the anticommutator {θi,θj}=0) are independent elements of an algebra which
contains the real numbers that anticommute with each other but commute with ordinary
numbers
invariant:)/(OSp =+ nmnmii CxxMN θθ
Note that the orthosymplectic group obviously contains the product:






=
)(Sp
)(O
)/(OSp
MB
AN
MN
with simple restrictions on the A and B matrices.
The simplest way to exhibit the matrix representation of this group is to use the block
diagonal form:
)(Sp)(O)/(OSp MNMN ⊗⊃
The orthosymplectic group is now defined as the group that preserves the sum:
A Brief introduction to Supersymmetry

164
2017
MRT
Similarly, the superunitary groups can be defined as the groups that preserve the
complex form:
where the metric gmn =±δmn. The bosonic decomposition of the group is given by:
},,{ ανµµ QMPM A =
invariant)()(:)/(SU **
=+ mn
nm
ji
ji
gxxMN θθδ
)()(SU)(SU)/(SU 1UMNMN ⊗⊗⊃
Let us now write the generators of OSp(1/4) as the set:
which have the commutation relations:
C
C
ABBA MfMM =±],[
where ‘−’ means the commutator [A,B]=AB−BA and ‘+’ means the anticommutator {A,B}
=AB+BA and fAB
C are structure constants of the super group. Written out explicitly, the
commutators involving the supersymmetry generator MA are:
β
β
ανµνµαµαµαβ
µ
βα γ QMQPQPCQQ )(],[0],[)(2},{ Σ=== and,
with γ µ being the Dirac matrices and Σµν=½i[γµ,γν ]. What we want to do hereafter is to
find an explicit representation of these generators, in the same way that:
µµ ∂−= iP
is the generator of translations in x-space.

165
2017
MRT
Now we must enlarge the concept of space-time to include the supersymmetric partner
of the x-coordinate. Let us define superspace as the space created by the pair {xµ,θα}
where θα is a Grassman number. Let us define the supersymmetry generator:
where θ is a Grassman number also. We choose this particular representation because
the anticommutator between two such generators yields a displacement, as it should:
ααα
µµµ
εθθθγε +→−→ andxx
µα
µ
αα θγ
θ
∂−
∂
∂
= )(Q
µαβ
µ
βα γ ∂−= iCQQ )(2},{
Notice that ε Q (N.B., the supersymmetric parameter ε is an anticommuting spinor and
ε =ε†γ 0) makes the following supersymmetry transformations on superspace:
_
_
Notice also that we can construct a covariant derivative operator:
αµ
µ
αα θγ
θ
)( ∂+
∂
∂
=D
This operator anticommutes with the supersymmetry generator:
0},{ =βα DQ
This is very important because it allows us to place restrictions on the
representations of supersymmetry without destroying the symmetry. This permits us
to extract the irreducible representations from the reducible ones!

166
2017
MRT
Let us now try to construct invariant actions under supersymmetry. Let us define the
superfield V(x,θ) as the most general power expansion in this superspace. Then a
representation of supersymmetry is given by:
),(),(),(),( θεθθδθδθδ α xVQxVxxVxV =−++=
Notice that this definition proves that the product of two superfields is also a superfield
(i.e., V1V2 =V3). Thus, we can construct a large set of representations of supersymmetry
by this simple product rule. Now let us calculate the explicit transformation of the fields.
We will sometimes find it useful to break up the four-component spinor into two two-
components spinors, because of the identity O(4)=SU(2)⊗SU(2). Using indices a and a, a
=1,2, we will write a Majorana four-spinor in terms of its SU(2)⊗SU(2) content:
⋅⋅⋅⋅
][ b
a
a
a
&
&
χχχ
χ
χ
χ α
α
−=








= and
If we invert this, we find, in the purely Majorana imaginary representation of the spinors:
χγχχγχ )1(
2
1
)1(
2
1
55 −=+= a
a
&and
with a modified Weyl representation of the Dirac matrices (i.e., the Weyl basis):
where εab =ε ab =−εab =−ε ab with ε12 =+1 (i.e., ε12 =ε 12 =−ε12 =−ε 12 =1).⋅⋅⋅⋅ ⋅⋅⋅⋅
⋅⋅⋅⋅ ⋅⋅⋅⋅
⋅⋅⋅⋅ ⋅⋅⋅⋅
⋅⋅⋅⋅ ⋅⋅⋅⋅






=




−
===








=





−
−
= ba
ba
C
i
i
&&
ε
ε
γγγγγγ
σ
σ
γγ µ
µ
µ
0
0
0
0
0
0
0
0
5
321050
and,,
1
1

167
2017
MRT
In this notation, the covariant derivatives can be written as:
where σ µ =[1,σ i]=(σ µ)ab with σ µ =[1,−σ i]=(σ µ)ab =(σ µ)ba with σ i the Pauli matrices.
µ
µ
µ
µ
θσ
θ
θσ
θ
∂−
∂
∂
=∂−
∂
∂
= a
aaaa
a
aaaa iDiD &&&
&
& )()( and
The real vector superfield V can be decomposed as:






+−





∂/−′−





′∂/−+
−−−′−−=
CDiiii
ANiMiiiCxV
2
1
2
1
2
1
2
1
)(
2
1
),,(
2222
2
θθχλθθχλθθ
θσθθθχχθθθ µ
µ
with ≡∂µ∂µ =(∂/∂t)2 −∇2 and the slash means a ≡γ µaµ. We can now read off the
supersymmetry transformation parametrized by ζ on these 16 fields:
λγζδζλδχζλγζδχλγζδ
χλζδζγγζγχδχγζδ
νµνµµµµ
µµ
µ
55
555
)(
)()()(
∂/−=∂Σ−=∂+=∂/−=
∂/−=∂+−+==
iDAiiAiN
iMCAiNMC
and,,
,,,
We call it a vector superfield because it contains a vector particle in its represen-
tation (N.B., not because the superfield itself is a vector field under the Lorentz
group). In general, vector fields can be complex and they are reducible. To form
irreducible representations, we will find it convenient to place constraints on them
that do not destroy their supersymmetric nature. The constraints must therefore
commute with the supersymmetry generator.
_
⋅⋅⋅⋅ ⋅⋅⋅⋅_ ⋅⋅⋅⋅

168
2017
MRT
Notice that, because Dα anticommutes with the supersymmetry generator, we can
impose this derivative on a superfield and still get a representation of supersymmetry.
Let us now try to construct various representations of supersymmetry based on this
simple principle. We can impose:
0=φaD&
A superfield that satisfies this constraint is called a chiral superfield. Notice that it has
half the number of fields of the original superfield but that it still transforms under the
group correctly. A chiral superfield has the decomposition:
FAx 2
2),( θψθθφ −+=
The variation of this superfield can also be read off the variation:
],[ ζζφφδ QQi +−=
We obtain:
ζσψδζσζψδψζδ µ
µ
µ
µ ∂−=∂−−== iFAiFA 22 and,
We could also try other combinations of constraints, such as:
0=φaD
on a chiral superfield. We find, however, that the combination of the two constraints
imposed simultaneously implies that φ is a constant.

169
2017
MRT
Another constraint might be:
Yet another constraint might be:
0=φa
a
DD
This yields the linear multiplet and, unfortunately, actions based on this are usually
equivalent to actions based on chiral superfields, so we learn nothing new.
0],[ =φba DD &
This again gets back to the chiral superfield.
Finally, we might also try:
02
=φDDa
for real φ. This actually yields an entirely new superfield, the Yang-Mills superfield, which
is used to build the Yang-Mills action.
In summary, the only new fields that transform as irreducible representations of
supersymmetry are the vector superfield, the chiral superfield, and the Yang-Mills
superfield. The other combinations that one might try are either empty or redundant to
the original set.

170
2017
MRT
Now let us discuss the problem of forming an invariant action by defining Grassman
integration. Integration over these Grassman variables must be carefully defined.
Ordinary integration over real numbers, of course, is translation invariant:
where c is a real displacement. We would like Grassman-valued integration to have the
same property:
∫∫
∞
∞−
∞
∞−
+= )()( cxxdxxd φφ
∫∫
∞
∞−
∞
∞−
+= )()( cdd θφθθφθ
If we power expand this function φ(θ) in a Taylor series, we then have a linear
expression φ(θ)=a+bθ. If we define I0 = ∫dθ and I1 = ∫dθ θ then translation invariance
forces us to have:
1010 )()( IbIcbaIbIad ++=+=∫ θφθ
Thus we must have I0 =0 and I1 can be normalized to one (i.e., I1 =1), or:
10 == ∫∫ θθθ dd and
In other words, we have the curious-looking identity:
θ
θ
∂
∂
=∫d

171
2017
MRT
With these identities, we can show that:
Thus, in general, invariant actions can be formed (c.f., equations above for V(x,θ,θ) and
φ(x,θ)):
ji
N
i
A
ii Add
N
ji jjii
dete
1
1,
=∫∏=
∑ =
θθ
θθ
_
term-andterm- FxxddDxVxdd →→ ∫∫ ),(),( 2244
θφθθθ
This first integral only selects out the D-term of the vector superfield. The second
integral only selects out the F-term of the chiral superfield. In general, we call these F
and D terms. We can check that these are invariant actions:
088
== ∫∫ VQxdVxd α
α
εδδ
This is because the integral of a total derivative, in either x or θ space, is zero.

172
2017
MRT
Now, let us try to write down simple invariant actions based on F and D invariant
terms. The simplest invariant action is called the Wess-Zumino model (c.f., PART IX –
SUPERSYMMETRY: The Wess-Zumino Model):
Written out in components after performing the θ integration, it contains:








+





+++= ∫∫ h.c.
!32
1 3268
φ
λ
φφµφφ mxdxdS
∫ 





++∂−∂−∂−= 22224
2
1
2
1
2
1
)(
2
1
)(
2
1
GFBAxdS χγχ µ
µ
µµ
Notice that we have now constructed an action with an irreducible representation of
supersymmetry with a spin-0 and spin-½ multiplet: (½,0).

173
2017
MRT
To construct the (1,½) multiplet, we need the following construction for the Maxwell
action, given by:
V is real, but Λ is chiral (i.e., Λ+ =−Λ). Under this transformation, we find that:
Λ−Λ=Vδ
∫= A
A
WWdxdS θ24
where:
022
=== baaaaa WDVDDWVDDW &&& and,
where V is a real vector supermultiplet, which transforms as:
_
0=aWδ
so the action is trivially invariant under both supersymmetry and U(1) gauge invariance.
Notice that the vector supermultiplet contains the Maxwell field Aµ, while the chiral super-
multiplet Λ contains the gauge parameter λ. Written out in components, this equals:
∫ 





+∂−−= 224
2
1
2
1
4
1
DFxdS ψγψ µ
µ
νµ
which is invariant under:
ψγγεδεγψδψγεδ µ
µ
νµ
νµ
µµ ∂=





+Σ−== 55
2
1
DDFA and,

174
2017
MRT
The next multiplet we wish to investigate is the (2,3/2) multiplet. Historically, it was
thought that the Rarita-Schwinger theory was fundamentally flawed because it permitted
no consistent couplings to other fields. However, physicists neglected to couple the Rari-
ta-Schwinger field with the graviton. The inconsistencies all disappear for this multiplet.
So, after having reviewed supersymmetry up to the Wess-Zumino action, we review
the steps it took to get there! After the discovery by Myazawa of the supergroup SU(N/M)
in 1966, supersymmetry was rediscovered in 1971, from two entirely different
approaches. In the first, the Neveu-Schwarz-Ramond superstring was found to possess
a new anticommuting gauge symmetry. From this, Gervais and Sakita then wrote down
the first supersymmetric action, the two-dimensional superstring action. The second
approach was that of Gol’fand and Likhtman, who were looking for a generalization of
the usual space-time algebra and found the super Poincaré algebra. In 1972, Volkov and
Akulov found a nonlinear supersymmetric theory. And finally in 1974, Wess and Zumino
wrote down the first four-dimensional point-particle field theory action.
Although a wide variety of supersymmetric actions were then discovered in the 1970s,
for many years supersymmetry was considered a mathematical oddity, since none of the
known subatomic particles had supersymmetric partners. However, its possible
application to quantum physics came when attempts were made to iron out inconsis-
tencies of GUT theories such as the theoretical problems surrounding the hierarchy
problem (i.e., renormalization effects will inevitably mix the two mass scales in the
theory, the GUT scale MX
2 and the electroweak energy scale MW
2).

One appealing solution to the hierarchy problem is to include supersymmetry, both local
and global. There are powerful nonrenormalization theorems in supersymmetric theories
that show that higher order interactions do not renormalize the mass scale (i.e., we do
not have to fine-tune these parameters to each other in perturbation theory). One fine-
tuning at the beginning is enough. This does not explain where this original fine tuning
came from; it only explains why higher-loop diagrams do not mix the two mass scales.
175
2017
MRT
There are, however, many other reasons for examining supersymmetric theories. One
of the main problems in building unified field theories is the inability to find a gauge
group that can combine the particle spectrum with quantum gravity. The problem is the
no-go theorem, which states that a group that nontrivially combines both the Lorentz
group and a compact Lie group cannot have finite dimensional, unitary representations.
This means that attempts to build a Master Group that combines both gravity and the
particle spectrum faces an insurmountable difficulty.
So, what we will do now is review again the key aspects of supersymmetry but this
time we will begin from first principles. The intent here is to home into the key
developments and rehash the same results we’ve just obtained but this time by a field
construct kind of path as opposed to the group theoretically-motivated one we’ve just
seen. Again, we will begin by conjuring up the way out of the no-go theorem since there
is a way to evade the Coleman-Mandula theorem and that is to use supersymmetry!
Supersymmetry

Since anticommuting Grassman numbers (i.e., a Grassman number is a mathematical
construction which allows a path integral representation for Fermionic fields and is
defined by θiθj =−θjθi or {θi,θj}=0) were never contemplated in the original derivation,
the no-go theorem breaks down. The Coleman-Mandula theorem never analyzed a
nontrivial symmetry that mixes bosonic and fermionic fields and places both in the same
multiplet:
176
2017
MRT
FermionsBosons ↔
Thus there exists a supersymmetry (charge) operator Q that converts boson state |B〉
into fermions states |F〉 by the operation:
FB =Q
As a consequence, electrons can appear in the same multiplet as the Maxwell field. In
fact, there is the possibility of placing all the known particles found in nature into the
same multiplet.
Perhaps one of the most remarkable aspects of supersymmetry is that it yields field
theories that are finite to all orders in perturbation theory. In particular, N=4 super Yang-
Mills theory, and certain versions of the N=2 super Yang-Mills theory, are finite to all
orders (i.e., Z=1 for all renormalization constants). This is a surprising result, which
indicates the power of supersymmetry in eliminating many, if not all, of the divergences
of certain quantum field theories.

Yet another attractive feature of the theory is that once supersymmetry becomes a
local gauge symmetry, it inevitably becomes a theory of gravity. This new theory, called
supergravity, has a new set of Ward identities that render the theory much more
convergent than ordinary gravity. In fact, the largest supergravity theory, which has SO(8)
symmetry, is almost big enough to accommodate all the elementary particles. Although
supergravity is not as divergent as ordinary gravity, the theory still is not finite. Local
supersymmetry, by itself, is not powerful enough to cancel all the divergences of the
theory. Second, the group SO(8) cannot (without extra bound states) include all the
particles of the Standard Model. To remedy some of these problems, we will have to go
to yet another, more powerful theory, the superstring theory.
177
2017
MRT

We would first like to show that supersymmetry forces us to have equal numbers of
bosons and fermions. The simplest example is the Hamiltonian;
178
2017
MRT
bbaaH ba
††
ωω +=
where we have bosonic and fermionic harmonic oscillators that obey:
1},{],[ ††
== bbaa
The supersymmetric operator Q mentioned above is defined as:
baabQ ††
+≡
If a†|0〉 is one boson state, then Qa†|0〉 becomes a one fermion state, and vice versa. Q
obeys the following identity:
QHQ ba )ωω(],[ −=
If ωa =ωb =ω, then the supersymmetric operator Q commutes with the Hamiltonian and:
HQQ
ω
2
},{ †
=
These identities show that Q and Q† form a closed algebra with the Hamiltonian if the
fermions and bosons have equal energy. The unusual feature of these identities is
that the supersymmetric generator Q in some sense, is the square root of the
Hamiltonian. Furthermore, this highlights the fact that supersymmetry closes on
space-time transformations. In this sense, it is radically different from the other
symmetries that, say, treat space-time and isospin as entirely unrelated.
Supersymmetric Actions

Another unusual consequence is that the energy of the vacuum must be zero in order
to have supersymmetry. To see this, take the vacuum expectation value of both sides of
the previous {Q,Q†}=(2/ω)H equation. In order to have a supersymmetric vacuum, we
must have:
179
2017
MRT
00 =Q
However, this implies that:
000 =H
so that the vacuum must have zero energy.

To use symmetry to construct new actions, let us examine the very first and simplest
supersymmetric action that was discovered in 1971. This is the action found by Gervais
and Sakita that describes the Neveu-Schwarz-Ramond superstring with Lagrangian:
180
2017
MRT
aaa
PP φεγεφεγεφδδ µ
µ
µ
µ
122121 ],[ −=
These commutation relations mean that there exists a spinor operator Qα whose
anticommutation relations with itself yield the translation operator Pµ. This generalizes
the discussion earlier about Q forming a closed algebra with H. Now, we find that the
supersymmetric generator forms a closed algebra with the vector Pµ.
aaaa
i φφψγψ µ
µµ
µ
∂∂+∂= )(L
which is defined in two dimensions for real, Majorana spinors (and where a is an
additional vector index). The action is invariant under*:
ψεφδεφγψδ µ
µ
=∂−= aaa
i and
Notice that the supersymmetric parameter ε is an anticommuting spinor. This means
that many of the classical theorem concerning Lie groups and Lie algebra no longer
hold. Second, the fermions and bosons have the same index a (i.e., they must transform
under the same representation of some group). Third, we notice that if we anticommute
the fields a second time, we find:
* Factors of i = √(−1) will seem to be off here and there from now on compared to the first part of this Appendix since now we
use the Wess-Bagger convention. Take note of this (or Freedman’s convention) when reading through supersymmetry papers
or books.

The previous action was written down in only two dimensions. To obtain a four-
dimensional theory, we now study the free Wess-Zumino action (1974), where we again
have Majorana spinors:
181
2017
MRT
∫ ++∂+∂+∂= ])()[(
2
1 22224
GFiBAxdS ψγψ µ
µ
µµ
This action is invariant under:
εγεγγψδ
ψγγεδψγεδ
ψγεδψεδ
µµ
µ
µ
µ
µ
µ
)()( 55
5
5
GFBAi
iGiF
BA
+−∂+∂−=
∂=∂=
==
and,
,,
This action contains equal numbers of fermions and bosons, as desired. There are
four components within the off-shell Majorana field ψ, and four boson fields A, B, F,
and G. The supersymmetric algebra is thus linear. However, because F and G are
auxiliary fields, we can eliminate them from the action from the very start. After this
seemingly trivial elimination, the resulting actions no longer has equal numbers of
fermions and bosons! The action is still invariant under a modified form of
supersymmetry, although it is no longer linear. By taking two such nonlinear
supersymmetric variations, we find that the algebra does not close. The term that
breaks the closure of the algebra is proportional to the equations of motion. The
algebra then closes on-shell. So, it is more convenient therefore to retain these
auxiliary fields in order to maintain a complete off-shell, linear algebra.

Supersymmetry also generalizes to gauge theories. For example, the following gauge
action with a Maxwell field and a Majorana spinor is invariant under global supersym-
metry. It is the supersymmetric counterpart of QED:
182
2017
MRT
∫ 





+∂+−= 224
2
1
24
1
D
i
FxdS ψγψ µ
µ
νµ
The fields transform under:
εγψδ
ψγεδψγεδ
νµ
νµ
µ
µ
µµ






−Σ−=
∂==
DF
i
iDiA
5
2
and
,
Once again, we have equal numbers of fermions and bosons off-shell. Off-shell, the
Majorana field has 4 components, while the Aµ field has 3 components (because one is
eliminated by gauge fixing), and the D field has one component. We therefore have 4+4
fermions and bosons. On-shell, we also have the same number of fermions and bosons.
The ψ field now only has 2 components, the D field disappears, and the Aµ field has 2
components, so we are left with 2+2 fields on-shell. When we generalize this to non-
Abelian gauge transformations, we will find that the fermionic ψ must transform in the
adjoint representation, the same representation as the gauge fields, since they all
belong in the same multiplet.

So far, we have been exploring actions written totally in terms of their component fields.
This, however, becomes prohibitively difficult as we go to non-Abelian and gravitational
theories since the number of fields rapidly escalates for higher supersymmetric actions.
In order to systematically generate new supersymmetric actions, we now turn to a new
formalism, the superspace formalism which is perhaps one of the most beautiful ways in
which to compress the blizzard of indices that often appears in supersymmetric theories.
This construction postulates the existence of four antisymmetric coordinates θa that form
the superpartner of the usual space-time coordinates xµ via:
183
2017
MRT
},{ α
µ
θx
Supersymmetry, acting on the superspace coordinates, makes the following
transformation:
ααα
µµµ
εθθθγε +→+→ andixx
In practice, the use of complex Dirac spinors leads to reducible representations of
supersymmetry. In order to find irreducible representations, we will find it more
convenient to use Majorana or Weyl spinors. We will therefore split the four-component
spinor into two smaller spinors as follows:








≡
a
a
&θ
θ
θα
Superspace

In this formalism, we will take a modified Weyl representation of the Dirac matrices:
184
2017
MRT





−
=








=
i
i
0
0
0
0
5γ
σ
σ
γ µ
µ
µ
and
where:
ba
i
&)(],[ µµ
σσσ == 1
Then the typical spinor breaks up as follows:






= a
a
&
χ
φ
ψ
In this representation, the spinors become reducible (i.e., the four-spinor ψ has now
been broken up into two two-spinors φ and χ, each of which forms a two-dimensional
representation of the Lorentz group). The Lorentz group generators can be obtained by
multiplying the old generators Mµν =½Σµν by the chiral projection operator P± =½(1±iγ5).
In this way, we can slit the original 4×4 Lorentz generator into two distinct 2×2 blocks.
Each two-spinor then transforms under 2×2 complex representation of the Lorentz
group, which we can show is SL(2,C), the set of 2×2 complex matrices with unit
determinant. We use the fact that O(3,1)~SL(2,C).
_
and
][0†
a
a
&φχγψψ ==
and
abbai &&
)()(],[ µµµ
σσσσ ==−= 1

Let the two-spinor θa transform as the fundamental representation of SL(2,C), where a
=1,2. The complex conjugate of these matrices generate an inequivalent representation
of SL(2,C). We will label these two-spinors as θa, where a=1,2 and the dot reminds us
that the two-spinors transforms under the inequivalent complex conjugate representation
of SL(2,C).
185
2017
MRT
aa
aa
&
&
θθθθ == **
)()( and
We take the conjugation of spinors as follows:
_
⋅⋅⋅⋅ ⋅⋅⋅⋅
Raising and lowering in this two-dimensional space is done via:
121
21
12
12 +=−=−==
&&
&& εεεε
so that:
ab
b
a
b
aba &&
&
&
εψψψεψ == and
and also
b
baaba
b
a
&
&&& ψεψεψψ == and
Invariants under each of the two groups are given by:
aaa
a
a
aa
a &&&
& χφχφχφχφχφχφ −==−== and
and also
a
aa
a &
& θθθθθθ == 22
and

In this way, the standard invariant ψψ for four-spinors decomposes as:
186
2017
MRT
φχχφ
χ
φ
χφψψ +=











= a
aa
a &&
01
10
][
We also have:
_
212121 φσφχσχψγψ µµµ
+=
In two-spinor notation, the supersymmetric transformation in superspace (i.e., xµ →xµ
+iεγ µθ and θα →θα +εα) is written as:
_
aaa
aaa
iixx &&& εθθεθθεσθθσε µµµµ
+→+→−+→ and,
Given this superspace transformation, we can now extract the operators that generate
this transformation:
µ
µ
µ
µ
σθ
θ
θσ
θ
∂+
∂
∂
−=∂−
∂
∂
= aaaaaa iQiQ &&& )()( and
The supersymmetric algebra now reads:
where Σµν=½[γµ,γν ].
0],[0],[
)(
2
1
],[)(
2
1
],[
0},{},{)(2},{
==
Σ−=Σ=
===
µµ
νµνµνµνµ
µ
µ
σ
PQPQ
QMQQMQ
QQQQPQQ
aa
b
abab
b
aa
babababa
&
&
&&&
&&&&&
and
,,
,,

Using superspace methods, let us now construct a few representations of
supersymmetry. We begin by constructing a vector superfield V(x,θ,θ) that is a function
of superspace. Under a supersymmetric transformation, it transforms as:
187
2017
MRT
],[),( VQQixV εεθδ +=
These superfields have many nice properties. The most important is that the product of
two superfields is again a superfield (i.e., V1V2 =V3). Although this product rule is simple
in superspace, written in component form it is highly nontrivial. Superspace thus gives a
way of generating new representations of supersymmetry from old ones.
_
By power expanding V(x,θ,θ) is a power series in θ and θ, we find that the series
terminates after reaching the fourth power of the spinor because of its Grassman nature.
Since V is real, the most general parametrization is given by a Taylor expansion in the
Grassman variables:
__






∂+−





∂−+





∂−−
−++−−+−=
CD
i
i
i
i
ANiM
i
NiM
i
iiCxV
22222
22
2
1
2
1
22
)(
2
)(
2
),,(
µ
µ
µ
µ
µ
µ
µ
θθχσλθθχσλθθ
θσθθθθχχθθθ
This is called a vector superfield because it contains an ordinary vector field Aµ. The
vector superfield has 8 fermionic fields contained within λ and χ as well as 8 bosonic
fields contained within C, D, M, N, and Aµ. As such, we have an equal number of
fermions and bosons, as required.

Under a supersymmetric transformation, we have:
188
2017
MRT
),,(),,( εθεθεσθθσεθθ µµ ++−+→ iixVxV
By power expanding the previous equation for V(x,θ,θ) and then equating coefficients,
we can calculate the variation of all the fields within the vector superfield:
_
λγεδεγελδχελγεδ
χλγεδχλεδ
εγγεγχδχγεδ
νµ
νµ
µµµ
µµ
µ
55
5
555
)()(
)()(
∂/−=−∂Σ−=∂+=
∂/−=∂/−=
∂+−+==
iDDAiiA
iNiM
CAiNMC
and,
,,
,,
We can introduce a new derivative operator:
µµµ
µ
σθ
θ
θσ
θ
∂+
∂
∂
−=∂−
∂
∂
= aaaaaa iDiD &&& )()( and
The importance of this derivative operator is that it anticommutes with the
supersymmetric generators. Because Da anticommutes with the supersymmetric
generator, we can apply it at will on any superfield to form a constraint, such as:
0=φaD&
This constraint does not spoil the transformation of φ under supersymmetry because Da
anticommutes with the supersymmetry generator. A field that satisfies this constraint is
called a chiral superfield (N.B., whereas an antichiral superfield satisfies Daφ =0).

We can write down the solution to the chiral constraint equation as:
189
2017
MRT
),(e),,(),(e),,( θφθθφθφθθφ θθθθ
xxxx ii ∂/∂/−
== and
The problem of finding chiral superfields then reduces to the simpler problem of power
expanding φ(x,θ), which terminates after only three terms:
FAx 2
2),( θθψθφ −+=
Once again, the number of fermion and boson fields are equal. ψ contains four
components, while A and F are complex scalar fields with four components in all.
Written out explicitly, the variation of the fields is given by:
εσψδεσεψδεψδ µ
µ
µ
µ ∂−=∂−−== iFAiFA 22 and,
Given these vector and chiral superfields, we can now construct superfield actions that
are manifestly supersymmetric. There are two ways in which supersymmetric invariant
actions can be constructed, one for vector fields and the other for chiral fields. For vector
fields, we simply integrate over all eight x, θ, and θ indices. This integration selects out
the D-term that appears in the variation of the vector field (N.B., since the variation of the
D-term in δ D=−iε ∂γ5λ above is a total derivative and hence always integrates to zero –
that is why D-terms are always invariant). For chiral fields, we only integrate over six
variable x and θ. This selects out the F-term (N.B., and since the variation of the F-term
in δ F=−2i∂µψ σ µε above is also a total derivative).
_
_

The simplest action based on superspace is the Wess-Zumino action, given as:
190
2017
MRT








+





++−= ∫∫ h.c.
!32
1
2
1 3268
φ
λ
φφµφφ mxdxdS
By integrating out the Grassman variables, we retrieve the free Wess-Zumino action:
which we obtained earlier.
∫ ++∂+∂+∂= ])()[(
2
1 22224
GFiBAxdS ψγψ µ
µ
µµ
Not only can we find a superspace formulation of the Wess-Zumino model, we can
also find the superspace formulation of gauge theory! To introduce gauge invariance, we
first notice that the vector field V contains Aµ, which shall form the basis of a gauge
theory. The variation of V, in turn, looks very much like a chiral superfield Λ, which
contains the combination ∂µ A. What we want, therefore, is a real vector field V that
transforms as:
)(
2
Λ−Λ−=
i
Vδ
where Λ is a chiral superfield. This variation contains the U(1) symmetry transformation
δAµ ~∂µ A.

Now we wish to construct the counterpart of the Maxwell tensor Aµν , which is invariant
under the δAµ ~ ∂µ A transformation. Let us define a chiral superfield Wa :
191
2017
MRT
0
4
2
=≡ aaaa WDVDD
i
W &and
where the last identity is important because it shows that Wa is a chiral superfield. One
can show that the Maxwell tensor is contained within Wa and it can also be shown that:
0=aWδ
where we have used the fact that both Λ and Wa are chiral superfields.
Our gauge-invariant action is therefore:
∫ a
a
WWdxd θ24
16
1
which is invariant under both gauge and supersymmetry. Next, we will show that this
yields the correct U(1) action when we perform the integration. In general, this
integration is rather lengthy. But we can circumvent this by using up many of the
degrees of freedom in the gauge transformation so that the theory is defined only in
terms of the important fields.

Since A and F within the chiral field Λ are complex, we will find it convenient to
redefine these fields as A→ A+iB and F→ F+iG. Then under this gauge transformation,
we have:
192
2017
MRT
DDAAA
GiFNiMNiMBCC
→→∂+→
−−+→++→+→
and,
,,,
λλ
ψχχ
µµµ
so that the A field is the gauge parameter associated with the gauge field Aµ . We can
use B, ψ, F, and G to eliminate C, χ, M, and N. This leaves us with a reduced vector
multiplet:
],,,0,0,0,0[ DAV λµ=
We call this the Wess-Zumino gauge, where we have partially used up the gauge degree
of freedom within the chiral superfields, leaving us with only the gauge multiplet that
includes the Maxwell field Aµ . Placing the chiral superfield Wa into the action, we obtain
the original super-Maxwell theory:
which we also obtained earlier.
∫ 





+∂+−= 224
2
1
24
1
D
i
FxdS ψγψ µ
µ
νµ

Now for a result that has not been obtained earlier. The generalization of the prior
construction to the full non-Abelian theory is also somewhat straightforward. Let us
define V=V aτ a, where τ a matrices generate some Lie group. The vector superfield V will
transform as:
193
2017
MRT
)( †
Λ−Λ= giVδ
for some chiral superfield Λ. Then we also have:
Λ−Λ−−
→ iViV
eeee 22
Then define:
V
a
V
a DDD
i
W 22
ee
8
−
−=
If we include matter fields within a chiral superfield, then:
ΛΛ−
→→ ii
ee φφφφ and
Then the coupling to the matter fields arises through:
φφ V2
e−
Let us now put everything together. The most general coupling between a superfield V
and a matter superfield φ is given by:
∫∫
−
++= φφθθ V
a
a
dxdWWdxd 24424
e
8
1
h.c.Tr
16
1
L

After performing the Grassman integrations, the Lagrangian density equals:
194
2017
MRT
∫ 





+++ h.c.
3
1
2
124
kjikjijijiii gmdxd φφφφφφλθ
where the matter field is in the adjoint representation of the group.



−−−++



∇+∇∇+∇∇++∇+−= ∫
],[],[],[
2
1
2
1
22
1
2
1
2
1
24
1
Tr
5
22
24
BiAiDBAiGF
i
BBAAD
i
FFxd
λγψλψ
ψσψλγλ µ
µµ
µ
µ
µµ
µνµ
νµL
We also have the freedom of adding the most general renormalizable self-interaction
of the φ field, which is at most cubic:
where the terms in the interaction must be gauge invariant.

2017
MRT
P.D.B. Collins, A.D. Martin, E.J. Squires, Particle Physics and Cosmology, Wiley, 1989.
University of Durham, England
This readable introduction to particle physics and cosmology discusses the interaction of these two fundamental branches of physics
and considers recent advances beyond the standard models. Eight chapters comprise a brief introduction to the gauge theories of
the strong and the electroweak interactions, the so-called grand unified theories, and general relativity. Ten more chapters address
recent concepts such as composite fermions and bosons, supersymmetry, quantum gravity, supergravity, and strings theories, and
relate them to modern cosmology and experimental astronomy.
B. Zwiebach, A First Course in String Theory, 1-st Edition, Cambridge University Press, 2006 (2-nd Edition/2009 now available)
Massachusetts Institute of Technology
An accessible introduction to string theory, this book provides a detailed and self-contained demonstration of the main concepts
involved. The first part deals with basic ideas, reviewing special relativity and electromagnetism while introducing the concept of extra
dimensions. D-branes and the classical dynamics of relativistic strings are discussed next, and the quantization of open and closed
bosonic strings in the light-cone gauge, along with a brief introduction to superstrings. The second part begins with a detailed study of
D-branes followed by string thermodynamics. It discusses possible physical applications, and covers T-duality of open and closed
strings, electromagnetic fields on D-branes, Born/Infeld electrodynamics, covariant string quantization and string interactions.
Primarily aimed as a textbook for advanced undergraduate and beginning graduate courses, it will also be ideal for a wide range of
scientists and mathematicians who are curious about string theory.
K. Becker, M. Becker and J. H. Schwarz, String Theory and M-Theory, Cambridge University Press, 2007
Texas A&M University and California Institute of Technology
String theory is one of the most exciting and challenging areas of modern theoretical physics. This book guides the reader from the
basics of string theory to recent developments. It introduces the basics of perturbative string theory, world-sheet supersymmetry,
space-time supersymmetry, conformal field theory and the heterotic string, before describing modern developments, including D-
branes, string dualities and M-theory. It then covers string geometry and flux compactifications, applications to cosmology and
particle physics, black holes in string theory and M-theory, and the microscopic origin of black-hole entropy. It concludes with Matrix
theory, the AdS/CFT duality and its generalizations. This book is ideal for graduate students and researchers in modern string theory,
and will make an excellent textbook for a one-year course on string theory.
L. E. Ibáñez and A. M. Uranga, String Theory and Particle Physics, Cambridge University Press, 2012
Universidad Autónoma de Madrid and Instituto de Física Teórica, IFT/UAM-CSIC,
String theory is one of the most active branches of theoretical physics and has the potential to provide a unified description of all
known particles and interactions. This book is a systematic introduction to the subject, focused on the detailed description of how
string theory is connected to the real world of particle physics. Aimed at graduate students and researchers working in high energy
physics, it provides explicit models of physics beyond the Standard Model. No prior knowledge of string theory is required as all
necessary material is provided in the introductory chapters. The book provides particle phenomenologists with the information
needed to understand string theory model building and describes in detail several alternative approaches to model building,
such as heterotic string compactifications, intersecting D-brane models, D-branes at singularities and F-theory.
195
References / Study Guide

Even before string theory,
especially as physics developed
in the 20th century,it turned
out that the equations that
really work in describing nature
with the most generality and
greatest simplicity are very
elegant and subtle.
Edward Witten

Type HE
(E8⊗E8)
Type HO
(SO(32))
Type I
Type IIB
Type IIA
DualityDuality
Heterotic
(E8⊗E8)
Heterotic
(SO(32))
Type I
(SO(32))
Supergravity
(SUGRA)M-THEORY
D=10 dimensions
D=11 dimensions
Duality
Type IIAType IIB
String Theories
D =11 SUGRA
M-THEORY

PART X.1 - Superstring Theory

More Related Content

What's hot (12)

Viewers also liked (20)

Similar to PART X.1 - Superstring Theory (20)

More from Maurice R. TREMBLAY (20)

Recently uploaded (20)

PART X.1 - Superstring Theory