Toward a theory of chaos

Tutorials and Reviews

International Journal of Bifurcation and Chaos, Vol. 13, No. 11 (2003) 3147–3233
c World Scientific Publishing Company

TOWARD A THEORY OF CHAOS

A. SENGUPTA
Department of Mechanical Engineering,
Indian Institute of Technology Kanpur,
Kanpur 208016, India
osegu@iitk.ac.in

Received February 23, 2001; Revised September 19, 2002

This paper formulates a new approach to the study of chaos in discrete dynamical systems
based on the notions of inverse ill-posed problems, set-valued mappings, generalized and multi-
valued inverses, graphical convergence of a net of functions in an extended multifunction space
[Sengupta & Ray, 2000], and the topological theory of convergence. Order, chaos and complexity
are described as distinct components of this unified mathematical structure that can be viewed
as an application of the theory of convergence in topological spaces to increasingly nonlinear
mappings, with the boundary between order and complexity in the topology of graphical con-
vergence being the region in (Multi(X)) that is susceptible to chaos. The paper uses results from
the discretized spectral approximation in neutron transport theory [Sengupta, 1988, 1995] and
concludes that the numerically exact results obtained by this approximation of the Case singular
eigenfunction solution is due to the graphical convergence of the Poisson and conjugate Poisson
kernels to the Dirac delta and the principal value multifunctions respectively. In (Multi(X)),
the continuous spectrum is shown to reduce to a point spectrum, and we introduce a notion of
latent chaotic states to interpret superposition over generalized eigenfunctions. Along with these
latent states, spectral theory of nonlinear operators is used to conclude that nature supports
complexity to attain efficiently a multiplicity of states that otherwise would remain unavailable
to it.

Keywords: Chaos; complexity; ill-posed problems; graphical convergence; topology; multifunc-
tions.

Prologue of study of so-called “strongly ” nonlinear system.
1. Generally speaking, the analysis of chaos is ex- . . . Linearity means that the rule that determines what
tremely difficult. While a general definition for chaos a piece of a system is going to do next is not influ-
applicable to most cases of interest is still lacking, enced by what it is doing now. More precisely this
mathematicians agree that for the special case of iter- is intended in a differential or incremental sense: For
ation of transformations there are three common char- a linear spring, the increase of its tension is propor-
acteristics of chaos: tional to the increment whereby it is stretched, with
the ratio of these increments exactly independent of
1. Sensitive dependence on initial conditions,
how much it has already been stretched. Such a spring
2. Mixing,
3. Dense periodic points. can be stretched arbitrarily far . . . . Accordingly no real
spring is linear. The mathematics of linear objects is
[Peitgen, Jurgens & Saupe, 1992] particularly felicitous. As it happens, linear objects en-
2. The study of chaos is a part of a larger program joy an identical, simple geometry. The simplicity of

3147

3148 A. Sengupta

this geometry always allows a relatively easy mental 5. One of the most striking aspects of physics
image to capture the essence of a problem, with the is the simplicity of its laws. Maxwell’s equations,
technicality, growing with the number of parts, basi- Schroedinger’s equations, and Hamilton mechanics
cally a detail. The historical prejudice against nonlinear can each be expressed in a few lines. . . . Everything
problems is that no so simple nor universal geometry is simple and neat except, of course, the world. Every
usually exists. place we look outside the physics classroom we see a
Mitchell Feigenbaum’s Foreword (pp. 1–7) world of amazing complexity. . . . So why, if the laws
in [Peitgen et al., 1992] are so simple, is the world so complicated? To us com-
plexity means that we have structure with variations.
3. The objective of this symposium is to explore the Thus a living organism is complicated because it has
impact of the emerging science of chaos on various dis- many different working parts, each formed by varia-
ciplines and the broader implications for science and tions in the working out of the same genetic coding.
society. The characteristic of chaos is its universality Chaos is also found very frequently. In a chaotic world
and ubiquity. At this meeting, for example, we have it is hard to predict which variation will arise in a given
scholars representing mathematics, physics, biology, place and time. A complex world is interesting because
geophysics and geophysiology, astronomy, medicine, it is highly structured. A chaotic world is interesting
psychology, meteorology, engineering, computer sci- because we do not know what is coming next. Our
ence, economics and social sciences. 1 Having so many world is both complex and chaotic. Nature can pro-
disciplines meeting together, of course, involves the duce complex structures even in simple situations and
risk that we might not always speak the same lan- obey simple laws even in complex situations.
guage, even if all of us have come to talk about
[Goldenfeld & Kadanoff, 1999]
“chaos”.
6. Where chaos begins, classical science stops. For as
Opening address of Heitor Gurgulino de Souza,
long as the world has had physicists inquiring into
Rector United Nations University, Tokyo
the laws of nature, it has suffered a special ignorance
[de Souza, 1997]
about disorder in the atmosphere, in the turbulent sea,
4. The predominant approach (of how the different in the fluctuations in the wildlife populations, in the
fields of science relate to one other ) is reductionist: oscillations of the heart and the brain. But in the 1970s
Questions in physical chemistry can be understood a few scientists began to find a way through disor-
in terms of atomic physics, cell biology in terms of der. They were mathematicians, physicists, biologists,
how biomolecules work . . . . We have the best of rea- chemists . . . (and) the insights that emerged led di-
sons for taking this reductionist approach: it works. rectly into the natural world: the shapes of clouds,
But shortfalls in reductionism are increasingly appar- the paths of lightning, the microscopic intertwining
ent (and) there is something to be gained from sup- of blood vessels, the galactic clustering of stars. . . .
plementing the predominantly reductionist approach Chaos breaks across the lines that separate scientific
with an integrative agenda. This special section on disciplines, (and) has become a shorthand name for a
complex systems is an initial scan (where) we have fast growing movement that is reshaping the fabric of
taken a “complex system” to be one whose properties the scientific establishment.
are not fully explained by an understanding of its com-
[Gleick, 1987]
ponent parts. Each Viewpoint author 2 was invited to
define “complex” as it applied to his or her discipline. 7. order (→) complexity (→) chaos.
[Gallagher & Appenzeller, 1999] [Waldrop, 1992]

1
A partial listing of papers is as follows: Chaos and Politics: Application of Nonlinear Dynamics to Socio-Political issues;
Chaos in Society: Reflections on the Impact of Chaos Theory on Sociology; Chaos in Neural Networks; The Impact of Chaos
on Mathematics; The Impact of Chaos on Physics; The Impact of Chaos on Economic Theory; The Impact of Chaos on
Engineering; The impact of Chaos on Biology; Dynamical Disease: And The Impact of Nonlinear Dynamics and Chaos on
Cardiology and Medicine.
2
The eight Viewpoint articles are titled: Simple Lessons from Complexity; Complexity in Chemistry; Complexity in Biolog-
ical Signaling Systems; Complexity and the Nervous System; Complexity, Pattern, and Evolutionary Trade-Offs in Animal
Aggregation; Complexity in Natural Landform Patterns; Complexity and Climate, and Complexity and the Economy.

Toward a Theory of Chaos 3149

8. Our conclusions based on these examples seem sim- essary that we have a mathematically clear physi-
ple: At present chaos is a philosophical term, not a cal understanding of these notions that are suppos-
rigorous mathematical term. It may be a subjective edly reshaping our view of nature. This paper is an
notion illustrating the present day limitations of the attempt to contribute to this goal. To make this
human intellect or it may describe an intrinsic prop- account essentially self-contained we include here,
erty of nature such as the “randomness” of the se- as far as this is practical, the basics of the back-
quence of prime numbers. Moreover, chaos may be ground material needed to understand the paper in
undecidable in the sense of Godel in that no matter the form of Tutorials and an extended Appendix.
what definition is given for chaos, there is some ex- The paradigm of chaos of the kneading of the
ample of chaos which cannot be proven to be chaotic dough is considered to provide an intuitive basis
from the definition. of the mathematics of chaos [Peitgen et al., 1992],
[Brown & Chua, 1996] and one of our fundamental objectives here is to re-
count the mathematical framework of this process
9. My personal feeling is that the definition of a “frac- in terms of the theory of ill-posed problems arising
tal” should be regarded in the same way as the biolo- from non-injectivity [Sengupta, 1997], maximal ill-
gist regards the definition of “life”. There is no hard posedness, and graphical convergence of functions
and fast definition, but just a list of properties char- [Sengupta & Ray, 2000]. A natural mathematical
acteristic of a living thing . . . . Most living things have formulation of the kneading of the dough in the
most of the characteristics on the list, though there form of stretch-cut-and-paste and stretch-cut-and-
are living objects that are exceptions to each of them. fold operations is in the ill-posed problem arising
In the same way, it seems best to regard a fractal as from the increasing non-injectivity of the function
a set that has properties such as those listed below, f modeling the kneading operation.
rather than to look for a precise definition which will
certainly exclude some interesting cases.
[Falconer, 1990]
Begin Tutorial 1: Functions and
10. Dynamical systems are often said to exhibit chaos Multifunctions
without a precise definition of what this means.
A relation, or correspondence, between two sets X
[Robinson, 1999] and Y , written M: X –→ Y , is basically a rule that
→
associates subsets of X to subsets of Y ; this is often
1. Introduction expressed as (A, B) ∈ M where A ⊂ X and B ⊂ Y
The purpose of this paper is to present an unified, and (A, B) is an ordered pair of sets. The domain
self-contained mathematical structure and physical def
understanding of the nature of chaos in a discrete D(M) = {A ⊂ X : (∃Z ∈ M)(πX (Z) = A)}
dynamical system and to suggest a plausible expla- and range
nation of why natural systems tend to be chaotic. def
The somewhat extensive quotations with which we R(M) = {B ⊂ Y : (∃Z ∈ M)(πY (Z) = B)}
begin above, bear testimony to both the increas- of M are respectively the sets of X which under
ingly significant — and perhaps all-pervasive — M corresponds to sets in Y ; here πX and (πY )
role of nonlinearity in the world today as also our are the projections of Z on X and Y , respectively.
imperfect state of understanding of its manifesta- Equivalently, (D(M) = {x ∈ X : M(x) = ∅}) and
tions. The list of papers at both the UN Confer- (R(M) = x∈D(M) M(x)). The inverse M− of M
ence [de Souza, 1997] and in Science [Gallagher & is the relation
Appenzeller, 1999] is noteworthy if only to justify
M− = {(B, A) : (A, B) ∈ M}
the observation of Gleick [1987] that “chaos seems
to be everywhere”. Even as everybody appears to so that M− assigns A to B iff M assigns B to A.
be finding chaos and complexity in all likely and In general, a relation may assign many elements in
unlikely places, and possibly because of it, it is nec- its range to a single element from its domain; of

3150 A. Sengupta

¨
¡
£ ¤¢ ¤
© £ $
# !
¥ ¦¢ § ¦
¥

§

(a)
(a) (a) (b)
(b) (b)

3 4 ( )'
6 9 @
1
0 % 2 ' 7 8
5

(c)
(c) (c) (d)
(d) (d)
Fig. 1. Functional and non-functional relations between two sets X and Y : while f and g are functional relations, M is not.
(a) f and g are both injective and surjective (i.e. they are bijective), (b) g is bijective but f is only injective and f −1 ({y2 }) := ∅,
(c) f is not 1:1, g is not onto, while (d) M is not a function but is a multifunction.

especial significance are functional relations f 3 that linear homogeneous differential equation with con-
can assign only a unique element in R(f ) to any stant coefficients of order n 1 has n linearly
element in D(f ). Figure 1 illustrates the distinc- independent solutions so that the operator D n of
tion between arbitrary and functional relations M D n (y) = 0 has a n-dimensional null space. Inverses
and f . This difference between functions (or maps) of non-injective, and in general non-bijective, func-
and multifunctions is basic to our development and tions will be denoted by f − . If f is not injective
should be fully understood. Functions can again be then
classified as injections (or 1:1) and surjections (or def
A ⊂ f − f (A) = sat(A)
onto). f : X → Y is said to be injective (or one-to-
one) if x1 = x2 ⇒ f (x1 ) = f (x2 ) for all x1 , x2 ∈ X, where sat(A) is the saturation of A ⊆ X induced by
while it is surjective (or onto) if Y = f (X). f is f ; if f is not surjective then
bijective if it is both 1:1 and onto.
f f − (B) := B f (X) ⊆ B.
Associated with a function f : X → Y is its in-
verse f −1 : Y ⊇ R(f ) → X that exists on R(f ) iff If A = sat(A), then A is said to be saturated, and
f is injective. Thus when f is bijective, f −1 (y) := B ⊆ R(f ) whenever f f − (B) = B. Thus for non-
{x ∈ X: y = f (x)} exists for every y ∈ Y ; infact f is injective f , f − f is not an identity on X just as
bijective iff f −1 ({y}) is a singleton for each y ∈ Y . f f − is not 1Y if f is not surjective. However the
Non-injective functions are not at all rare; if any- set of relations
thing, they are very common even for linear maps
and it would be perhaps safe to conjecture that f f − f = f, f −f f − = f − (1)
they are overwhelmingly predominant in the non- that is always true will be of basic significance in
linear world of nature. Thus for example, the simple this work. Following are some equivalent statements

3
We do not distinguish between a relation and its graph although technically they are different objects. Thus although a
functional relation, strictly speaking, is the triple (X, f, Y ) written traditionally as f : X → Y , we use it synonymously with
the graph f itself. Parenthetically, the word functional in this paper is not necessarily employed for a scalar-valued function,
but is used in a wider sense to distinguish between a function and an arbitrary relation (that is a multifunction). Formally,
whereas an arbitrary relation from X to Y is a subset of X × Y , a functional relation must satisfy an additional restriction
that requires y1 = y2 whenever (x, y1 ) ∈ f and (x, y2 ) ∈ f . In this subset notation, (x, y) ∈ f ⇔ y = f (x).


on the injectivity and surjectivity of functions f : set of X under ∼, denoted by X/ ∼:= {[x]: x ∈ X},
X →Y. has the equivalence classes [x] as its elements; thus
(Injec) f is 1:1 ⇔ there is a function f L : Y → X [x] plays a dual role either as subsets of X or as ele-
called the left inverse of f , such that f L f = 1X ⇔ ments of X/ ∼. The rule x → [x] defines a surjective
A = f − f (A) for all subsets A of X ⇔ f ( Ai ) = function Q: X → X/ ∼ known as the quotient map.
f (Ai ). Example 1.1. Let
(Surjec) f is onto ⇔ there is a function f R : Y → X
called the right inverse of f , such that f f R = 1Y ⇔ S 1 = {(x, y) ∈ R2 ) : x2 + y 2 = 1}
B = f f − (B) for all subsets B of Y .
be the unit circle in R2 . Consider X = [0, 1] as a
As we are primarily concerned with non- subspace of R, define a map
injectivity of functions, saturated sets generated by
equivalence classes of f will play a significant role q : X → S 1, s → (cos 2πs, sin 2πs), s ∈ X ,
in our discussions. A relation E on a set X is said from R to R2 , and let ∼ be the equivalence relation
to be an equivalence relation if it is 4 on X
(ER1) Reflexive: (∀x ∈ X)(xEx).
s ∼ t ⇔ (s = t) ∨ (s = 0, t = 1) ∨ (s = 1, t = 0) .
(ER2) Symmetric: (∀x, y ∈ X)(xEy ⇒ yEx).
(ER3) Transitive: (∀x, y, z ∈ X)(xEy ∧ yEz ⇒ If we bend X around till its ends touch, the resulting
xEz). circle represents the quotient set Y = X/ ∼ whose
Equivalence relations group together unequal ele- points are equivalent under ∼ as follows
ments x1 = x2 of a set as equivalent according to [0] = {0, 1} = [1], [s] = {s} for all s ∈ (0, 1) .
the requirements of the relation. This is expressed
as x1 ∼ x2 (mod E) and will be represented here by Thus q is bijective for s ∈ (0, 1) but two-to-one for
the shorthand notation x1 ∼E x2 , or even simply the special values s = 0 and 1, so that for s, t ∈ X,
as x1 ∼ x2 if the specification of E is not essential.
s ∼ t ⇔ q(s) = q(t) .
Thus for a non-injective map if f (x1 ) = f (x2 ) for
x1 = x2 , then x1 and x2 can be considered to be This yields a bijection h: X/ ∼ → S 1 such that
equivalent to each other since they map onto the
same point under f ; thus x1 ∼f x2 ⇔ f (x1 ) = q =h◦Q
f (x2 ) defines the equivalence relation ∼ f induced defines the quotient map Q: X → X/ ∼ by h([s]) =
by the map f . Given an equivalence relation ∼ on q(s) for all s ∈ [0, 1]. The situation is illustrated by
a set X and an element x ∈ X the subset the commutative diagram of Fig. 2 that appears as
def
[x] = {y ∈ X : y ∼ x} an integral component in a different and more gen-
is called the equivalence class of x; thus x ∼ y ⇔ eral context in Sec. 2. It is to be noted that com-
[x] = [y]. In particular, equivalence classes gener- mutativity of the diagram implies that if a given
ated by f : X → Y , [x]f = {xα ∈ X : f (xα ) = equivalence relation ∼ on X is completely deter-
f (x)}, will be a cornerstone of our analysis of chaos mined by q that associates the partitioning equiva-
generated by the iterates of non-injective maps, and lence classes in X to unique points in S 1 , then ∼ is
the equivalence relation ∼f := {(x, y): f (x) = f (y)} identical to the equivalence relation that is induced
generated by f is uniquely defined by the partition by Q on X. Note that a larger size of the equivalence
that f induces on X. Of course as x ∼ x, x ∈ [x]. classes can be obtained by considering X = R + for
It is a simple matter to see that any two equiva- which s ∼ t ⇔ |s − t| ∈ Z+ .
lence classes are either disjoint or equal so that the
equivalence classes generated by an equivalence re- End Tutorial 1
lation on X form a disjoint cover of X. The quotient

4
An alternate useful way of expressing these properties for a relation R on X are
(ER1) R is reflexive iff 1X ⊆ X
(ER2) R is symmetric iff R = R−1
(ER3) R is transitive iff R ◦ R ⊆ R,
with R an equivalence relation only if R ◦ R = R.

3152 A. Sengupta

¡ α∈D M(Aα ) and M α∈D Aα ⊆ α∈D M(Aα )
where D is an index set. The following illustrates
the difference between the two inverses of M. Let
X be a set that is partitioned into two disjoint M-
¢ invariant subsets X1 and X2 . If x ∈ X1 (or x ∈ X2 )
then M(x) represents that part of X1 (or of X2 )
that is realized immediately after one application
§¥¡
¦ ¤ £ © ¨ of M, while M− (x) denotes the possible precursors
of x in X1 (or of X2 ) and M+ (B) is that subset of
X whose image lies in B for any subset B ⊂ X.
Fig. 2. The quotient map Q. In this paper the multifunctions that we shall
be explicitly concerned with arise as the inverses of
non-injective maps.
One of the central concepts that we consider and
The second major component of our theory is
employ in this work is the inverse f − of a nonlin-
the graphical convergence of a net of functions to
ear, non-injective, function f ; here the equivalence
a multifunction. In Tutorial 2 below, we replace for
classes [x]f = f − f (x) of x ∈ X are the saturated
the sake of simplicity and without loss of generality,
subsets of X that partition X. While a detailed
the net (which is basically a sequence where the in-
treatment of this question in the form of the non-
dex set is not necessarily the positive integers; thus
linear ill-posed problem and its solution is given in
every sequence is a net but the family 5 indexed, for
Sec. 2 [Sengupta, 1997], it is sufficient to point out
example, by Z, the set of all integers, is a net and
here from Figs. 1(c) and 1(d), that the inverse of a
not a sequence) with a sequence and provide the
non-injective function is not a function but a mul-
necessary background and motivation for the con-
tifunction while the inverse of a multifunction is a
cept of graphical convergence.
non-injective function. Hence one has the general
result that
f is a non-injective function
⇔ f − is a multifunction . Begin Tutorial 2: Convergence of
(2)
f is a multifunction Functions
⇔ f − is a non-injective function This Tutorial reviews the inadequacy of the usual
notions of convergence of functions either to limit
The inverse of a multifunction M: X –→ Y is a gen-
→ functions or to distributions and suggests the mo-
eralization of the corresponding notion for a func- tivation and need for introduction of the notion
tion f : X → Y such that of graphical convergence of functions to multifunc-
def tions. Here, we follow closely the exposition of
M− (y) = {x ∈ X : y ∈ M(x)}
Korevaar [1968], and use the notation (f k )∞ to de-
k=1
leads to note real or complex valued functions on a bounded
or unbounded interval J.
M− (B) = {x ∈ X : M(x) B = ∅}
A sequence of piecewise continuous functions
for any B ⊆ Y , while a more restricted inverse (fk )∞ is said to converge to the function f , nota-
k=1
that we shall not be concerned with is given as tion fk → f , on a bounded or unbounded interval
M+ (B) = {x ∈ X : M(x) ⊆ B}. Obviously, J6
M+ (B) ⊆ M− (B). A multifunction is injective if (1) Pointwise if
x1 = x2 ⇒ M(x1 ) M(x2 ) = ∅, and commonly
with functions, it is true that M α∈D Aα = fk (x) → f (x) for all x ∈ J ,

5
A function χ: D → X will be called a family in X indexed by D when reference to the domain D is of interest, and a net
when it is required to focus attention on its values in X.
6
Observe that it is not being claimed that f belongs to the same class as (fk ). This is the single most important cornerstone
on which this paper is based: the need to “complete” spaces that are topologically “incomplete”. The classical high-school
example of the related problem of having to enlarge, or extend, spaces that are not big enough is the solution space of algebraic
equations with real coefficients like x2 + 1 = 0.


i.e. Given any arbitrary real number ε 0 there It is to be observed that apart from point-
exists a K ∈ N that may depend on x, such that wise and uniform convergences, all the other modes
|fk (x) − f (x)| ε for all k ≥ K. listed above represent some sort of an averaged con-
(2) Uniformly if tribution of the entire interval J and are therefore
not of much use when pointwise behavior of the
sup |f (x) − fk (x)| → 0 as k → ∞ , limit f is necessary. Thus while limits in the mean
x∈J
are not unique, oscillating functions are tamed by
i.e. Given any arbitrary real number ε 0 there m-integral convergence for adequately large values
exists a K ∈ N, such that supx∈J |fk (x) − f (x)| ε of m, and convergence relative to test functions,
for all k ≥ K. as we see below, can be essentially reduced to m-
(3) In the mean of order p ≥ 1 if |f (x) − f k (x)|p is integral convergence. On the contrary, our graphical
integrable over J for each k convergence — which may be considered as a point-
wise biconvergence with respect to both the direct
|f (x) − fk (x)|p → 0 as k → ∞ . and inverse images of f just as usual pointwise con-
J vergence is with respect to its direct image only
For p = 1, this is the simple case of convergence in — allows a sequence (in fact, a net) of functions to
the mean. converge to an arbitrary relation, unhindered by ex-
(4) In the mean m-integrally if it is possible to select ternal influences such as the effects of integrations
indefinite integrals and test functions. To see how this can indeed mat-
x x1
ter, consider the following
(−m)
fk (x) = πk (x) + dx1 dx2 Example 1.2. Let fk (x) = sin kx, k = 1, 2, . . . and
c c
xm−1 let J be any bounded interval of the real line. Then
··· dxm fk (xm ) 1-integrally we have
c x
(−1) 1 1
and fk (x) = − cos kx = − + sin kx1 dx1 ,
k k 0
x x1
f (−m) (x) = π(x) + dx1 dx2 which obviously converges to 0 uniformly (and
c c therefore in the mean) as k → ∞. And herein lies
xm−1 the point: even though we cannot conclude about
··· dxm f (xm ) the exact nature of sin kx as k increases indefi-
c
nitely (except that its oscillations become more and
such that for some arbitrary real p ≥ 1,
more pronounced), we may very definitely state that
(−m) p limk→∞(cos kx)/k = 0 uniformly. Hence from
|f (−m) − fk | →0 as k → ∞.
J x
(−1)
fk (x) → 0 = 0 + lim sin kx1 dx1
where the polynomials πk (x) and π(x) are of degree 0 k→∞
m, and c is a constant to be chosen appropriately. it follows that
(5) Relative to test functions ϕ if f ϕ and f k ϕ are
lim sin kx = 0 (3)
integrable over J and k→∞

∞
1-integrally.
(fk − f )ϕ → 0 for every ϕ ∈ C0 (J) as k → ∞ , Continuing with the same sequence of func-
J
tions, we now examine its test-functional conver-
∞
where C0 (J) is the class of infinitely differentiable 1
gence with respect to ϕ ∈ C0 (−∞, ∞) that vanishes
continuous functions that vanish throughout some for all x ∈ (α, β). Integrating by parts,
/
neighborhood of each of the end points of J. For ∞ β
an unbounded J, a function is said to vanish in fk ϕ = ϕ(x1 ) sin kx1 dx1
some neighborhood of +∞ if it vanishes on some −∞ α
ray (r, ∞). 1
While pointwise convergence does not imply = − [ϕ(x1 ) cos kx1 ]β
α
k
any other type of convergence, uniform conver- β
gence on a bounded interval implies all the other 1
− ϕ (x1 ) cos kx1 dx1
convergences. k α

3154 A. Sengupta

©¦ £¦
¨ § ¦ ( ) A¥£7
B@ 98
% 6 C
¤

§ £¦
'
A¥£7
B@ 98
@

¢ £¡ ¢ # ! $ 4 0 2 31 0 6 5
¥
¤ ¤
(a) (b) (c)

Fig. 3. Incompleteness of function spaces. (a) demonstrates the classic example of non-completeness of the space of real-
valued continuous functions leading to the complete spaces Ln [a, b] whose elements are equivalence classes of functions with
b
f ∼ g iff the Lebesgue integral a |f − g|n = 0. (b) and (c) illustrate distributional convergence of the functions fk (x) of
Eq. (5) to the Dirac delta δ(x) leading to the complete space of generalized functions. In comparison, note that the space
of continuous functions in the uniform metric C[a, b] is complete which suggests the importance of topologies in determining
convergence properties of spaces.

The first integrated term is 0 due to the condi- converges in the mean to f (−m) ϕ(m) so that
tions on ϕ while the second also vanishes because β β
1
ϕ ∈ C0 (−∞, ∞). Hence (−m) (m)
fk ϕ = (−1)m fk ϕ
α α
∞ β
fk ϕ → 0 = lim ϕ(x1 ) sin ksdx1 β β
−∞ α k→∞ → (−1)m f (−m) ϕ(m) = f ϕ.
α α
for all ϕ, and leading to the conclusion that In fact the converse also holds leading to the
following Equivalences between m-convergence in
lim sin kx = 0 (4)
k→∞ the mean and convergence with respect to test-
functions [Korevaar, 1968].
test-functionally.
Type 1 Equivalence. If f and (fk ) are functions
This example illustrates the fact that if
on J that are integrable on every interior subinter-
Supp(ϕ) = [α, β] ⊆ J,7 integrating by parts suf-
val, then the following are equivalent statements.
ficiently large number of times so as to wipe out
the pathological behavior of (fk ) gives (a) For every interior subinterval I of J there is
an integer mI ≥ 0, and hence a smallest in-
β
fk ϕ = fk ϕ teger m ≥ 0, such that certain indefinite inte-
(−m)
J α grals fk of the functions fk converge in the
β β mean on I to an indefinite integral f (−m) ; thus
(−1) (−m) m
= fk ϕ = · · · = (−1)m fk ϕ (−m)
− f (−m) | → 0.
α α I |fk
∞
(b) J (fk − f )ϕ → 0 for every ϕ ∈ C0 (J).
(−m) x x x
where fk (x) = πk (x) + c dx1 c 1 dx2 · · · c m−1
A significant generalization of this Equivalence is
dxm fk (xm ) is an m-times arbitrary indefinite in-
β (−m) obtained by dropping the restriction that the limit
tegral of fk . If now it is true that α fk → object f be a function. The need for this gener-
β (−m) (m)
α f (−m) , then it must also be true that fk ϕ alization arises because metric function spaces are

7 ∞
By definition, the support (or supporting interval) of ϕ(x) ∈ C0 [α, β] is [α, β] if ϕ and all its derivatives vanish for x ≤ α
and x ≥ β.


known not to be complete: Consider the sequence can be associated with the arbitrary indefinite
of functions [Fig. 3(a)] integrals

 0, if a≤x≤0 
a≤x≤0

  0,

1
 
1
 
fk (x) = kx, if 0≤x≤

(5) def (−1)
Θk (x) = δk (x) = kx, 0 x
 k k
1
 
1
 
 1, if ≤x≤b
 
 1, ≤x≤b

k k
which is not Cauchy in the uniform metric
ρ(fj , fk ) = supa≤x≤b |fj (x) − fk (x)| but is Cauchy of Fig. 3(c), which, as noted above, converge
b in the mean to the unit step function Θ(x);
in the mean ρ(fj , fk ) = a |fj (x) − fk (x)|dx, or ∞ β β (−1)
even pointwise. However in either case, (f k ) cannot hence −∞ δk ϕ ≡ α δk ϕ = − α δk ϕ →
β
converge in the respective metrics to a continuous − 0 ϕ (x)dx = ϕ(0). But there can be no func-
function and the limit is a discontinuous unit step β
tional relation δ(x) for which α δ(x)ϕ(x)dx = ϕ(0)
function for all ϕ ∈ C0 1 [α, β], so that unlike in the case in

0, if a ≤ x ≤ 0 Type 1 Equivalence, the limit in the mean Θ(x)
Θ(x) = (−1)
1, if 0 x ≤ b of the indefinite integrals δk (x) cannot be ex-
pressed as the indefinite integral δ (−1) (x) of some
with graph ([a, 0], 0) ((0, b], 1), which is also in-
function δ(x) on any interval containing the ori-
tegrable on [a, b]. Thus even if the limit of the se-
gin. This leads to the second more general type of
quence of continuous functions is not continuous,
equivalence.
both the limit and the members of the sequence
are integrable functions. This Riemann integration
Type 2 Equivalence. If (fk ) are functions on J
is not sufficiently general, however, and this type
that are integrable on every interior subinterval,
of integrability needs to be replaced by a much
then the following are equivalent statements.
weaker condition resulting in the larger class of
the Lebesgue integrable complete space of functions (a) For every interior subinterval I of J there is an
L[a, b].8 integer mI ≥ 0, and hence a smallest integer
The functions in Fig. 3(b), m ≥ 0, such that certain indefinite integrals
(−m)
 k, if 0 x 1

 fk of the functions fk converge in the mean
k on I to an integrable function Θ which, unlike

δk (x) = 1 in Type 1 Equivalence, need not itself be an
 0, x ∈ [a, b] − 0, ,


k indefinite integral of some function f .

8
Both Riemann and Lebesgue integrals can be formulated in terms of the so-called step functions s(x), which are piecewise
constant functions with values (σi )I on a finite number of bounded subintervals (Ji )I
i=1 i=1 (which may reduce to a point or
defI
may not contain one or both of the end points) of a bounded or unbounded interval J, with integral J s(x)dx = i=1 σi |Ji |.
While the Riemann integral of a bounded function f (x) on a bounded interval J is defined with respect to sequences
of step functions (sj )∞ and (tj )∞ satisfying sj (x) ≤ f (x) ≤ tj (x) on J with J (sj − tj ) → 0 as j → ∞ as
j=1 j=1
R J f (x)dx = lim J sj (x)dx = lim J tj (x)dx, the less restrictive Lebesgue integral is defined for arbitrary functions f
over bounded or unbounded intervals J in terms of Cauchy sequences of step functions J |si − sk | → 0, i, k → ∞, converging
to f (x) as

sj (x) → f (x) pointwise almost everywhere on J ,

to be
def
f (x)dx = lim sj (x)dx .
J j→∞ J

That the Lebesgue integral is more general (and therefore is the proper candidate for completion of function spaces) is
illustrated by the example of the function defined over [0, 1] to be 0 on the rationals and 1 on the irrationals for which an
application of the definitions verify that while the Riemann integral is undefined, the Lebesgue integral exists and has value
1. The Riemann integral of a bounded function over a bounded interval exists and is equal to its Lebesgue integral. Because
it involves a larger family of functions, all integrals in integral convergences are to be understood in the Lebesgue sense.

3156 A. Sengupta

(b) ck (ϕ) = ∞
fk ϕ → c(ϕ) for every ϕ ∈ C0 (J). system evolves to a state of maximal ill-posedness.
J
The analysis is based on the non-injectivity, and
(−m)
Since we are now given that I fk (x)dx → hence ill-posedness, of the map; this may be viewed
(−m) (m) as a mathematical formulation of the stretch-and-
I Ψ(x)dx, it must also be true that fk ϕ con-
verges in the mean to Ψϕ(m) whence fold and stretch-cut-and-paste kneading operations
of the dough that are well-established artifacts in
(−m) (m) the theory of chaos and the concept of maximal ill-
fk ϕ = (−1)m fk ϕ
J I posedness helps in obtaining a physical understand-
ing of the nature of chaos. We do this through the
→ (−1)m Ψϕ(m) = (−1)m f (−m) ϕ(m) . fundamental concept of the graphical convergence of
I I
a sequence (generally a net) of functions [Sengupta
The natural question that arises at this stage is Ray, 2000] that is allowed to converge graphically,
then: What is the nature of the relation (not func- when the conditions are right, to a set-valued map
tion any more) Ψ(x)? For this it is now stipulated, or multifunction. Since ill-posed problems naturally
despite the non-equality in the equation above, that lead to multifunctional inverses through functional
as in the mean m-integral convergence of (f k ) to a generalized inverses [Sengupta, 1997], it is natural
function f , to seek solutions of ill-posed problems in multifunc-
x
(−1) def tional space Multi(X, Y ) rather than in spaces of
Θ(x) := lim δk (x) = δ(x )dx (6) functions Map(X, Y ); here Multi(X, Y ) is an ex-
k→∞ −∞
tension of Map(X, Y ) that is generally larger than
defines the non-functional relation (“generalized the smallest dense extension Multi | (X, Y ).
function”) δ(x) integrally as a solution of the inte- Feedback and iteration are natural processes by
gral equation (6) of the first kind; hence formally 9 which nature evolves itself. Thus almost every pro-
dΘ cess of evolution is a self-correction process by which
δ(x) = (7)
dx the system proceeds from the present to the future
through a controlled mechanism of input and eval-
End Tutorial 2 uation of the past. Evolution laws are inherently
nonlinear and complex; here complexity is to be un-
derstood as the natural manifestation of the non-
The above tells us that the “delta function” is not linear laws that govern the evolution of the system.
a function but its indefinite integral is the piecewise This paper presents a mathematical description
continuous function Θ obtained as the mean (or of complexity based on [Sengupta, 1997] and [Sen-
pointwise) limit of a sequence of non-differentiable gupta Ray, 2000] and is organized as follows.
functions with the integral of dΘk (x)/dx being pre- In Sec. 1, we follow [Sengupta, 1997] to give an
served for all k ∈ Z+ . What then is the delta overview of ill-posed problems and their solution
(and not its integral)? The answer to this ques- that forms the foundation of our approach. Sec-
tion is contained in our multifunctional extension tions 2 to 4 apply these ideas by defining a chaotic
Multi(X, Y ) of the function space Map(X, Y ) con- dynamical system as a maximally ill-posed problem;
sidered in Sec. 3. Our treatment of ill-posed prob- by doing this we are able to overcome the limi-
lems is used to obtain an understanding and inter- tations of the three Devaney characterizations of
pretation of the numerical results of the discretized chaos [Devaney, 1989] that apply to the specific case
spectral approximation in neutron transport the- of iteration of transformations in a metric space,
ory [Sengupta, 1988, 1995]. The main conclusions and the resulting graphical convergence of func-
are the following: In a one-dimensional discrete sys- tions to multifunctions is the basic tool of our ap-
tem that is governed by the iterates of a nonlin- proach. Section 5 analyzes graphical convergence in
ear map, the dynamics is chaotic if and only if the Multi(X) for the discretized spectral approximation

9
The observant reader cannot have failed to notice how mathematical ingenuity successfully transferred the “troubles” of
∞
(δk )k=1 to the sufficiently differentiable benevolent receptor ϕ so as to be able to work backward, via the resultant trouble free
(−m)
(δk )∞ , to the final object δ. This necessarily hides the true character of δ to allow only a view of its integral manifestation
k=1
on functions. This unfortunately is not general enough in the strongly nonlinear physical situations responsible for chaos, and
is the main reason for constructing the multifunctional extension of function spaces that we use.


of neutron transport theory, which suggests a nat- Example 2.1. As a non-trivial example of an in-
ural link between ill-posed problems and spectral verse problem, consider the heat equation
theory of nonlinear operators. This seems to offer
an answer to the question of why a natural sys- ∂θ(x, t) ∂ 2 θ(x, t)
= c2
tem should increase its complexity, and eventually ∂t ∂x2
tend toward chaoticity, by becoming increasingly for the temperature distribution θ(x, t) of a one-
nonlinear. dimensional homogeneous rod of length L satisfy-
ing the initial condition θ(x, 0) = θ 0 (x), 0 ≤ x ≤ L,
2. Ill-Posed Problem and Its and boundary conditions θ(0, t) = 0 = θ(L, t), 0 ≤
Solution t ≤ T , having the Fourier sine-series solution
This section based on [Sengupta, 1997] presents ∞
nπ 2
a formulation and solution of ill-posed problems θ(x, t) = An sin x e−λn t (8)
L
arising out of the non-injectivity of a function f : n=1
X → Y between topological spaces X and Y . A
where λn = (cπ/a)n and
workable knowledge of this approach is necessary as
our theory of chaos leading to the characterization a
2 nπ
of chaotic systems as being a maximally ill-posed An = θ0 (x ) sin x dx
L 0 L
state of a dynamical system is a direct application of
these ideas and can be taken to constitute a math- are the Fourier expansion coefficients. While the di-
ematical representation of the familiar stretch-cut- rect problem evaluates θ(x, t) from the differential
and paste and stretch-and-fold paradigms of chaos. equation and initial temperature distribution θ 0 (x),
The problem of finding an x ∈ X for a given y ∈ Y the inverse problem calculates θ0 (x) from the inte-
from the functional relation f (x) = y is an inverse gral equation
problem that is ill-posed (or, the equation f (x) = y 2 a
is ill-posed) if any one or more of the following con- θT (x) = k(x, x )θ0 (x )dx , 0 ≤ x ≤ L,
L 0
ditions are satisfied.
when this final temperature θT is known, and
(IP1) f is not injective. This non-uniqueness prob-
lem of the solution for a given y is the single most ∞
nπ nπ 2
significant criterion of ill-posedness used in this k(x, x ) = sin x sin x e−λn T
L L
work. n=1

(IP2) f is not surjective. For a y ∈ Y , this is the is the kernel of the integral equation. In terms of
existence problem of the given equation. the final temperature the distribution becomes
(IP3) When f is bijective, the inverse f −1 is not
∞
continuous, which means that small changes in y nπ 2
θT (x) = Bn sin x e−λn (t−T ) (9)
may lead to large changes in x. L
n=1

A problem f (x) = y for which a solution exists, with Fourier coefficients
is unique, and small changes in data y that lead 2 a
nπ
to only small changes in the solution x is said to Bn = θT (x ) sin x dx .
L 0 L
be well-posed or properly posed. This means that
f (x) = y is well-posed if f is bijective and the In L2 [0, a], Eqs. (8) and (9) at t = T and t = 0
inverse f −1 : Y → X is continuous; otherwise the yield respectively
equation is ill-posed or improperly posed. It is to ∞
be noted that the three criteria are not, in general, L 2 2
θT (x) 2
= A2 e−2λn T ≤ e−2λ1 T θ0
n
2
(10)
independent of each other. Thus if f represents a 2
n=1
bijective, bounded linear operator between Banach
∞
spaces X and Y , then the inverse mapping theo- 2 L 2
θ0 = Bn e2λn T .
2
(11)
rem guarantees that the inverse f −1 is continuous. 2
n=1
Hence ill-posedness depends not only on the alge-
braic structures of X, Y , f but also on the topolo- The last two equations differ from each other in
gies of X and Y . the significant respect that whereas Eq. (10) shows

3158 A. Sengupta

that the direct problem is well-posed according to (b) For a linear operator A: Rn → Rm , m n, sat-
(IP3), Eq. (11) means that in the absence of similar isfying (1) and (2), the problem Ax = y reduces A
bounds the inverse problem is ill-posed. 10 to echelon form with rank r less than min{m, n},
when the given equations are consistent. The solu-
tion however, produces a generalized inverse leading
Example 2.2. Consider the Volterra integral equa-
to a set-valued inverse A− of A for which the inverse
tion of the first kind
images of y ∈ R(A) are multivalued because of the
x non-trivial null space of A introduced by assump-
y(x) = r(x )dx = Kr tion (1). Specifically, a null-space of dimension n−r
a n
is generated by the free variables {x j }j=r+1 which
are arbitrary: this is illposedness of type (1). In ad-
where y, r ∈ C[a, b] and K: C[0, 1] → C[0, 1] is dition, m − r rows of the row reduced echelon form
the corresponding integral operator. Since the dif- of A have all 0 entries that introduce restrictions
ferential operator D = d/dx under the sup-norm m
on m − r coordinates {yi }i=r+1 of y which are now
r = sup0≤x≤1 |r(x)| is unbounded, the inverse r
related to {yi }i=1 : this illustrates ill-posedness of
problem r = Dy for a differentiable function y type (2). Inverse ill-posed problems therefore gen-
on [a, b] is ill-posed, see Example 6.1. However, erate multivalued solutions through a generalized
y = Kr becomes well-posed if y is considered to be inverse of the mapping.
in C 1 [0, 1] with norm y = sup0≤x≤1 |Dy|. This il-
(c) The eigenvalue problem
lustrates the importance of the topologies of X and
Y in determining the ill-posed nature of the prob- d2
lem when this is due to (IP3). + λ2 y = 0 y(0) = 0 = y(1)
dx2
Ill-posed problems in nonlinear mathematics of
type (IP1) arising from the non-injectivity of f has the following equivalence class of 0
can be considered to be a generalization of non-
d2
uniqueness of solutions of linear equations as, for [0]D2 = {sin(πmx)}∞ ,
m=0 D2 = + λ2 ,
example, in eigenvalue problems or in the solution of dx2
a system of linear algebraic equations with a larger
as its eigenfunctions corresponding to the eigenval-
number of unknowns than the number of equations.
ues λm = πm.
In both cases, for a given y ∈ Y , the solution set of
Ill-posed problems are primarily of interest to
the equation f (x) = y is given by
us explicitly as non-injective maps f , that is under
f − (y) = [x]f = {x ∈ X : f (x ) = f (x) = y} . the condition of (IP1). The two other conditions
(IP2) and (IP3) are not as significant and play only
A significant point of difference between linear and
an implicit role in the theory. In its application to
nonlinear problems is that unlike the special im-
iterative systems, the degree of non-injectivity of f
portance of 0 in linear mathematics, there are no
defined as the number of its injective branches, in-
preferred elements in nonlinear problems; this leads
creases with iteration of the map. A necessary (but
to a shift of emphasis from the null space of linear
not sufficient) condition for chaos to occur is the
problems to equivalence classes for nonlinear equa-
increasing non-injectivity of f that is expressed de-
tions. To motivate the role of equivalence classes,
scriptively in the chaos literature as stretch-and-fold
let us consider the null spaces in the following lin-
or stretch-cut-and-paste operations. This increasing
ear problems.
non-injectivity that we discuss in the following sec-
(a) Let f : R2 → R be defined by f (x, y) = x + y, tions, is what causes a dynamical system to tend
(x, y) ∈ R2 . The null space of f is generated by the toward chaoticity. Ill-posedness arising from non-
equation y = −x on the x–y plane, and the graph surjectivity of (injective) f in the form of regular-
of f is the plane passing through the lines ρ = x ization [Tikhonov Arsenin, 1977] has received
and ρ = y. For each ρ ∈ R the equivalence classes wide attention in the literature of ill-posed prob-
f − (ρ) = {(x, y) ∈ R2 : x + y = ρ} are lines on the lems; this however is not of much significance in
graph parallel to the null set. our work.

10
Recall that for a linear operator continuity and boundedness are equivalent concepts.


%¨§ #
¡$ ¡ ¨§ # P

! 3 5) B
6 @
¡ £
£ 6 GF
8@
¡ £ ©
£
3 5 I1
)
¡ ¥£ ©
¤ 8 HF 921ED
3 )
¤ ¥£ 4210(
3 )
¡ ©¨§ ¦ '¨§ ¦
¡$ A C
¡¢ 8 95 6 75
(a) (b)

Fig. 4. (a) Moore–Penrose generalized inverse. The decomposition of X and Y into the four fundamental subspaces of A
comprising the null space N (A), the column (or range) space R(A), the row space R(AT ) and N (AT ), the complement of
R(A) in Y , is a basic result in the theory of linear equations. The Moore–Penrose inverse takes advantage of the geometric
orthogonality of the row space R(AT ) and N (A) in Rn and that of the column space and N (AT ) in Rm . (b) When X and
Y are not inner-product spaces, a non-injective inverse can be defined by extending f to Y − R(f ) suitably as shown by
the dashed curve, where g(x) := r1 + ((r2 − r1 )/r1 )f (x) for all x ∈ D(f ) was taken to be a good definition of an extension
that replicates f in Y − R(f ); here x1 ∼ x2 under both f and g, and y1 ∼ y2 under {f, g} just as b is equivalent to
b in the Moore–Penrose case. Note that both {f, g} and {f − , g − } are both multifunctions on X and Y , respectively. Our
inverse G, introduced later in this section, is however injective with G(Y − R(f )) := 0.

map a) is the noninjective map defined in terms of
the row and column spaces of A, row(A) = R(A T ),
Begin Tutorial 3: Generalized col(A) = R(A), as
Inverse
In this Tutorial, we take a quick look at the equation def (a|row(A) )−1 (y), if y ∈ col(A)
a(x) = y, where a: X → Y is a linear map that need GMP (y) =
0, if y ∈ N (AT ) .
not be either one-one or onto. Specifically, we will
take X and Y to be the Euclidean spaces R n and (12)
Rm so that a has a matrix representation A ∈ R m×n
where Rm×n is the collection of m×n matrices with Note that the restriction a|row(A) of a to R(AT )
real entries. The inverse A−1 exists and is unique iff is bijective so that the inverse (a| row(A) )−1 is well-
m = n and rank(A) = n; this is the situation de- defined. The role of the transpose matrix appears
picted in Fig. 1(a). If A is neither one-one or onto, naturally, and the GMP of Eq. (12) is the unique
then we need to consider the multifunction A − , a matrix that satisfies the conditions
functional choice of which is known as the general-
ized inverse G of A. A good introductory text for AGMP A = A, GMP AGMP = GMP ,
(13)
generalized inverses is [Campbell Mayer, 1979]. (GMP A)T = GMP A, (AGMP )T = AGMP
Figure 4(a) introduces the following definition of
the Moore–Penrose generalized inverse G MP . that follow immediately from the definition (12);
hence GMP A and AGMP are orthogonal projec-
Definition 2.1 (Moore–Penrose Inverse). If a: tions11 onto the subspaces R(AT ) = R(GMP ) and
Rn → Rm is a linear transformation with matrix R(A), respectively. Recall that the range space
representation A ∈ Rm×n then the Moore–Penrose R(AT ) of AT is the same as the row space row(A)
inverse GMP ∈ Rn×m of A (we will use the same of A, and R(A) is also known as the column space
notation GMP : Rm → Rn for the inverse of the of A, col(A).

11
A real matrix A is an orthogonal projector iff A2 = A and A = AT .

3160 A. Sengupta

Example 2.3. For a: R5 → R4 , let rank is 4. This gives
9 1 18 2
 
1

−3 2 1 2
 − − 

 275 275 275 55 
3 −9 10 2 9 
 − 27 3 54 6 
A=
  − 
2 −6 4 2 4 275 275 275 55 
 
 
2 −6 8 1 7
 10 6 20 16 
GMP =  − − 

 143 143 143 143 
 238 57 476 59 
By reducing the augmented matrix (A|y) to the
 − − 
3575 3575 3575 715 
 
row-reduced echelon form, it can be verified that

 129 106 258 47 
the null and range spaces of A are three- and two- − −
dimensional, respectively. A basis for the null space 3575 3575 3575 715
(14)
of AT and of the row and column space of A ob-
tained from the echelon form are respectively as the Moore–Penrose inverse of A that readily ver-
ifies all the four conditions of Eqs. (13). The basic
    point here is that, as in the case of a bijective map,
1 0
 −3   0   GMP A and AGMP are identities on the row and col-
−2
      
1 

 
 

 1 0 umn spaces of A that define its rank. For later use —
 0   −1   0   1   0 1

,
 
 ; and  3
 
, 1
 
;
 
,  .
   when we return to this example for a simpler inverse
 1  0   2 0


 2
 −
  4  G — given below are the orthonormal bases of the
0 1 
 1
 
  3

 −1 1 four fundamental subspaces with respect to which
2 4 GMP is a representation of the generalized inverse of
A; these calculations were done by MATLAB. The
basis for
According to its definition Eq. (12), the Moore–
Penrose inverse maps the middle two of the above (a) the column space of A consists of the first two
set to (0, 0, 0, 0, 0)T , and the A-image of the first columns of the eigenvectors of AAT :
two (which are respectively (19, 70, 38, 51) T and
T
(70, 275, 140, 205)T lying, as they must, in the span 1633 363 3317 363
− ,− , ,
of the last two), to the span of (1, −3, 2, 1, 2) T and 2585 892 6387 892
(3, −9, 10, 2, 9)T because a restricted to this sub- T
929 709 346 709
space of R5 is bijective. Hence − , , ,−
1435 1319 6299 1319
     
1 0 (b) the null space of AT consists of the last two
  −3   0 
 −2
 columns of the eigenvectors of AAT :
    1
  0  1 0
     
−1  T
3185 293 3185 1777
GMP A  3  A  1  − ,−
     
 1 , ,
0

  2  −4  8306 2493 4153 3547
   

  1  3 0 1 T
    
 323 533 323 1037
, , ,
2 4 1732 731 866 1911
 
1 0 0 0 (c) the row space of A consists of the first two
 −3
 0 0 0  columns of the eigenvectors of AT A:
 0 1 0 0
 
421 44 569 659 1036
= 3

1 .
 , ,− ,− ,
13823 14895 918 2526 1401
 2 −4 0 0 
 
  661 412 59 1523 303
 1 3  , , ,− ,−
0 0 690 1775 2960 10221 3974
2 4
(d) the null space of A consists of the last three
The second matrix on the left is invertible as its columns of AT A:


571 369 149 291 389 (T3) Arbitrary unions of members of U belong
− ,− , ,− ,− to U.
15469 776 25344 350 1365
281 956 875 1279 409 Example 2.4
− , , ,− ,
1313 1489 1706 2847 1473
(1) The smallest topology possible on a set X is
292 876 203 621 1157 its indiscrete topology when the only open sets
,− , , ,
1579 1579 342 4814 2152 are ∅ and X; the largest is the discrete topology
The matrices Q1 and Q2 with these eigenvectors where every subset of X is open (and hence also
(xi ) satisfying xi = 1 and (xi , xj ) = 0 for i = j closed).
as their columns are orthogonal matrices with the (2) In a metric space (X, d), let Bε (x, d) = {y ∈ X:
simple inverse criterion Q−1 = QT . d(x, y) ε} be an open ball at x. Any subset
U of X such that for each x ∈ U there is a d-
ball Bε (x, d) ⊆ U in U , is said to be an open
End Tutorial 3 set of (X, d). The collection of all these sets
is the topology induced by d. The topological
space (X, U) is then said to be associated with
The basic issue in the solution of the inverse ill- (induced by) (X, d).
posed problem is its reduction to an well-posed one (3) If ∼ is an equivalence relation on a set X, the
when restricted to suitable subspaces of the do- set of all saturated sets [x]∼ = {y ∈ X: y ∼ x}
main and range of A. Considerations of geometry is a topology on X; this topology is called the
leading to their decomposition into orthogonal sub- topology of saturated sets.
spaces is only an additional feature that is not cen- We argue in Sec. 4.2 that this constitutes
tral to the problem: recall from Eq. (1) that any the defining topology of a chaotic system.
function f must necessarily satisfy the more general (4) For any subset A of the set X, the A-inclusion
set-theoretic relations f f −f = f and f − f f − = f − topology on X consists of ∅ and every superset
of Eq. (13) for the multiinverse f − of f : X → Y . of A, while the A-exclusion topology on X con-
The second distinguishing feature of the MP-inverse sists of all subsets of X − A. Thus A is open
is that it is defined, by a suitable extension, on all in the inclusion topology and closed in the ex-
of Y and not just on f (X) which is perhaps more clusion, and in general every open set of one is
natural. The availability of orthogonality in inner- closed in the other.
product spaces allows this extension to be made The special cases of the a-inclusion and a-
in an almost normal fashion. As we shall see be- exclusion topologies for A = {a} are defined in
low the additional geometric restriction of Eq. (13) a similar fashion.
is not essential to the solution process, and in- (5) The cofinite and cocountable topologies in which
fact, only results in a less canonical form of the the open sets of an infinite (resp. uncount-
inverse. able) set X are respectively the complements
of finite and countable subsets, are examples of
topologies with some unusual properties that
are covered in Appendix A.1. If X is itself
finite (respectively, countable), then its cofinite
Begin Tutorial 4: Topological Spaces
(respectively, cocountable) topology is the dis-
This Tutorial is meant to familiarize the reader with crete topology consisting of all its subsets. It is
the basic principles of a topological space. A topo- therefore useful to adopt the convention, unless
logical space (X, U) is a set X with a class 12 U of stated to the contrary, that cofinite and co-
distinguished subsets, called open sets of X, that countable spaces are respectively infinite and
satisfy uncountable.
(T1) The empty set ∅ and the whole X belong to U In the space (X, U), a neighborhood of a point
(T2) Finite intersections of members of U belong x ∈ X is a nonempty subset N of X that con-
to U tains an open set U containing x; thus N ⊆ X is a

12
In this sense, a class is a set of sets.

3162 A. Sengupta

neighborhood of x iff neighborhood system at x coincides exactly with
x∈U ⊆N (15) the assigned collection Nx ; compare with Defini-
tion A.1.1. Neighborhoods in topological spaces are
for some U ∈ U. The largest open set that can be a generalization of the familiar notion of distances
used here is Int(N ) (where, by definition, Int(A) is of metric spaces that quantifies “closeness” of points
the largest open set that is contained in A) so that of X.
the above neighborhood criterion for a subset N of A neighborhood of a non-empty subset A of X
X can be expressed in the equivalent form that will be needed later on is defined in a similar
N ⊆ X is a U − neighborhood of x iff x ∈ Int U (N ) manner: N is a neighborhood of A iff A ⊆ Int(N ),
(16) that is A ⊆ U ⊆ N ; thus the neighborhood sys-
tem at A is given by NA = a∈A Na := {G ⊆ X:
implying that a subset of (X, U) is a neighborhood G ∈ Na for every a ∈ A} is the class of common
of all its interior points, so that N ∈ N x ⇒ N ∈ Ny neighborhoods of each point of A.
for all y ∈ Int(N ). The collection of all neighbor- Some examples of neighborhood systems at a
hoods of x point x in X are the following:
def
Nx = {N ⊆ X : x ∈ U ⊆ N for some U ∈ U} (1) In an indiscrete space (X, U), X is the only
(17) neighborhood of every point of the space; in a
is the neighborhood system at x, and the subcol- discrete space any set containing x is a neigh-
lection U of the topology used in this equation borhood of the point.
constitutes a neighborhood (local ) base or basic (2) In an infinite cofinite (or uncountable cocount-
neighborhood system, at x, see Definition A.1.1 of able) space, every neighborhood of a point is an
Appendix A.1. The properties open neighborhood of that point.
(3) In the topology of saturated sets under the
(N1) x belongs to every member N of Nx , equivalence relation ∼, the neighborhood sys-
(N2) The intersection of any two neighborhoods of tem at x consists of all supersets of the equiva-
x is another neighborhood of x: N, M ∈ N x ⇒ lence class [x]∼ .
N M ∈ Nx , (4) Let x ∈ X. In the x-inclusion topology, N x
(N3) Every superset of any neighborhood of x is a consists of all the non-empty open sets of X
neighborhood of x: (M ∈ Nx ) ∧ (M ⊆ N ) ⇒ N ∈ which are the supersets of {x}. For a point
Nx , y = x of X, Ny are the supersets of {x, y}.
that characterize Nx completely are a direct conse- For any given class T S of subsets of X, a unique
quence of the definitions (15), (16) that may also topology U(T S) can always be constructed on X
be stated as by taking all finite intersections T S∧ of members
of S followed by arbitrary unions T S∧∨ of these fi-
(N0) Any neighborhood N ∈ Nx contains another
nite intersections. U(T S) := T S∧∨ is the smallest
neighborhood U of x that is a neighborhood of each
topology on X that contains T S and is said to be
of its points: ((∀N ∈ Nx )(∃U ∈ Nx )(U ⊆ N )) :
generated by T S. For a given topology U on X satis-
(∀y ∈ U ⇒ U ∈ Ny ).
fying U = U(T S), T S is a subbasis, and T S∧ := T B
Property (N0) infact serves as the defining char- a basis, for the topology U; for more on topological
acteristic of an open set, and U can be identified basis, see Appendix A.1. The topology generated
with the largest open set Int(N ) contained in N ; by a subbase essentially builds not from the collec-
hence a set G in a topological space is open iff it is tion T S itself but from the finite intersections T S∧
a neighborhood of each of its points. Accordingly if of its subsets; in comparison the base generates a
Nx is a given class of subsets of X associated with topology directly from a collection T S of subsets
each x ∈ X satisfying (N1)–(N3), then (N0) defines by forming their unions. Thus whereas any class of
the special class of neighborhoods G subsets can be used as a subbasis, a given collection
U = {G ∈ Nx : x ∈ B ⊆ G for all x ∈ G must meet certain qualifications to pass the test of a
base for a topology: these and related topics are cov-
and some basic nbd B ∈ Nx } (18) ered in Appendix A.1. Different subbases, therefore,
as the unique topology on X that contains a basic can be used to generate different topologies on the
neighborhood of each of its points, for which the same set X as the following examples for the case of


X = R demonstrates; here (a, b), [a, b), (a, b] and consisting of those points of X that are in A but
[a, b], for a ≤ b ∈ R, are the usual open-closed inter- not in its boundary, Int(A) = A − Bdy(A), is the
vals in R.13 The subbases T S1 = {(a, ∞), (−∞, b)}, largest open subset of X that is contained in A.
T S2 = {[a, ∞), (−∞, b)}, T S3 = {(a, ∞), (−∞, b]} Hence it follows that Int(Bdy(A)) = ∅, the bound-
and T S4 = {[a, ∞), (−∞, b]} give the respective ary of A is the intersection of the closures of A and
bases T B1 = {(a, b)}, T B2 = {[a, b)}, T B3 = {(a, b]} X − A, and a subset N of X is a neighborhood of
and T B4 = {[a, b]}, a ≤ b ∈ R, leading to the stan- x iff x ∈ Int(N ).
dard (usual ), lower limit (Sorgenfrey), upper limit,
and discrete (take a = b) topologies on R. Bases of The three subsets Int(A), Bdy(A) and exterior
the type (a, ∞) and (−∞, b) provide the right and of A defined as Ext(A) := Int(X − A) = X − Cl(A),
left ray topologies on R. are pairwise disjoint and have the full space X as
their union.
This feasibility of generating different
topologies on a set can be of great practi- Definition 2.3 (Derived and Isolated sets). Let A
cal significance because open sets determine be a subset of X. A point x ∈ X (which may or
convergence characteristics of nets and con- may not be a point of A) is a cluster point of A if
tinuity characteristics of functions, thereby every neighborhood N ∈ Nx contains at least one
making it possible for nature to play around point of A different from x. The derived set of A
with the structure of its working space in its
def
kitchen to its best possible advantage. 14 Der(A) = x ∈ X : (∀N ∈ Nx ) N (A−{x}) = ∅

Here are a few essential concepts and terminology (22)
for topological spaces. is the set of all cluster points of A. The complement
of Der(A) in A
Definition 2.2 (Boundary, Closure, Interior). The
def
boundary of A in X is the set of points x ∈ X such Iso(A) = A − Der(A) = Cl(A) − Der(A) (23)
that every neighborhood N of x intersects both A
and X–A: are the isolated points of A to which no proper
sequence in A converges, that is there exists a neigh-
def
Bdy(A) = {x ∈ X : (∀N ∈ Nx )((N A=∅ borhood of any such point that contains no other
point of A so that the only sequence that converges
∧(N (X − A) = ∅))} (19) to a ∈ Iso(A) is the constant sequence (a, a, a, . . .).
Clearly,
where Nx is the neighborhood system of Eq. (17)
at x. Cl(A) = A Der(A) = A Bdy(A)
The closure of A is the set of all points x ∈ X
such that each neighborhood of x contains at least = Iso(A) Der(A) = Int(A) Bdy(A)
one point of A that may be x itself. Thus the set
with the last two being disjoint unions, and A is
def
Cl(A) = {x ∈ X : (∀N ∈ Nx )(N A = ∅)} (20) closed iff A contains all its cluster points, Der(A) ⊆
A, iff A contains its closure. Hence
of all points in X adherent to A is given by the
A = Cl(A) ⇔ Cl(A)
union of A with its boundary.
The interior of A = {x ∈ A : ((∃N ∈ Nx )(N ⊆ A))
def
Int(A) = {x ∈ X : (∃N ∈ Nx )(N ⊆ A)} (21) ∨((∀N ∈ Nx )(N (X − A) = ∅))} .

13
By definition, an interval I in a totally ordered set X is a subset of X with the property
(x1 , x2 ∈ I) ∧ (x3 ∈ X : x1 x3 x2 ) ⇒ x3 ∈ I
so that any element of X lying between two elements of I also belongs to I.
14
Although we do not pursue this point of view here, it is nonetheless tempting to speculate that the answer to the question
“Why does the entropy of an isolated system increase?” may be found by exploiting this line of reasoning that seeks to explain
the increase in terms of a visible component associated with the usual topology as against a different latent workplace topology
that governs the dynamics of nature.

3164 A. Sengupta

Comparison of Eqs. (19) and (22) also makes it (g) Cl(A) = {F ⊆ X : F
clear that Bdy(A) ⊆ Der(A). The special case of is a closed set of X containing A}
A = Iso(A) with Der(A) ⊆ X − A is important
(25)
enough to deserve a special mention:
A straightforward consequence of property (b)
Definition 2.4 (Donor set). A proper, nonempty is that the boundary of any subset A of a topolog-
subset A of X such that Iso(A) = A with Der(A) ⊆ ical space X is closed in X; this significant result
X − A will be called self-isolated or donor. Thus se- may also be demonstrated as follows. If x ∈ X is not
quences eventually in a donor set converges only in the boundary of A there is some neighborhood
in its complement; this is, the opposite of the N of x that does not intersect both A and X − A.
characteristic of a closed set where all converging For each point y ∈ N , N is a neighborhood of that
sequences eventually in the set must necessarily point that does not meet A and X − A simultane-
converge in it. A closed-donor set with a closed ously so that N is contained wholly in X − Bdy(A).
neighbor has no derived or boundary sets, and will We may now take N to be open without any loss of
be said to be isolated in X. generality implying thereby that X − Bdy(A) is an
open set of X from which it follows that Bdy(A) is
Example 2.5. In an isolated set sequences con-
closed in X.
Further material on topological spaces relevant
verge, if they have to, simultaneously in the com-
to our work can be found in Appendix A.3.
plement (because it is donor) and in it (because it is
closed). Convergent sequences in such a set can only End Tutorial 4
be constant sequences. Physically, if we consider ad-
herents to be contributions made by the dynamics of
the corresponding sequences, then an isolated set is Working in a general topological space, we now re-
secluded from its neighbor in the sense that it nei- call the solution of an ill-posed problem f (x) = y
ther receives any contributions from its surround- [Sengupta, 1997] that leads to a multifunctional in-
ings, nor does it give away any. In this light and verse f − through the generalized inverse G. Let
terminology, a closed set is a selfish set (recall that f : (X, U) → (Y, V) be a (nonlinear) function be-
a set A is closed in X iff every convergent net of X tween two topological space (X, U) and (Y, V) that
that is eventually in A converges in A; conversely a is neither one-one or onto. Since f is not one-
set is open in X iff the only nets that converge in A one, X can be partitioned into disjoint equiva-
are eventually in it), whereas a set with a derived lence classes with respect to the equivalence relation
set that intersects itself and its complement may be x1 ∼ x2 ⇔ f (x1 ) = f (x2 ). Picking a representative
considered to be neutral. Appendix A.3 shows the member from each of the classes (this is possible
various possibilities for the derived set and bound- by the Axiom of Choice; see the following Tuto-
ary of a subset A of X. rial) produces a basic set XB of X; it is basic as it
corresponds to the row space in the linear matrix
Some useful properties of these concepts for example which is all that is needed for taking an
a subset A of a topological space X are the inverse. XB is the counterpart of the quotient set
X/ ∼ of Sec. 1, with the important difference that
following.
whereas the points of the quotient set are the equiv-
(a) BdyX (X) = ∅, alence classes of X, XB is a subset of X with each
(b) Bdy(A) = Cl(A) Cl(X − A), of the classes contributing a point to X B . It then
follows that fB : XB → f (X) is the bijective re-
(c) Int(A) = X − Cl(X − A) = A − Bdy(A) =
striction a|row(A) that reduces the original ill-posed
Cl(A) − Bdy(A),
problem to a well-posed one with XB and f (X)
(d) Int(A) Bdy(A) = ∅,
corresponding respectively to the row and column
(e) X = Int(A) Bdy(A) Int(X − A), −1
spaces of A, and fB : f (X) → XB is the ba-
(f) Int(A) = {G ⊆ X : G sic inverse from which the multiinverse f − is ob-
tained through G, which in turn corresponds to the
is an open set of X contained in A}
Moore–Penrose inverse GMP . The topological con-
(24) siderations (obviously not for inner product spaces


that applies to the Moore–Penrose inverse) needed of the choice of the single element π from the re-
to complete the solution are discussed below and in als. To see this more closely in the context of maps
Appendix A.1. that we are concerned with, let f : X → Y be a
non-injective, onto map. To construct a functional
right inverse fr : Y → X of f , we must choose, for
each y ∈ Y one representative element x rep from
Begin Tutorial 5: Axiom of Choice the set f − (y) and define fr (y) to be that element
and Zorn’s Lemma according to f ◦ fr (y) = f (xrep ) = y. If there is
Since some of our basic arguments depend on it, no preferred or natural way to make this choice,
this Tutorial contains a short description of the the axiom of choice allows us to make an arbitrary
Axiom of Choice that has been described as “one selection from the infinitely many that may be pos-
of the most important, and at the same time one sible from f − (y). When a natural choice is indeed
of the most controversial, principles of mathemat- available, as for example in the case of the initial
ics”. What this axiom states is this: For any set X value problem y (x) = x; y(0) = α0 on [0, a], the
there exists a function fC : P0 (X) → X such that definite solution α0 +x2 /2 may be selected from the
x
fC (Aα ) ∈ Aα for every non-empty subset Aα of X; infinitely many 0 x dx = α + x2 /2, 0 ≤ x ≤ a that
here P0 (X) is the class of all subsets of X except ∅. are permissible, and the axiom of choice sanctions
Thus, if X = {x1 , x2 , x3 } is a three element set, a this selection. In addition, each y ∈ Y gives rise to
possible choice function is given by the family of solution sets Ay = {f − (y) : y ∈ Y }
and the real power of the axiom is its assertion that
fC ({x1 , x2 , x3 }) = x3 , fC ({x1 , x2 }) = x1 ,
it is possible to make a choice fC (Ay ) ∈ Ay on every
fC ({x2 , x3 }) = x3 , fC ({x3 , x1 }) = x3 , Ay simultaneously; this permits the choice on every
fC ({x1 }) = x1 , fC ({x2 }) = x2 , fC ({x3 }) = x3 . Ay of the collection to be made at the same time.

It must be appreciated that the axiom is only an ex-
istence result that asserts every set to have a choice
Pause Tutorial 5
function, even when nobody knows how to construct
one in a specific case. Thus, for example, how does
√
one pick out the isolated irrationals 2 or π from Figure shows our formulation and solution
the uncountable reals? There is no doubt that they [Sengupta, 1997] of the inverse ill-posed problem
do exist, for we can construct a right-angled trian- f (x) = y. In sub-diagram X−XB −f (X), the surjec-
gle with sides of length 1 or a circle of radius 1. The tion p : X → XB is the counterpart of the quotient
axiom tells us that these choices are possible even map Q of Fig. 2 that is known in the present con-
though we do not know how exactly to do it; all text as the identification of X with X B (as it iden-
that can be stated with confidence is that we can tifies each saturated subset of X with its represen-
actually pick up rationals arbitrarily close to these tative point in XB ), with the space (XB , FT{U; p})
irrationals. carrying the identification topology FT{U; p} being
The axiom of choice is essentially meaningful known as an identification space. By sub-diagram
when X is infinite as illustrated in the last two ex- Y − XB − f (X), the image f (X) of f gets the
amples. This is so because even when X is denu- subspace topology15 IT{j; V} from (Y, V) by the in-
merable, it would be physically impossible to make clusion j : f (X) → Y when its open sets are
an infinite number of selections either all at a time generated as, and only as, j −1 (V ) = V f (X)
or sequentially: the Axiom of Choice nevertheless for V ∈ V. Furthermore if the bijection f B con-
tells us that this is possible. The real strength and necting XB and f (X) (which therefore acts as a
utility of the Axiom however is when X and some 1 : 1 correspondence between their points, imply-
or all of its subsets are uncountable as in the case ing that these sets are set-theoretically identical

15
In a subspace A of X, a subset UA of A is open iff UA = A U for some open set U of X. The notion of subspace topology
can be formalized with the help of the inclusion map i : A → (X, U) that puts every point of A back to where it came from,
thus
UA = {UA = A U : U ∈ U}
= {i− (U ) : U ∈ U}.

3166 A. Sengupta

indistinguishable which may be considered to be
¦
© ¨¦
© §
¡ ¢ identical in as far as their topological properties
are concerned.
D 5 4D
Remark. It may be of some interest here to spec-
1 C B ulate on the significance of ininality in our work.
! Physically, a map f : (X, U) → (Y, V) between two
spaces can be taken to represent an interaction be-
tween them and the algebraic and topological char-
5420('%© ¥ ¦
3 1) $ # ¥ 5A@9('6¦ ¦
3 ) 8 $7© acters of f determine the nature of this interaction.
£¤¡ ¥ A simple bijection merely sets up a correspondence,
that is an interaction, between every member of X
with some member Y , whereas a continuous map
Fig. 5. Solution of ill-posed problem f (x) = y, f : X → Y .
establishes the correspondence among the special
G : Y → XB , a generalized inverse of f because of f Gf = f
category of “open” sets. Open sets, as we see in
and Gf G = G which follows from the commutativity of the
Appendix A.1, are the basic ingredients in the the-
diagrams, is a functional selection of the multi-inverse f − :
ory of convergence of sequences, nets and filters, and
(Y, V) –→ (X, U) f and f are the injective and surjective
→
the characterization of open sets in terms of conver-
restrictions of f ; these will be topologically denoted by their
gence, namely that a set G in X is open in it if every
generic notations e and q, respectively.
net or sequence that converges in X to a point in G
is eventually in G, see Appendix A.1, may be inter-
except for their names) is image continuous, then preted to mean that such sets represent groupings
by Theorem A.2.1 of Appendix 2, so is the asso- of elements that require membership of the group
ciation q = fB ◦ p : X → f (X) that associates before permitting an element to belong it; an open
saturated sets of X with elements of f (X); this set unlike its complement the closed or selfish set,
makes f (X) look like an identification space of X however, does not forbid a net that has been even-
by assigning to it the topology FT{U; q}. On the tually in it to settle down in its selfish neighbor,
other hand if fB happens to be preimage continu- who nonetheless will never allow such a situation to
ous, then XB acquires, by Theorem A.2.2, the initial develop in its own territory. An ininal map forces
topology IT{e; V} by the embedding e : X B → Y these well-defined and definite groups in (X, U) and
that embeds XB into Y through j ◦ fB , making (Y, V) to interact with each other through f ; this is
it look like a subspace of Y .16 In this dual situa- not possible with simple continuity as there may be
tion, fB has the highly interesting topological prop- open sets in X that are not derived from those of
erty of being simultaneously image and preimage Y and non-open sets in Y whose inverse images are
continuous when the open sets of XB and f (X) open in X. It is our hypothesis that the driving force
−1 behind the evolution of a system represented by the
— which are simply the fB -images of the open
sets of f (X) which, in turn, are the f B -images input–output relation f (x) = y is the attainment
of these saturated open sets — can be considered of the ininal triple state (X, f, Y ) for the system.
to have been generated by fB , and are respec- A preliminary analysis of this hypothesis is to be
tively the smallest and largest collection of sub- found in Sec. 4.2.
sets of X and Y that makes fB ini(tial-fi)nal con- For ininality of the interaction, it is therefore
tinuous [Sengupta, 1997]. A bijective ininal func- necessary to have
tion such as fB is known as a homeomorphism FT{U; f } = IT{j; V}
and ininality for functions that are neither 1 : 1 (26)
nor onto is a generalization of homeomorphism for IT{ f ; V} = FT{U; p}} ;
bijections; refer Eqs. (A.47) and (A.48) for a set- in what follows we will refer to the injective and sur-
theoretic formulation of this distinction. A homeo- jective restrictions of f by their generic topological
morphism f : (X, U) → (Y, V) renders the home- symbols of embedding e and association q, respec-
omorphic spaces (X, U) and (Y, V) topologically tively. What are the topological characteristics of f

16
A surjective function is an association iff it is image continuous and an injective function is an embedding iff it is preimage
continuous.


in order that the requirements of Eq. (26) be met?
£

From Appendix A.1, it should be clear by super-
posing the two parts of Fig. 21 over each other that ¡

given q : (X, U) → (f (X), FT{U; q}) in the first of
¤

these equations, IT{j; V} will equal FT{U; q} iff j ¥

is an ininal open inclusion and Y receives FT{U; f }. ¦

In a similar manner, preimage continuity of e re-
quires p to be open ininal and f to be preimage con-
tinuous if the second of Eq. (26) is to be satisfied.
Thus under the restrictions imposed by Eq. (26),
the interaction f between X and Y must be such ¥ ¡ § £

as to give X the smallest possible topology of f - ¤
¢ ¢

saturated sets and Y the largest possible topology
of images of all these sets: f , under these condi- 
 2x, 0 ≤ x 3/8
tions, is an ininal transformation. Observe that a 
Fig. 6. The function f (x) = 3/4, 3/8 ≤ x ≤ 5/8
direct application of parts (b) of Theorems A.2.1 
 7/6 − 2x/3, 5/8 x ≤ 1.
and A.2.2 to Fig. implies that Eq. (26) is satisfied
iff fB is ininal, that is iff it is a homeomorphism.
Ininality of f is simply a reflection of this as it is
neither 1 : 1 nor onto. An injective branch of a function f in this work
The f - and p-images of each saturated set of X refers to the restrictions fB and its associated in-
−1
are singletons in Y (these saturated sets in X arose, verse fB .
in the first place, as f − ({y}) for y ∈ Y ) and in XB , The following example of an inverse ill-posed
respectively. This permits the embedding e = j ◦ f B problem will be useful in fixing the notations intro-
to give XB the character of a virtual subspace of Y duced above. Let f on [0, 1] be the function shown
just as i makes f (X) a real subspace. Hence the in- below.
verse images p− (xr ) = f − (e(xr )) with xr ∈ XB , and Then f (x) = y is well-posed for [0, 1/4), and ill-
q − (y) = f − (i(y)) with y = fB (xr ) ∈ f (X) are the posed in [1/4, 1]. There are two injective branches
same, and are just the corresponding f − images via of f in {[1/4, 3/8) (5/8, 1]}, and f is constant
the injections e and i, respectively. G, a left inverse ill-posed in [3/8, 5/8]. Hence the basic component
of e, is a generalized inverse of f . G is a general- fB of f can be taken to be fB (x) = 2x for x ∈
−1
ized inverse because the two set-theoretic defining [0, 3/8) having the inverse fB (y) = x/2 with
requirements of f Gf = f and Gf G = G for the y ∈ [0, 3/4]. The generalized inverse is obtained
generalized inverse are satisfied, as Fig. shows, in by taking [0, 3/4] as a subspace of [0, 1], while the
the following forms multiinverse f − follows by associating with every
point of the basic domain [0, 1]B = [0, 3/8], the re-
jfB Gf = f GjfB G = G . spective equivalent points [3/8]f = [3/8, 5/8] and
[x]f = {x, 7/4 − 3x}forx ∈ [1/4, 3/8). Thus the in-
In fact the commutativity embodied in these equali- verses G and f − of f are17
ties is self evident from the fact that e = if B is a left
inverse of G, that is eG = 1Y . On putting back XB
 y, 3

into X by identifying each point of X B with the set 
2 y ∈ 0,
it came from yields the required set-valued inverse 4
G(y) = ,
f − , and G may be viewed as a functional selection  3
 0, y∈ ,1
of the multiinverse f − .

4
17
If y ∈ R(f ) then f − ({y}) := ∅ which is true for any subset of Y − R(f ). However from the set-theoretic definition of
/
natural numbers that requires 0 := ∅, 1 = {0}, 2 = {0, 1} to be defined recursively, it follows that f − (y) can be identified
with 0 whenever y is not in the domain of f − . Formally, the successor set A+ = A {A} of A can be used to write 0 := ∅,
1 = 0+ = 0 {0}, 2 = 1+ = 1 {1} = {0} {1}3 = 2+ = 2 {2} = {0} {1} {2}, etc. Then the set of natural numbers N is
defined to be the intersection of all the successor sets, where a successor set S is any set that contains ∅ and A + whenever A
belongs to S. Observe how in the successor notation, countable union of singleton integers recursively define the corresponding
sum of integers.

3168 A. Sengupta

is the unique matrix representation of the functional

y 1
 ,

2 y ∈ 0, inverse a−1 : a(R5 ) → XB extended to Y defined
 2 B
according to18


 y 7 3y

1 3
 2, 4 − 2 , y∈ ,

a−1 (b),

if b ∈ R(a)

− 2 4 B
f (y) = g(b) = (29)
 3, 5 ,
 3 0, if b ∈ Y − R(a) ,

 8 8 y=
4 that bears comparison with the basic inverse





 3 
5 1

 0, y∈ ,1 ,

4  2 −2 0 0 

 
 0 0 0 0
which shows that f − is multivalued. In order to  
A−1 (b∗ ) =  − 3 1
 
avoid cumbersome notations, an injective branch of B 0 0
 4 4
 
f will always refer to a representative basic branch 
−1  0 0 0 0
 
fB , and its “inverse” will mean either f B or G.
0 0 0 0
Example 2.3 (Revisited). The row reduced  
echelon form of the augmented matrix (A|b) of b1
 b
Example 2.3 is

2
×  : a(R5 ) → XB
 
 2b1 
3 1 5b1 b2
 
1 −3 0 − b2 − b 1
 2 2 2 2 
between the two-dimensional column and row
 
 1 3 3b1 b2 
(A|b) →  0 0 1 − − +  (27) spaces of A which is responsible for the particular
 
 4 4 4 4  solution of Ax = b. Thus G is simply A−1 acting
  B
0 0 0 0 0 −2b1 + b3  on its domain a(X) considered a subspace of Y ,
0 0 0 0 0 b 1 − b2 + b4 suitably extended to the whole of Y . That it is in-
deed a generalized inverse is readily seen through
The multifunctional solution x = A− b, with b any the matrix multiplications GAG and AGA that
element of Y = R4 not necessarily in the image of can be verified to reproduce G and A, respectively.
a, is Comparison of Eqs. (12) and (29) shows that the

3
 
1
 Moore–Penrose inverse differs from ours through
−2  −2  the geometrical constraints imposed in its defini-
 
3
1

 0
 
 0
 tion, Eqs. (13). Of course, this results in a more
      complex inverse (14) as compared to our very simple
x = A− b = Gb+x2  0  +x4  1  +x5  − 3  ,
     
(28); nevertheless it is true that both the inverses
 4  4
     
0 satisfy
 1  0
   
0 
1 0 0 0 0

0 1 0 1 0 0 0
E((E(GMP ))T ) = 
 
0 0 0 0 0

with its multifunctional character arising from the
arbitrariness of the coefficients x2 , x4 and x5 . The 0 0 0 0 0
generalized inverse
= E((E(G))T )
5 1
 
 2 −2 0 0  where E(A) is the row-reduced echelon form of A.

 0
 The canonical simplicity of Eq. (28) as compared to
0 0 0
  Eq. (14) is a general feature that suggests a more
G = −3 1
0 0  : Y → XB (28)
 
natural choice of bases by the map a than the or-
 4 4
 
 thogonal set imposed by Moore and Penrose. This
 0 0 0 0
 
is to be expected since the MP inverse, governed by
0 0 0 0 Eq. (13), is a subset of our less restricted inverse

18
See footnote 17 for a justification of the definition when b is not in R(a).


described by only the first two of (13); more specifi- the basis that diagonalizes an n × n matrix (when
cally the difference is made clear in Fig. 4(a) which this is possible) is not the standard “diagonal” or-
shows that for any b ∈ R(A), only GMP (b⊥ ) = 0
/ thonormal basis of Rn , but a problem-dependent,
as compared to G(b) = 0. This seems to imply less canonical, basis consisting of the n eigenvectors
that introducing extraneous topological considera- of the matrix. The 0-rows of the inverse of Eq. (28)
tions into the purely set-theoretic inversion process result from the three-dimensional null-space vari-
may not be a recommended way of inverting, and ables x2 , x4 and x5 , while the 0-columns come from
the simple bases comprising the row and null spaces the two-dimensional image-space dependency of b 3 ,
of A and AT — that are mutually orthogonal just as b4 on b1 and b2 , that is from the last two zero rows
those of the Moore–Penrose — are a better choice of the reduced echelon form (27) of the augmented
for the particular problem Ax = b than the gen- matrix.
eral orthonormal bases that the MP inverse intro- We will return to this theme of the generation
duces. These “good” bases, with respect to which of a most appropriate problem-dependent topology
the generalized inverse G has a considerably sim- for a given space in the more general context of
pler representation, are obtained in a straightfor- chaos in Sec. 4.2.
ward manner from the row-reduced forms of A and In concluding this introduction to generalized
AT . These bases are inverses we note that the inverse G of f comes very
close to being a right inverse: thus even though
(a) The column space of A is spanned by the AG = 12 its row-reduced form
columns (1, 3, 2, 2)T and (1, 5, 2, 4)T of A that  
correspond to the basic columns containing the 1 0 0 0
0 1 0 0
leading 1’s in the row-reduced form of A,  
(b) The null space of AT is spanned by the solu- 0 0 0 0
 
tions (−2, 0, 1, 0)T and (1, −1, 0, 1)T of the 0 0 0 0
equation AT b = 0,
(c) The row space of A is spanned by the rows is to be compared with the corresponding less
(1, −3, 2, 1, 2) and (3, −9, 10, 2, 9) of A cor- satisfactory
responding to the non-zero rows in the row- −1
 
1 0 2
reduced form of A, 0
 1 0 1
(d) The null space of A is spanned by the 0 0 0 0
 
solutions (3, 1, 0, 0, 0), (−6, 0, 1, 4, 0), and
(−2, 0, −3, 0, 4) of the equation Ax = 0. 0 0 0 0
representation of AGMP .
The main differences between the natural
“good” bases and the MP-bases that are respon-
sible for the difference in the form of inverses, is 3. Multifunctional Extension of
that the latter have the additional restrictions of Function Spaces
being orthogonal to each other (recall the orthog- The previous section has considered the solution of
onality property of the Q-matrices), and the more ill-posed problems as multifunctions and has shown
severe of basis vectors mapping onto basis vectors how this solution may be constructed. Here we in-
according to Axi = σi bi , i = 1, . . . , r, where the troduce the multifunction space Multi | (X) as the
{xi }i=1 and {bj }j=1 are the eigenvectors of AT A
n m
first step toward obtaining a smallest dense ex-
and AAT respectively and (σi )r are the positive
i=1 tension Multi(X) of the function space Map(X).
square roots of the non-zero eigenvalues of A T A (or Multi| (X) is basic to our theory of chaos [Sengupta
of AAT ), with r denoting the dimension of the row Ray, 2000] in the sense that a chaotic state of
or column space. This is considered as a serious re- a system can be fully described by such an inde-
striction as the linear combination of the basis {b j } terminate multifunctional state. In fact, multifunc-
that Axi should otherwise have been equal to, al- tions also enter in a natural way in describing the
lows a greater flexibility in the matrix representa- spectrum of nonlinear functions that we consider in
tion of the inverse that shows up in the structure of Sec. 6; this is required to complete the construc-
G. These are, in fact, quite general considerations in tion of the smallest extension Multi(X) of the func-
the matrix representation of linear operators; thus tion space Map(X). The main tool in obtaining the

3170 A. Sengupta

space Multi| (X) from Map(X) is a generalization of (fα )α∈D converge pointwise in Y . Explicitly, this
of the technique of pointwise convergence of con- is the subset of Y on which subnets of injective
tinuous functions to (discontinuous) functions. In branches of (fα )α∈D in Map(Y, X) combine to form
the analysis below, we consider nets instead of se- a net of functions that converge pointwise to a fam-
quences as the spaces concerned, like the topology ily of limit functions G : R− → X. Depending on
of pointwise convergence, may not be first count- the nature of (fα )α∈D , there may be more than one
able, Appendix A.1. R− with a corresponding family of limit functions
on each of them. To simplify the notation, we will
usually let G : R− → X denote all the limit func-
3.1. Graphical convergence of a net
tions on all the sets R− .
of functions If we consider cofinal rather than residual sub-
Let (X, U) and (Y, V) be Hausdorff spaces and sets of D then corresponding D+ and R+ can be
(fα )α∈D : X → Y be a net of piecewise continuous expressed as
functions, not necessarily with the same domain or
range, and suppose that for each α ∈ D there is D+ = {x ∈ X : ((fν (x))ν∈Cof(D) converges in
−
a finite set Iα = {1, 2, . . . Pα } such that fα has Pα (Y, V))} (32)
functional branches possibly with different domains;
R+ = {y ∈ Y : (∃i ∈ Iν )((gνi (y))ν∈Cof(D)
obviously Iα is a singleton iff f is a injective. For
each α ∈ D, define functions (gαi )i∈Iα : Y → X converges in (X, U))} . (33)
such that
It is to be noted that the conditions D + = D− and
I
fα gαi fα = fαi i = 1, 2, . . . Pα , R+ = R− are necessary and sufficient for the Kura-
I towski convergence to exist. Since D + and R+ differ
where fαi is a basic injective branch of fα on
I I from D− and R− only in having cofinal subsets of D
some subset of its domain: gαi fαi = 1X on D(fαi ),
I g replaced by residual ones, and since residual sets are
fαi αi = 1Y on D(gαi ) for each i ∈ Iα . The use of
also cofinal, it follows that D− ⊆ D+ and R− ⊆ R+ .
nets and filters is dictated by the fact that we do
The sets D− and R− serve for the convergence of
not assume X and Y to be first countable. In the
a net of functions just as D+ and R+ are for the
application to the theory of dynamical systems that
convergence of subnets of the nets (adherence). The
follows, X and Y are compact subsets of R when the
latter sets are needed when subsequences are to be
use of sequences suffice.
considered as sequences in their own right as, for
In terms of the residual and cofinal subsets
example, in dynamical systems theory in the case
Res(D) and Cof(D) of a directed set D (Defini-
of ω-limit sets.
tion A.1.7), with x and y in the equations below
As an illustration of these definitions, consider
being taken to belong to the required domains, de-
the sequence of injective functions on the interval
fine subsets D− of X and R− of Y as
[0, 1] fn (x) = 2n x, for x ∈ [0, 1/2n ], n = 0, 1, 2 . . . .
D− = {x ∈ X : ((fν (x))ν∈D converges in (Y, V))} Then D0.2 is the set {0, 1, 2} and only D0 is even-
(30) tual in D. Hence D− is the single point set {0}. On
the other hand Dy is eventual in D for all y and R−
R− = {y ∈ Y : (∃i ∈ Iν )((gνi (y))ν∈D converges in
is [0, 1].
(X, U))} (31)
Thus, Definition 3.1 (Graphical Convergence of a net of
D− is the set of points of X on which the values functions). A net of functions (fα )α∈D : (X, U) →
of a given net of functions (fα )α∈D converge point- (Y, V) is said to converge graphically if either D − =
wise in Y . Explicitly, this is the subset of X on ∅ or R− = ∅; in this case let F : D− → Y and
which subnets19 in Map(X, Y ) combine to form a G : R− → X be the entire collection of limit func-
net of functions that converge pointwise to a limit tions. Because of the assumed Hausdorffness of X
function F : D− → Y . and Y , these limits are well defined.
R− is the set of points of Y on which the values The graph of the graphical limit M of the net
G
of the nets in X generated by the injective branches (fα ) : (X, U) → (Y, V) denoted by fα → M, is the

19
A subnet is the generalized uncountable equivalent of a subsequence; for the technical definition, see Appendix A.1.


subset of D− × R− that is the union of the graphs spaces X and Y to be a consequence of both the di-
of the function F and the multifunction G − rect interaction represented by f : X → Y and also
GM = G F G G− the inverse interaction f − : Y –→ X, and our formu-
→
lation of pointwise biconvergence is a formalization
where
of this idea. Thus the basic examples (1) and (2)
GG− = {(x, y) ∈ X × Y : (y, x) ∈ GG ⊆ Y × X}. below produce multifunctions instead of discontin-
uous functions that would be obtained by the usual
pointwise limit.
Begin Tutorial 6: Graphical
Convergence Example 3.1
The following two examples are basic to the un-
derstanding of the graphical convergence of func- (1) 
 0, −1 ≤ x ≤ 0
tions to multifunctions and were the examples 


that motivated our search of an acceptable tech- 
 1
nique that did not require vertical portions of fn (x) = nx, 0 ≤ x ≤ : [−1, 1] → [0, 1]
 n
limit relations to disappear simply because they 

 1
were non-functions: the disturbing question that
 1,
 ≤x≤1
n
needed an answer was how not to mathemati-
cally sacrifice these extremely significant physi- y 1
cal components of the limiting correspondences. gn (y) = : [0, 1] → 0,
n n
Furthermore, it appears to be quite plausible
to expect a physical interaction between two Then

0, −1 ≤ x ≤ 0
F (x) = on D− = D+ = [−1, 0] (0, 1]
1, 0x≤1
G(y) = 0 on R− = [0, 1] = R+ .

The graphical limit is ([−1, 0], 0) (0, [0, 1]) not converge graphically because in this case both
((0, 1], 1). the sets D− and R− are empty. The power of
(2) fn (x) = nx for x ∈ [0, 1/n] gives gn (y) = y/n : graphical convergence in capturing multifunctional
[0, 1] → [0, 1/n]. Then limits is further demonstrated by the example of
∞
the sequence (sin nπx)n=1 that converges to 0 both
F (x) = 0 on D− = {0} = D+ , 1-integrally and test-functionally, Eqs. (3) and (4).
It is necessary to understand how the concepts
G(y) = 0 on R− = [0, 1] = R+ . of eventually in and frequently in of Appendix A.2
apply in examples (a) and (b) of Fig. 7. In these
The graphical limit is (0, [0, 1]).
two examples we have two subsequences one each
In these examples that we consider to be the proto- for the even indices and the other for the odd.
types of graphical convergence of functions to mul- For a point-to-point functional relation, this would
tifunctions, G(y) = 0 on R− because gn (y) → 0 mean that the sequence frequents the adherence set
for all y ∈ R− . Compare the graphical multifunc- adh(x) of the sequence (xn ) but does not converge
tional limits with the corresponding usual pointwise anywhere as it is not eventually in every neigh-
functional limits characterized by discontinuity at borhood of any point. For a multifunctional limit
x = 0. Two more examples from Sengupta and Ray however it is possible, as demonstrated by these
[2000] that illustrate this new convergence princi- examples, for the subsequences to be eventually
ple tailored specifically to capture one-to-many re- in every neighborhood of certain subsets common
lations are shown in Fig. 7 which also provides an to the eventual limiting sets of the subsequences;
example in Fig. 7(c) of a function whose iterates do this intersection of the subsequential limits is now

3172 A. Sengupta

11+ 1/n + 1/n 1/n 1/n 2 − 1/n 1/n
1/n 1/n 1/n −
2 − 2 − 1/n
2
1.5 1.5 +1 + 1/n
1
1/n n n even even
eveneven
1.5 1.5 n n
nneveneven
nn
even even
1 1 1
1
1 + 2/n + 2/n
1 +12/n
+ 2/n
1

3 + 1/n 1/n
3++
0 3 + 1/n 1/n
3
0 0 0
1 1 1 2 2 2 1 2 3
1 2 1 1 1 2 2 2 3 3 3
n odd odd
nn
-0.5 -0.5
-0.5 -0.5 n odd odd n odd odd
n n
n odd odd
(a) (a) (b) (b)
(a) (a)
(a) (b)
(b) (b)
1 1 1
1
12 iterates of −0.05 + x − x2− −2x2
iterates of −0.05 + x x
1212 iterates of 12 iterates of 0.7 + x + x2+ x2 x2
12 iterates of 0.7
12 iterates of −0.05−0.05 + 2
+x−x 12 12 iterates of 0.7 + x +
1 1 1 iterates of 0.7 + x + x
2
0 0 0 α 1 αα
0 α

1 1 1
-1 -1 -1 1
-1 0 0 0
12 1212 0
12
-2 -2 -2
-2 -1 -1 -1
-1
-3 -3 -3
-3 -1 -1
-1 0 0 0 1 1 1 2 2 2 a a a c c c
-1 0 1 2 a c
(c) (c)
(c) (c) (d) (d)
(d)
(c) (d) (d)
1 for 0 ≤ x ≤ 1
Fig. 7. The graphical limits are: (a) F (x) = on D− = [0, 1] (1, 2], and G(y) = 1 on R− = [0, 1]. Also
0 for 1 x ≤ 2
1 on R+ = [0, 3/2]
G= .
1 on R+ = [−1/2, 1]
(b) F (x) = 1 on D− = {0} and G(y) = 0 on R− = {1}. Also F (x) = −1/2, 0, 1, 3/2 respectively on D+ = (0, 3], {2},
{0}, (0, 2) and G(y) = 0, 0, 2, 3 respectively on R+ = (−1/2, 1], [1, 3/2), [0, 3/2), [−1/2, 0).
(c) For f (x) = −0.05 + x − x2 , no graphical limit as D− = ∅ = R− .
(d) For f (x) = 0.7 + x − x2 , F (x) = α on D− = [a, c], G1 (y) = a and G2 (y) = c on R− = (−∞, α]. Notice how the two fixed
points and their equivalent images define the converged limit rectangular multi. As in example (1) one has D − = D+ ; also
R− = R + .

defined to be the limit of the original sequence. A ous set of equations (sequence) may have distinct
similar situation obtains, for example, in the solu- solutions (limits), the solution of the equations is
tion of simultaneous equations: The solution of the their common point of intersection.
equation a11 x1 + a12 x2 = b1 for one of the vari- 1
1 1 Considered as sets in X × Y , the discussion of
ables x2 say with a12 = 0, is the set represented
1 convergence of a sequence of graphs f n : X → Y
by the straight line x2 = m1 x1 + c1 for all x1 in would be incomplete without a mention of the con-
its domain, while for a different set of constants vergence of a sequence of sets under the Hausdorff
a21 , a22 and b2 the solution is the entirely differ- metric that is so basic in the study of fractals. In
ent set x2 = m2 x1 + c2 , under the assumption that this case, one talks about the convergence of a se-
m1 = m2 and c1 = c2 . Thus even though the indi- quence of compact subsets of the metric space R n
vidual equations (subsequences) of the simultane- so that the sequences, as also the limit points that


are the fractals, are compact subsets of R n . Let K topology of pointwise convergence iff (f α ) converges
denote the collection of all nonempty compact sub- pointwise to f in the sense that fα (x) → f (x) in Y
sets of Rn . Then the Hausdorff metric dH between for every x in X.
two sets on K is defined to be Proof. Necessity. First consider fα → f in
dH (E, F ) = max{δ(E, F ), δ(F, E)} E, F ∈ K , (Map(X, Y ), T ). For an open neighborhood V of
f (x) in Y with x ∈ X, let B(x; V ) be a local neigh-
where borhood of f in (Map(X, Y ), T ), see Eq. (A.6) in
δ(E, F ) = max min x − y 2 Appendix A.1. By assumption of convergence, (f α )
x∈E y∈F
must eventually be in B(x; V ) implying that f α (x)
is δ(E, F ) is the non-symmetric 2-norm in R n . is eventually in V . Hence fα (x) → f (x) in Y .
The power and utility of the Hausdorff distance is Sufficiency. Conversely, if fα (x) → f (x) in
best understood in terms of the dilations E + ε := Y for every x ∈ X, then for a finite collection
n of points (xi )I of X (X may itself be uncount-
x∈E Dε (x) of a subset E of R by ε where Dε (x) i=1
is a closed ball of radius ε at x; physically a dilation able) and corresponding open sets (V i )I in Y with
i=1
of E by ε is a closed ε-neighborhood of E. Then a f (xi ) ∈ Vi , let B((xi )I ; (Vi )I ) be an open neigh-
i=1 i=1
fundamental property of dH is that dH (E, F ) ≤ ε borhood of f . From the assumed pointwise conver-
iff both E ⊆ F + ε and F ⊆ E + ε hold simultane- gence fα (xi ) → f (xi ) in Y for i = 1, 2, . . . , I, it
ously which leads [Falconer, 1990] to the interesting follows that (fα (xi )) is eventually in Vi for every
consequence that (xi )I . Because D is a directed set, the existence of
i=1
∞
If (Fn )n=1 and F are nonempty compact sets, a residual applicable globally for all i = 1, 2, . . . , I
then limn→∞ Fn = F in the Hausdorff metric iff is assured leading to the conclusion that f α (xi ) ∈ Vi
Fn ⊆ F +ε and F ⊆ Fn +ε eventually. Furthermore eventually for every i = 1, 2, . . . , I. Hence f α ∈
∞
if (Fn )n=1 is a decreasing sequence of elements of a B((xi )I ; (Vi )I ) eventually; this completes the
i=1 i=1
filter-base in Rn , then the nonempty and compact demonstration that fα → f in (Map(X, Y ), T ),
limit set F is given by and thus of the proof.
∞
lim Fn = F = Fn . End Tutorial 6
n→∞
n=1
Note that since Rn is Hausdorff, the assumed com-
pactness of Fn ensures that they are also closed in 3.2. The extension Multi| (X, Y ) of
Rn ; F , therefore, is just the adherent set of the Map (X, Y )
filter-base. In the deterministic algorithm for the In this section we show how the topological treat-
generation of fractals by the so-called iterated func- ment of pointwise convergence of functions to func-
tion system (IFS) approach, Fn is the inverse im- tions given in Example A.1.1 of Appendix 1 can be
age by the nth iterate of a non-injective function f generalized to generate the boundary Multi | (X, Y )
having a finite number of injective branches and between Map(X, Y ) and Multi(X, Y ); here X
converging graphically to a multifunction. Under and Y are Hausdorff spaces and Map(X, Y ) and
the conditions stated above, the Hausdorff metric Multi(X, Y ) are respectively the sets of all func-
ensures convergence of any class of compact sub- tional and non-functional relations between X and
sets in Rn . It appears eminently plausible that our Y . The generalization we seek defines neighbor-
multifunctional graphical convergence on Map(R n ) hoods of f ∈ Map(X, Y ) to consist of those func-
implies Hausdorff convergence on Rn : in fact point- tional relations in Multi(X, Y ) whose images at any
wise biconvergence involves simultaneous conver- point x ∈ X lies not only arbitrarily close to f (x)
gence of image and preimage nets on Y and X, (this generates the usual topology of pointwise con-
respectively. Thus confining ourselves to the sim- vergence TY of Example A.1.1) but whose inverse
pler case of pointwise convergence, if (f α )α∈D is a images at y = f (x) ∈ Y contain points arbitrar-
net of functions in Map(X, Y ), then the following ily close to x. Thus the graph of f must not only
theorem expresses the link between convergence in lie close enough to f (x) at x in V , but must addi-
Map(X, Y ) and in Y . tionally be such that f − (y) has at least branch in
Theorem 3.1. A net of functions (fα )α∈D con- U about x; thus f is constrained to cling to f as
verges to a function f in (Map(X, Y ), T ) in the the number of points on the graph of f increases

3174 A. Sengupta

©
©
©
%
! # $ !
!

9¥' 75
83 6
C
@1' 75
80 6
@1' 75
84 6
BA' 75
8( 6

¢ £¡ ¤ ¥¡ ¦ ¥¡ § ¨¡ ¨21' ¥' )'
3 ' 0 4 ( !
(a) (b)

Fig. 8. The power of graphical convergence, illustrated for Example 3.1 (1), shows a local neighborhood of the functions
x and 2x in (a) and (b) at the four points (xi )4 with corresponding neighborhoods (Ui )4 and (Vi )i=1 at (xi , f (xi )) in
i=1 i=1
4

R in the X and Y directions respectively, see Eqs. (34) and (A.6) for the notations. (a) shows a function g in a pointwise
neighborhood of f determined by the open sets Vi , while (b) shows g in a graphical neighborhood of f due to both Ui and
Vi . A comparison of these figures demonstrates how the graphical neighborhood forces functions close to f remain closer to it
than if they were in its pointwise neighborhood. This property is clearly visible in (a) where g, if it were to be in a graphical
neighborhood of f , would be more faithful to it by having to be also in U2 and U4 . Thus in this case not only must the images
j j
f (xij ) → f (xi ) as Vi decreases, but also the preimages xij → xi with shrinking Ui . It is this simultaneous convergence of
both images and preimages at every x that makes graphical convergence a natural candidate for multifunctional convergence
of functions.

with convergence and, unlike in the situation of sim- for every choice of α ∈ D, is a base T B of
ple pointwise convergence, no gaps in the graph of (Map(X, Y ), T ). Here the directed set D is used
the limit object is permitted not only, as in Exam- as an indexing tool because, as pointed out in Ex-
ple A.1.1 on the domain of f , but simultaneously ample A.1.1, the topology of pointwise convergence
on its range too. We call the resulting generated is not first countable.
topology the topology of pointwise biconvergence on In a manner similar to Eq. (34), the open sets
Map(X, Y ), to be denoted by T . Thus for any given ˆ
of (Multi(X, Y ), T ), where Multi(X, Y ) are mul-
integer I ≥ 1, the generalization of Eq. (A.6) gives tifunctions with only countably many values in Y
for i = 1, 2, . . . , I, the open sets of (Map(X, Y ), T ) for every point of X (so that we exclude continuous
to be regions from our discussion except for the “vertical
lines” of Multi| (X, Y )), can be defined as
B((xi ), (Vi ); (yi ), (Ui ))
= {g ∈ Map(X, Y ) : (g(xi ) ∈ Vi ) ˆ
B((xi ), (Vi ); (yi ), (Ui ))

∧ (g − (yi ) Ui = ∅), i = 1, 2, . . . , I} , (34) = {G ∈ Multi(X, Y ) : (G(xi ) Vi = ∅)
∧ (G − (yi ) Ui = ∅)} , (36)
where (xi )i=1 , (Vi )I
I
i=1 are as in that example,
I
(yi )i=1 ∈ Y , and the corresponding open sets where
(Ui )i=1 in X are chosen arbitrarily.20 A local base
I

at f , for (xi , yi ) ∈ Gf , is the set of functions G − (y) = {x ∈ X : y ∈ G(x)} .
of (34) with yi = f (xi ) and the collection of all
and (xi )I I I
i=1 ∈ D(M), (Vi )i=1 ; (yi )i=1 ∈ R(M),
local bases I
(Ui )i=1 are chosen as in the above. The topology
Bα = B((xi )i=1 , (Vi )Iα ; (yi )Iα , (Ui )Iα ) ,
Iα
(35) ˆ
T of Multi(X, Y ) is generated by the collection of
i=1 i=1 i=1

20
Equation (34) is essentially the intersection of the pointwise topologies (A.6) due to f and f − .


ˆ
all local bases Bα for every choice of α ∈ D, and it is persets of all elements of F B; see Appendix A.1) and
not difficult to see from Eqs. (34) and (36), that the thereby the filter-base
ˆ ˆ
restriction T |Map(X, Y ) of T to Map(X, Y ) is just T . ˆ = {B = B
ˆ
FB {m} : B ∈F B}
ˆ
ˆ
Henceforth T and T will be denoted by the
ˆ
on M ; this filter-base at m can also be obtained
same symbol T , and convergence in the topology
of pointwise biconvergence in (Multi(X, Y ), T ) will independently from Eq. (36). Obviously Fˆ is an
B
extension of F B on Mˆ and F B is the filter induced
be denoted by , with the notation being derived
from Theorem 3.1. on M by Fˆ We may also consider the filter-base
B.
to be a topological base on M that defines a coarser
Definition 3.2 (Functionization of a multifunction). topology T on M (through all unions of members
A net of functions (fα )α∈D in Map(X, Y ) converges of F B) and hence the topology
in (Multi(X, Y ), T ), fα M, if it biconverges ˆ ˆ
T = {G = G {m} : G ∈ T }
ˆ
pointwise in (Map(X, Y ), T ∗ ). Such a net of func-

tions will be said to be a functionization of M. ˆ ˆ
on M to be the topology associated with F. A finer
topology on M ˆ may be obtained by adding to T ˆ
Theorem 3.2. Let (fα )α∈D be a net of functions in all the discarded elements of T0 that do not satisfy
Map(X, Y ). Then FIP. It is clear that m is on the boundary of M
ˆ
because every neighborhood of m intersects M by
ˆ
G ˆ ˆ
fα → M ⇔ f α M. construction; thus (M, T ) is dense in ( M, T ) which
is the required topological extension of (M, T ).
Proof. If (fα ) converges graphically to M then ei- In the present case, a filter-base at f ∈
ther D− or R− is non-empty; let us assume both of Map(X, Y ) is the neighborhood system F Bf at f
them to be so. Then the sequence of functions (f α ) given by decreasing sequences of neighborhoods
converges pointwise to a function F on D − and to (Vk ) and (Uk ) of f (x) and x, respectively, and the
functions G on R− , and the local basic neighbor- ˆ
filter F is the neighborhood filter Nf G where
hoods of F and G generate the topology of point- G ∈ Multi| (X, Y ). We shall present an alternate,
wise biconvergence. and perhaps more intuitively appealing, description
Conversely, for pointwise biconvergence on X of graphical convergence based on the adherence set
and Y , R− and D− must be non-empty. of a filter in Sec. 4.1.
As more serious examples of the graphical con-
vergence of a net of functions to multifunction than
Observe that the boundary of Map(X, Y ) in
those considered above, Fig. 9 shows the first four
the topology of pointwise biconvergence is a “line
iterates of the tent map
parallel to the Y -axis”. We denote this closure of 
Map(X, Y ) as 1
 2x, 0≤x

2

t(x) = (t1 = t) .
Definition 3.3. Multi| ((X, Y ), T ) = Cl(Map((X,  2(1 − x), 1 ≤ x ≤ 1

Y ), T )).

2
defined on [0.1] and the sine map fn =
The sense in which Multi| (X, Y ) is the smallest
| sin(2n−1 πx)|, n = 1, . . . , 4 with domain [0, 1].
closed topological extension of M = Map(X, Y ) is
These examples illustrate the important gener-
the following, refer to Theorems A.1.4 and its proof. alization that periodic points may be replaced by the
Let (M, T0 ) be a topological space and suppose that more general equivalence classes where a sequence
ˆ of functions converges graphically; this generaliza-
M =M {m}
ˆ
tion based on the ill-posed interpretation of dynam-
is obtained by adjoining an extra point to M ; here ical systems is significant for non-iterative systems
M = Map(X, Y ) and m ∈ Cl(M ) is the multifunc-
ˆ as in second example above. The equivalence classes
ˆ
tional limit in M = Multi| (X, Y ). Treat all open of the tent map for its two fixed points 0 and 2/3
sets of M generated by local bases of the type (35) generated by the first four iterates are
with finite intersection property as a filter-base F B 1 1 3 1 5 3 7
[0]4 = 0, , , , , , , ,1
on X that induces a filter F on M (by forming su- 8 4 8 2 8 4 8

3176 A. Sengupta

1 1 1 1

0 iterates of tent map 1 Graph Graph of |sine|4maps maps
0First 4First 4 iterates of tent map 0 1 0 of first 4 first |sine| 1 1
(a)
(a) (a) (b) (b)
(b)

Fig. 9. The first four iterates of (a) tent and (b) | sin(2n−1 πx)| maps show the formal similarity of the dynamics of these
functions. It should be noted, as shown in Fig. 7, that although sin(nπx)∞ fails to converge at any point other than 0 and
n=1
1, the subsequence sin(2n−1 πx)∞ does converge graphically on a set dense in [0, 1].
n=1

2 1 1 3 1 5 functions. It is to be noted that the number of equiv-
= c, c, c, c, c, c, alent fixed points in a class increases with the num-
3 4 8 4 8 2 8
ber of iterations k as 2k−1 + 1; this increase in the
3 7
c, c, 1 − c degree of ill-posedness is typical of discrete chaotic
4 8 systems and can be regarded as a paradigm of chaos
where c = 1/24. If the moduli of the slopes of the generated by the convergence of a family of func-
graphs passing through these equivalent fixed points tions.
are greater than 1 then the graphs converge to The mth iterate tm of the tent map has 2m fixed
multifunctions and when these slopes are less than points corresponding to the 2m injective branches
1 the corresponding graphs converge to constant of tm

 j−1

 m , j = 1, 3, . . . , (2m − 1)
2 −1
xmj = tm (xmj ) = xmj , j = 1, 2, . . . , 2m .
 j
 , j = 2, 4, . . . , 2m
2m + 1
Let Xm be the collection of these 2m fixed points higher iterates tn for m = in with i = 1, 2, . . .
(thus X1 = {0, 2/3}), and denote by [Xm ] the set where these subsequences remain fixed. For exam-
of the equivalent points, one coming from each of ple, the fixed points 2/5 and 4/5 produced respec-
the injective branches, for each of the fixed points: tively by the second and fourth injective branches
thus of t2 , are also fixed for the seventh and thirteenth
1 1 branches of t4 . For the shift map 2xmod(1) on [0, 1],
2
D− = [X1 ] = [0], D− = {[0], [1]} where [0] = ∞ {(i − 1)/2m :
m=1
3 ∞
i = 1, 2, . . . , 2m } and [1] = m=1 {i/2m : i =
2 2 4 1, 2, . . . , 2m }.
[X2 ] = [0], , , It is useful to compare the graphical conver-
5 3 5
gence of (sin(πnx))∞ to [0, 1] at 0 and to 0 at 1
n=1
and D+ = ∞ [Xm ] is a non-empty countable
m=1 with the usual integral and test-functional conver-
set dense in X at each of which the graphs of the gences to 0; note that the point 1/2, for example,
sequence (tm ) converge to a multifunction. New belongs to D+ and not to D− = {0, 1} because it
sets [Xn ] will be formed by subsequences of the is frequented by even n only. However for the sub-


sequence (f2m−1 )m∈Z+ , 1/2 is in D− because if the properties (ER1)–(ER3) of an equivalence relation,
graph of f2m−1 passes through (1/2, 0) for some m, Tutorial 1)
then so do the graphs for all higher values. There-
fore [0] = ∞ {i/2m−1 : i = 0, 1, . . . , 2m−1 } is the
m=1 (OR1) Reflexive, that is (∀x ∈ X)(x x).
equivalence class of (f2m−1 )∞ and this sequence
m=1 (OR2) Antisymmetric: (∀x, y ∈ X)(x y∧y
converges to [−1, 1] on this set. Thus our extension x ⇒ x = y).
Multi(X) is distinct from the distributional exten-
(OR3) Transitive, that is (∀x, y, z ∈ X)(x y ∧y
sion of function spaces with respect to test func-
z ⇒ x z). Any notion of order on a set X in the
tions, and is able to correctly generate the patho-
sense of one element of X preceding another should
logical behavior of the limits that are so crucially
possess at least this property.
vital in producing chaos.
The relation is a preorder if it is only reflexive
4. Discrete Chaotic Systems are and transitive, that is if only (OR1) and (OR3) are
Maximally Ill-posed true. If the hypothesis of (OR2) is also satisfied by
a preorder, then this induces an equivalence re-
The above ideas apply to the development of a cri- lation ∼ on X according to (x y) ∧ (y x) ⇔
terion for chaos in discrete dynamical systems that x ∼ y that evidently is actually a partial order iff
is based on the limiting behavior of the graphs of x ∼ y ⇔ x = y. For any element [x] ∈ X/ ∼ of the
a sequence of functions (fn ) on X, rather than on induced quotient space, let ≤ denote the generated
the values that the sequence generates as is cus- order in X/ ∼ so that
tomary. For the development of the maximality of
ill-posedness criterion of chaos, we need to refresh x y ⇔ [x] ≤ [y] ;
ourselves with the following preliminaries.
then ≤ is a partial order on X/ ∼. If every two ele-
ments of X are comparable, in the sense that either
x1 x2 or x2 x1 for all x1 , x2 ∈ X, then X is
said to be a totally ordered set or a chain. A to-
Resume Tutorial 5: Axiom of Choice tally ordered subset (C, ) of a partially ordered
and Zorn’s Lemma set (X, ) with the ordering induced from X, is
Let us recall from the first part of this Tutorial that known as a chain in X if
for nonempty subsets (Aα )α∈D of a nonempty set C = {x ∈ X : (∀c ∈ X)(c x∨x c)} . (37)
X, the Axiom of Choice ensures the existence of a
set A such that A Aα consists of a single element The most important class of chains that we are con-
for every α. The choice axiom has far reaching concerned with in this work is that on the subsets P(X)
sequences and a few equivalent statements, one of of a set (X, ⊆) under the inclusion order; Eq. (37),
which the Zorn’s lemma that will be used immedi- as we shall see in what follows, defines a family of
ately in the following, is the topic of this resumed chains of nested subsets in P(X). Thus while the
Tutorial. The beauty of the Axiom, and of its equiv- relation in Z defined by n1 n2 ⇔ |n1 | ≤ |n2 |
alents, is that they assert the existence of mathe- with n1 , n2 ∈ Z preorders Z, it is not a partial
matical objects that, in general, cannot be demon- order because although −n n and n −n for
strated and it is often believed that Zorn’s lemma any n ∈ Z, it is does not follow that −n = n.
is one of the most powerful tools that a mathemati- A common example of partial order on a set of
cian has available to him that is “almost indispens- sets, for example on the power set P(X) of a set
able in many parts of modern pure mathematics” X (see footnote 23), is the inclusion relation ⊆: the
with significant applications in nearly all branches ordered set X = (P({x, y, z}), ⊆) is partially or-
of contemporary mathematics. This “lemma” talks dered but not totally ordered because, for exam-
about maximal (as distinct from “maximum”) ele- ple, {x, y} ⊆ {y, x}, or {x} is not comparable to
ments of a partially ordered set, a set in which some {y} unless x = y; however C = {{∅, {x}, {x, y}}
notion of x1 “preceding” x2 for two elements of the does represent one of the many possible chains of
set has been defined. X . Another useful example of partial order is the
A relation on a set X is said to be a partial following: Let X and (Y, ≤) be sets with ≤ or-
order (or simply an order) if it is (compare with the dering Y , and consider f, g ∈ Map(X, Y ) with

3178 A. Sengupta

D(f ), D(g) ⊆ X. Then which requires the upper bound u to be larger than
all members of A, with the corresponding lower
(D(f ) ⊆ D(g))(f = g|D(f ) ) ⇔ f g
bounds of A being defined in a similar manner. Of
(D(f ) = D(g))(R(f ) ⊆ R(g)) ⇔ f g (38) course, it is again not necessary that the elements
(∀x ∈ D(f ) = D(g))(f (x) ≤ g(x)) ⇔ f g of A be comparable to each other, and it should
be clear from Eqs. (41) and (42) that when an up-
define partial orders on Map(X, Y ). In the last case, per bound of a set is in the set itself, then it is the
the order is not total because any two functions maximum element of the set. If the upper (lower)
whose graphs cross at some point in their common bounds of a subset (A, ) of a set (X, ) has a least
domain cannot be ordered by the given relation, (greatest) element, then this smallest upper bound
while in the first any f whose graph does not coin- (largest lower bound) is called the least upper bound
cide with that of g on the common domain is not (greatest lower bound) or supremum (infimum) of A
comparable to it by this relation. in X. Combining Eqs. (41) and (42) then yields
Let (X, ) be a partially ordered set and let A
be a subset of X. An element a+ ∈ (A, ) is said sup A = {a← ∈ ΩA : a← u∀u ∈ (ΩA , )}
X
to be a maximal element of A with respect to if (43)
inf A = {→ a ∈ ΛA : l → a∀l ∈ (ΛA , )}
(∀a ∈ (A, ))(a+ a) ⇒ a = a+ , (39) X
where ΩA = {u ∈ X : (∀a ∈ A)(a u)} and
that is, iff there is no a ∈ A with a = a+ and
ΛA = {l ∈ X : (∀a ∈ A)(l a)} are the sets of all
a a+ .21 Expressed otherwise, this implies that an
upper and lower bounds of A in X. Equation (43)
element a+ of a subset A ⊆ (X, ) is maximal in
may be expressed in the equivalent but more trans-
(A, ) iff it is true that
parent form as
(a a+ ∈ A)(for every a ∈ (A, ) a← = sup A ⇔ (a ∈ A ⇒ a a← )
comparable to a+ ) ; (40) X
∧ (a0 a← ⇒ a0 b a← for some b ∈ A)
thus a+ in A is a maximal element of A iff it is
strictly greater than every other comparable element → a = inf A ⇔ (a ∈ A ⇒→ a a)
X
of A. This of course does not mean that each ele- ∧ (→ a a 1 ⇒→ a b a1 for some b ∈ A)
ment a of A satisfies a a+ because every pair
(44)
of elements of a partially ordered set need not be
comparable: in a totally ordered set there can be to imply that a← (→ a) is the upper (lower) bound of
at most one maximal element. In comparison, an A in X which precedes (succeeds) every other upper
element a∞ of a subset A ⊆ (X, ) is the unique (lower) bound of A in X. Notice that uniqueness in
maximum (largest, greatest, last) element of A iff the definitions above is a direct consequence of the
uniqueness of greatest and least elements of a set.
(a a∞ ∈ A)(for every a ∈ (A, )) , (41) It must be noted that whereas maximal and max-
implying that a∞ is the element of A that is strictly imum are properties of the particular subset and
larger than every other element of A. As in the case have nothing to do with anything outside it, up-
of the maximal, although this also does not require per and lower bounds of a set are defined only with
all elements of A to be comparable to each other, respect to a superset that may contain it.
it does require a∞ to be larger than every element The following example, beside being useful in
of A. The dual concepts of minimal and minimum Zorn’s lemma, is also of great significance in fix-
can be similarly defined by essentially reversing the ing some of the basic ideas needed in our future
roles of a and b in relational expressions like a b. arguments involving classes of sets ordered by the
The last concept needed to formalize Zorn’s inclusion relation.
lemma is that of an upper bound: For a subset
Example 4.1. Let X = P({a, b, c}) be ordered
(A, ) of a partially ordered set (X, ), an element
by the inclusion relation ⊆. The subset A =
u of X is an upper bound of A in X iff
P({a, b, c}) − {a, b, c} has three maximals {a, b},
(a u ∈ (X, ))(for every a ∈ (A, )) (42) {b, c} and {c, a} but no maximum as there is no

21
If is an order relation in X then the strict relation in X corresponding to , given by x y ⇔ (x y) ∧ (x = y), is
not an order relation because unlike , is not reflexive even though it is both transitive and asymmetric.


A∞ ∈ A satisfying A A∞ for every A ∈ A, The statement of Zorn’s lemma and its proof
while P({a, b, c}) − ∅ the three minimals {a}, {b} can now be completed in three stages as follows.
and {c} but no minimum. This shows that a sub- For Theorem 4.1 below that constitutes the most
set of a partially ordered set may have many max- significant technical first stage, let g be a function
imals (minimals) without possessing a maximum on (X, ) that assigns to every x ∈ X an immediate
(minimum), but a subset has a maximum (mini- successor y ∈ X such that
mum) iff this is its unique maximal (minimal). If
M(x) = {y x : ∃x∗ ∈ X satisfying x x∗ y}
A = {{a, b}, {a, c}}, then every subset of the in-
tersection of the elements of A, namely {a} and ∅, are all the successors of x in X with no element
are lower bounds of A, and all supersets in X of the of X lying strictly between x and y. Select a rep-
union of its elements — which in this case is just resentative of M (x) by a choice function f C such
{a, b, c} — are its upper bounds. Notice that while that
the maximal (minimal) and maximum (minimum)
g(x) = fC (M(x)) ∈ M (x)
are elements of A, upper and lower bounds need
not be contained in their sets. In this class (X , ⊆) is an immediate successor of x chosen from the
of subsets of a set X, X+ is a maximal element of many possible in the set M (x). The basic idea in
X iff X+ is not contained in any other subset of X, the proof of the first of the three-parts is to express
while X∞ is a maximum of X iff X∞ contains every the existence of a maximal element of a partially
other subset of X. ordered set X in terms of the existence of a fixed
Let A := {Aα ∈ X }α∈D be a non-empty sub- point in the set, which follows as a contradiction
class of (X , ⊆), and suppose that both Aα and of the assumed hypothesis that every point in X
Aα are elements of X . Since each Aα is ⊆-less has an immediate successor. Our basic application
than Aα , it follows that Aα is an upper bound of immediate successors in the following will be to
of A; this is also the smallest of all such bounds classes X ⊆ (P(X), ⊆) of subsets of a set X or-
because if U is any other upper bound then every dered by inclusion. In this case for any A ∈ X , the
Aα must precede U by Eq. (42) and therefore so function g can be taken to be the superset
must Aα (because the union of a class of subsets
g(A) = A fC (G(A) − A) ,
of a set is the smallest that contain each member (46)
of the class: Aα ⊆ U ⇒ Aα ⊆ U for subsets where G(A) = {x ∈ X − A : A {x} ∈ X }
(Aα ) and U of X). Analogously, since Aα is ⊆-
of A. Repeated application of g to A then generates
less than each Aα it is a lower bound of A; that it
a principal filter, and hence an associated sequence,
is the greatest of all the lower bounds L in X fol-
based at A.
lows because the intersection of a class of subsets is
the largest that is contained in each of the subsets: Theorem 4.1. Let (X, ) be a partially ordered set
L ⊆ Aα ⇒ L ⊆ Aα for subsets L and (Aα ) of X. that satisfies
Hence the supremum and infimum of A in (X , ⊆)
(ST1) There is a smallest element x0 of X which
given by
has no immediate predecessor in X.
A← = sup(X ,⊆) A = A (ST2) If C ⊆ X is a totally ordered subset in X,
A∈A
then c∗ = supX C is in X.
(45) Then there exists a maximal element x + of X which
and →A = inf (X ,⊆) A = A has no immediate successor in X.
A∈A

are both elements of (X , ⊆). Intuitively, an upper Proof. Let T ⊆ (X, ) be a subset of X. If the con-
(respectively, lower) bound of A in X is any subset clusion of the theorem is false then the alternative
of X that contains (respectively, is contained in) (ST3) Every element x ∈ T has an immediate suc-
every member of A. cessor g(x) in T 22

22
This makes T , and hence X, inductively defined infinite sets. It should be realized that (ST3) does not mean that every
member of T is obtained from g, but only ensures that the immediate successor of any element of T is also in T. The infimum
→ T of these towers satisfies the additional property of being totally ordered (and is therefore essentially a sequence or net) in
(X, ) to which (ST2) can be applied.

3180 A. Sengupta

leads, as shown below, to a contradiction that can g(c) t then g(c) g(t); this combined with
be resolved only by the conclusion of the theo- (c = t) ⇒ (g(c) = g(t)) yields g(c) g(t). On the
rem. A subset T of (X, ) satisfying conditions other hand, t c for every t ∈ Cg requires g(t) c
(ST1)−(ST3) is sometimes known as an g-tower as otherwise (t c) ⇒ (c g(t)) would, from the
or an g-sequence: an obvious example of a tower resulting consequence t c g(t), contradict the
is (X, ) itself. If assumed hypothesis that g(t) is the immediate suc-
cessor of t. Hence, Cg is a g-tower in X.
→T = {T ∈ T : T is an x0 − tower} To complete the proof that g(c) ∈ CT , and
is the (P(X), ⊆)-infimum of the class T of all se- thereby the argument that CT is a tower, we first
quential towers of (X, ), we show that this small- note that as → T is the smallest tower and Cg is built
est sequential tower is infact a sequential totally from it, Cg =→ T must infact be → T itself. From
ordered chain in (X, ) built from x0 by the g- Eq. (48) therefore, for every t ∈→ T either t g(c)
function. Let the subset or g(c) t, so that g(c) ∈ CT whenever c ∈ CT .
This concludes the proof that CT is actually the
CT = {c ∈ X : (∀t ∈→ T )(t c∨c t)} ⊆ X (47) tower → T in X.
From (ST2), the implication of the chain C T
of X be an g-chain in → T in the sense that
[cf. Eq. (37)] it is that subset of X each of whose CT =→ T = C g (49)
elements is comparable with some element of → T . being the minimal tower → T is that the supre-
The conditions (ST1)–(ST3) for CT can be verified mum t← of the totally ordered → T in its own
as follows to demonstrate that CT is an g-tower. tower (as distinct from in the tower X: recall that
(1) x0 ∈ CT , because it is less than each x ∈ → T . → T is a subset of X) must be contained in itself,
(2) Let c← = supX CT be the supremum of the that is
chain CT in X so that by (ST2), c← ∈ X. Let sup(CT ) = t← ∈→ T ⊆ X . (50)
t ∈ → T . If there is some c ∈ CT such that t c, CT
then surely t c← . Else, c t for every c ∈ CT This however leads to the contradiction from
shows that c← t because c← is the small- (ST3) that g(t← ) be an element of → T , unless of
est of all the upper bounds t of CT . Therefore course
c← ∈ C T .
(3) In order to show that g(c) ∈ C whenever c ∈ C g(t← ) = t← , (51)
it needs to be verified that for all t ∈ → T , ei- which because of (49) may also be expressed equiv-
ther t c ⇒ t g(c) or c t ⇒ g(c) t. alently as g(c← ) = c← ∈ CT . As the sequential
As the former is clearly obvious, we investigate totally ordered set → T is a subset of X, Eq. (48)
the latter as follows; note that g(t) ∈ → T by implies that t← is a maximal element of X which
(ST3). The first step is to show that the subset allows (ST3) to be replaced by the remarkable in-
verse criterion that
Cg = {t ∈ →T : (∀c ∈ CT )(t c ∨ g(c) t)}
(ST3 ) If x ∈ X and w precedes x, w x,
(48) then w ∈ X, that is obviously false for a general
of → T , which is a chain in X (observe the in- tower T . In fact, it follows directly from Eq. (39)
verse roles of t and c here as compared to that in that under (ST3 ) any x+ ∈ X is a maximal ele-
Eq. (47)), is a tower: Let t← be the supremum ment of X iff it is a fixed point of g as given by
of Cg and take c ∈ C. If there is some t ∈ Cg Eq. (51). This proves the theorem and also demon-
for which g(c) t, then clearly g(c) t ← . Else, strates how, starting from a minimum element of a
t x for each t ∈ Cg shows that t← c be- partially ordered set X, (ST3) can be used to gen-
cause t← is the smallest of all the upper bounds erate inductively a totally ordered sequential subset
c of Cg . Hence t← ∈ Cg . of X leading to a maximal x+ = c← ∈ (X, ) that
is a fixed point of the generating function g when-
Property (ST3) for Cg follows from a small ever the supremum t← of the chain → T is in X.
yet significant modification of the above arguments
in which the immediate successors g(t) of t ∈ C g
formally replaces the supremum t← of Cg . Thus Remark. The proof of this theorem, despite its ap-
given a c ∈ C, if there is some t ∈ Cg for which parent length and technically involved character,


carries the highly significant underlying message be the set of all the totally ordered subsets of
that (X, ). Since X is a collection of (sub)sets of X,
we order it by the inclusion relation on X and use
Any inductive sequential g-construction of the tower Theorem to demonstrate that (X , ⊆) has
an infinite chained tower CT starting with a maximal element C← , which by the definition of
a smallest element x0 ∈ (X, ) such that X , is the required maximal chain in (X, ).
a supremum c← of the g-generated sequen- Let C be a chain in X of the chains in (X, ).
tial chain CT in its own tower is contained In order to apply the tower Theorem to (X , ⊆) we
in itself, must necessarily terminate with a need to verify hypothesis (ST2) that the smallest
fixed point relation of the type (51) with re-
spect to the supremum. Note from Eqs. (50) C∗ = sup C = C (53)
X
and (51) that the role of (ST2) applied to C∈C
a fully ordered tower is the identification of of the possible upper bounds of C [see Eq. (45)] is
the maximal of the tower — which depends a chain of (X, ). Indeed, if x1 , x2 ∈ X are two
only on the tower and has nothing to do points of Csup with x1 ∈ C1 and x2 ∈ C2 , then
with anything outside it — with its supre- from the ⊆-comparability of C1 and C2 we may
mum that depends both on the tower and its choose x1 , x2 ∈ C1 ⊇ C2 , say. Thus x1 and x2 are
complement. -comparable as C1 is a chain in (X, ); C∗ ∈ X
is therefore a chain in (X, ) which establishes that
Thus although purely set-theoretic in nature, the
the supremum of a chain of (X , ⊆) is a chain in
filter-base associated with a sequentially totally or-
(X, ).
dered set may be interpreted to lead to the usual
The tower Theorem 4.1 can now be applied to
notions of adherence and convergence of filters and
(X , ⊆) with C0 as its smallest element to construct
thereby of a generated topology for (X, ), see
a g-sequentially towered fully ordered subset of X
Appendix A.1 and Example A.1.3. This very sig-
consisting of chains in X
nificant apparent inter-relation between topologies,
filters and orderings will form the basis of our CT = {Ci ∈ P(X) : Ci ⊆ Cj for i ≤ j ∈ N}
approach to the condition of maximal ill-posedness
= → T ⊆ P(X)
for chaos.
In the second stage of the three-stage of (X , ⊆) — consisting of the common elements of
programme leading to Zorn’s lemma, the tower all g-sequential towers T ∈ T of (X , ⊆) — that in-
Theorem 4.1 and the comments of the preceding fact is a principal filter base of chained subsets of
paragraph are applied at a higher level to a very (X, ) at C0 . The supremum (chain in X) C← of CT
special class of the power set of a set, the class of in CT must now satisfy, by Theorem 4.1, the fixed
all the chains of a partially ordered set, to directly point g-chain of X
lead to the physically significant sup(CT ) = C← = g(C← ) ∈ CT ⊆ P(X) ,
CT
Theorem 4.2(Hausdorff Maximal Principle).
where the chain g(C) = C fC (G(C) − C) with
Every partially ordered set (X, ) has a maximal
G(C) = {x ∈ X − C : C {x} ∈ X }, is an im-
totally ordered subset.23
mediate successor of C obtained by choosing one
Proof. Here the base level is point x = fC (G(C) − C) from the many possible in
G(C) − C such that the resulting g(C) = C {x}
X = {C ∈ P(X) : C is a chain in (X, )} ⊆ P(X) is a strict successor of the chain C with no others
(52) lying between it and C. Note that C← ∈ (X , ⊆) is

23
Recall that this means that if there is a totally ordered chain C in (X, ) that succeeds C+ , then C must be C+ so that no
chain in X can be strictly larger than C+ . The notation adopted here and below is the following: If X = {x, y} is a non-empty
set, then X := P(X) = {A : A ⊆ X} = {∅, {x}, {y}, {x, y}} is the set of subsets of X, and X := P 2 (X) = {A : A ⊆ X }, the set
of all subsets of X , consists of the 16 elements ∅, {∅}, {{x}}, {{y}}, {{x, y}}, {{∅}, {x}}, {{∅}, {y}}, {{∅}, {x, y}}, {{x}, {y}},
{{x}, {x, y}}, {{y}, {x, y}}, {{∅}, {x}, {y}}, {{∅}, {x}, {x, y}}, {{∅}, {y}, {x, y}}, {{x}, {y}, {x, y}}, and X : an element
of P 2 (X) is a subset of P(X), any element of which is a subset of X. Thus if C = {0, 1, 2} is a chain in (X = {0, 1, 2}, ≤),
then C = {{0}, {0, 1}, {0, 1, 2}} ⊆ P(X) and C = {{{0}}, {{0}, {0, 1}}, {{0}, {0, 1}, {0, 1, 2}}} ⊆ P 2 (X) represent chains
in (P(X), ⊆) and (P 2 (X), ⊆), respectively.

3182 A. Sengupta

(X, ) X = {C ⊆ X : C is a chain in (X, )}

Tower Theorem 4.1

CT = {T ⊆ (X , ⊆) : T is a C0 − tower}

supC (CT ) = C← = g(C← ) ∈ CT ⊆ (X , ⊆)
T
Hausdorff Maximal
Chain Theorem
Zorn Lemma

(u ∈ X c) (∀c ∈ (C← , ))

Fig. 10. Application of Zorn’s Lemma to (X, ). Starting with a partially ordered set (X, ), construct:
(a) The one-level higher subset X = {C ∈ P(X) : C is a chain in (X, )} of P(X) consisting of all the totally ordered subsets
of (X, ),
(b) The smallest common g-sequential totally ordered towered chain CT = {Ci ∈ P(X) : Ci ⊆ Cj for i ≤ j} ⊆ P(X) of all
sequential g-towers of X by Theorem 4.1, which in fact is a principal filter base of totally ordered subsets of (X, ) at the
smallest element C0 .
(c) Apply Hausdorff Maximal Principle to (X , ⊆) to get the subset supCT (CT ) = C← = g(C← ) ∈ CT ⊆ P(X) of (X, ) as the
supremum of (X , ⊆) in CT . The identification of this supremum as a maximal element of (X , ⊆) is a consequence of (ST2)
and Eqs. (50), (51) that actually puts the supremum into X itself.
By returning to the original level (X, )
(d) Zorn’s Lemma finally yields the required maximal element u ∈ X as an upper bound of the maximal totally ordered subset
(C← , ) of (X, ).
The dashed segment denotes the higher Hausdorff (X , ⊆) level leading to the base (X, ) Zorn level.

only one of the many maximal fully ordered subsets Indeed, if there is an element v ∈ X that is compa-
possible in (X, ). rable to u and v u, then v cannot be in C ← as it is
necessary for every x ∈ C← to satisfy x u. Clearly
With the assurance of the existence of a max-
then C← {v} is a chain in (X, ) bigger than C←
imal chain C← among all fully ordered subsets of
which contradicts the assumed maximality of C ←
a partially ordered set (X, ), the arguments are
among the chains of X.
completed by returning to the basic level of X. 1
The sequence of steps leading to Zorn’s Lemma,
Theorem 4.3 (Zorn’s Lemma). Let (X, ) be a
and thence to the maximal of a partially ordered set,
partially ordered set such that every totally ordered
is summarized in Fig. 10.
subset of X has an upper bound in X. Then X
The three examples below of the application of
has at least one maximal element with respect to
Zorn’s Lemma clearly reflect the increasing com-
its order.
plexity of the problem considered, with the maxi-
Proof. The proof of this final part is a mere ap- mals a point, a subset, and a set of subsets of X,
plication of the Hausdorff Maximal Principle on so that these are elements of X, P(X) and P 2 (X),
the existence of a maximal chain C← in X to the respectively.
hypothesis of this theorem that C← has an upper
Example 4.2
bound u in X that quickly leads to the identifica-
tion of this bound as a maximal element x + of X. (1) Let X = ({a, b, c}, ) be a three-point base-


£ ¢ ¡ ¢

!

¦ § ¨ ¤ ¢ ©

!
§%
!
¤ ¡

$

!
¥

¦ #

(a) (b)

Fig. 11. Tree diagrams of two partially ordered sets where two points are connected by a line iff they are comparable to
each other, with the solid lines linking immediate neighbors and the dashed, dotted and dashed–dotted lines denoting second,
third and fourth generation orderings according to the principle of transitivity of the order relation. There are 8 × 2 chains of
(a) and 7 chains of (b) starting from respective smallest elements with the immediate successor chains shown in solid lines.
The 17 point set X = {0, 1, 2, . . . , 15, 16} in (a) has two maximals but no maximum, while in (b) there is a single maximum
of P({a, b, c}), and three maximals without any maximum for P({a, b, c}) − {a, b, c}. In (a), let A = {1, 3, 4, 7, 9, 10, 15},
B = {1, 3, 4, 6, 7, 13, 15}, C = {1, 3, 4, 10, 11, 16} and D = {1, 3, 4}. The upper bounds of D in A are 7, 10 and 15 without
any supremum (as there is no smallest element of {7, 10, 15}), and the upper bounds of D in B are 7 and 15 with sup B (D) = 7,
while supC (D) = 10. Finally the maximal, maximum and the supremum in A of {1, 3, 4, 7} are all the same illustrating how
the supremum of a set can belong to itself. Observe how the supremum and upper bound of a set are with reference to its
complement in contrast with the maximum and maximal that have nothing to do with anything outside the set.

level ground set ordered lexicographically, that is of X , with
a b c. A chain C of the partially ordered
sup(CT ) = C← = {a, b, c} = g(C← ) ∈ CT ⊆ P(X)
Hausdorff-level set X consisting of subsets of X CT
given by Eq. (52) is, for example, {{a}, {a, b}} and
the six g-sequential chained towers the only maximal element of P(X). Zorn’s Lemma
now assures the existence of a maximal element of
C1 = {∅, {a}, {a, b}, {a, b, c}} , c ∈ X. Observe how the maximal element of (X, )
C2 = {∅, {a}, {a, c}, {a, b, c}} is obtained by going one level higher to X at the
Hausdorff stage and returning to the base level X
C3 = {∅, {b}, {a, b}, {a, b, c}} , at Zorn, see Fig. 10 for a schematic summary of this
C4 = {∅, {b}, {b, c}, {a, b, c}} sequence of steps.
C5 = {∅, {c}, {a, c}, {a, b, c}} , (2) Basis of a vector space. A linearly independent
C6 = {∅, {c}, {b, c}, {a, b, c}} set of vectors in a vector space X that spans the
space is known as the Hamel basis of X. To prove
built from the smallest element ∅ corresponding to the existence of a Hamel basis in a vector space,
the six distinct ways of reaching {a, b, c} from ∅ Zorn’s lemma is invoked as follows.
along the sides of the cube marked on the figure The ground base level of the linearly indepen-
with solid lines, all belong to X ; see Fig. 11(b). dent subsets of X
An example of a tower in (X , ⊆) which is not
a chain is T = {∅, {a}, {b}, {c}, {a, b}, {a, c}, X = {{xij }J ∈ P(X) : Span({xij }J )
j=1 j=1
{b, c}, {a, b, c}}. Hence the common infimum tow- = 0 ⇒ (αj )J = 0 ∀ J ≥ 1} ⊆ P(X) ,
j=1
ered chained subset
with Span({xij }J ) := J αj xij , is such that no
j=1 j=1
CT = {∅, {a, b, c}} =→ T ⊆ P(X) x ∈ X can be expressed as a linear combination of

3184 A. Sengupta

the elements of X − {x}. X clearly has a smallest Compared to this purely algebraic concept of
element, say {xi1 }, for some non-zero xi1 ∈ X. Let basis in a vector space, is the Schauder basis in
the higher Hausdorff level a normed space which combines topological struc-
ture with the linear in the form of convergence: If
X = {C ∈ P 2 (X) : C is a chain in (X , ⊆)} ⊆ P 2 (X)
a normed vector space contains a sequence (e i )i∈Z+
and collection of the chains with the property that for every x ∈ X there is an
unique sequence of scalars (αi )i∈Z+ such that the
CiK = {{xi1 }, {xi1 , xi2 }, . . . , {xi1 , xi2 , . . . , xiK }}
remainder x − (α1 e1 + α2 e2 + · · · + αI eI ) ap-
∈ P 2 (X) proaches 0 as I → ∞, then the collection (e i ) is
of X comprising linearly independent subsets of X known as a Schauder basis for X.
be g-built from the smallest {xi1 }. Any chain C of (3) Ultrafilter. Let X be a set. The set F S =
X is bounded above by the union C∗ = supX C = {Sα ∈ P(X) : Sα Sβ = ∅, ∀α = β} ⊆ P(X) of all
C∈C C which is a chain in X containing {x i1 }, nonempty subsets of X with finite intersection prop-
thereby verifying (ST2) for X. Application of the erty is known as a filter subbase on X and F B =
tower theorem to X implies that the element {B ⊆ X : B = i∈I⊂D Si }, for I ⊂ D a finite subset
CT = {Ci1 , Ci2 , . . . , Cin , . . .} =→ T ⊆ P 2 (X) of a directed set D, is a filter-base on X associated
with the subbase F S; cf. Appendix A.1. Then the
in X of chains of X is a g-sequential fully ordered filter generated by F S consisting of every superset
towered subset of (X, ⊆) consisting of the com- of the finite intersections B ∈F B of sets of F S is
mon elements of all g-sequential towers of (X, ⊆), the smallest filter that contains the subbase F S and
that in fact is a chained principal ultrafilter on base F B. For notational simplicity, we will denote
(P(X), ⊆) generated by the filter-base {{{x i1 }}} at the subbase F S in the rest of this example simply
{xi1 }, where by S.
T = {Ci1 , Ci2 , . . . , Cjn , Cjn+1 , . . .} Consider the base-level ground set of all filter
subbases on X
for some n ∈ N is an example of non-chained g-
tower whenever (Cjk )∞ is neither contained in nor
k=n S = S ∈ P 2 (X) : R = ∅ for every finite subset of S
contains any member of the (Cik )∞ chain. Haus-
k=1 ∅=R⊆S
dorff’s chain theorem now yields the fixed-point g-
⊆ P 2 (X),
chain C← ∈ X of X

sup(CT ) =C← = {{xi1 }, {xi1 , xi2 }, {xi1 , xi2 , xi3 }, . . .} ordered by inclusion in the sense that S α ⊆ Sβ for
CT all α β ∈ D, and let the higher Hausdorff-level
=g(C← ) ∈ CT ⊆ P 2 (X) ˜
X = {C ∈ P 3 (X) : C is a chain in (S, ⊆)} ⊆ P 3 (X)

as a maximal totally ordered principal filter on X comprising the collection of the totally ordered
that is generated by the filter-base {{x i1 }} at xi1 , chains
whose supremum B = {xi1 , xi2 , . . .} ∈ P(X) is, by
Zorn’s lemma, a maximal element of the base level Cκ = {{Sα }, {Sα , Sβ }, . . . , {Sα , Sβ , . . . , Sκ }}
X . This maximal linearly independent subset of X ∈ P 3 (X)
is the required Hamel basis for X: Indeed, if the
of S be g-built from the smallest {Sα } then an ultra-
span of B is not the whole of X, then Span(B) x,
filter on X is a maximal member S+ of (S, ⊆) in the
with x ∈ Span(B) would, by definition, be a linearly
/
usual sense that any subbase S on X must necessar-
independent set of X strictly larger than B, con-
ily be contained in S+ so that S+ ⊆ S ⇒ S = S+
tradicting the assumed maximality of the later. It
for any S ⊆ P(X) with FIP. The tower theorem
needs to be understood that since the infinite basis
now implies that the element
cannot be classified as being linearly independent,
we have here an important example of the supre- ˜ ˜
CT = {Cα , Cβ , . . . , Cν , . . .} = → T ⊆ P 3 (X)
mum of the maximal chained set not belonging to
the set even though this criterion was explicitly used ˜
of P 4 (X), which is a chain in X of the chains of S,
in the construction process according to (ST2) and is a g-sequential fully ordered towered subset of the
(ST3). ˜
common elements of all sequential towers of ( X, ⊆)


that is a chained principal ultrafilter on (P 2 (X), ⊆) element of (X, ). This sequence is now applied, as
generated by the filter-base {{{Sα }}} at {Sα }, where in Example 4.2(1), to the set of arbitrary relations
˜ Multi(X) on an infinite set X in order to formulate
T = {Cα , Cβ , . . . , Cσ , Cς , . . .},
our definition of chaos that follows.
is an obvious example of non-chained g-tower when- Let f be a noninjective map in Multi(X) and
ever (Cσ ) is neither contained in, nor contains, any P (f ) the number of injective branches of f . Denote
member of the Cα -chain. Hausdorff’s chain theorem by
now yields the fixed-point C˜ ∈ X
←
˜
F = {f ∈ Multi(X) : f is a noninjective function
sup(CT ) = C˜ = {{Sα }, {Sα , Sβ }, {Sα , Sβ , Sγ }, . . .}
˜ ←
˜
CT
on X} ⊆ Multi(X)
= g(C˜ ) ∈ CT ⊆ P 3 (X)
←
˜
the resulting basic collection of noninjective func-
as a maximal totally ordered g-chained towered sub- tions in Multi(X).
set of X that is, by Zorn’s lemma, a maximal ele-
ment of the base level subset S of P 2 (X). C˜ is (i) For every α in some directed set D, let F have
←
a chained principal ultrafilter on (P(X), ⊆) gener- the extension property
ated by the filter-base {{Sα }} at Sα , while S+ = (∀fα ∈ F )(∃fβ ∈ F ) : P (fα ) ≤ P (fβ )
{Sα , Sβ , Sγ , . . .} ∈ P 2 (X) is an (non-principal) ul-
trafilter on X — characterized by the property that (ii) Let a partial order on Multi(X) be defined,
any collection of subsets on X with FIP (that is any for fα , fβ ∈ Map(X) ⊆ Multi(X) by
filter subbase on X) must be contained in the max- P (fα ) ≤ P (fβ ) ⇔ fα fβ , (54)
imal set S+ having FIP — that is not a principal
filter unless Sα is a singleton set {xα }. with P (f ) := 1 for the smallest f , define a par-
tially ordered subset (F, ) of Multi(X). This
What emerges from these applications of Zorn’s is actually a preorder on Multi(X) in which
Lemma is the remarkable fact that infinities (the functions with the same number of injective
dot-dot-dots) can be formally introduced as “limit- branches are equivalent to each other.
ing cases” of finite systems in a purely set-theoretic (iii) Let
context without the need for topologies, metrics or
convergences. The significance of this observation Cν = {fα ∈ Multi(X) : fα fν } ∈ P(F ) ,
will become clear from our discussions on filters and ν ∈ D,
topology leading to Sec. 4.2 below. Also, the obser-
be g-chains of non-injective functions of
vation on the successive iterates of the power sets
Multi(X) and
P(X) in the examples above was to suggest their
anticipated role in the complex evolution of a dy- X = {C ∈ P(F ) : C is a chain in (F, )} ⊆ P(F )
namical system that is expected to play a significant
part in our future interpretation and understanding denote the corresponding Hausdorff level of all
of this adaptive and self-organizing phenomenon of chains of F , with
nature. CT = {Cα , Cβ , . . . , Cν , . . .} =→ T ⊆ P(F )

End Tutorial 5 being a g-sequential in X . By Hausdorff Max-
imal Principle, there is a maximal fixed-point
g-towered chain C← ∈ X of F
From the examples in Tutorial 5, it should be clear
sup(CT ) = C← = {fα , fβ , . . .}
that the sequential steps summarized in Fig. 10 are CT
involved in an application of Zorn’s lemma to show = g(C← ) ∈ CT ⊆ P(F ).
that a partially ordered set has a maximal element
with respect to its order. Thus for a partially or- Zorn’s Lemma applied to this maximal chain yields
dered set (X, ), form the set X of all chains C in its supremum as the maximal element of C ← , and
X. If C+ is a maximal chain of X obtained by the thereby of F . It needs to be appreciated, as in the
Hausdorff Maximal Principle from the chain C of case of the algebraic Hamel basis, that the exis-
all chains of X, then its supremum u is a maximal tence of this maximal non-functional element was

3186 A. Sengupta

obtained purely set theoretically as the “limit” of a [Devaney, 1989] and is also maximally non-injective;
net of functions with increasing nonlinearity, with- the tent map is therefore chaotic on D + . In con-
out resorting to any topological arguments. Because trast, the examples of Secs. 1 and 2 are not chaotic
it is not a function, this supremum does not be- as the maps are not topologically transitive, al-
long to the functional g-towered chain having it though the Liapunov exponents, as in the case of
as a fixed point, and this maximal chain does not the tent map, are positive. Here the (f n ) are iden-
possess a largest, or even a maximal, element, al- tified with the iterates of f, and the “fixed point”
though it does have a supremum.24 The supremum as one through which graphs of all the functions on
is a contribution of the inverse functional relations residual index subsets pass. When the set of points
−
(fα ) in the following sense. From Eq. (2), the net D+ is dense in [0, 1] and both D+ and [0, 1] − D+ =
∞
of increasingly non-injective functions of Eq. (54) [0, 1] − i=0 f −i (Per(f )) (where Per(f ) denotes the
implies a corresponding net of increasingly multi- set of periodic points of f ) are totally disconnected,
valued functions ordered inversely by the inverse it is expected that at any point on this complement
− −
relation fα fβ ⇔ fβ fα . Thus the inverse re- the behavior of the limit will be similar to that on
lations which are as much an integral part of graph- D+ : these points are special as they tie up the iter-
ical convergence as are the direct relations, have a ates on Per(f ) to yield the multifunctions. There-
smallest element belonging to the multifunctional fore in any neighborhood U of a D+ -point, there
class. Clearly, this smallest element as the required is an x0 at which the forward orbit {f i (x0 )}i≥0 is
supremum of the increasingly non-injective tower chaotic in the sense that
of functions defined by Eq. (54), serves to complete
the significance of the tower by capping it with a (a) the sequence neither diverges nor does it con-
“boundary” element that can be taken to bridge the verge in the image space of f to a periodic orbit
classes of functional and non-functional relations of any period, and
on X. (b) the Liapunov exponent is given by
We are now ready to define a maximally ill-
1/n
posed problem f (x) = y for x, y ∈ X in terms of a def df n (x0 )
maximally non-injective map f as follows. λ(x0 ) = lim ln
n→∞ dx
n−1
Definition 4.1 (Chaotic map). Let A be a non- 1 df (xi )
= lim ln , xi = f i (x0 ) ,
empty closed set of a compact Hausdorff space X. A n→∞ n dx
i=0
function f ∈ Multi(X) equivalently the sequence of
functions (fi ) is maximally non-injective or chaotic which is a measure of the average slope of an orbit
on A with respect to the order relation (54) if at x0 or equivalently of the average loss of informa-
tion of the position of a point after one iteration, is
(a) for any fi on A there exists an fj on A satisfying
positive. Thus an orbit with positive Liapunov expo-
fi fj for every j i ∈ N.
nent is chaotic if it is not asymptotic (that is neither
(b) the set D+ consists of a countable collection of
convergent nor adherent, having no convergent sub-
isolated singletons.
orbit in the sense of Appendix A.1) to an unstable
Definition 4.2 (Maximally ill-posed problem). periodic orbit or to any other limit set on which the
Let A be a non-empty closed set of a compact Haus- dynamics is simple. A basic example of a chaotic
dorff space X and let f be a functional relation in orbit is that of an irrational in [0, 1] under the shift
Multi(X). The problem f (x) = y is maximally ill- map and that of the chaotic set its closure, the full
posed at y if f is chaotic on A. unit interval.
Let f ∈ Map((X, U)) and suppose that A =
As an example of the application of these def- {f j (x0 )}j∈N is a sequential set corresponding to the
initions, on the dense set D+ , the tent map sat- orbit Orb(x0 ) = (f j (x0 ))j∈N , and let fRi (x0 ) =
j
isfies both the conditions of sensitive dependence j≥i f (x0 ) be the i-residual of the sequence
on initial conditions and topological transitivity (f j (x0 ))j∈N , with F Bx0 = {fRi (x0 ) : Res(N) → X

24
A similar situation arises in the following more intuitive example. Although the subset A = {1/n} n∈Z+ of the interval
I = [−1, 1] has no smallest or minimal elements, it does have the infimum 0. Likewise, although A is bounded below by any
element of [−1, 0), it has no greatest lower bound in [−1, 0) (0, 1].


for all i ∈ N} being the decreasingly nested filter- It is important that the difference in the dy-
base associated with Orb(x0 ). The so-called ω-limit namical behavior of the system on D+ and its com-
set of x0 given by plement be appreciated. At any fixed point x of f i
def in D+ (or at its equivalent images in [x]) the dynam-
ω(x0 ) = {x ∈ X : (∃nk ∈ N)(nk → ∞)(f nk (x0 ) → x)} ics eventually gets attached to the (equivalent) fixed
= {x ∈ X : (∀N ∈ Nx )(∀fRi ∈F Bx0 ) point, and the sequence of iterates converges graph-
(fRi (x0 ) N = ∅)} (55) ically in Multi(X) to x (or its equivalent points).
When x ∈ D+ , however, the orbit A = {f i (x)}i∈N
/
is simply the adherence set adh(f j (x0 )) of the se- is chaotic in the sense that (f i (x)) is not asymp-
quence (f j (x0 ))j∈N , see Eq. (A.39); hence Defini- totically periodic and not being attached to any
tion A.1.11 of the filter-base associated with a se- particular point they wander about in the closed
quence and Eqs. (A.16), (A.24), (A.31) and (A.34) chaotic set ω(x) = Der(A) containing A such that
allow us to express ω(x0 ) more meaningfully as for any given point in the set, some subsequence of
the chaotic orbit gets arbitrarily close to it. Such
ω(x0 ) = Cl(fRi (x0 )) . (56) sequences do not converge anywhere but only fre-
i∈N
quent every point of Der(A). Thus although in the
It is clear from the second of Eqs. 55) that for limit of progressively larger iterations there is com-
a continuous f and any x ∈ X, x ∈ ω(x0 ) im- plete uncertainty of the outcome of an experiment
plies f (x) ∈ ω(x0 ) so that the entire orbit of x conducted at either of these two categories of ini-
lies in ω(x0 ) whenever x does imply that the ω- tial points, whereas on D+ this is due to a random
limit set is positively invariant; it is also closed be- choice from a multifunctional set of equally prob-
cause the adherent set is a closed set according to able outputs as dictated by the specific conditions
Theorem A.1.3. Hence x0 ∈ ω(x0 ) ⇒ A ⊆ ω(x0 ) under which the experiment was conducted at that
reduces the ω-limit set to the closure of A with- instant, on its complement the uncertainty is due
out any isolated points, A ⊆ Der(A). In terms to the chaotic behavior of the functional iterates
of Eq. (A.33) involving principal filters, Eq. (56) themselves. Nevertheless it must be clearly under-
in this case may be expressed in the more trans- stood that this later behavior is entirely due to the
parent form ω(x0 ) = Cl(F P({f j (x0 )}∞ )) where
j=0 multifunctional limits at the D+ points which com-
the principal filter F P({f j (x0 )}∞ ) at A consists
j=0
pletely determine the behavior of the system on its
of all supersets of A = {f j (x0 )}∞ , and ω(x0 ) rep- complement. As an explicit illustration of this sit-
j=0
resents the adherence set of the principal filter at uation, recall that for the shift map 2x mod(1) the
A, see the discussion following Theorem A.1.3. If D+ points are the rationals on [0, 1], and any ir-
A represents a chaotic orbit under this condition, rational is represented by a non-terminating and
then ω(x0 ) is sometimes known as a chaotic set non-repeating decimal so that almost all decimals
[Alligood et al., 1997]; thus the chaotic orbit in- in [0, 1] in any base contain all possible sequences
finitely often visits every member of its chaotic of any number of digits. For the logistic map, the
set25 which is simply the ω-limit set of a chaotic situation is more complex, however. Here the on-
orbit that is itself contained in its own limit set. set of chaos marking the end of the period dou-
Clearly the chaotic set is positive invariant, and bling sequence at λ∗ = 3.5699456 is signaled by the
from Theorem A.1.3 and its corollary it is also com- disappearance of all stable fixed points, Fig. 13(c),
pact. Furthermore, if all (sub)sequences emanating with Fig. 13(a) being a demonstration of the sta-
from points x0 in some neighborhood of the set con- ble limits for λ = 3.569 that show up as conver-
verge to it, then ω(x0 ) is called a chaotic attractor, gence of the iterates to constant valued functions
see [Alligood et al., 1997]. As common examples of (rather than as constant valued inverse functions)
chaotic sets that are not attractors mention may be at stable fixed points, shown more emphatically in
made of the tent map with a peak value larger than Fig. 12(a). What actually happens at λ ∗ is shown in
1 at 0.5, and the logistic map with λ ≥ 4 again with Fig. 16(a) in the next subsection: the almost verti-
a peak value at 0.5 exceeding 1. cal lines produced at a large, but finite, iterations i

25
How does this happen for A = {f i (x0 )}i∈N that is not the constant sequence (x0 ) at a fixed point? As i ∈ N increases,
points are added to {x0 , f (x0 ), . . . , f I (x0 )} not, as would be the case in a normal sequence, as a piled-up Cauchy tail, but as
points generally lying between those already present; recall a typical graph as of Fig. 9, for example.

3188 A. Sengupta

1 1 1 1

9 9
2 2 5 5
6 6 1 1 2
2
7 7
3 3 10
10
1 1

Stable 1-cycle, 1-cycle, λ = 2.95
Stable λ = 2.95 Stable 2-cycle, 2-cycle, λ = 3.4
Stable λ = 3.4
Graphical limit at limit at 9001
Graphical 9001 Graphical limit at limit at 9001-9002
Graphical 9001-9002

0 0
0.2 0.2 1 0 1 0 0.7 0.7 1 1
(a)
(a)
(a) (b) (b)
(b)

1 1
1 1

9 9
5 5
1 1
2 2

6 6
10 10
8 8

Stable 4-cycle, λ = 3.5 = 3.5
Stable 4-cycle, λ Stable Stable 8-cycle, λ = 3.55
8-cycle, λ = 3.55
Graphical limit limit at 9001-9004
Graphical at 9001-9004 Graphical limit at 9001-9008
Graphical limit at 9001-9008

0 0 0.7 0.7 1 0 1 0 0.7 0.7 1 1
(c) (d)

Fig. 12. Fixed points and cycles of logistic map. The isolated fixed point of (b) yields two non-fixed points to which the
iterates converge simultaneously in the sense that the generated sequence converges to one iff it converges to the other. This
suggests that nonlinear dynamics of a system can lead to a situation in which sequences in a Hausdorff space may converge
to more than one point. Since convergence depends on the topology (Corollary to Theorem A.1.5), this may be interpreted
to mean that nonlinearity tends to modify the basic structure of a space. The sequence of points generated by the iterates of
the map are marked on the y-axis of (a)–(c) in italics. The singletons {x} are ω-limit sets of the respective fixed point x and
is generated by the constant sequence (x, x, . . .). Whereas in (a) this is the limit of every point in (0, 1), in the other cases
these fixed points are isolated in the sense of Definition 2.3. The isolated points, however, give rise to sequences that converge
to more than one point in the form of limit cycles as shown in (b)–(d).
1 1

(the multifunctions are generated only in the limit- chaos therefore, λx(1 − x) is chaotic for the values
ing sense of i → ∞ and represent a boundary be- of λ λ∗ that are shown in Fig. 16. We return to
tween functional and non-functional relations on a this case in the following subsection.
set), decrease in magnitude with increasing itera- As an example of chaos in a noniterative sys-
tions until they reduce to points. This gives rise to tem, we investigate the following question: While
a (totally disconnected) Cantor set on the y-axis in maximality of non-injectiveness produced by an in-
contrast with the connected intervals that the mul- creasing number of injective branches is necessary
tifunctional limits at λ λ∗ of Figs. 16(b)–16(d) for a family of functions to be chaotic, is this also
produce. By our characterization Definition 4.1 of sufficient for the system to be chaotic? This is an


.507 .507 1 1

.5 .5 Iterate 9000 of 9000 of logistic map
Iterate logistic map 9000 iterations on logistic logistic map
9000 iterations on map
.473 .473 Order at λ = 3.569= 3.569 .488
Order at λ 0 0.1 0 0.1
.488 at λ = 3.569= 3.569
at λ 1 1
(a) (a) (b) (b)
(a) (b)

.511 .511 1 1

.493 Iterate 9000 of logistic map
Iterate 9000 of logistic map 9000 iterations on logistic map
9000 iterations on logistic map
.493
.472 .472 0.1 at
‘‘Edge of”Edge of chaos” ∗ λ = λ∗ 0 .487 0 0.1 λ∗ = 3.5699456
chaos” at λ = λ at .487 at λ∗ = 3.5699456 1 1
(c)
(c) (c) (d) (d)
(d)

.511 .511 1 1

1 1

.493 Iterate 9000 of logistic map
.493Iterate 9000 of logistic map 9000 iterationsiterations on logistic map
9000 on logistic map
.472 .472 Chaos at Chaos at λ = 3.57 .487 0 .487 0 0.1 at λ = 3.57 λ = 3.57
λ = 3.57 0.1 at 1 1
(e) (e) (f) (f)
(e) (f)

Fig. 13. Multifunctional and cobweb plots of λx(1 − x). Comparison of the graphs for the three values of λ shown in (a)–(f)
illustrates how the dramatic changes in the character of the former are conspicuously absent in the conventional plots that
display no perceptible distinction between the three cases.

1 1

3190 A. Sengupta

important question especially in the context of a functions (f i )i∈N which may be verified by reference
non-iterative family of functions where fixed points to Definition A.1.8, Theorem A.1.3 and the proofs
are no longer relevant. of Theorems A.1.4 and A.1.5, together with the di-
Consider the sequence of functions rected set Eq. (A.10) with direction (A.11). The
| sin(πnx)|∞ . The graphs of the subsequence
n=1 basin of attraction of the attractor is A 1 because
| sin(2n−1 πx)| and of the sequence (tn (x)) on [0, 1] the graphical limit (D+ , F (D+ )) (G(R+ ), R+ ) of
are qualitatively similar in that they both contain Definition 3.1 may be obtained, as indicated above,
2n−1 of their functional graphs each on a base of by a proper choice of sequences associated with
1/2n−1 . Thus both | sin(2n−1 πx)|∞ and (tn (x))∞
n=1 n=1 A. Note that in the context of iterations of func-
converge graphically to the multifunction [0,1] on tions, the graphical limit (D+ , y0 ) of the sequence
the same set of points equivalent to 0. This is suf- (f n (x)) denotes a stable fixed point x∗ with im-
ficient for us to conclude that | sin(2 n−1 πx)|∞ ,n=1 age x∗ = f (x∗ ) = y0 to which iterations start-
and hence | sin(πnx)|∞ , is chaotic on the infinite
n=1 ing at any point x ∈ D+ converge. The graphi-
equivalent set [0]. While Fig. 9 was a comparison cal limits (xi0 , R+ ) are generated with respect to
of the first four iterates of the tent and absolute the class {xi∗ } of points satisfying f (xi0 ) = xi∗ ,
sine maps, Fig. 14 shows the “converged” graphical i = 0, 1, 2, . . . equivalent to unstable fixed point
limits after 17 iterations. x∗ := x0∗ to which inverse iterations starting at any
initial point in R+ must converge. Even though only
4.1. The chaotic attractor x∗ is inverse stable, an equivalent class of graph-
One of the most fascinating characteristics of chaos ically converged limit multis is produced at every
in dynamical systems is the appearance of attrac- member of the class xi∗ ∈ [x∗ ], resulting in the far-
tors the dynamics on which are chaotic. For a subset reaching consequence that every member of the class
A of a topological space (X, U) such that R(f (A)) is as significant as the parent fixed point x ∗ from
is contained in A — in this section, unless otherwise which they were born in determining the dynam-
stated to the contrary, f (A) will denote the graph ics of the evolving system. The point to remember
and not the range (image) of f — which ensures about infinite intersections of a collection of sets
that the iteration process can be carried out in A, having finite intersection property, as in Eq. (58), is
let that this may very well be empty; recall, however,
fRi (A) = f j (A) that in a compact space this is guaranteed not to be
j≥i∈N so. In the general case, if core(A) = ∅ then A is the
(57) principal filter at this core, and Atr(A 1 ) by Eqs. (58)
= f j (x) and (A.33) is the closure of this core, which in this
j≥i∈N x∈A case of topology being induced by the filterbase, is
generate the filter-base F B with Ai := fRi (A) ∈F B just the core itself. A1 by its very definition, is a pos-
being decreasingly nested, Ai+1 ⊆ Ai for all i ∈ N, itively invariant set as any sequence of graphs con-
in accordance with Definition A.1.1. The existence verging to Atr(A1 ) must be eventually in A1 : the
of a maximal chain with a corresponding maxi- entire sequence therefore lies in A 1 . Clearly, from
mal element as asssured by the Hausdorff Maximal Theorem A.3.1 and its corollary, the attractor is a
Principle and Zorn’s Lemma respectively implies a positively invariant compact set. A typical attrac-
nonempty core of F B. As in Sec. 3 following Defi- tor is illustrated by the derived sets in the second
nition 3.3, we now identify the filterbase with the column of Fig. 22 which also illustrates that the set
neighborhood base at f ∞ which allows us to define of functional relations are open in Multi(X); specifi-
def cally functional–non-functional correspondences are
Atr(A1 ) = adh(F B)
neutral-selfish related as in Fig. 22, 3–2, with the
(58)
= Cl(Ai ) attracting graphical limit of Eq. (58) forming the
A i ∈F B boundary of (finitely) many-to-one functions and
as the attractor of the set A1 , where the last equal- the one-to-(finitely) many multifunctions.
ity follows from Eqs. (59) and (20) and the closure Equation (58) is to be compared with the im-
is wth respect to the topology induced by the neigh- age definition of an attractor [Stuart Humphries,
borhood filter base F B. Clearly the attractor as de- 1996] where f (A) denotes the range and not the
fined here is the graphical limit of the sequence of graph of f . Then Eq. (58) can be used to define a


1 1 1 1

0 017th iterate of tent map map 0 .0008 0 Graph of | sin(216 πx)| 16 πx)|
17th iterate of tent .0008 Graph of | sin(2 .0008 .0008
(a)
(a) (a) (b) (b)
(b)

Fig. 14. Similarity in the behavior of the graphs of (a) tent and (b) | sin(216 πx)| maps at 17 iterations demonstrate chaoticity
of the latter.

sequence of points xk ∈ Ank and hence the subset be identified with the subset R+ on the y-axis on
def which the multifunctional limits G : R + → X of
ω(A) = {x ∈ X : (∃nk ∈ N)(nk → ∞)(∃xk ∈ Ank ) graphical convergence are generated, with its basin
(f nk (xk ) → x)} of attraction being contained in the D + associated
with the injective branch of f that generates R + . In
= {x ∈ X : (∀N ∈ Nx )(∀Ai ∈ A) summary it may be concluded that since definitions
(N Ai = ∅)} (59) (59) and (61) involve both the domain and range
of f , a description of the attractor in terms of the
as the corresponding attractor of A that satisfies an graph of f , like that of Eq. (58), is more pertinent
equation formally similar to (58) with the difference and meaningful as it combines the requirements of
that the filter-base A is now in terms of the image both these equations. Thus, for example, as ω(A) is
f (A) of A, which allows the adherence expression not the function G(R+ ), this attractor does not in-
to take the particularly simple form clude the equivalence class of inverse stable points
ω(A) = Cl(f i (A)) . (60) that may be associated with x∗ , see for example
i∈N Fig. 15.
From Eq. (59), we may make the particularly
The complimentary subset excluded from this def-
simple choice of (xk ) to satisfy f nk (x−k ) = x so
inition of ω(A), as compared to Atr(A 1 ), that is −n
that x−k = fB k (x), where x−k ∈ [x−k ] := f −nk (x)
required to complete the formalism is given by
is the element of the equivalence class of the inverse
Eq. (61) below. Observe that the equation for ω(A)
image of x corresponding to the injective branch f B .
is essentially Eq. (A.15), even though we prefer to
This choice is of special interest to us as it is the
use the alternate form of Eq. (A.16) as this brings
1 1 class that generates the G-function on R + in graph-
out more clearly the frequenting nature of the se-
ical convergence. This allows us to express ω(A) as
quence. The basin of attraction
−n
Bf (A) = {x ∈ A : ω(x) ⊆ Atr(A)} ω(A) = {x ∈ X : (∃nk ∈ N)(nk → ∞)(fB k (x)
= {x ∈ A : (∃nk ∈ N)(nk → ∞) (61) = x−k converges in (X, U))} ; (62)
(f nk (x) → x∗ ∈ ω(A)) note that the x−k of this equation and the xk of
of the attractor is the smallest subset of X in which Eq. (59) are, in general, quite different points.
sequences generated by f must eventually lie in or- A simple illustrative example of the construc-
der to adhere at ω(A). Comparison of Eqs. (62) tion of ω(A) for the positive injective branch of
with (33) and (61) with (32) show that ω(A) can the homeomorphism (4x2 − 1)/3, −1 ≤ x ≤ 1, is

3192 A. Sengupta

1

4x2 − 1
f=
3 0.8 x−1
fB
0.6
x
0.4 f2
B

0.2 3
fB

1
−0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8

5 4 3 2
−0.25
4 3 2 1
−0.4 x1 x2
x−1 x−3
Fig. 15. The attractor for f (x) = (4x2 − 1)/3, for −1 ≤ x ≤ 1. The converging sequences are denoted by arrows on the
right, and (xk ) are chosen according to the construction shown. This example demonstrates how although A ⊆ f (A), where
A = [0, 1] is the domain of the positive injective branch of f , the succeeding images (f i (A))i≥1 satisfy the required restriction
for iteration, and A in the discussion above can be taken to be f (A); this is permitted as only a finite number of iterates
is thereby discarded. It is straightforward to verify that Atr(A1 ) = (−1, [−0.25, 1]) ((−1, 1), −0.25) (1, [−0.25, 1]) with
F (x) = −0.25 on D− = (−1, 1) = D+ and G(y) = 1, and −1 on R− = [−0.25, 1] = R+ . By comparison, ω(A) from either
its definition Eq. (59) or from the equivalent intersection expression Eq. (60), is simply the closed interval R + = [−0.25, 1].
The italicized iterate numbers on the graphs show how the oscillations die out with increasing iterations from x = ±1 and
approach −0.25 in all neighborhoods of 0.

shown in Fig. 15, where the arrow-heads denote the quirements of an attractor to lead to the concept of
converging sequences f ni (xi ) → x and f ni −m (xi ) → a chaotic attractor to be that on which the dynam-
x−m which proves invariance of ω(A) for a homeo- ics is chaotic in the sense of Definitions 4.1. and 4.2.
morphic f ; here continuity of the function and its Hence
inverse is explicitly required for invariance. Posi-
tive invariance of a subset A of X implies that for Definition 4.3 (Chaotic Attractor). Let A be a
any n ∈ N and x ∈ A, f n (x) = yn ∈ A, while positively invariant subset of X. The attractor
negative invariance assures that for any y ∈ A, Atr(A) is chaotic on A if there is sensitive depen-
f −n (y) = x−n ∈ A. Invariance of A in both the dence on initial conditions for all x ∈ A. The sensi-
forward and backward directions therefore means tive dependence manifests itself as multifunctional
that for any y ∈ A and n ∈ N, there exists a x ∈ A graphical limits for all x ∈ D+ and as chaotic orbits
1
such that f n (x) = y. In interpreting this figure, it when x ∈ D+ .
may be useful to recall from Definition 4.1 that an
increasing number of injective branches of f is a The picture of chaotic attractors that emerge
necessary, but not sufficient, condition for the oc- from the foregoing discussions and our characteri-
currence of chaos; thus in Figs. 12(a) and 15, in- zation of chaos of Definition 4.1 is that it it is a
creasing noninjectivity of f leads to constant valued subset of X that is simultaneously “spiked” multi-
limit functions over a connected D+ in a manner functional on the y-axis and consists of a dense col-
similar to that associated with the classical Gibb’s lection of singleton domains of attraction on the x-
phenomenon in the theory of Fourier series. axis. This is illustrated in Fig. 16 which shows some
Graphical convergence of an increasingly non- typical chaotic attractors. The first four diagrams
linear family of functions implied by its increasing (a)–(d) are for the logistic map with (b)–(d) show-
non-injectivity may now be combined with the re- ing the 4-, 2- and 1-piece attractors for λ = 3.575,


3.66, and 3.8, respectively that are in qualitative significant as λ = λ∗ marks the boundary between
agreement with the standard bifurcation diagram the nonchaotic region for λ λ∗ and the chaotic for
reproduced in (e). Figures 16(b)–16(d) have the ad- λ λ∗ (this is to be understood as being suitably
vantage of clearly demonstrating how the attractors modified by the appearance of the nonchaotic win-
are formed by considering the graphically converged dows for some specific intervals in λ λ ∗ ). At λ∗
limit as the object of study unlike in (e) which shows the generated fractal Cantor set Λ is an attractor
the values of the 501–1001th iterates of x 0 = 1/2 as as it attracts almost every initial point x 0 so that
a function of λ. The difference in (a) and (b) for a the successive images xn = f n (x0 ) converge toward
change of λ from λ λ∗ = 3.5699456 to 3.575 is the Cantor set Λ. In (f) the chaotic attractors for

1 1 1 1

λ = 3.5699456
λ = 3.5699456 λ=
λ = 3.575 3.575
Iterates = 2001 − 2004 2004
Iterates = 2001 − Iterates = 2001 − 2004− 2004
Iterates = 2001
0 0 1 0 1 0 1 1
(a) (a) (b) ----
(a) (b)

1 1 1 1

λ
λ = 3.66= 3.66 λ
λ = 3.8 = 3.8
Iterates = 2001 − 2004 2004
Iterates = 2001 − Iterates = 2001 − 2002− 2002
Iterates = 2001
0 0 1 0 1 0 1 1
(c) (c) (d)
(c) (d)
Fig. 16. Chaotic attractors for different values of λ. For the logistic map the usual bifurcation (e) shows the chaotic attractors
for λ λ∗ = 3.5699456, while (a)–(d) display the graphical limits for four values of λ chosen for the Cantor set and 4,- 2-,
and 1-piece attractors, respectively. In (f) the attractor [0, 1] (where the dotted lines represent odd iterates and the solid lines
even iterates of f ) disappear if f is reflected about the x-axis. The function ff (x) is given by

2(1 + x)/3 0 ≤ x 1/2
ff (x) =
2(1 − x) 1/2 ≤ x ≤ 1 .

1 1

3194 A. Sengupta

1 1 1 1

λ4 = 3.449 3.449
λ4 =
0 0 λ4 λ∗
λ4 λ∗ 4 0 4 0 First 12 iterates
First 12 iterates 1 1
(e)
(e) (e) (f) (f)
(f)
Fig. 16. (Continued )

the piecewise continuous function on [0, 1] on ordered sets, just as the role of the choice of
an appropriate problem-dependent basis was high-
 2(1 + x) , 0 ≤ x 1

 lighted at the end of Sec. 2. Chaos as manifest in its
3 2

ff (x) = attractors is a direct consequence of the increasing

 2(1 − x), 1 nonlinearity of the map with increasing iteration;
≤ x ≤ 1,

2 we reemphasize that this is only a necessary condi-
is [0, 1] where the dotted lines represent odd iterates tion so that the increasing nonlinearities of Figs. 12
and the full lines even iterates of f ; here the attrac- and 15 eventually lead to stable states and not to
tor disappears if the function is reflected about the chaotic instability. Under the right conditions as
x-axis. enunciated following Fig. 10, chaos appears to be
the natural outcome of the difference in the behav-
4.2. Why chaos? A preliminary ior of a function f and its inverse f − under their
inquiry successive applications. Thus f = f f − f allows f
The question as to why a natural system should to take advantage of its multi-inverse to generate
evolve chaotically is both interesting and relevant, all possible equivalence classes that are available, a
and this section attempts to advance a plausible an- feature not accessible to f − = f − f f − . As we have
swer to this inquiry that is based on the connection seen in the foregoing, equivalence classes of fixed
between topology and convergence contained in the points, stable and unstable, are of defining signif-
Corollary to Theorem A.1.5. Open sets are group- icance in determining the ultimate behavior of an
ings of elements that govern convergence of nets evolving dynamical system and as the eventual (as
and filters, because the required property of being also frequent) charcter of a filter or net in a set
either eventually or frequently in (open) neighbor- is dictated by open neighborhoods of points of the
hoods of a point determines the eventual behavior set, it is postulated that chaoticity on a set X leads
of the net; recall in this connection the unusual 1
1 con- to a reformulation of the open sets of X to equiv-
vergence characteristics in cofinite and cocountable alence classes generated by the evolving map f , see
spaces. Conversely for a given convergence charac- Example 2.4(3). Such a redefinition of open sets of
teristic of a class of nets, it is possible to infer the equivalence clases allow the evolving system to tem-
topology of the space that is responsible for this porally access an ever increasing number of states
convergence, and it is this point of view that we even though the equivalent fixed points are not fixed
adopt here to investigate the question of this sub- under iterations of f except for the parent of the
section: recall that our Definitions 4.1 and 4.2 were class, and can be considered to be the governing
based on purely algebraic set-theoretic arguments criterion for the cooperative or collective behavior


of the system. The predominance of the role of f − to points x = y ∈ X then x ∼ y: x is of course
in f = f f −f in generating the equivalence classes equivalent to itself while x, y, z are equivalent to
(that is exploiting the many-to-one character) of f each other iff they are simultaneously in every open
is reflected as limit multis for f (i.e. constant f − set in which the net may eventually belong. This
on R+ ) in f − = f − f f − ; this interpretation of the hall-mark of chaos can be appreciated in terms of
dynamics of chaos is meaningful as graphical con- a necessary obliteration of any separation property
vergence leading to chaos is a result of pointwise bi- that the space might have originally possessed, see
convergence of the sequence of iterates of the func- property (H3) in Appendix A.3. We reemphasize
tions generated by f . But as f is a noninjective that a set in this chaotic context is required to act
function on X possessing the property of increasing in a dual capacity depending on whether it carries
nonlinearity in the form of increasing noninjectivity the initial or final topology under M.
with iteration, various cycles of disjoint equivalence This preliminary inquiry into the nature of
classes are generated under iteration, see for exam- chaos is concluded in the final section of this paper.
ple Fig. 9(a) for the tent map. A reference to Fig.
shows that the basic set XB , for a finite number n of
5. Graphical Convergence Works
iterations of f , contains the parent of each of these
open equivalent sets in the domain of f , with the We present in this section some real evidence in
topology on XB being the corresponding p-images support of our hypothesis of graphical conver-
of these disjoint saturated open sets of the domain. gence of functions in Multi(X, Y ). The example is
In the limit of infinite iterations of f leading to the taken from neutron transport theory, and concerns
multifunction M (this is the f ∞ of Sec. 4.1), the the discretized spectral approximation [Sengupta,
generated open sets constitute a basis for a topol- 1988, 1995] of Case’s singular eigenfunction solu-
tion of the monoenergetic neutron transport equa-
ogy on D(f ) and the basis for the topology of R(f )
tion, [Case Zweifel, 1967]. The neutron transport
are the corresponding M-images of these equivalent
equation is a linear form of the Boltzmann equation
classes. It is our contention that the motive force be-
that is obtained as follows. Consider the neutron-
hind evolution toward a chaos, as defined by Defini-
moderator system as a mixture of two species of
tion 4.1, is the drive toward a state of the dynamical
gases each of which satisfies a Boltzmann equation
system that supports ininality of the limit multi M; of the type
see Appendix A.2 with the discussions on Fig. and
Eq. (26) in Sec. 2. In the limit of infinite iterations ∂
+ vi · fi (r, v, t)
therefore, the open sets of the range R(f ) ⊆ X are ∂t
the multi images that graphical convergence gener-
ates at each of these inverse-stable fixed points. X = dv dv1 dv1 Wij (vi → v ; v1 → v1 )
therefore has two topologies imposed on it by the j
dynamics of f : the first of equivalence classes gen- {fi (r, v , t)fj (r, v1 , t) − fi (r, v, t)fj (r, v1 , t)}
erated by the limit multi M in the domain of f and
where
the second as M-images of these classes in the range
of f . Quite clearly these two topologies need not be Wij (vi → v ; v1 → v1 ) = |v − v1 |σij (v − v , v1 − v1 )
the same; their intersection therefore can be defined
to be the chaotic topology on X associated with the σij being the cross-section of interaction between
chaotic map f on X. Neighborhoods of points in this species i and j. Denote neutrons by subscript 1 and
topology cannot be arbitrarily small as they consist the background moderator with which the neutrons
of all members of the equivalence class to which interact by 2, and make the assumptions that
any element belongs; hence a sequence converging (i) The neutron density f1 is much less compared
to any of these elements necessarily converges to with that of the moderator f2 so that the terms
all of them, and the eventual objective of chaotic f1 f1 and f1 f2 may be neglected in the neutron and
dynamics is to generate a topology in X with re- moderator equations, respectively.
spect to which elements of the set can be grouped (ii) The moderator distribution f2 is not affected
together in as large equivalence classes as possible by the neutrons. This decouples the neutron and
in the sense that if a net converges simultaneously moderator equations and leads to an equilibrium

3196 A. Sengupta

Maxwellian fM for the moderator while the neu- the continuous spectrum of µ. This distinction be-
trons are described by the linear equation tween the nature of the inverses depending on the
∂ relative values of µ and ν suggests a wider “non-
+v· f (r, v, t) function” space in which to look for the solutions of
∂t
operator equations, and in keeping with the philos-
= dv dv1 dv1 W12 (v → v ; v1 → v1 ) ophy embodied in Fig. of treating inverse prob-
lems in the space of multifunctions, we consider
{f (r, v , t)fM (v1 ) − f (r, v, t)fM (v1 )}) all Fν ∈ Multi(V (µ), R)) satisfying Eq. (63) to be
eigenfunctions of µ for the corresponding eigenvalue
This is now put in the standard form of the neutron ν, leading to the following multifunctional solution
transport equation [Williams, 1971] of (63)
1 ∂ ˆ (V (µ), 0) if ν ∈ V (µ)
/
+ Ω · v + S(E) Φ(r, E, Ω, t) Fν (µ) =
v ∂t
(V (µ) − ν, 0) (ν, R)) if ν ∈ V (µ) ,
= dΩ ˆ ˆ ˆ
dE S(r, E → E; Ω · Ω)Φ(r, E , Ω , t). where V (µ) − ν is used as a shorthand for the inter-
val V (µ) with ν deleted. Rewriting the eigenvalue
ˆ
where E = mv 2 /2 is the energy and Ω the direc- equation (63) as µν (Fν (µ)) = 0 and comparing this
tion of motion of the neutrons. The steady state, with Fig. , allows us to draw the correspondences
monoenergetic form of this equation is Eq. (A.53) f ⇔ µν
1
∂Φ(x, µ) c X and Y ⇔ {Fν ∈ Multi(V (µ), R) :
µ + Φ(x, µ) = Φ(x, µ )dµ ,
∂x 2 −1 Fν ∈ D(µν )}
0 c 1, −1 ≤ µ ≤ 1 (64)
f (X) ⇔ {0 : 0 ∈ Y }
and its singular eigenfunction solution for x ∈ XB ⇔ {0 : 0 ∈ X}
(−∞, ∞) is given by Eq. (A.56)
f − ⇔ µ− .
ν
−x/ν0
Φ(x, µ) = a(ν0 )e φ(µ, ν0 )
Thus a multifunction in X is equivalent to 0 in X B
x/ν0
+ a(−ν0 )e φ(−ν0 , µ) under the linear map µν , and we show below that
1 this multifunction is in fact the Dirac delta “func-
+ a(ν)e−x/ν φ(µ, ν)dν ; tion” δν (µ), usually written as δ(µ − ν). This sug-
−1 gests that in Multi(V (µ), R), every ν ∈ V (µ) is in
see Appendix A.4 for an introductory review of the point spectrum of µ, so that discontinuous func-
Case’s solution of the one-speed neutron transport tions that are pointwise limits of functions in func-
equation. tion space can be replaced by graphically converged
The term “eigenfunction” is motivated by the multifunctions in the space of multifunctions. Com-
following considerations. Consider the eigenvalue pleting the equivalence class of 0 in Fig. , gives the
equation multifunctional solution of Eq. (63).
From a comparison of the definition of ill-
(µ − ν)Fν (µ) = 0, µ ∈ V (µ), ν∈R (63)
posedness (Sec. 2) and the spectrum (Table 1), it is
in the space of multifunctions Multi(V (µ), (−∞, clear that Lλ (x) = y is ill-posed iff
∞)), where µ is in either of the intervals [−1, 1]
(1) Lλ not injective ⇔ λ ∈ P σ(Lλ ), which corre-
or [0, 1] depending on whether the given bound-
sponds to the first row of Table 1.
ary conditions for Eq. (A.53) is full-range or half
(2) Lλ not surjective ⇔ the values of λ correspond
range. If we are looking only for functional solu-
to the second and third columns of Table 1.
tions of Eq. (63), then the unique function F that
(3) Lλ is bijective but not open ⇔ λ is either in
satisfies this equation for all possible µ ∈ V (µ) and
Cσ(Lλ ) or Rσ(Lλ ) corresponding to the second
ν ∈ R − V (µ) is Fν (µ) = 0 which means, according
row of Table 1.
to Table 1, that the point spectrum of µ is empty
and (µ − ν)−1 exists for all ν. When ν ∈ V (µ), how- We verify in the three steps below that X =
ever, this inverse is not continuous and we show L1 [−1, 1] of integrable functions, ν ∈ V (µ) = [−1, 1]
below that in Map(V (µ), 0), ν ∈ V (µ) belongs to belongs to the continuous spectrum of µ.


Table 1. Spectrum of linear operator L ∈ Map(X). Here Lλ := L−λ satisfies
the equation Lλ (x) = 0, with the resolvent set ρ(L) of L consisting of all those
complex numbers λ for which L−1 exists as a continuous operator with dense
λ
domain. Any value of λ for which this is not true is in the spectrum σ(L)
of L, that is further subdivided into three disjoint components of the point,
continuous and residual spectra according to the criteria shown in the table.

R(Lλ )

Lλ L−1
λ R=X Cl(R) = X Cl(R) = X

Not injective ··· P σ(L) P σ(L) P σ(L)
Not continuous Cσ(L) Cσ(L) Rσ(L)
Injective
Continuous ρ(L) ρ(L) Rσ(L)

(a) R(µν ) is dense, but not equal to L1 . The set Nevertheless although the net of functions
of functions g(µ) ∈ L1 such that µ−1 g ∈ L1
ν 1
cannot be the whole of L1 . Thus, for example, δνε (µ) = −1 (1 + ν)/ε + tan−1 (1 − ν)/ε
tan
the piecewise constant function g = const = 0
ε
on |µ − ν| ≤ δ 0 and 0 otherwise is in L1 × , ε0
but not in R(µν ) as µ−1 g ∈ L1 . Nevertheless (µ − ν)2 + ε2
ν
for any g ∈ L1 , we may choose the sequence of 1
is in the domain of µν because −1 δνε (µ)dµ =1
functions for all ε 0,
1
0, if |µ − ν| ≤ 1/n lim |µ − ν|δνε (µ)dµ = 0
ε→0 −1
gn (µ) =
g(µ), otherwise implying that (µ − ν)−1 is unbounded.
Taken together, (a) and (b) show that func-
in R(µν ) to be eventually in every neighbor- tional solutions of Eq. (63) lead to state 2–2 in
1 Table 1; hence ν ∈ [−1, 1] = Cσ(µ).
hood of g in the sense that limn→∞ −1 |g −
gn | = 0. (c) The two integral constraints in (b) also mean
(b) The inverse (µ − ν)−1 exists but is not contin- that ν ∈ Cσ(µ) is a generalized eigenvalue
uous. The inverse exists because, as noted ear- of µ which justifies calling the graphical limit
G
lier, 0 is the only functional solution of Eq. (63). δνε (µ) → δν (µ) a generalized, or singular, eigen-

20 20 −32 32

−0.5 0.5 0.5
0 0

−0.5 −0.5 0 0 0.5 −32 −32
(a) (a) (b) (b)
(a) (b)

Fig. 17. Graphical convergence of: (a) Poisson kernel δε (x) = ε/π(x2 + ε2 ) and (b) conjugate Poisson kernel Pε (x) =
x/(x2 + ε2 ) to the Dirac delta and principal value, respectively; the graphs, each for a definite ε-value, converges to the
respective limits as ε → 0.

3198 A. Sengupta

function, see Fig. 17 which clearly indicates the with
convergence of the net of functions. 26 1
dµ 1 ε→0
πε = ε = 2 tan−1 −→ π .
−1 µ2 + ε2 ε
From the fact that the solution Eq. (A.56) of
the transport equation contains an integral involv- These discretized equations should be compared
ing the multifunction φ(µ, ν), we may draw an in- with the corresponding exact ones of Appendix A.4.
teresting physical interpretation. As the multi ap- We shall see that the net of functions (65) con-
pears everywhere on V (µ) (i.e. there are no chaotic verges graphically to the multifunction Eq. (A.55)
orbits but only the multifunctions that produce as ε → 0.
them), we have here a situation typical of maximal In the discretized spectral approximation,
ill-posedness characteristic of chaos: note that both the singular eigenfunction φ(µ, ν) is replaced by
the functions comprising φε (µ, ν) are non-injective. φε (µ, ν), ε → 0, with the integral in ν being replaced
As the solution (A.56) involves an integral over all by an appropriate sum. The solution Eq. (A.58) of
ν ∈ V (µ), the singular eigenfunctions — that col- the physically interesting half-space x ≥ 0 problem
lectively may be regarded as representing a chaotic then reduces to [Sengupta, 1988, 1995]
substate of the system represented by the solution of
Φε (x, µ) = a(ν0 )e−x/ν0 φ(µ, ν0 )
the neutron transport equation — combine with the
N
functional components φ(±ν0 , µ) to produce the
+ a(νi )e−x/νi φε (µ, νi ) µ ∈ [0, 1]
well-defined, non-chaotic, experimental end result
i=1
of the neutron flux Φ(x, µ).
The solution (A.56) is obtained by assuming (66)
Φ(x, µ) = e−x/ν φ(µ, ν) to get the equation for where the nodes {νi }N are chosen suitably. This
i=1
φ(µ, ν) to be (µ − ν)φ(µ, ν) = −cν/2 with the nor- discretized spectral approximation to Case’s so-
1
malization −1 φ(µ, ν) = 1. As µ−1 is not invert-
ν lution has given surprisingly accurate numerical
ible in Multi(V (µ), R) and µνB : XB → f (X) does results for a set of properly chosen nodes when
not exist, the alternate approach of regularization compared with exact calculations. Because of its
was adopted in [Sengupta, 1988, 1995] to rewrite involved nature [Case Zweifel, 1967], the exact
µν φ(µ, ν) = −cν/2 as µνε φε (µ, ν) = −cν/2 with calculations are basically numerical which leads to
µνε := µ − (ν + iε) being a net of bijective func- nonlinear integral equations as part of the solu-
tions for ε 0; this is a consequence of the fact tion procedure. To appreciate the enormous com-
that for the multiplication operator every non-real plexity of the exact treatment of the half-space
λ belongs to the resolvent set of the operator. The problem, we recall that the complete set of eigen-
family of solutions of the latter equation is given by functions {φ(µ, ν0 ), {φ(µ, ν)}ν∈[0,1] } are orthogo-
[Sengupta 1988, 1995] nal with respect to the half-range weight function
W (µ) of half-range theory, Eq. (A.61), that is ex-
cν ν−µ λε (ν) ε pressed only in terms of solution of the nonlin-
φε (ν, µ) = +
2 (µ − ν) 2 + ε2 πε (µ − ν)2 + ε2 ear integral equation Eq. (A.62). The solution of
(65) a half-space problem then evaluates the coefficients
{a(ν0 ), a(ν)ν∈[0, 1] } from the appropriate half range
where the required normalization
1
φε (ν, µ) = 1 (that is 0 ≤ µ ≤ 1) orthogonality integrals satisfied
−1
gives by the eigenfunctions {φ(µ, ν0 ), {φ(µ, ν)}ν∈[0, 1] }
with respect to the weight W (µ), see Appendix A.4
πε for the necessary details of the half-space problem
λε (ν) =
tan−1 (1 + ν)/ε + tan−1 (1 − ν)/ε in neutron transport theory.
As may be appreciated from this brief introduc-
cν (1 + ν)2 + ε2
× 1− ln tion, solutions to half-space problems are not sim-
4 (1 − ν)2 + ε2 ple and actual numerical computations must rely a
ε→0
−→ πλ(ν) great deal on tabulated values of the X-function.

26
The technical definition of a generalized eigenvalue is as follows. Let L be a linear operator such that there exists in the
domain of L a sequence of elements (xn ) with xn = 1 for all n. If limn→∞ (L − λ)xn = 0 for some λ ∈ C, then this λ is
a generalized eigenvalue of L, the corresponding eigenfunction x∞ being a generalized eigenfunction.


Self-consistent calculations of sample benchmark nature of the exact theory, it is our contention that
problems performed by the discretized spectral ap- the remarkable accuracy of these basic data, some
proximation in a full-range adaption of the half- of which is reproduced in Table 2, is due to the
range problem described below that generate all graphical convergence of the net of functions
necessary data, independent of numerical tables, G
with the quadrature nodes {νi }N taken at the φε (µ, ν) → φ(µ, ν)
i=1
zero Legendre polynomials show that the full range shown in Fig. 18; here ε = 1/πN so that ε → 0
formulation of this approximation [Sengupta, 1988, as N → ∞. By this convergence, the delta
1995] can give very accurate results not only of inte- function and principal values in [−1, 1] are the
grated quantities like the flux Φ and leakage of par- multifunctions ([−1, 0), 0) (0, [0, ∞) ((0, 1], 0)
ticles out of the half space, but of also basic “raw” and {1/x}x∈[−1, 0) (0, (−∞, ∞)) {1/x}x∈(0, 1]
data like the extrapolated end point respectively. Tables 2 and 3, taken from [Sengupta,
cν0 1
ν cν 2 ν0 + ν 1988] and [Sengupta, 1995], show respectively the
z0 = 1+ ln dν extrapolated end point and X-function by the
4 0 N (ν) 1 − ν2 ν0 − ν
full-range adaption of the discretized spectral ap-
(67)
proximation for two different half-range problems
and of the X-function itself. Given the involved denoted as Problems A and B defined as

c = c = 0.3
0.3 c = c = 0.9
0.9

c = c = 0.3
0.3 c = c = 0.9
0.9
N = 1000
N = 1000 N = 1000
N = 1000

(a) (a)
(a) (b)
(b) (b)
Fig. 18. Rational function approximations φε (µ, ν) of the singular eigenfunction φ(µ, ν) at four different values of ν. N = 1000
denotes the “converged” multifunction φ, with the peaks at the specific ν-values chosen.

3200 A. Sengupta

1
P roblem A Equation : µΦx + Φ = (c/2) −1 Φ(x, µ )dµ , x ≥ 0
Boundary condition : Φ(0, µ) = 0 for µ ≥ 0
Asymptotic condition : Φ → e−x/ν0 φ(µ, ν0 ) as x → ∞ .
1
P roblem B Equation : µΦx + Φ = (c/2) −1 Φ(x, µ )dµ , x≥0
Boundary condition : Φ(0, µ) = 1 for µ ≥ 0
Asymptotic condition : Φ → 0 as x → ∞ .

The full −1 ≤ µ ≤ 1 range form of the half of the full-range weight function µ as compared to
0 ≤ µ ≤ 1 range discretized spectral approxima- the half-range function W (µ), and the resulting sim-
tion replaces the exact integral boundary condition plicity of the orthogonality relations that follow, see
at x = 0 by a suitable quadrature sum over the val- Appendix A.4. The basic data of z0 and X(−ν)
ues of ν taken at the zeros of Legendre polynomials; are then completely generated self-consistently
thus the condition at x = 0 can be expressed as [Sengupta, 1988, 1995] by the discretized spectral
approximation from the full-range adaption
N
N
ψ(µ) = a(ν0 )φ(µ, ν0 ) + a(νi )φε (µ, νi ) ,
i=1 (68) bi φε (µ, νi ) = ψ+ (µ) + ψ− (µ) ,
(69)
i=0
µ ∈ [0, 1] , µ ∈ [−1, 1], νi ≥ 0
where ψ(µ) = Φ(0, µ) is the specified incoming
radiation incident on the boundary from the left, Table 2. Extrapolated end-point z0 .
and the half-range coefficients a(ν0 ), {a(ν)}ν∈[0,1] cz0
are to be evaluated using the W -function of
Appendix A.4. We now exploit the relative sim- c N =2 N =6 N = 10 Exact
plicity of the full-range calculations by replacing 0.2 0.78478 0.78478 0.78478 0.7851
Eq. (68) by Eq. (69) following, where the coefficients 0.4 0.72996 0.72996 0.72996 0.7305
N
{b(νi )}i=0 are used to distinguish the full-range co- 0.6 0.71535 0.71536 0.71536 0.7155
efficients from the half-range ones. The significance 0.8 0.71124 0.71124 0.71124 0.7113
of this change lies in the overwhelming simplicity 0.9 0.71060 0.71060 0.71061 0.7106

Table 3. X(−ν) by the full-range method.

X(−ν)

c N νi Problem A Problem B Exact

0.2133 0.8873091 0.8873091 0.887308
0.2 2
0.7887 0.5826001 0.5826001 0.582500
0.0338 1.3370163 1.3370163 1.337015
0.1694 1.0999831 1.0999831 1.099983
0.3807 0.8792321 0.8792321 0.879232
0.6 6
0.6193 0.7215240 0.7215240 0.721524
0.8306 0.6239109 0.6239109 0.623911
0.9662 0.5743556 0.5743556 0.574355
0.0130 1.5971784 1.5971784 1.597163
0.0674 1.4245314 1.4245314 1.424532
0.1603 1.2289940 1.2289940 1.228995
0.2833 1.0513750 1.0513750 1.051376
0.4255 0.9058140 0.9058410 0.905842
0.9 10
0.5744 0.7934295 0.7934295 0.793430
0.7167 0.7102823 0.7102823 0.710283
0.8397 0.6516836 0.6516836 0.651683
0.9325 0.6136514 0.6136514 0.613653
0.9870 0.5933988 0.5933988 0.593399


of the discretized boundary condition Eq. (68), the required bj from these “negative” coefficients.
where ψ+ (µ) is by definition the incident flux ψ(µ) By equating these calculated bi with the exact half-
for µ ∈ [0, 1] and 0 if µ ∈ [−1, 0], while range expressions for a(ν) with respect to W (µ) as
 outlined in Appendix A.4, it is possible to find nu-
 N merical values of z0 and X(−ν). Thus from the sec-
b− φε (µ, νi ) if µ ∈ [−1, 0], νi ≥ 0


ψ− (µ) = i=0 i ond of Eq. (A.64), {X(−νi )}N is obtained with
i=1

 biB = aiB , i = 1, . . . , N , which is then substituted
0 if µ ∈ [0, 1]

in the second of Eq. (A.63) with X(−ν 0 ) obtained
is the emergent angular distribution out of the from aA (ν0 ) according to Appendix A.4, to compare
medium. Equation (69) corresponds to the full- the respective aiA with the calculated biA from (71).
range µ ∈ [−1, 1], νi ≥ 0 form Finally the full-range coefficients of Problem A can
be used to obtain the X(−ν) values from the sec-
1
b(ν0 )φ(µ, ν0 ) + b(ν)φ(µ, ν)dν ond of Eqs. (A.63) and compared with the exact
0 tabulated values as in Table 3. The tabulated val-
1 ues of cz0 from Eq. (67) show a consistent deviation
= ψ+ (µ) + b− (ν0 )φ(µ, ν0 ) + b− (ν)φ(µ, ν)dν from our calculations of Problem A according to
0
aA (ν0 ) = − exp(−2z0 /ν0 ). Since the X(−ν) values
(70)
of Problem A in Table 3 also need the same b 0A as
of boundary condition (A.59) with the first and sec- input that was used in obtaining z0 , it is reasonable
ond terms on the right having the same interpre- to conclude that the “exact” numerical integration
tation as for Eq. (69). This full-range simulation of z0 is inaccurate to the extent displayed in Table 2.
merely states that the solution (A.58) of Eq. (A.53) From these numerical experiments and Fig. 18
holds for all µ ∈ [−1, 1], x ≥ 0, although it was ob- we may conclude that the continuous spectrum
tained, unlike in the regular full-range case, from [−1, 1] of the position operator µ acts as the D +
the given radiation ψ(µ) incident on the bound- points in generating the multifunctional Case sin-
ary at x = 0 over only half the interval µ ∈ [0, 1]. gular eigenfunction φ(µ, ν). Its rational approxima-
To obtain the simulated full-range coefficients {b i } tion φε (µ, ν) in the context of the simple simulated
and {b− } of the half-range problem, we observe that
i full-range computations of the complex half-range
there are effectively only half the number of coef- exact theory of Appendix A.4, clearly demonstrates
ficients as compared to a normal full-range prob- the utility of graphical convergence of sequence of
lem because ν is now only over half the full inter- functions to multifunction. The totality of the mul-
val. This allows us to generate two sets of equations tifunctions φ(µ, ν) for all ν in Figs. 18(c) and 18(d)
from (70) by integrating with respect to µ ∈ [−1, 1] endows the problem with the character of max-
with ν in the half intervals [−1, 0] and [0, 1] to imal ill-posedness that is characteristic of chaos.
obtain the two sets of coefficients b− and b, re- This chaotic signature of the transport equation is
spectively. Accordingly we get from Eq. (69) with however latent as the experimental output Φ(x, µ)
j = 0, 1, . . . , N the sets of equations is well-behaved and regular. This important exam-
N
ple shows how nature can use hidden and complex
(+)
(ψ, φj− )µ =− b− (φi+ , φj− )µ
(−) chaotic substates to generate order through a pro-
i
i=0
cess of superposition.
N
1
bj = (ψ, φj+ )(+) +
µ b− (φi+ , φj+ )µ
i
(−)
6. Does Nature Support
Nj
i=0 Complexity?
(71)
The question of this section is basic in the light of
where (φj± )N represents (φε (µ, ±νj ))N , φ0± =
j=1 j=1 the theory of chaos presented above as it may be
φ(µ, ±ν0 ), the (+) (−) superscripts are used to reformulated to the inquiry of what makes nature
denote the integrations with respect to µ ∈ [0, 1] support chaoticity in the form of increasing non-
and µ ∈ [−1, 0] respectively, and (f, g) µ denotes injectivity of an input–output system. It is the pur-
the usual inner product in [−1, 1] with respect pose of this section to exploit the connection be-
to the full range weight µ. While the first set of tween spectral theory and the dynamics of chaos
N + 1 equations give b− , the second set produces
i that has been presented in the previous section.

3202 A. Sengupta

Since linear operators on finite dimensional spaces of functions whose images under the respective L λ
do not possess continuous or residual spectra, spec- converge to 0; recall the definition of footnote 26.
tral theory on infinite dimensional spaces essentially This observation generalizes to the dense extension
involves limiting behavior to infinite dimensions of Multi| (X, Y ) of Map(X, Y ) as follows. If x ∈ D +
the familiar matrix eigenvalue–eigenvector problem. is not a fixed point of f (λ; x) = x, but there is
As always this means extensions, dense embeddings some n ∈ N such that f n (λ; x) = x, then the limit
and completions of the finite dimensional problem n → ∞ generates a multifunction at x as was the
that show up as generalized eigenvalues and eigen- case with the delta function in the previous section
vectors. In its usual form, the goal of nonlinear spec- and the various other examples that we have seen
tral theory consists [Appell et al., 2000] in the study so far in the earlier sections.
−1
of Tλ for nonlinear operators Tλ that satisfy more One of the main goals of investigations on the
general continuity conditions, like differentiability spectrum of nonlinear operators is to find a set in
and Lipschitz continuity, than simple boundedness the complex plane that has the usual desirable prop-
that is sufficient for linear operators. The following erties of the spectrum of a linear operator [Appell
generalization of the concept of the spectrum of a et al., 2000]. In this case, the focus has been to find
linear operator to the nonlinear case is suggestive. a suitable class of operators C(X) with T ∈ C(X),
For a nonlinear map, λ need not appear only in a such that the resolvent set is expressed as
multiplying role, so that an eigenvalue equation can
ρ(T ) = {λ ∈ C : (Tλ is 1 : 1)(Cl(R(Tλ ) = X)
be written more generally as a fixed-point equation
−1
and (Tλ ∈ C(X) on R(Tλ ))}
f (λ; x) = x
with the spectrum σ(T ) being defined as the com-
with a fixed point corresponding to the eigenfunc- plement of this set. Among the classes C(X) that
tion of a linear operator and an “eigenvalue” being have been considered, beside spaces of continu-
the value of λ for which this fixed point appears. ous functions C(X), are linear boundedness B(X),
The correspondence of the residual and continu- Frechet differentiability C 1 (X), Lipschitz continu-
ous parts of the spectrum are, however, less trivial ity Lip(X), and Granas quasiboundedness Q(x),
than for the point spectrum. This is seen from the where Lip(X) specifically takes into account the
following two examples [Roman, 1975]. Let Ae k = nonlinearity of T to define
λk ek , k = 1, 2, . . . be an eigenvalue equation with
ej being the jth unit vector. Then (A − λ)e k := T (x) − T (y)
T Lip = supx=y ,
(λk − λ)ek = 0 iff λ = λk so that {λk }∞ ∈ P σ(A) x−y
k=1 (72)
are the only eigenvalues of A. Consider now (λ k )∞k=1 T (x) − T (y)
to be a sequence of real numbers that tends to |T |lip = inf x=y =
x−y
a finite λ∗ ; for example, let A be a diagonal ma-
trix having 1/k as its diagonal entries. Then λ ∗ that are plain generalizations of the corresponding
−
belongs to the continuous spectrum of A because norms of linear operators. Plots of f λ (y) = {x ∈
(A − λ∗ )ek = (λk − λ∗ )ek with λk → λ∗ implies D(f −λ) : (f −λ)x = y} for the functions f : R → R
that (A − λ∗ )−1 is an unbounded linear operator

 −1 − λx, x −1
and λ∗ a generalized eigenvalue of A. In the second fλa (x) = (1 − λ)x, −1 ≤ x ≤ 1
example Aek = ek+1 /(k + 1), it is not difficult to 
1 − λx, 1x
verify that: (a) The point spectrum of A is empty, 
(b) The range of A is not dense because it does  −λx, x1
not contain e1 , and (c) A−1 is unbounded because fλb (x) = (1 − λ)x − 1, 1 ≤ x ≤ 2
Aek → 0. Thus the generalized eigenvalue λ ∗ = 0 in 1 − λx, 2x

this case belongs to the residual spectrum of A. In
−λx
√ x1
either case, limj→∞ ej is the corresponding general- fλc (x) =
x − 1 − λx 1≤x
ized eigenvector that enlarges the trivial null space
N (Lλ∗ ) of the generalized eigenvalue λ∗ . In fact (x − 1)2 + 1 − λx 1≤x≤2
in these two and the Dirac delta example of Sec. fλd (x) =
(1 − λ)x otherwise
5 of continuous and residual spectra, the general-
ized eigenfunctions arise as the limits of a sequence fλe (x) = tan−1 (x) − λx

−1 2 −4
0
0 1
1 −4
−4 44
Toward a Theory of Chaos
−5 3203
(a)
−5 −1
5
5
−5 4
4
(a)
−1 10
(b)
(a) (b)
10
10 33
0
0 2
4 11 00
1 22 1 00
0.5 1 0
2 1 0 −1 3
−1
−1 22 0.5
0.5 00
−1
−1
−1
33
3
3
−1 −3 2 −1 1
2
2 1
1 1.5
3 −1 −0.5
0
0 1
1 1.5 2 −4
−41.5
−1
2 −4 −0.5
−0.5 44
2
2 −4
−4 55
−1
−1 −3 3
−4 4
−3
−3 33
−5
−5 −1
−1 −1
(a)
(a) (b)
(b) (c)
(a)
10
10 (b)
−1
−1
33 (c)
−1
−1
−1 (c)
(c) (d)
10
(d)
(b)
3 22 10
11
10 00 10
10
0 1.5
−1
−1
2 0.5
0.5 1 0 0
0 0
0 0.5
1.5
1.5 11 0.5
0.5 00 −0.5
−0.5 2
33
−1 0.5
0.5 1 22
0 −1
−1
3 1
1 2 −0.5
1 −2 2 −10
2
2 −0.5
−0.5 2
−0.5
1.5 −0.5
−0.5 −2
−2 22 −10
−10 1 10
10
−0.5 2
2 55
2 −4
−4−0.5
−1
−1 1
1 −1
−0.5 0 0.5
−3
−3
−4 3
3
5 −1
−1 22 −0.5
−1 0
0 0.5
0.5
−1 −1 −0.5
−0.5 00 0.5
0.5 11 1.5
1.5
3 −1 −1
(c)
(c) (d)
(d) −10
(e)
10
−110 −10
−1010
10
−10
−10
(d) (e)
(e) (f)
(f)
(d) (e) (f)
10 0
0 1.5
1.5 11 0.5
0.5 00 −0.5
−0.5 2
0.5
0.5 Fig. 19. Inverses of fλ =2 2
2 f − λ. The λ-values are2shown on the graphs.
0 1.5 1 0.5 0 −0.5
−1
−1
1
1
2 2
2 √ −0.5
−1
−0.5
 1 − 2 −x − λx,2 x−10−1
the complement of the resolvent set, is more diffi-

−2
−2 2 −10 10
10
22 cult to find. Here the convenient characterization of
−0.5 
fλf (x) = (1 − λ)x, 1 −1 ≤ x ≤ 1
−0.5 −0.5
2 1 10
−10  √ the resolvent of a continuous linear operator as the
2 2
2 x − 1 − λx, 1 x −1

1 0 0 0.5 −1
0.5 set of all 2sufficiently large λ that satisfy |λ| M is
0 0.5 11 1.5
0.5 of little significance as, unlike for a linear operator,
taken from [Appell et al., 2000] are shown −0.5 0 1.5
−0.5
−1 2 in Fig. 19.
0.5
It is easy 0 verify that 1.5 Lipschitz and linear
−0.5
to −10 1
0.5
the −10
the non-existence of an inverse is not just due to
upper and lower bounds of these maps are as in −1 (0)} which happens to be the only way
(f)the set {f
−10 −10
(e)
(e) (f)
Table 4. −10 a linear map can fail to be injective. Thus the map
The point spectrum defined by
(f) 2 defined piecewise as α + 2(1 − α)x for 0 ≤ x 1/2
2
P σ(f ) = {λ ∈ C : (f − λ)x = 0 for some x = 0} and 2(1 − x) for 1/2 ≤ x ≤ 1, with 0 α 1, is
2 not invertible on its range although {f − (0)} = 1.
is the simplest to calculate. Because of the spe-
Comparing Fig. 19 and Table 4, it is seen that in
cial role played by the zero element 0 in generating
cases (b)–(d), the intervals [|f |b , f B ] are subsets
the point spectrum in the linear case, the bounds
m x ≤ Lx ≤ M x together with Lx = λx of the λ-values for which the respective maps are
imply Cl(P σ(L)) = [ L b , L B ] — where the not injective; this is to be compared with (a), (e)
subscripts denote the lower and upper bounds in and (f) where the two sets are the same. Thus the
Eq. (72) and which sometimes is taken to be a de- linear bounds are not good indicators of the unique-
scriptor of the point spectrum of a nonlinear op- ness properties of solution of nonlinear equations for
erator — as can be seen in Table 5 and verified which the Lipschitzian bounds are seen to be more
from Fig. 19. The remainder of the spectrum, as appropriate.

3204 A. Sengupta

Table 4. Bounds on the functions of Fig. 19. Thus apart from multifunctions, λ ∈ σ(f ) also
generates functions on the boundary of functional
Function |f |b f |f |lip f
B Lip and non-functional relations in Multi(X, T ). While
fa 0 1 0 1 it is possible to classify the spectrum into point,
fb 0 1/2 0 1 continuous and residual subsets, as in the linear
fc 0 1/2 0 ∞
case, it is more meaningful for nonlinear opera-
√ tors to consider λ as being either in the bound-
fd 2( 2 − 1) ∞ 0 2
ary spectrum Bdy(σ(f )) or in the interior spectrum
fe 0 1 0 1
Int(σ(f )), depending on whether or not the mul-
ff 0 1 0 1
tifunction f (λ; ·)− arises as the graphical limit of
a net of functions in either ρ(f ) or Rσ(f ). This
is suggested by the spectra arising from the sec-
Table 5. Lipschitzian and point spectra ond row of Table 1 (injective Lλ and discontinu-
of the functions of Fig. 19. ous L−1 ) that lies sandwiched in the λ-plane be-
λ
tween the two components arising from the first
Functions σLip (f ) P σ(f )
and third rows, see [Naylor Sell, 1971, Sec. 6.6],
fa [0, 1] (0, 1] for example. According to this simple scheme, the
fb [0, 1] [0, 1/2]
spectral set is a closed set with its boundary and
interior belonging to Bdy(σ(f )) and Int(σ(f )), re-
fc [0, ∞) [0, 1/2]
√ spectively. Table 6 shows this division for the ex-
fd [0, 2] [2( 2 − 1), 1]
amples in Fig. 19. Because 0 is no more significant
fe [0, 1] (0, 1) than any other point in the domain of a nonlin-
ff [0, 1] (0, 1) ear map in inducing non-injectivity, the division of
the spectrum into the traditional sets would be as
shown in Table 6; compare also with the conven-
In view of the above, we may draw the follow- tional linear point spectrum of Table 5. In this non-
ing conclusions. If we choose to work in the space linear classification, the point spectrum consists of
of multifunctions Multi(X, T ), with T the topology any λ for which the inverse f (λ; ·)− is set-valued,
of pointwise biconvergence, when all functional re- irrespective of whether this is produced at 0 or not,
lations are (multi)invertible on their ranges, we may while the continuous and residual spectra together
make the following definition for the net of functions comprise the boundary spectrum. Thus a λ can be
f (λ; x) satisfying f (λ; x) = x. both at the point and the continuous or residual
spectra which need not be disjoint. The continuous
Definition 6.1. Let f (λ; ·) ∈ Multi(X, T ) be a and residual spectra are included in the boundary
function. The resolvent set of f is given by spectrum which may also contain parts of the point
spectrum.
ρ(f ) = {λ : (f (λ; ·)−1 ∈ Map(X, T ))
∧(Cl(R(f (λ; ·)) = X)} , Example 6.1. To see how these concepts apply
to linear mappings, consider the equation (D −
and any λ not in ρ is in the spectrum of f . λ)y(x) = r(x) where D = d/dx is the differential

Table 6. Nonlinear spectra of functions of Fig. 19. Compare the present
point spectra with the usual linear spectra of Table 5.

Function Int(σ(f )) Bdy(σ(f )) P σ(f ) Cσ(f ) Rσ(f )

fa (0, 1) {0, 1} [0, 1] {1} {0}
fb (0, 1) {0, 1} [0, 1] {1} {0}
fc (0, ∞) {0} [0, ∞) {0} ∅
fd (0, 2) {0, 2} (0, 2) {0, 2} ∅
fe (0, 1) {0, 1} (0, 1) {1} {0}
ff (0, 1) {0, 1} (0, 1) {0, 1} ∅


operator on L2 [0, ∞), and let λ be real. For λ = 0, by the graphical convergence of a net of resolvent
the unique solution of this equation in L 2 [0, ∞), is functions while the multifunctions in the interior of
 x
the spectral set evolve graphically independent of
 λx
e
 y(0) + e−λx r(x )dx , λ 0 the functions in the resolvent. The chaotic states
forming the boundary of the functional and multi-

0
y(x) = ∞ functional subsets of Multi(X) marks the transition
e−λx r(x )dx
 λx
e y(0) − λ0

 from the less efficient functional state to the more
x
efficient multifunctional one.
showing that for λ 0 the inverse is functional so These arguments also suggest the following.
that λ ∈ (0, ∞) belongs to the resolvent of D. How- The countably many outputs arising from the non-
ever, when λ 0, apart from the y = 0 solution injectivity of f (λ; ·) corresponding to a given input
(since we are dealing with a linear problem, only can be interpreted to define complexity because in
r = 0 is to be considered), eλx is also in L2 [0, ∞) a nonlinear system each of these possibilities con-
so that all such λ are in the point spectrum of D. stitute an experimental result in itself that may not
For λ = 0 and r = 0, the two solutions are not nec- be combined in any definite predtermined manner.
∞
essarily equal unless 0 r(x) = 0, so that the range This is in sharp contrast to linear systems where
R(D − I) is a subspace of L2 [0, ∞). To complete a linear combination, governed by the initial con-
the problem, it is possible to show [Naylor Sell, ditions, always generate a unique end result; recall
1971] that 0 ∈ Cσ(D), see Example 2.2; hence the also the combination offered by the singular gen-
continuous spectrum forms at the boundary of the eralized eigenfunctions of neutron transport the-
functional solution for the resolvent-λ and the mul- ory. This multiplicity of possibilities that have no
tifunctional solution for the point spectrum. With definite combinatorial property is the basis of the
a slight variation of problem to y(0) = 0, all λ 0 diversity of nature, and is possibly responsible for
are in the resolvent set, while λ 0 the inverse is Feigenbaum’s “historical prejudice”, [Feigenbaum,
∞
bounded but must satisfy y(0) = 0 e−λx r(x)dx = 1992], see Prelude 2. Thus order represented by
0 so that Cl(R(D−λ)) = L2 [0, ∞). Hence λ 0 be- the functional resolvent passes over to complexity
long to the residual spectrum. The decomposition of the countably multifunctional interior spectrum
of the complex λ-plane for these and some other via the uncountably multifunctional boundary that
linear spectral problems taken from [Naylor Sell, is a prerequisite for chaos. We may now strengthen
1971] is shown in Fig. 20. In all cases, the spectrum our hypothesis offered at the end of the previous
due to the second row of Table 1 acts as a boundary section in terms of the examples of Figs. 19 and
between that arising from the first and third rows, 20, that nature uses chaoticity as an intermediate
which justifies our division of the spectrum for a step to the attainment of states that would other-
nonlinear operator into the interior and boundary wise be inaccessible to it. Well-posedness of a sys-
components. Compare with Example 2.2. tem is an extremely inefficient way of expressing a
multitude of possibilities as this requires a different
From the basic representation of the resolvent input for every possible output. Nature chooses to
operator (1 − f )−1 express its myriad manifestations through the mul-
1 + f + f2 + · · · + fi + · · · tifunctional route leading either to averaging as in
the delta function case or to a countable set of well-
in Multi(X), if the iterates of f converge to a multi- defined states, as in the examples of Fig. 19 corre-
function for some λ, then that λ must be in the spec- sponding to the interior spectrum. Of course it is no
trum of f , which means that the control parameter distraction that the multifunctional states arise re-
−
of a chaotic dynamical system is in its spectrum. Of spectively from fλ and fλ in these examples as f is
course, the series can sum to a multi even otherwise: a function on X that is under the influence of both
take fλ (x) to be identically x with λ = 1, for exam- f and its inverse. The functional resolvent is, for all
ple, to get 1 ∈ P σ(f ). A comparison of Tables 1 and practical purposes, only a tool in this structure of
5 reveal that in case (d), for example, 0 and 2 belong nature.
−1
to the Lipschtiz spectrum because although f d is The equation f (x) = y is typically an input–
not Lipschitz continuous, f Lip = 2. It should also output system in which the inverse images at a func-
be noted that the boundary between the functional tional value y0 represents a set of input parameters
resolvent and multifunctional spectral set is formed leading to the same experimental output y 0 ; this

3206 A. Sengupta

Resolvent
Resolvent
Resolvent Resolvent
Resolvent
Resolvent Resolvent
Resolvent
Resolvent
Resolvent
set set
Resolvent
set Resolvent
set set
Resolvent
set Resolvent
setset
Resolvent
set
set
set set
set set
set

Continuous Residual
Continuous Residual
Continuous Residual Continuous Point
Continuous
Continuous Point
Point Continuous
Continuous
Continuous
Continuous spectrum
Continuous Residual
spectrum
spectrum Residual
spectrum spectrum
spectrum spectrum
spectrum
spectrum Point
Continuous spectrum
Point
Continuous spectrum
spectrum Continuous
Continuous
spectrum
spectrum
spectrum
spectrum
spectrum spectrum
spectrum spectrum
spectrum spectrum
spectrum spectrum
spectrum
(a) (b) (c)
λ-plane
λ-plane
λ-plane
λ-plane
λ-plane
Resolvent
Resolvent
Resolvent Resolvent
Resolvent
Resolvent Residual Point
Residual
Residual Point
Point Resolvent
Resolvent
Resolvent
Resolvent
Resolvent
set set
set Resolvent
Resolvent
set set
set Residual
Residual Point
Point
spectrum spectrum
spectrum
spectrum spectrum
spectrum § !
Resolvent
Resolvent
£ !% ¦#
$ £
set
set
set §
set
set set
set spectrum
spectrum spectrum
spectrum ¨¦¤¢
© § ¥ £ ¡ set
set )¢
§ £

Continuous
Continuous
Continuous Continuous
Continuous
Continuous Continuous
Continuous
Continuous
¤ ! § % (
Continuous
Continuous
spectrum
spectrum
spectrum Continuous
Continuous
spectrum
spectrum
spectrum Continuous
Continuous
spectrum
spectrum
spectrum
'¦¤¢
© § ¥ £ ¡
spectrum
spectrum spectrum
spectrum spectrum
spectrum

£ 54$ ¡ 20
3 1

(d) (e) (f)

Fig. 20. Spectra of some linear operators in the complex λ-plane. (a) Left shift (. . . , x−1 , x0 , x1 , . . .) → (. . . x0 , x1 , x2 , . . .)
on l2 (−∞, ∞), (b) Right shift (x0 , x1 , x2 , . . .) → (0, x0 , x1 , . . .) on l2 [0, ∞), (c) Left shift (x0 , x1 , x2 , . . .) → (x1 , x2 , x3 , . . .)
on l2 [0, ∞) of sequence spaces, and (d) d/dx on L2 (−∞, ∞) (e) d/dx on L2 [0, ∞) with y(0) = 0 and (f) d/dx on L2 [0, ∞).
The residual spectrum in (b) and (e) arise from block (3–3) in Table 1, i.e. Lλ is one-to-one and L−1 is bounded on non-λ
dense domains in l2 [0, ∞) and L2 [0, ∞), respectively. The continuous spectrum therefore marks the boundary between two
functional states, as in (a) and (e), now with dense and non-dense domains of the inverse operator.

is stability characterized by a complete insensitiv- is larger than a functional state represented by the
ity of the output to changes in input. On the other singleton {f (x0 )}.
hand, a continuous multifunction at x 0 is a signal
for a hypersensitivity to input because the output, Epilogue
which is a definite experimental quantity, is a choice
The most passionate advocates of the new science
from the possibly infinite set {f (x0 )} made by a
go so far as to say that twentieth-century science
choice function which represents the experiment at will be remembered for just three things: relativity,
that particular point in time. Since there will always quantum mechanics and chaos. Chaos, they contend,
be finite differences in the experimental parameters has become the century’s third great revolution in
when an experiment is repeated, the choice function the physical sciences. Like the first two revolutions,
(that is the experimental output) will select a point chaos cuts away at the tenets of Newton’s physics. As
from {f (x0 )} that is representative of that experi- one physicist put it: “Relativity eliminated the New-
ment and which need not bear any definite relation tonian illusion of absolute space and time; quantum
to the previous values; this is instability and sig- theory eliminated the Newtonian dream of a con-
nals sensitivity to initial conditions. Such a state is trollable measurement process; and chaos eliminates
of high entropy as the number of available states the Laplacian fantasy of deterministic predictability.”
fC ({f (x0 )}) — where fC is the choice function — Of the three, the revolution in chaos applies to the
1
11
11


universe we see and touch, to objects at human scale. Goldenfeld, N. Kadanoff, L. P. [1999] “Simple lessons
. . . There has long been a feeling, not always expressed from complexity,” Science 284, 87–89.
openly, that theoretical physics has strayed far from Korevaar, J. [1968] Mathematical Methods, Vol. 1 (Aca-
human intuition about the world. Whether this will demic Press, NY).
prove to be fruitful heresy, or just plain heresy, no one Naylor, A. W. Sell, G. R. [1971] Linear Operator
Theory is Engineering and Science Holt (Rinehart and
knows. But some of those who thought that physics
Winston, NY).
might be working its way into a corner now look to
Peitgen, H.-O., Jurgens, H. Saupe, D. [1992] Chaos
chaos as a way out. and Fractals: New Frontiers of Science (Springer-
[Gleick, 1987] Verlag, NY).
Robinson, C. [1999] Dynamical Systems: Stability, Sym-
bolic Dynamics and Chaos (CRC Press LLC, Boca
Acknowledgments Raton).
It is a pleasure to thank the referees for recom- Roman, P. [1975] Some Modern Mathematics for Physi-
mending an enlarged Tutorial and Review revision cists and other Outsiders (Pergammon Press, NY).
of the original submission Graphical Convergence, Sengupta, A. [1995a] “A discretized spectral approxima-
Chaos and Complexity, and Professor Leon O. Chua tion in neutron transport theory. Some numerical con-
siderations,” J. Stat. Phys. 51, 657–676.
for suggesting a pedagogically self-contained, jar-
Sengupta, A. [1995b] “Full range solution of half-space
gonless version accessible to a wider audience for
neutron transport problem,” ZAMP 46, 40–60.
the present form of the paper. Financial assis- Sengupta, A. [1997] “Multifunction and generalized in-
tance during the initial stages of this work from verse,” J. Inverse and Ill-Posed Problems 5, 265–285.
the National Board for Higher Mathematics is also Sengupta, A. Ray, G. G. [2000] “A multifunctional ex-
acknowledged. tension of function spaces: Chaotic systems are maxi-
mally ill-posed,” J. Inverse and Ill-Posed Problems 8,
232–353.
References Stuart, A. M. Humphries, A. R. [1996] Dynamical
Alligood, K. T., Sauer, T. D. Yorke, J. A. [1997] Chaos, Systems and Numerical Analysis (Cambridge Univer-
An Introduction to Dynamical Systems (Springer- sity Press).
Verlag, NY). Tikhonov, A. N. Arsenin, V. Y. [1977] Solutions of Ill-
Appell, J., DePascale, E. Vignoli, A. [2000] “A com- Posed Problems (V. H. Winston, Washington D.C.).
parison of different spectra for nonlinear operators,” Waldrop, M. M. [1992] Complexity: The Emerging Sci-
Nonlin. Anal. 40, 73–90. ence at the Edge of Order and Chaos (Simon and
Brown, R. Chua, L. O. [1996] “Clarifying chaos: Schuster).
Examples and counterexamples,” Int. J. Bifurcation Willard, S. [1970] General Topology (Addison-Wesley,
and Chaos 6, 219–249. Reading, MA).
Campbell, S. I. Mayer, C. D. [1979] Generalized Williams, M. M. R. [1971] Mathematical Methods of
Inverses of Linear Transformations (Pitman Publish- Particle Transport Theory (Butterworths, London).
ing Ltd., London).
Case, K. M. Zweifel, P. F. [1967] Linear Transport
Theory (Addison-Wesley, MA). Appendix
de Souza, H. G. [1997] “Opening address,” in The Im-
This Appendix gives a brief overview of some as-
pact of Chaos on Science and Society, eds. Grebogi,
C. Yorke, J. A. (United Nations University Press,
pects of topology that are necessary for a proper
Tokyo), pp. 384–386. understanding of the concepts introduced in this
Devaney, R. L. [1989] An Introduction to Chaotic Dy- work.
namical Systems (Addison-Wesley, CA).
Falconer, K. [1990] Fractal Geometry (John Wiley,
Chichester). A.1. Convergence in Topological
Feigenbaum, M. [1992] “Foreword,” Chaos and Fractals: Spaces: Sequence, Net and
New Frontiers of Science (Springer-Verlag, NY), Filter
pp. 1–7.
Gallagher, R. Appenzeller, T. [1999] “Beyond reduc- In the theory of convergence in topological spaces,
tionism,” Science 284, p. 79. countability plays an important role. To understand
Gleick, J. [1987] Chaos: The Amazing Science of the Un- the significance of this concept, some preliminaries
predictable (Viking, NY). are needed.

3208 A. Sengupta

The notion of a basis, or base, is a familiar one determines reciprocally the topology U as
in analysis: a base is a subcollection of a set which  
may be used to construct, in a specified manner, any
 
U = U ⊆X :U = B . (A.4)
element of the set. This simplifies the statement of  
B∈T B
a problem since a smaller number of elements of
the base can be used to generate the larger class This means that the topology on X can be recon-
of every element of the set. This philosophy finds structed from the base by taking all possible unions
application in topological spaces as follows. of members of the base, and a collection of subsets
Among the three properties (N1) − (N3) of the of a set X is a topological base iff Eq. (A.4) of arbi-
neighborhood system Nx of Tutorial 4, (N1) and trary unions of elements of T B generates a topology
(N2) are basic in the sense that the resulting sub- on X. This topology, which is the coarsest (that is
collection of Nx can be used to generate the full the smallest) that contains T B, is obviously closed
system by applying (N3); this basic neighborhood under finite intersections. Since the open set Int(N )
system, or neighborhood (local ) base B x at x, is char- is a neighborhood of x whenever N is, Eq. (A.2)
acterized by and the definition Eq. (17) of Nx implies that the
(NB1) x belongs to each member B of Bx . open neighborhood system of any point in a topo-
logical space is an example of a neighborhood base
(NB2) The intersection of any two members of B x
at that point, an observation that has often led, to-
contains another member of Bx : B1 , B2 ∈ Bx ⇒
gether with Eq. (A.3), to the use of the term “neigh-
(∃B ∈ Bx : B ⊆ B1 B2 ).
borhood” as a synonym for “non-empty open set”.
The distinction between the two however is signifi-
Formally, compare Eq. (18),
cant as neighborhoods need not necessarily be open
Definition A.1.1. A neighborhood (local) base B x sets; thus while not necessary, it is clearly sufficient
at x in a topological space (X, U) is a subcollection for the local basic sets B to be open in Eqs. (A.1)
of the neighborhood system Nx having the prop- and (A.2). If Eq. (A.2) holds for every x ∈ N , then
erty that each N ∈ Nx contains some member of the resulting Nx reduces to the topology induced by
Bx . Thus the open basic neighborhood system B x as given by
Eq. (18).
def
Bx = {B ∈ Nx : x ∈ B ⊆ N for each N ∈ Nx } In order to check if a collection of subsets T B
(A.1) of X qualifies to be a basis, it is not necessary to
verify properties (T1)–(T3) of Tutorial 4 for the
determines the full neighborhood system class (A.4) generated by it because of the proper-
Nx = {N ⊆ X : x ∈ B ⊆ N for some B ∈ Bx } ties (TB1) and (TB2) below whose strong affinity to
(A.2) (NB1) and (NB2) is formalized in Theorem A.1.1.

reciprocally as all supersets of the basic elements. Theorem A.1.1. A collection TB of subsets of X
is a topological basis on X iff
The entire neighborhood system Nx , which is
(TB1) X = B ∈T B B. Thus each x ∈ X must be-
recovered from the base by forming all supersets of
long to some B ∈ T B which implies the existence
the basic neighborhoods, is trivially a local base at
of a local base at each point x ∈ X.
x; non-trivial examples are given below.
The second example of a base, consisting as (TB2) The intersection of any two members B 1 and
usual of a subcollection of a given collection, is the B2 of T B with x ∈ B1 B2 contains another mem-
topological base T B that allows the specification of ber of T B: (B1 , B2 ∈ T B) ∧ (x ∈ B1 B2 ) ⇒ (∃B ∈
the topology on a set X in terms of a smaller col- T B : x ∈ B ⊆ B1 B2 ).
lection of open sets. This theorem, together with Eq. (A.4) ensures
Definition A.1.2. A base T B in a topological space that a given collection of subsets of a set X sat-
(X, U) is a subcollection of the topology U having isfying (TB1) and (TB2) induces some topology
the property that each U ∈ U contains some mem- on X; compared to this is the result that any
ber of T B. Thus collection of subsets of a set X is a subbasis
for some topology on X. If X, however, already
def
TB = {B ∈ U : B ⊆ U for each U ∈ U} (A.3) has a topology U imposed on it, then Eq. (A.3)


must also be satisfied in order that the topol- R2 . Of course, the entire neighborhood system at
ogy generated by T B is indeed U. The next the- any point of a topological space is itself a (less use-
orem connects the two types of bases of Defini- ful) local base at that point. By Theorem A.1.2,
tions A.1.1 and A.1.2 by asserting that although Bε (x; d), Dε (x; d), ε 0, Bq (x; d), Q q 0 and
a local base of a space need not consist of open B1/n (x; d), n ∈ Z+ , for all x ∈ X are examples of
sets and a topological base need not have any ref- bases in a metrizable space with topology induced
erence to a point of X, any subcollection of the by a metric d.
base containing a point is a local base at that
point. In terms of local bases and bases, it is now pos-
sible to formulate the notions of first and second
Theorem A.1.2. A collection of open sets T B is countability as follows.
a base for a topological space (X, U) iff for each
x ∈ X, the subcollection Definition A.1.3. A topological space is first
countable if each x ∈ X has some countable neigh-
Bx = {B ∈ U : x ∈ B ∈ T B} (A.5) borhood base, and is second countable if it has a
of basic sets containing x is a local base at x. countable base.

Every metrizable space (X, d) is first countable
Proof. Necessity. Let TB be a base of (X, U) and
as both {B(x, q)}Q q0 and {B(x, 1/n)}n∈Z+ are
N be a neighborhood of x, so that x ∈ U ⊆ N for
examples of countable neighborhood bases at any
some open set U = B ∈ T B B and basic open sets
x ∈ (X, d); hence Rn is first countable. It should be
B. Hence x ∈ B ⊆ N shows, from Eq. (A.1), that
clear that although every second countable space
B ∈ Bx is a local basic set at x.
is first countable, only a countable first countable
Sufficiency. If U is an open set of X contain-
space can be second countable, and a common ex-
ing x, then the definition of local base Eq. (A.1)
ample of an uncountable first countable space that
requires x ∈ Bx ⊆ U for some subcollection of basic
is also second countable is provided by R n . Metriz-
sets Bx in Bx ; hence U = x∈U Bx . By Eq. (A.4)
able spaces need not be second countable: any un-
therefore, T B is a topological base for X.
countable set having the discrete topology is as an
example.
Because the basic sets are open, (TB2) of
Theorem A.1.1 leads to the following physically Example A.1.2. The following is an important ex-
appealing paraphrase of Theorem A.1.2. ample of a space that is not first countable as it is
needed for our pointwise biconvergence of Sec. 3.
Corollary. A collection T B of open sets of (X, U) Let Map(X, Y ) be the set of all functions between
is a topological base that generates U iff for each the uncountable spaces (X, U) and (Y, V). Given
open set U of X and each x ∈ U there is an open any integer I ≥ 1, and any finite collection of points
set B ∈ T B such that x ∈ B ⊆ U ; that is iff (xi )I of X and of open sets (Vi )I in Y , let
i=1 i=1
x ∈ U ∈ U ⇒ (∃B ∈ T B : x ∈ B ⊆ U ) . B((xi )I ; (Vi )I ) = {g ∈ Map(X, Y ) :
i=1 i=1
Example A.1.1. Some examples of local bases in R (g(xi ) ∈ Vi )(i = 1, 2, . . . , I)}
are intervals of the type (x−ε, x+ε), [x−ε, x+ε] for (A.6)
real ε, (x−q, x+q) for rational q, (x−1/n, x+1/n)
for n ∈ Z+ , while for a metrizable space with the be the functions in Map(X, Y ) whose graphs pass
topology induced by a metric d, each of the follow- through each of the sets (Vi )I I
i=1 at (xi )i=1 , and
ing is a local base at x ∈ X: Bε (x; d) := {y ∈ X : let T B be the collection of all such subsets of
d(x, y) ε} and Dε (x; d) := {y ∈ X : d(x, y) ≤ ε} Map(X, Y ) for every choice of I, (xi )I , and
i=1
for ε 0, Bq (x; d) for Q q 0 and B1/n (x; d) (Vi )I . The existence of a unique topology T — the
i=1
for n ∈ Z+ . In R2 , two neighborhood bases at any topology of pointwise convergence on Map(X, Y ) —
x ∈ R2 are the disks centered at x and the set that is generated by the open sets B of the collec-
of all squares at x with sides parallel to the axes. tion T B now follows because
Although these bases have no elements in common, (TB1) is satisfied: For any f ∈ Map(X, Y ) there
they are nevertheless equivalent in the sense that must be some x ∈ X and a corresponding V ⊆ Y
they both generate the same (usual) topology in such that f (x) ∈ V , and

3210 A. Sengupta

(TB2) is satisfied because the space (X, U) is not first countable (and as seen
above this is not a rare situation), it is not diffi-
B((si )I ; (Vi )I )
i=1 i=1 B((tj )J ; (Wj )J )
j=1 j=1 cult to realize that sequences are inadequate to de-
= B((si )I , (tj )J ; (Vi )I , (Wj )J )
i=1 j=1 i=1 j=1
scribe convergence in X simply because it can have
only countably many values whereas the space may
implies that a function simultaneously belonging to require uncountably many neighborhoods to com-
the two open sets on the left must pass through each pletely define the neighborhood system at a point.
of the points defining the open set on the right. The resulting uncountable generalizations of a se-
We now demonstrate that (Map(X, Y ), T ) is quence in the form of nets and filters is achieved
not first countable by verifying that it is not through a corresponding generalization of the index
possible to have a countable local base at any set N to the directed set D.
f ∈ Map(X, Y ). If this is not indeed true, let
Bf ((xi )I ; (Vi )I ) = {g ∈ Map(X, Y ) : (g(xi ) ∈
I Definition A.1.4. A directed set D is a preordered
i=1 i=1
I
Vi )i=1 }, which denotes those members of T B that set for which the order , known as a direction of
contain f with Vi an open neighborhood of f (xi ) D, satisfies
in Y , be a countable local base at f , see Theo- (a) α ∈ D ⇒ α α (that is is reflexive).
rem A.1.2. Since X is uncountable, it is now pos- (b) α, β, γ ∈ D such that (α β ∧β γ) ⇒ α γ
sible to choose some x∗ ∈ X different from any of (that is is transitive).
the (xi )I (for example, let x∗ ∈ R be an irrational
i=1 (c) α, β ∈ D ⇒ ∃γ ∈ D such that (α γ ∧ β γ).
for rational (xi )I ), and let f (x∗ ) ∈ V ∗ where V ∗
i
is an open neighborhood of f (x∗ ). Then B(x∗ ; V ∗ ) While the first two properties are obvious enough,
is an open set in Map(X, Y ) containing f ; hence the third which replaces antisymmetry, ensures that
from the definition of the local base, Eq. (A.1), or for any finite number of elements of the directed set,
equivalently from the Corollary to Theorem A.1.2, there is always a successor (upper bound). Exam-
there exists some (countable) I ∈ N such that ples of directed sets can be both straight forward,
f ∈ B I ⊆ B(x∗ ; V ∗ ). However, as any totally ordered set like N, R, Q, or Z and
all subsets of a set X under the superset or subset
 yi ∈ V i , if x = xi , and 1 ≤ i ≤ I

relation (that is (P(X), ⊇) or (P(X), ⊆) that are
∗
f (x) = y∗ ∈ V ∗ , if x = x∗ directed by their usual ordering, and not quite so
obvious as the following examples which are signifi-

arbitrary, otherwise
cantly useful in dealing with convergence questions
is a simple example of a function on X that is in B I in topological spaces, amply illustrate.
(as it is immaterial as to what values the function The neighborhood system
takes at points other than those defining B I ), but
not in B(x∗ ; V ∗ ). From this it follows that a suffi- DN = {N : N ∈ Nx }
cient condition for the topology of pointwise conver- at a point x ∈ X, directed by the reverse inclusion
gence to be first countable is that X be countable. direction defined as
Even though it is not first countable, M N ⇔N ⊆M for M, N ∈ Nx , (A.7)
(Map(X, Y ), T ) is a Hausdorff space when Y is is a fundamental example of a natural direction of
Hausdorff. Indeed, if f , g ∈ (Map(X, Y ), T ) with Nx . In fact while reflexivity and transitivity are
f = g, then f (x) = g(x) for some x ∈ X. But clearly obvious, (c) follows because for any M, N ∈
then as Y is Hausdorff, it is possible to choose dis- Nx , M M N and N M N . Of course, this
joint open intervals Vf and Vg at f (x) and g(x) direction is not a total ordering on N x . A more nat-
respectively. urally useful directed set in convergence theory is
With this background on first and second
countability, it is now possible to go back to the D Nt = {(N, t) : (N ∈ Nx )(t ∈ N )} (A.8)
question of nets, filters and sequences. Technically, under its natural direction
a sequence on a set X is a map x : N → X from the (M, s) (N, t) ⇔ N ⊆ M for M, N ∈ Nx ;
set of natural numbers to X; instead of denoting
this is in the usual functional manner of x(i) with (A.9)
i ∈ N, it is the standard practice to use the nota- D Nt is more useful than D N because, unlike the
tion (xi )i∈N for the terms of a sequence. However, if latter, D Nt does not require a simultaneous choice


of points from every N ∈ Nx that implicitly in- Definition A.1.7. A net χ : D → X converges to
volves a simultaneous application of the Axiom of x ∈ X if it is eventually in every neighborhood of
Choice; see Example A.1.3 below. The general in- x, that is
dexed variation
(∀N ∈ Nx )(∃µ ∈ D)(χ(ν µ) ∈ N ) .
D Nβ = {(N, β) : (N ∈ Nx )(β ∈ D)(xβ ∈ N )} The point x is known as the limit of χ and the col-
(A.10) lection of all limits of a net is the limit set
of Eq. (A.8), with natural direction lim(χ) = {x ∈ X : (∀N ∈ Nx )(∃Rβ ∈ Res(D))
(M, α) ≤ (N, β) ⇔ (α β) ∧ (N ⊆ M ) , (A.11) (χ(Rβ ) ⊆ N )} (A.12)
often proves useful in applications as will be clear of χ, with the set of residuals Res(D) in D given by
from the proofs of Theorems A.1.3 and A.1.4. Res(D) = {Rα ∈ P(D) : Rα = {β ∈ D
Definition A.1.5 (Net). Let X be any set and D for all β α ∈ D}} . (A.13)
a directed set. A net χ : D → X in X is a function The net adheres at x ∈ X 27 if it is frequently in
on the directed set D with values in X. every neighborhood of x, that is

A net, to be denoted as χ(α), α ∈ D, is there- ((∀N ∈ Nx )(∀µ ∈ D))((∃ν µ) : χ(ν) ∈ N ) .
fore a function indexed by a directed set. We adopt The point x is known as the adherent of χ and the
the convention of denoting nets in the manner of collection of all adherents of χ is the adherent set
functions and do not use the sequential notation χ α of the net, which may be expressed in terms of the
that can also be found in the literature. Thus, while cofinal subset of D
every sequence is a special type of net, χ : Z → X
is an example of a net that is not a sequence. Cof(D) = {Cα ∈ P(D) : Cα = {β ∈ D
Convergence of sequences and nets are de- for some β α ∈ D}} (A.14)
scribed most conveniently in terms of the notions of (thus Dα is cofinal in D iff it intersects every residual
being eventually in and frequently in every neigh- in D), as
borhood of points. We describe these concepts in
terms of nets which apply to sequences with obvi- adh(χ) = {x ∈ X : (∀N ∈ Nx )(∃Cβ ∈ Cof(D))
ous modifications. (χ(Cβ ) ⊆ N )}. (A.15)
Definition A.1.6. A net χ : D → X is said to be This recognizes, in keeping with the limit set, each
subnet of a net to be a net in its own right, and is
(a) Eventually in a subset A of X if its tail is even- equivalent to
tually in A: (∃β ∈ D) : (∀γ β)(χ(γ) ∈ A).
(b) Frequently in a subset A of X if for any index adh(χ) = {x ∈ X : (∀N ∈ Nx )(∀Rα ∈ Res(D))
β ∈ D, there is a successor index γ ∈ D such (χ(Rα ) N = ∅)} . (A.16)
that χ(γ) is in A: (∀β ∈ D)(∃γ β) : (χ(γ) ∈
A). Intuitively, a sequence is eventually in a set A
if it is always in it after a finite number of terms (of
It is not difficult to appreciate that course, the concept of a finite number of terms is
unavailable for nets; in this case the situation may
(i) A net eventually in a subset is also frequently
be described by saying that a net is eventually in A
in it but not conversely,
if its tail is in A) and it is frequently in A if it always
(ii) A net eventually (respectively, frequently) in a
returns to A to leave it again. It can be shown that
subset cannot be frequently (respectively, even-
a net is eventually (resp. frequently) in a set iff it is
tually) in its complement.
not frequently (resp. eventually) in its complement.
With these notions of eventually in and fre- The following examples illustrate graphically
quently in, convergence characteristics of a net may the role of a proper choice of the index set D in
be expressed as follows. the description of convergence.

27
This is also known as a cluster point; we shall, however, use this new term exclusively in the sense of the elements of a
derived set, see Definition 2.3.

3212 A. Sengupta

Example A.1.3. (1) Let γ ∈ D. The eventually to yield a self-consistent tool for the description of
constant net χ(δ) = x for δ γ converges to x. convergence.
(2) Let Nx be a neighborhood system at a point x in As compared with sequences where, the index
X and suppose that the net (χ(N ))N ∈Nx is defined set is restricted to positive integers, the considerable
by freedom in the choice of directed sets as is abun-
dantly borne out by the two preceding examples,
def
χ(M ) = s ∈ M ; (A.17) is not without its associated drawbacks. Thus as
here the directed index set D N is ordered by the a trade-off, the wide range of choice of the directed
natural direction (A.7) of Nx . Then χ(N ) → x be- sets may imply that induction methods, so common
cause given any x-neighborhood M ∈ D N , it follows in the analysis of sequences, need no longer apply
from to arbitrary nets.
(4) The non-convergent nets (actually these are
M N ∈ D N ⇒ χ(N ) = t ∈ N ⊆ M (A.18)
sequences)
that a point in any subset of M is also in M ; χ(N )
(a) (1, −1, 1, −1, . . .) adheres at 1 and −1 and
is therefore eventually in every neighborhood of x.
n, if n is odd
(3) This slightly more general form of the previous (b) xn =
1 − 1/(1 + n), if n is even
example provides a link between the complimentary
concepts of nets and filters that is considered below. adheres at 1 for its even terms, but is unbounded
For a point x ∈ X, and M, N ∈ Nx with the corre- in the odd terms.
sponding directed set D Ms of Eq. (A.8) ordered by
A converging sequence or net is also adhering
its natural order (A.9), the net
but, as examples (4) show, the converse is false.
def Nevertheless it is true, as again is evident from ex-
χ(M, s) = s (A.19)
amples (4), that in a first countable space where
converges to x because, as in the previous example, sequences suffice, a sequence (xn ) adheres to x iff
for any given (M, s) ∈ D Ns , it follows from some subsequence (xnm )m∈N of (xn ) converges to x.
(M, s) (N, t) ∈ D Ms ⇒ χ(N, t) = t ∈ N ⊆ M If the space is not first countable this has a corre-
(A.20) sponding equivalent formulation for nets with sub-
nets replacing subsequences as follows.
that χ(N, t) is eventually in every neighborhood Let (χ(α))α∈D be a net. A subnet of χ(α) is
M of x. The significance of the directed set D Nt the net ζ(β) = χ(σ(β)), β ∈ E, where σ : (E, ≤) →
of Eq. (A.8), as compared to D N , is evident from (D, ) is a function that captures the essence of the
the net that it induces without using the Axiom of subsequential mapping n → nm in N by satisfying
Choice: For a subset A of X, the net χ(N, t) = t ∈ A
indexed by the directed set (SN1) σ is an increasing order-preserving function:
it respects the order of E: σ(β) σ(β ) for every
D Nt = {(N, t) : (N ∈ Nx )(t ∈ N A)} (A.21) β ≤ β ∈ E, and
under the direction of Eq. (A.9), converges to x ∈ X (SN2) For every α ∈ D there exists a β ∈ E such
with all such x defining the closure Cl(A) of A. Fur- that α σ(β).
thermore taking the directed set to be These generalize the essential properties of a subse-
D Nt = {(N, t) : (N ∈ Nx )(t ∈ N A − {x})} quence in the sense that (1) Even though the index
sets D and E may be different, it is necessary that
(A.22)
the values of E be contained in D, and (2) There
which, unlike Eq. (A.21), excludes the point x that are arbitrarily large α ∈ D such that χ(α = σ(β))
may or may not be in the subset A of X, induces is a value of the subnet ζ(β) for some β ∈ E. Re-
the net χ(N, t) = t ∈ A − {x} converging to x ∈ X, calling the first of the order relations Eq. (38) on
with the set of all such x yielding the derived set Map(X, Y ), we will denote a subnet ζ of χ by ζ χ.
Der(A) of A. In contrast, Eq. (A.21) also includes We now consider the concept of filter on a set
the isolated points t = x of A so as to generate X that is very useful in visualizing the behavior
its closure. Observe how neighborhoods of a point, of sequences and nets, and in fact filters constitute
which define convergence of nets and filters in a an alternate way of looking at convergence ques-
topological space X, double up here as index sets tions in topological spaces. A filter F on a set X


is a collection of nonempty subsets of X satisfying space and F a filter on X. Then
properties (F1) − (F3) below that are simply those
lim(F) = {x ∈ X : (∀N ∈ Nx )(∃F ∈ F)(F ⊆ N )}
of a neighborhood system Nx without specification
of the reference point x. (A.23)

(F1) The empty set ∅ does not belong to F, and
(F2) The intersection of any two members of a fil- adh(F) = {x ∈ X : (∀N ∈ Nx )(∀F ∈ F)
ter is another member of the filter: F 1 , F2 ∈
(F N = ∅)} (A.24)
F ⇒ F1 F2 ∈ F,
(F3) Every superset of a member of a filter belongs are respectively the sets of limit points and adherent
to the filter: (F ∈ F) ∧ (F ⊆ G) ⇒ G ∈ F; in points of F 28
particular X ∈ F.
A comparison of Eqs. (A.12) and (A.16) with
Example A.1.4 Eqs. (A.23) and (A.24) respectively demonstrate
their formal similarity; this inter-relation between
(1) The indiscrete filter is the smallest filter on X. filters and nets will be made precise in Defini-
(2) The neighborhood system Nx is the important tions A.1.10 and A.1.11 below. It should be clear
neighborhood filter at x on X, and any local from the preceding two equations that
base at x is also a filter-base for Nx . In general
for any subset A of X, {N ⊆ X : A ⊆ Int(N )} lim(F) ⊆ adh(F) , (A.26)
is a filter on X at A. with a similar result
(3) All subsets of X containing a point x ∈ X is the
principal filter F P(x) on X at x. More gener- lim(χ) ⊆ adh(χ) (A.27)
ally, if F consists of all supersets of a nonempty holding for nets because of the duality between nets
subset A of X, then F is the principal filter and filters as displayed by Definitions A.1.9 and
F P(A) = {N ⊆ X : A ⊆ Int(N )} at A. By A.1.10 below, with the equality in Eqs. (A.26) and
adjoining the empty set to this filter give the (A.27) being true (but not characterizing) for ultra-
p-inclusion and A-inclusion topologies on X, re- filters and ultranets respectively, see Example 4.2(3)
spectively. The single element sets {{x}} and for an account of this notion. It should be clear from
{A} are particularly simple examples of filter- the equations of Definition A.1.8 that
bases that generate the principal filters at x
adh(F) = {x ∈ X : (∃ a finer filter G ⊇ F on X)
and A.
(4) For an uncountable (resp. infinite) set X, all (G → x)} (A.28)
cocountable (resp. cofinite) subsets of X consti- consists of all the points of X to which some finer
tute the cocountable (resp. cofinite or Frechet) filter G (in the sense that F ⊆ G implies every ele-
filter on X. Again, adding to these filters the ment of F is also in G) converges in X; thus
empty set gives the respective topologies.
adh(F) = lim(G : G ⊇ F) ,
Like the topological and local bases T B and Bx
respectively, a subclass of F may be used to define which corresponds to the net-result of Theo-
a filter-base F B that in turn generate F on X, just rem A.1.5 below, that a net χ adheres to x iff there
as it is possible to define the concepts of limit and is some subnet of χ that converges to x in X. Thus
adherence sets for a filter to parallel those for nets if ζ χ is a subnet of χ and F ⊆ G is a filter coarser
that follow straightforwardly from Definition A.1.7, than G then
taken with Definition A.1.11. lim(χ) ⊆ lim(ζ) lim(F) ⊆ lim(G)
Definition A.1.8. Let (X, T ) be a topological adh(ζ) ⊆ adh(χ) adh(G) ⊆ adh(F) ;

28
The restatement
F → x ⇔ Nx ⊆ F (A.25)
of Eq. (A.23) that follows from (F3), and sometimes taken as the definition of convergence of a filter, is significant as it ties
up the algebraic filter with the topological neighborhood system to produce the filter theory of convergence in topological
spaces. From the defining properties of F it follows that for each x ∈ X, Nx is the coarsest (that is smallest) filter on X that
converges to x.

3214 A. Sengupta

a filter G finer than a given filter F corresponds supersets F SΣ∧ . F(F S) :=F SΣ∧ is the smallest fil-
to a subnet ζ of a given net χ. The implication of ter on X that contains F S and is the filter generated
this correspondence should be clear from the asso- by F S.
ciation between nets and filters contained in Defini- Equation (A.24) can be put in the more useful
tions A.1.10 and A.1.11. and transparent form given by
A filter-base in X is a non-empty family
Theorem A.1.3. For a filter F in a space (X, T )
(Bα )α∈D = F B of subsets of X characterized by
(FB1) There are no empty sets in the collection F B: adh(F) = Cl(F )
(∀α ∈ D)(Bα = ∅) F ∈F
(A.31)
(FB2) The intersection of any two members of F B = Cl(B) ,
contains another member of F B: Bα , Bβ ∈ F B ⇒ B ∈ FB

(∃B ∈ F B : B ⊆ Bα Bβ ); and dually adh(χ), are closed sets.
hence any class of subsets of X that does not con-
Proof. Follows immediately from the definitions for
tain the empty set and is closed under finite inter-
the closure of a set Eq. (20) and the adherence of a
sections is a base for a unique filter on X; compare
filter Eq. (A.24). As always, it is a matter of conve-
the properties (NB1) and (NB2) of a local basis
nience in using the basic filters F B instead of F to
given at the beginning of this Appendix. Similar to
generate the adherence set.
Definition A.1.1 for the local base, it is possible to
define It is in fact true that the limit sets lim(F) and
lim(χ) are also closed set of X; the arguments in-
Definition A.1.9. A filter-base FB
in a set X is a
volving ultrafilters are omitted.
subcollection of the filter F on X having the prop-
Similar to the notion of the adherence set of
erty that each F ∈ F contains some member of F B.
a filter is its core — a concept that unlike the ad-
Thus
herence, is purely set-theoretic being the infimum
def
FB = {B ∈ F : B ⊆ F for each F ∈ F} (A.29) of the filter and is not linked with any topological
structure of the underlying (infinite) set X — de-
determines the filter fined as
F = {F ⊆ X : B ⊆ F for some B ∈ F B} (A.30) core(F) = F. (A.32)
F ∈F
reciprocally as all supersets of the basic elements.
From Theorem A.1.3 and the fact that the closure
This is the smallest filter on X that contains F B and of a set A is the smallest closed set that contains A,
is said to be the filter generated by its filter-base F B; see Eq. (25) at the end of Tutorial 4, it is clear that
alternatively F B is the filter-base of F. The entire in terms of filters
neighborhood system Nx , the local base Bx , Nx A A = core(F P(A))
for x ∈ Cl(A), and the set of all residuals of a di- Cl(A) = adh(F P(A)) (A.33)
rected set D are among the most useful examples of
filter-bases on X, A and D respectively. Of course, = core(Cl(F P(A)))
every filter is trivially a filter-base of itself, and the where F P(A) is the principal filter at A; thus the
singletons {{x}}, {A} are filter-bases that generate core and adherence sets of the principal filter at A
the principal filters F P(x) and F P(A) at x, and A are equal respectively to A and Cl(A) — a classic ex-
respectively. ample of equality in the general relation Cl( Aα ) ⊆
Paralleling the case of topological subbase T S, Cl(Aα ) — but both are empty, for example, in
a filter subbase F S can be defined on X to be any the case of an infinitely decreasing family of ratio-
collection of subsets of X with the finite intersection nals centered at any irrational (leading to a princi-
property (as compared with T S where no such condi- pal filter-base of rationals at the chosen irrational).
tion was necessary, this represents the fundamental This is an important example demonstrating that
point of departure between topology and filter) and the infinite intersection of a non-empty family of
it is not difficult to deduce that the filter generated (closed ) sets with the finite intersection property
by F S on X is obtained by taking all finite inter- may be empty, a situation that cannot arise on
sections F S∧ of members of F S followed by their a finite set or an infinite compact set. Filters on


X with an empty core are said to be free, and (ii) χ is frequently in A ⇒ (∀Rα ∈ Res(D))
are fixed otherwise: notice that by its very defini- (A χ(Rα ) = ∅) ⇒ A Fχ = ∅.
tion filters cannot be free on a finite set, and a
free filter represents an additional feature that may Limits and adherences are obviously preserved in
arise in passing from finite to infinite sets. Clearly switching between nets (respectively, filters) and
(adh(F) = ∅) ⇒ (core(F) = ∅), but as the im- the filters (respectively, nets) that they generate:
portant example of the rational space in the reals lim(χ) = lim(Fχ ), adh(χ) = adh(Fχ ) (A.34)
illustrate, the converse need not be true. Another
example of a free filter of the same type is provided lim(F) = lim(χF ), adh(F) = adh(χF ) . (A.35)
by the filter-base {[a, ∞) : a ∈ R} in R. Both these
The proofs of the two parts of Eq. (A.34), for ex-
examples illustrate the important property that a
ample, go respectively as follows. x ∈ lim(χ) ⇔ χ is
filter is free iff it contains the cofinite filter, and the
eventually in Nx ⇔ (∀N ∈ Nx )(∃F ∈ Fχ ) such that
cofinite filter is the smallest possible free filter on an
(F ⊆ N ) ⇔ x ∈ lim(Fχ ), and x ∈ adh(χ) ⇔ χ is
infinite set. The free cofinite filter, as these examples
frequently in Nx ⇔ (∀N ∈ Nx )(∀F ∈ Fχ )(N F =
illustrate, may be typically generated as follows. Let
∅) ⇔ x ∈ adh(Fχ ); here F is a superset of χ(Rα ).
A be a subset of X, x ∈ Bdy X−A (A), and consider
Some examples of convergence of filters are
the directed set Eq. (A.21) to generate the corre-
sponding net in A given by χ(N ∈ Nx , t) = t ∈ A. (1) Any filter on an indiscrete space X converges
Quite clearly, the core of any Frechet filter based on to every point of X.
this net must be empty as the point x does not lie (2) Any filter on a space that coincides with its
in A. In general, the intersection is empty because topology (minus the empty set, of course) con-
if it were not so then the complement of the inter- verges to every point of the space.
section — which is an element of the filter — would (3) For each x ∈ X, the neighborhood filter N x
be infinite in contravention of the hypothesis that converges to x; this is the smallest filter on X
the filter is Frechet. It should be clear that every fil- that converges to x.
ter finer than a free filter is also free, and any filter (4) The indiscrete filter F = {X} converges to no
coarser than a fixed filter is fixed. point in the space (X, {∅, A, X − A, X}), but
Nets and filters are complimentary concepts converges to every point of X − A if X has the
and one may switch from one to the other as fol- topology {∅, A, X} because the only neighbor-
lows. hood of any point in X − A is X which is con-
tained in the filter.
Definition A.1.10. Let F be a filter on X and let
D Fx = {(F, x) : (F ∈ F)(x ∈ F )} be a directed set One of the most significant consequences of con-
with its natural direction (F, x) (G, y) ⇒ (G ⊆ vergence theory of sequences and nets, as shown by
F ). The net χF : D Fx → X defined by the two theorems and the corollary following, is that
this can be used to describe the topology of a set.
χF (F, x) = x
The proofs of the theorems also illustrate the close
is said to be associated with the filter F, see inter-relationship between nets and filters.
Eq. (A.20).
Theorem A.1.4. For a subset A of a topological
Definition A.1.11. Let χ : D → X be a net and space X,
Rα = {β ∈ D : β α ∈ D} a residual in D. Then
Cl(A) = {x ∈ X : (∃ a net χ in A)(χ → x)} .
def
F Bχ = {χ(Rα ) : Res(D) → X for all α ∈ D} (A.36)
is the filter-base associated with χ, and the corre- Proof. Necessity. For x ∈ Cl(A), construct a net
sponding filter Fχ obtained by taking all supersets χ → x in A as follows. Let Bx be a topological local
of the elements of F Bχ is the filter associated with χ. base at x, which by definition is the collection of all
F Bχ is a filter-base in X because χ( Rα ) ⊆
open sets of X containing x. For each β ∈ D, the
χ(Rα ), that holds for any functional relation, sets
proves (FB2). It is not difficult to verify that Nβ = {Bα : Bα ∈ Bx }
(i) χ is eventually in A ⇒ A ∈ Fχ , and α β

3216 A. Sengupta

form a nested decreasing local neighborhood fil- Theorem A.1.5. If χ is a net in a topological space
ter base at x. With respect to the directed set X, then x ∈ adh(χ) iff some subnet ζ(β) = χ(σ(β))
D Nβ = {(Nβ , β) : (β ∈ D)(xβ ∈ Nβ )} of Eq. (A.10), of χ(α), with α ∈ D and β ∈ E, converges in X to
define the desired net in A by x; thus

χ(Nβ , β) = xβ ∈ Nβ A adh(χ) = {x ∈ X : (∃ a subnet ζ χ in X)(ζ → x)}.
(A.39)
where the family of non-empty decreasing subsets
Nβ A of X constitute the filter-base in A as re- Proof. Necessity. Let x ∈ adh(χ). Define a subnet
quired by the directed set D Nβ . It now follows from function σ :D Nα → D by σ(Nα , α) = α where D Nα
Eq. (A.11) and the arguments in Example A.1.3(3) is the directed set of Eq. (A.10): (SN1) and (SN2)
that xβ → x; compare the directed set of Eq. (A.21) are quite evidently satisfied according to Eq. (A.11).
for a more compact, yet essentially identical, argu- Proceeding as in the proof of the preceding theorem
ment. Carefully observe the dual roles of N x as a it follows that xβ = χ(σ(Nα , α)) = ζ(Nα , α) → x
neighborhood filter base at x. is the required converging subnet that exists from
Sufficiency. Let χ be a net in A that con- Eq. (A.15) and the fact that χ(Rα ) Nα = ∅ for
verges to x ∈ X. For any Nα ∈ Nx , there is a every Nα ∈ Nx , by hypothesis.
Rα ∈ Res(D) of Eq. (A.13) such that χ(Rα ) ⊆ Nα . Sufficiency. Assume now that χ has a subnet
Hence the point χ(α) = xα of A belongs to Nα so ζ(Nα , α) that converges to x. If χ does not adhere
that A Nα = ∅ which means, from Eq. (20), that at x, there is a neighborhood Nα of x not frequented
x ∈ Cl(A). by it, in which case χ must be eventually in X −N α .
Then ζ(Nα , α) is also eventually in X − Nα so that
Corollary. Together with Eqs. (20) and (22), it fol- ζ cannot be eventually in Nα , a contradiction of the
lows that hypothesis that ζ(Nα , α) → x.29

Der(A) = {x ∈ X : (∃ a net ζ in A − {x})(ζ → x)} Equations (A.36) and (A.39) imply that the clo-
(A.37) sure of a subset A of X is the class of X-adherences
of all the (sub)nets of X that are eventually in A.
The filter forms of Eqs. (A.36) and (A.37) This includes both the constant nets yielding the
Cl(A) = {x ∈ X : (∃ a filter F on X) isolated points of A and the non-constant nets lead-
(A ∈ F)(F → x)} ing to the cluster points of A, and implies the fol-
(A.38) lowing physically useful relationship between con-
Der(A) = {x ∈ X : (∃ a filter F on X) vergence and topology that can be used as defining
(A − {x} ∈ F)(F → x)} criteria for open and closed sets having a more ap-
pealing physical significance than the original def-
then follows from Eq. (A.25) and the finite inter-
initions of these terms. Clearly, the term “net” is
section property (F2) of F so that every neighbor-
hood of x must intersect A (respectively A − {x}) in justifiably used here to include the subnets too.
Eq. (A.38) to produce the converging net needed in The following corollary of Theorem A.1.5 sum-
the proof of Theorem A.1.3. marizes the basic topological properties of sets in
terms of nets (respectively, filters).
We end this discussion of convergence in topo-
Corollary. Let A be a subset of a topological space
logical spaces with a proof of the following theorem
X. Then
which demonstrates the relationship that “eventu-
ally in” and “frequently in” bears with each other; (1) A is closed in X iff every convergent net of
Eq. (A.39) below is the net-counterpart of the filter X that is eventually in A actually converges
equation (A.28). to a point in A (respectively, iff the adhering

29
In a first countable space, while the corresponding proof of the first part of the theorem for sequences is essentially the same
as in the present case, the more direct proof of the converse illustrates how the convenience of nets and directed sets may
require more general arguments. Thus if a sequence (xi )i∈N has a subsequence (xik )k∈N converging to x, then a more direct
line of reasoning proceeds as follows. Since the subsequence converges to x, its tail (xik )k≥j must be in every neighborhood
N of x. But as the number of such terms is infinite whereas {ik : k j} is only finite, it is necessary that for any given n ∈ N,
cofinitely many elements of the sequence (xik )ik ≥n be in N . Hence x ∈ adh((xi )i∈N ).


points of each filter-base on A all belong to A). to x unless it is of the uncountable type 30
Thus no X-convergent net in a closed subset
may converge to a point outside it. (x0 , x1 , . . . , xI , xI+1 , xI+1 , . . .) (A.40)
(2) A is open in X iff every convergent net of X with only a finite number I of distinct terms ac-
that converges to a point in A is eventually in tually belonging to the closed sequential set F =
A. Thus no X-convergent net outside an open X −G, and xI+1 = x. Note that as we are concerned
subset may converge to a point in the set. only with the eventual behavior of the sequence, we
(3) A is closed-and-open (clopen) in X iff every may discard all distinct terms from G by consider-
convergent net of X that converges in A is even- ing them to be in F , and retain only the constant
tually in A and conversely. sequence (x, x, . . .) in G. In comparison with the
(4) x ∈ Der(A) iff some net (respectively, filter- cofinite case that was considered in Sec. 4, the en-
base) in A − {x} converges to x; this clearly tire countably infinite sequence can now lie outside
eliminates the isolated points of A and x ∈ a neighborhood of x thereby enforcing the eventual
Cl(A) iff some net (respectively, filter-base) in constancy of the sequence. This leads to a gener-
A converges to x. alization of our earlier cofinite result in the sense
that a cocountable filter on a cocountable space con-
Remark. The differences in these characterizations
verges to every point in the space.
should be fully appreciated: If we consider the clus-
It is now straightforward to verify that for a
ter points Der(A) of a net χ in A as the resource
point x0 in an uncountable cocountable space X
generated by χ, then a closed subset of X can be
considered to be selfish as it keeps all it resource (a) Even though no sequence in the open set G =
to itself: Der(A) ∩ A = Der(A). The opposite of X − {x0 } can converge to x0 , yet x0 ∈ Cl(G)
this is a donor set that donates all its generated re- since the intersection of any (uncountable) open
sources to its neighbor: Der(A) ∩ X − A = Der(A), neighborhood U of x0 with G, being an un-
while for a neutral set, both Der(A) ∩ A = ∅ and countable set, is not empty.
Der(A) ∩ X − A = ∅ implying that the convergence (b) By Corollary 1 of Theorem A.1.5, the uncount-
resources generated in A and X − A can be de- able open set G = X − {x0 } is also closed in X
posited only in the respective sets. The clopen sets because if any sequence (x1 , x2 , . . .) in G con-
(see diagram 2–2 of Fig. 22) are of some special verges to some x ∈ X, then x must be in G
interest as they are boundary less so that no net- as the sequence must be eventually constant in
resources can be generated in this case as any such order for it to converge. But this is a contra-
limit are required to be simultaneously in the set diction as G cannot be closed since it is not
and its complement. countable.31 By the same reckoning, although
{x0 } is not an open set because its complement
Example A.1.2. (Continued). This continuation is not countable, nevertheless it follows from
Example A.1.2 illustrates how sequential conver- Eq. (A.40) that should any sequence converge
gence is inadequate in spaces that are not first to the only point x0 of this set, then it must
countable like the uncountable set with cocountable eventually be in {x0 } so by Corollary 2 of the
topology. In this topology, a sequence can converge same theorem, {x0 } becomes an open set.
to a point x in the space iff it has only a finite num- (c) The identity map 1 : X → Xd , where Xd is
ber of distinct terms, and is therefore eventually X with discrete topology, is not continuous be-
constant. Indeed, let the complement cause the inverse image of any singleton of X d is
def not open in X. Yet if a sequence converges in X
G = X − F, F = {xi : xi = x, i ∈ N}
to x, then its image (1(x)) = (x) must actually
of the countably closed sequential set F be an open converge to x in Xd because a sequence con-
neighborhood of x ∈ X. Because a sequence (x i )i∈N verges in a discrete space, as in the cofinite or
in X converges to a point x ∈ X iff it is eventu- cocountable spaces, iff it is eventually constant;
ally in every neighborhood (including G) of x, the this is so because each element of a discrete
sequence represented by the set F cannot converge space being clopen is boundary-less.

30
This is uncountable because interchanging any two eventual terms of the sequence does not alter the sequence.
31
Note that {x} is a 1-point set but (x) is an uncountable sequence.

3218 A. Sengupta

This pathological behavior of sequences in a dropping all basic open sets that do not inter-
non Hausdorff, non first countable space does not sect. Then a (coarser) topology can be gener-
arise if the discrete indexing set of sequences is re- ated from this base by taking all unions, and
placed by a continuous, uncountable directed set a filter by taking all supersets according to
like R for example, leading to nets in place of se- Eq. (A.30). For any given filter this expression
quences. In this case the net can be in an open set may be used to extract a subclass F B as a base
without having to be constant valued in order to for F.
converge to a point in it as the open set can be de-
fined as the complement of a closed countable part A.2. Initial and Final Topology
of the uncountable net. The careful reader could
The commutative diagram of Fig. contains four
not have failed to notice that the burden of the
sub-diagrams X − XB − f (X), Y − XB − f (X),
above arguments, as also of that in the example
X − XB − Y and X − f (X) − Y . Of these, the first
following Theorem 4.6, is to formalize the fact that
two are especially significant as they can be used to
since a closed set is already defined as a countable
conveniently define the topologies on X B and f (X)
(respectively finite) set, the closure operation cannot −1
from those of X and Y , so that fB , fB and G
add further points to it from its complement, and
have some desirable continuity properties; we recall
any sequence that converges in an open set in these
that a function f : X → Y is continuous if inverse
topologies must necessarily be eventually constant
images of open sets of Y are open in X. This sim-
at its point of convergence, a restriction that no
ple notion of continuity needs refinement in order
longer applies to a net. The cocountable topology
that topologies on XB and f (X) be unambiguously
thus has the very interesting property of filtering
defined from those of X and Y , a requirement that
out a countable part from an uncountable set, as
leads to the concepts of the so-called final and initial
for example the rationals in R.
topologies. To appreciate the significance of these
new constructs, note that if f : (X, U) → (Y, V) is a
This example serves to illustrate the hard truth
continuous function, there may be open sets in X
that in a space that is not first countable, the sim-
that are not inverse images of open — or for that
plicity of sequences is not enough to describe its
matter of any — subset of Y , just as it is possible
topological character, and in fact “sequential con-
for non-open subsets of Y to contribute to U. When
vergence will be able to describe only those topolo-
the triple {U, f, V} are tuned in such a manner
gies in which the number of (basic) neighborhoods
that these are impossible, the topologies so gener-
around each point is no greater than the number
ated on X and Y are the initial and final topologies
of terms in the sequences”, [Willard, 1970]. It is
respectively; they are the smallest (coarsest) and
important to appreciate the significance of this in-
largest (finest) topologies on X and Y that make
terplay of convergence of sequences and nets (and
f : X → Y continuous. It should be clear that every
of continuity of functions of Appendix A.1) and the
image and preimage continuous function is contin-
topology of the underlying spaces.
uous, but the converse is not true.
A comparison of the defining properties (T1)–
Let sat(U ) := f − f (U ) ⊆ X be the saturation
(T3) of topology T with (F1)–(F3) of that of the
of an open set U of X and comp(V ) := f f − (V ) =
filter F, shows that a filter is very close to a topol-
V f (X) ∈ Y be the component of an open set V
ogy with the main difference being with regard to
of Y on the range f (X) of f . Let Usat , Vcomp de-
the empty set which must always be in T but never
note respectively the saturations U sat = {sat(U ) :
in F. Addition of the empty set to a filter yields
U ∈ U} of the open sets of X and the components
a topology, but removal of the empty set from a
Vcomp = {comp(V ) : V ∈ V} of the open sets of Y
topology need not produce the corresponding fil-
whenever these are also open in X and Y respec-
ter as the topology may contain nonintersecting
tively. Plainly, Usat ⊆ U and Vcomp ⊆ V.
sets.
The distinction between the topological and Definition A.2.1. For a function e : X → (Y, V),
filter-bases should be carefully noted. Thus the preimage or initial topology of X based on
(generated by) e and V is
(a) While the topological base may contain the def
empty set, a filter-base cannot. IT{e; V} = {U ⊆ X : U = e− (V ) if V ∈ Vcomp } ,
(b) From a given topology, form a common base by (A.41)


while for q : (X, U) → Y , the image or final topol- (a) f is continuous iff g is continuous,
ogy of Y based on (generated by) U and q is (b) f is preimage continuous iff U1 = IT{g; V}.
def
FT{U; q} = {V ⊆ Y : q − (V ) = U if U ∈ Usat }. As we need the second part of these theorems in
(A.42) our applications, their proofs are indicated below.
The special significance of the first parts is that they
Thus, the topology of (X, IT{e; V}) consists of, and
ensure the converse of the usual result that the com-
only of, the e-saturations of all the open sets of
position of two continuous functions is continuous,
e(X), while the open sets of (Y, FT{U; q}) are the
namely that one of the components of a composition
q-images in Y (and not just in q(X)) of all the q-
is continuous whenever the composition is so.
saturated open sets of X.32 The need for defining
(A.41) in terms of Vcomp rather than V will be-
Proof of Theorem A.2.1. If f be image continuous,
come clear in the following. The subspace topol-
V1 = {V1 ⊆ Y1 : f − (V1 ) ∈ U1 } and U1 = {U1 ⊆ X1 :
ogy IT{i; U} of a subset A ⊆ (X, U) is a basic ex-
q − (U1 ) ∈ U} are the final topologies of Y1 and X1
ample of the initial topology by the inclusion map
based on the topologies of X1 and X, respectively.
i : X ⊇ A → (X, U), and we take its generalization
Then V1 = {V1 ⊆ Y1 : q − f − (V1 ) ∈ U} shows that h
e : (A, IT{e; V}) → (Y, V) that embeds a subset A
is image continuous.
of X into Y as the prototype of a preimage continu-
Conversely, when h is image continuous, V 1 =
ous map. Clearly the topology of Y may also contain
{V1 ⊆ Y1 : h− (V1 )} ∈ U} = {V1 ⊆ Y1 :
open sets not in e(X), and any subset in Y − e(X)
q − f − (V1 )} ∈ U}, with U1 = {U1 ⊆ X1 : q − (U1 ) ∈
may be added to the topology of Y without alter-
U}, proves f − (V1 ) to be open in X1 and thereby f
ing the preimage topology of X: open sets of Y not
to be image continuous.
in e(X) may be neglected in obtaining the preimage
topology as e− (Y −e(X)) = ∅. The final topology on
a quotient set by the quotient map Q : (X, U) → Proof of Theorem A.2.2.If f be preimage
X/ ∼, which is just the collection of Q-images of the continuous, V1 = {V1 ⊆ Y1 : V1 = e− (V ) if V ∈ V}
Q-saturated open sets of X, known as the quotient and U1 = {U1 ⊆ X1 : U1 = f − (V1 ) if V1 ∈ V1 }
topology of X/ ∼, is the basic example of the image are the initial topologies of Y1 and X1 respectively.
topology and the resulting space (X/ ∼, FT{U; Q}) Hence from U1 = {U1 ⊆ X1 : U1 = f − e− (V ) if V ∈
is called the quotient space. We take the generaliza- V} it follows that g is preimage continuous.
tion q : (X, U) → (Y, FT{U; q}) of Q as the proto- Conversely, when g is preimage continuous,
type of an image continuous function. U1 = {U1 ⊆ X1 : U1 = g − (V ) if V ∈ V} =
The following results are specifically useful in {U1 ⊆ X1 : U1 = f − e− (V ) if V ∈ V} and
dealing with initial and final topologies; compare V1 = {V1 ⊆ Y1 : V1 = e− (V ) if V ∈ V} show
the corresponding results for open maps given later. that f is preimage continuous.

Theorem A.2.1. Let (X, U) and (Y1 , V1 ) be topo- Since both Eqs. (A.41) and (A.42) are in terms
logical spaces and let X1 be a set. If f : X1 → of inverse images (the first of which constitutes a
(Y1 , V1 ), q : (X, U) → X1 , and h = f ◦ q : direct, and the second an inverse, problem) the im-
(X, U) → (Y1 , V1 ) are functions with the topology age f (U ) = comp(V ) for V ∈ V is of interest as
U1 of X1 given by FT{U; q}, then it indicates the relationship of the openness of f
(a) f is continuous iff h is continuous, with its continuity. This, and other related concepts
(b) f is image continuous iff V1 = FT{U; h}. are examined below, where the range space f (X) is
always taken to be a subspace of Y . Openness of
Theorem A.2.2. Let (Y, V) and (X1 , U1 ) be topo- a function f : (X, U) → (Y, V) is the “inverse” of
logical spaces and let Y1 be a set. If f : (X1 , U1 ) → continuity, when images of open sets of X are re-
Y1 , e : Y1 → (Y, V) and g = e ◦ f : (X1 , U1 ) → quired to be open in Y ; such a function is said to be
(Y, V) are function with the topology V 1 of Y1 given open. Following are two of the important properties
by IT{e; V}, then of open functions.

32
We adopt the convention of denoting arbitrary preimage and image continuous functions by e and q respectively even though
they are not injective or surjective; recall that the embedding e : X ⊇ A → Y and the association q : X → f (X) are 1 : 1 and
onto respectively.

3220 A. Sengupta

(1) If f : (X, U) → (Y, f (U)) is an open function, continuous. Indeed, from its injectivity and conti-
then so is f : (X, U) → (f (X), IT{i; f (U)}). nuity, inverse images of all open subsets of Y are
The converse is true if f (X) is an open set of Y ; saturated-open in X, and openness of f ensures that
thus openness of f : (X, U) → (f (X), f (U)) these are the only open sets of X the condition of
implies that of f : (X, U) → (Y, V) whenever injectivity being required to exclude non-saturated
f (X) is open in Y such that f (U ) ∈ V for U ∈ sets from the preimage topology. It is therefore pos-
U. The truth of this last assertion follows eas- sible to rewrite Eq. (A.41) as
ily from the fact that if f (U ) is an open set of
f (X) ⊂ Y, then necessarily f (U ) = V f (X) U ∈ IT{e; V} ⇔ e(U ) = V if V ∈ Vcomp , (A.43)
for some V ∈ V, and the intersection of two
open sets of Y is again an open set of Y . and to compare it with the following criterion for an
(2) If f : (X, U) → (Y, V) and g : (Y, V) → (Z, W) injective, open-continuous map f : (X, U) → (Y, V)
are open functions then g ◦ f : (X, U) → (Z, W) that necessarily satisfies sat(A) = A for all A ⊆ X
is also open. It follows that the condition in
(1) on f (X) can be replaced by the require-
U ∈ U ⇔ ({{f (U )}U ∈U = Vcomp )∧(f −1 (V )|V ∈V ∈ U).
ment that the inclusion i : (f (X), IT{i; V}) →
(Y, V) be an open map. This interchange of (A.44)
f (X) with its inclusion i: f (X) → Y into Y
is a basic result that finds application in many Final Topology. Since it is necessarily produced
situations. on the range R(q) of q, the final topology is often
considered in terms of a surjection. This however
Collected below are some useful properties of is not necessary as, much in the spirit of the ini-
the initial and final topologies that we need in this tial topology, Y − q(X) = ∅ inherits the discrete
work. topology without altering anything, thereby allow-
ing condition (A.42) to be restated in the following
Initial Topology. In Fig. 21(b), consider Y 1 = more transparent form
h(X1 ), e → i and f → h : X1 → (h(X1 ),
IT{i; V}). From h− (B) = h− (B h(X1 )) for any V ∈ FT{U; q} ⇔ V = q(U ) if U ∈ Usat , (A.45)
B ⊆ Y , it follows that for an open set V of Y ,
h− (Vcomp ) = h− (V ) is an open set of X1 which, if and to compare it with the following criterion for
the topology of X1 is IT{h; V}, are the only open a surjective, open-continuous map f : (X, U) →
sets of X1 . Because Vcomp is an open set of h(X1 ) in (Y, V) that necessarily satisfies f B = B for all
its subspace topology, this implies that the preim- B⊆Y
age topologies IT{h; V} and IT{h ; IT{i; V}} of
X1 generated by h and h are the same. Thus the
preimage topology of X1 is not affected if Y is re- V ∈ V ⇔ (Usat = {f − (V )}V ∈V ) ∧ (f (U )|U ∈ U ∈ V).
placed by the subspace h(X1 ), the part Y − h(X1 ) (A.46)
contributing nothing to IT{h; V}.
A preimage continuous function e : X → (Y, V) As may be anticipated from Fig. 21, the final topol-
is not necessarily an open function. Indeed, if U = ogy does not behave as well for subspaces as the ini-
e− (V ) ∈ IT{e; V}, it is almost trivial to verify tial topology does. This is so because in Fig. 21(a)
along the lines of the restriction of open maps to the two image continuous functions h and q are
its range, that e(U ) = ee− (V ) = e(X) V , V ∈ V, connected by a preimage continuous inclusion f ,
is open in Y (implying that e is an open map) iff whereas in Fig. 21(b) all the three functions are
e(X) is an open subset of Y (because finite in- preimage continuous. Thus quite like open func-
tersections of open sets are open). A special case tions, although image continuity of h : (X, U) →
of this is the important consequence that the re- (Y1 , FT{U; h}) implies that of h : (X, U) →
striction e : (X, IT{e; V}) → (e(X), IT{i; V}) of (h(X), IT{i; FT{U; h})) for a subspace h(X) of
e : (X, IT{h; V}) → (Y, V) to its range is an open Y1 , the converse need not be true unless — en-
map. Even though a preimage continuous map need tirely like open functions again — either h(X) is
not be open, it is true that an injective, continu- an open set of Y1 or i : (h(X), IT{i; FT{U; h})) →
ous and open map f : X → (Y, V) is preimage (X, FT{U; h}) is an open map. Since an open


¨¦¤¢
§ ¥ £ ¡ T W¢5
9 E C

© 3
D

§ 1© )'$£ ¢ BA9 8¢5
C 7 @ 7 6 C VSD PGF5
Q I H 9 7 E
0 ( ¥ % # ! ¡ 2 ¢
§ £ 4 U T R

(a) (b)

Fig. 21. Continuity in final and initial topologies.

preimage continuous map is image continuous, this The following is a slightly more general form of
makes i : h(X) → Y1 an ininal function and hence the restriction on the inclusion that is needed for
all the three legs of the commutative diagram image image continuity to behave well for subspaces of Y .
continuous.
Like preimage continuity, an image continuous Theorem A.2.3. Let q : (X, U) → (Y, FT{U; q})
function q : (X, U) → Y need not be open. How- be an image continuous function. For a subspace B
ever, although the restriction of an image continu- of (Y, FT{U; q}),
ous function to the saturated open sets of its domain FT{IT{j; U}; q } = IT{i; FT{U; q}}
is an open function, q is unrestrictedly open iff the
saturation of every open set of X is also open in X. where q : (q − (B), IT{j; U}) → (B, FT{IT{j; U};
In fact it can be verified without much effort that a q }), if either q is an open map or B is an open set
continuous, open surjection is image continuous. of Y .
Combining Eqs. (A.43) and (A.45) gives the fol-
In summary we have the useful result that an
lowing criterion for ininality
open preimage continuous function is image con-
U and V ∈ IFT{Usat ; f ; V} tinuous and an open image continuous function is
preimage continuous, where the second assertion
⇔ ({f (U )}U ∈Usat = V)(Usat = {f − (V )}V ∈V ) ,
follows on neglecting non-saturated open sets in X;
(A.47) this is permitted in as far as the generation of the
final topology is concerned, as these sets produce
which reduces to the following for a homeomor-
the same images as their saturations. Hence an im-
phism f that satisfies both sat(A) = A for A ⊆ X
age continuous function q : X → Y is preimage
and f B = B for B ⊆ Y
continuous iff every open set in X is saturated with
U and V ∈ HOM{U; f ; V} respect to q, and a preimage continuous function
⇔ (U = {f −1 (V )}V ∈V )({f (U )}U ∈U = V) e : X → Y is image continuous iff the e-image of
every open set of X is open in Y .
(A.48)

and compares with A.3. More on Topological Spaces
U and V ∈ OC{U; f ; V} This Appendix — which completes the review
of those concepts of topological spaces begun in
⇔ (sat(U ) ∈ U : {f (U )}U ∈U = Vcomp ) Tutorial 4 that are needed for a proper understand-
∧(comp(V ) ∈ V : {f − (V )}V ∈V = Usat ) ing of this work — begins with the following sum-
(A.49) mary of the different possibilities in the distribu-
tion of Der(A) and Bdy(A) between sets A ⊆ X
for an open-continuous f . and its complement X − A, and follows it up with a

3222 A. Sengupta

few other important topological concepts that have Lemma A.3.2. If A is a subspace of X, a sepa-
been used, explicitly or otherwise, in this paper. ration of A is a pair of disjoint nonempty subsets
H1 and H2 of A whose union is A neither of which
Definition A.3.1 (Separation, Connected Space). contains a cluster point of the other. A is connected
A separation (disconnection) of X is a pair of mutu- iff there is no separation of A.
ally disjoint nonempty open (and therefore closed)
subsets H1 and H2 such that X = H1 ∪ H2 . A space Proof. Let H1 and H2 be a separation of A so
X is said to be connected if it has no separation, that they are clopen subsets of A whose union is
that is, if it cannot be partitioned into two open or A. As H1 is a closed subset of A it follows that
two closed non-empty subsets. X is separated (dis- H1 = ClX (H1 ) A, where ClX (H1 ) A is the clo-
connected) if it is not connected. sure of H1 in A; hence ClX (H1 ) H2 = ∅. But as
the closure of a subset is the union of the set and its
It follows from the definition, that for a dis- adherents, an empty intersection signifies that H 2
connected space X the following are equivalent cannot contain any of the cluster points of H 1 . A
statements. similar argument shows that H1 does not contain
(a) There exist a pair of disjoint non-empty open any adherent of H2 .
subsets of X that cover X, Conversely suppose that neither H1 nor H2 con-
(b) There exist a pair of disjoint non-empty closed tain an adherent of the other: ClX (H1 ) H2 = ∅
subsets of X that cover X, and ClX (H2 ) H1 = ∅. Hence ClX (H1 ) A = H1
(c) There exist a pair of disjoint non-empty clopen and ClX (H2 ) A = H2 so that both H1 and H2
subsets of X that cover X, are closed in A. But since H1 = A − H2 and
(d) There exists a non-empty, proper, clopen subset H2 = A−H1 , they must also be open in the relative
of X. topology of A.

By a connected subset is meant a subset of X that
Following are some useful properties of con-
is connected when provided with its relative topol-
nected spaces.
ogy making it a subspace of X. Thus any connected
subset of a topological space must necessarily be (c1) The closure of any connected subspace of a
contained in any clopen set that might intersect it: space is connected. In general, every B satis-
if C and H are respectively connected and clopen fying
subsets of X such that C H = ∅, then C ⊂ H be- A ⊆ B ⊆ Cl(A)
cause C H is a non-empty clopen set in C which
must contain C because C is connected. is connected. Thus any subset of X formed
For testing whether a subset of a topological from A by adjoining to it some or all of its
space is connected, the following relativized form of adherents is connected so that a topologi-
(a)–(d) is often useful. cal space with a dense connected subset is
connected.
Lemma A.3.1. A subset A of X is disconnected iff (c2) The union of any class of connected subspaces
there are disjoint open sets U and V of X satisfying of X with nonempty intersection is a con-
U A=∅=V A such that A ⊆ U V, nected subspace of X.
(c3) A topological space is connected iff there is a
with U V A=∅
covering of the space consisting of connected
(A.50) sets with nonempty intersection. Connected-
or there are disjoint closed sets E and F of X ness is a topological property: Any space
satisfying homeomorphic to a connected space is itself
connected.
E A=∅=F A such that A ⊆ E F, (c4) If H1 and H2 is a separation of X and A is any
with E F A = ∅. connected subset A of X, then either A ⊆ H 1
(A.51) or A ⊆ H2 .

Thus A is disconnected iff there are disjoint clopen While the real line R is connected, a subspace
subsets in the relative topology of A that cover A. of R is connected iff it is an interval in R.


X −A
1. Donor 2. Selfish (Closed) 3. Neutral

X X X

1. Donor

A A open A

X − A open X − A open X − A open

2. Selfish
A (Closed)

A A open A

X X X

3. Neutral

A A open A

Der(A) BdyX−A (A) Der(X − A) BdyA (X − A)

Fig. 22. Classification of a subset A of X relative to the topology of X. The derived set of A may intersect both A and
X − A (row 3), may be entirely in A (row 2), or may be wholly in X − A (row 1). A is closed iff Bdy(A) ⊆ A (row 2), open
iff Bdy(A) ⊆ X − A (column 2), and clopen iff Bdy(A) = ∅ when the derived sets of both A and X − A are contained in the
respective sets. An open set, beside being closed, may also be neutral or donor.

The important concept of total disconnected- ply must not be contained in any other connected
ness introduced below needs the following subset of X. Components can be constructively de-
fined as follows: Let x ∈ X be any point. Consider
Definition A.3.2 (Component). A component C ∗ the collection of all connected subsets of X to which
of a space X is a maximally (with respect to x belongs. Since {x} is one such a set, the collection
inclusion) connected subset of X. is non-empty. As the intersection of the collection
is non-empty, its union is a non-empty connected
1
set C. This is the largest connected set containing
Thus a component is a connected subspace which x and is therefore a component containing x and we
is not properly contained in any larger connected have
subspace of X. The maximal element need not be
unique as there can be more than one component of (C1) Let x ∈ X. The unique component of X con-
a given space and a “maximal” criterion rather than taining x is the union of all the connected sub-
“maximum” is used as the component that need sets of X that contain x. Conversely any non-
not contain every connected subset of X; it sim- empty connected subset A of X is contained

3224 A. Sengupta

in that unique component of X to which each Table 7. Separation properties of some useful
of the points of A belong. Hence a topological spaces.
space is connected iff it is the unique compo-
Space T0 T1 T2
nent of itself.
(C2) Each component C ∗ of X is a closed set of X: Discrete
By property (c1) above, Cl(C ∗ ) is also con- Indiscrete × × ×
nected and from C ∗ ⊆ Cl(C ∗ ) it follows that R, standard
C ∗ = Cl(C ∗ ). Components need not be open left/right ray × ×
sets of X: an example of this is the space of ra- Infinite cofinite ×
tionals Q in reals in which the components are Uncountable cocountable ×
the individual points which cannot be open in x-inclusion/exclusion × ×
R; see Example 2 below. A-inclusion/exclusion × × ×
(C3) Components of X are equivalence classes of
(X, ∼) with x ∼ y iff they are in the same
component: while reflexivity and symmetry
are obvious enough, transitivity follows be- (2, 3) can be enlarged into bigger connected subsets
cause if x, y ∈ C1 and y, z ∈ C2 with C1 , of X.
C2 connected subsets of X, then x and z As connected spaces, the empty set and the sin-
are in the set C1 C2 which is connected by gleton are considered to be degenerate and any con-
property c(2) above as they have the point y nected subspace with more than one point is non-
in common. Components are connected dis- degenerate. At the opposite extreme of the largest
joint subsets of X whose union is X (i.e. they possible component of a space X which is X itself,
form a partition of X with each point of X are the singletons {x} for every x ∈ X. This leads
contained in exactly one component of X) to the extremely important notion of a
such that any connected subset of X can
be contained in only one of them. Because
Definition A.3.3 (Totally disconnected space). A
a connected subspace cannot contain in it
space X is totally disconnected if every pair of dis-
any clopen subset of X, it follows that every
tinct points in it can be separated by a disconnec-
clopen connected subspace must be a compo-
tion of X.
nent of X.

Even when a space is disconnected, it is always X is totally disconnected iff the components in
possible to decompose it into pairwise disjoint con- X are single points with the only nonempty con-
nected subsets. If X is a discrete space this is the nected subsets of X being the one-point sets: If
only way in which X may be decomposed into con- x = y ∈ A ⊆ X are distinct points of a sub-
nected pieces. If X is not discrete, there may be set A of X then A = (A H1 ) (A H2 ), where
other ways of doing this. For example, the space X = H1 H2 with x ∈ H1 and y ∈ H2 is a discon-
nection of X (it is possible to choose H 1 and H2 in
X = {x ∈ R : (0 ≤ x ≤ 1) ∨ (2 x 3)}
this manner because X is assumed to be totally dis-
has the following distinct decomposition into three connected), is a separation of A that demonstrates
connected subsets: that any subspace of a totally disconnected space
1 1 7 7 with more than one point is disconnected.
X = 0, ,1 2, ,3 A totally disconnected space has interesting
2 2 3 3
physically appealing separation properties in terms
∞
1 1 of the (separated) Hausdorff spaces; here a topo-
X = {0} , (2, 3)
n+1 n logical space X is Hausdorff, or T2 , iff each two
n=1
distinct points of X can be separated by disjoint
X = [0, 1] (2, 3) . neighborhoods, so that for every x = y ∈ X, there
Intuition tells us that only in the third of these de- are neighborhoods M ∈ Nx and N ∈ Ny such that
compositions have we really broken up X into its M N = ∅. This means that for any two distinct
connected pieces. What distinguishes the third from points x = y ∈ X, it is impossible to find points that
the other two is that neither of the pieces [0, 1] or are arbitrarily close to both of them. Among the


properties of Hausdorff spaces, the following need It should be noted that that as none of the prop-
to be mentioned. erties (H1)–(H3) need neighborhoods of both points
simultaneously, it is sufficient for X to be T 1 for the
(H1) X is Hausdorff iff for each x ∈ X and any
conclusions to remain valid.
point y = x, there is a neighborhood N of x
From its definition it follows that any totally
such that y ∈ Cl(N ). This leads to the sig-
disconnected space is a Hausdorff space and is there-
nificant result that for any x ∈ X the closed
fore both T1 and T0 spaces as well. However, if a
singleton
Hausdorff space has a base of clopen sets then it is
{x} = Cl(N ) totally disconnected; this is so because if x and y
N ∈Nx are distinct points of X, then the assumed property
is the intersection of the closures of any local of x ∈ H ⊆ M for every M ∈ Nx and some clopen
base at that point, which in the language of set M yields X = H (X − H) as a disconnection
nets and filters (Appendix A.1) means that a of X that separates x and y ∈ X − H; note that the
net in a Hausdorff space cannot converge to assumed Hausdorffness of X allows M to be chosen
more than one point in the space and the ad- so as not to contain y.
herent set adh(Nx ) of the neighborhood filter
Example A.3.1
at x is the singleton {x}.
(H2) Since each singleton is a closed set, each fi- (1) Every indiscrete space is connected; every sub-
nite set in a Hausdorff space is also closed in set of an indiscrete space is connected. Hence
X. Unlike a cofinite space, however, there can if X is empty or a singleton, it is connected. A
clearly be infinite closed sets in a Hausdorff discrete space is connected iff it is either empty
space. or is a singleton; the only connected subsets in
(H3) Any point x in a Hausdorff space X is a clus- a discrete space are the degenerate ones. This is
ter point of A ⊆ X iff every neighborhood an extreme case of lack of connectedness, and a
of x contains infinitely many points of A, a discrete space is the simplest example of a total
fact that has led to our mental conditioning disconnected space.
of the points of a (Cauchy) sequence piling up (2) Q, the set of rationals considered as a subspace
in neighborhoods of the limit. Thus suppose of the real line, is (totally) disconnected because
for the sake of argument that although some all rationals larger than a given irrational r is a
neighborhood of x contains only a finite num- clopen set in Q, and
ber of points, x is nonetheless a cluster point Q = (−∞, r) Q Q (r, ∞)
of A. Then there is an open neighborhood U
of x such that U (A−{x}) = {x1 , . . . , xn } is r is an irrational
a finite closed set of X not containing x, and
is the union of two disjoint clopen sets in the
U (X − {x1 , . . . , xn }) being the intersection
relative topology of Q. The sets (−∞, r) ∩ Q
of two open sets, is an open neighborhood of x
and Q ∩ (r, ∞) are clopen in Q because neither
not intersecting A−{x} implying thereby that
contains a cluster point of the other. Thus for
x ∈ Der(A); infact U (X − {x1 , . . . , xn }) is
example, any neighborhood of the second must
simply {x} if x ∈ A or belongs to Bdy X−A (A)
contain the irrational r in order to be able to cut
when x ∈ X − A. Conversely if every neigh-
the first which means that any neighborhood
borhood of a point of X intersects A in in-
of a point in either of the relatively open sets
finitely many points, that point must belong
cannot be wholly contained in the other. The
to Der(A) by definition.
only connected sets of Q are one point subsets
consisting of the individual rationals. In fact, a
Weaker separation axioms than Hausdorffness connected piece of Q, being a connected subset
are those of T0 , respectively T1 , spaces in which of R, is an interval in R, and a nonempty in-
for every pair of distinct points at least one, re- terval cannot be contained in Q unless it is a
spectively each one, has some neighborhood not singleton. It needs to be noted that the individ-
containing the other; the following table is a list- ual points of the rational line are not (cl)open
ing of the separation properties of some useful because any open subset of R that contains a
spaces. rational must also contain others different from

3226 A. Sengupta

it. This example shows that a space need not spaces, the proof of which uses this contrapositive
be discrete for each of its points to be a com- characterization of compactness.
ponent and thereby for the space to be totally
Theorem A.2.1. A topological space X is compact
disconnected.
iff each class of closed subsets of X with finite in-
In a similar fashion, the set of irrationals
tersection property has non-empty intersection.
is (totally) disconnected because all the irra-
tionals larger than a given rational is an exam- Proof. Necessity. Let X be a compact space. Let
ple of a clopen set in R − Q. F = {Fα }α∈D be a collection of closed subsets of X
(3) The p-inclusion (A-inclusion) topology is con- with finite FIP, and let G = {X − Fα }α∈D be the
nected; a subset in this topology is connected N
corresponding open sets of X. If {Gi }i=1 is a non-
iff it is degenerate or contains p. For, a sub- empty finite subcollection from G, then {X −G i }Ni=1
set inherits the discrete topology if it does is the corresponding non-empty finite subcollection
not contain p, and p-inclusion topology if it of F. Hence from the assumed finite intersection
contains p. property of F, it must be true that
(4) The cofinite (cocountable) topology on an infi- N N
nite (uncountable) space is connected; a subset
X− Gi = (X − Gi ) (DeMorgan’s Law)
in a cofinite (cocountable) space is connected iff
i=1 i=1
it is degenerate or infinite (countable).
= ∅,
(5) Removal of a single point may render a con-
nected space disconnected and even totally so that no finite subcollection of G can cover X.
disconnected. In the former case, the point Compactness of X now implies that G too cannot
removed is called a cut point and in the sec- cover X and therefore
ond, it is a dispersion point. Any real number Fα = (X − Gα ) = X − Gα = ∅ .
is a cut point of R and it does not have any α α α
dispersion point only. The proof of the converse is a simple exercise of re-
(6) Let X be a topological space. Considering com- versing the arguments involving the two equations
ponents of X as equivalence classes by the in the proof above.
equivalence relation ∼ with Q : X → X/ ∼
denoting the quotient map, X/ ∼ is totally dis- Our interest in this theorem and its proof lies in
connected: As Q− ([x]) is connected for each the following corollary — which essentially means
[x] ∈ X/ ∼ in a component class of X, and that for every filter F on a compact space the adher-
as any open or closed subset A ⊆ X/ ∼ is con- ent set adh(F) is not empty — from which it follows
nected iff Q− (A) is open or closed, it must fol- that every net in a compact space must have a con-
low that A can only be a singleton. vergent subnet.

The next notion of compactness in topological Corollary.A space X is compact iff for every class
spaces provides an insight of the role of non-empty A = (Aα ) of nonempty subsets of X with FIP,
adherent sets of filters that lead in a natural fash- adh(A) = Aα ∈A Cl(Aα ) = ∅.
ion to the concept of attractors in the dynamical
systems theory that we take up next. The proof of this result for nets given by the
next theorem illustrates the general approach in
Definition A.3.4 (Compactness). A topological such cases which is all that is basically needed in
space X is compact iff every open cover of X con- dealing with attractors of dynamical systems; com-
tains a finite subcover of X. pare Theorem A.1.3.
Theorem A.3.2. A topological space X is compact
This definition of compactness has an useful iff each net in X adheres to X.
equivalent contrapositive reformulation: For any
given collection of open sets of X if none of its Proof. Necessity. Let X be a compact space, χ :
finite subcollections cover X, then the entire col- D → X a net in X, and Rα the residual of α in the
lection also cannot cover X. The following theorem directed set D. For the filter-base ( F Bχ(Rα ) )α∈D of
is a statement of the fundamental property of com- nonempty, decreasing, nested subsets of X associ-
pact spaces in terms of adherences of filters in such ated with the net χ, compactness of X requires from


Cl(χ(Rα ) ⊇ χ(Rδ ) = ∅, that the uncountably
α δ compactness of subspaces: A subspace K of a topo-
intersecting subset logical space X is compact iff each open cover of K
in X contains a finite cover of K.
adh(F Bχ ) := Cl(χ(Rα )) A proper understanding of the distinction be-
α∈D tween compactness and closedness of subspaces —
of X be non-empty. If x ∈ adh(F Bχ ) then because which often causes much confusion to the non-
x is in the closure of χ(Rβ ), it follows from Eq. (20) specialist — is expressed in the next two theorems.
that N χ(Rβ ) = ∅33 for every N ∈ Nx , β ∈ D. As a motivation for the first that establishes that
Hence χ(γ) ∈ N for some γ β so that x ∈ adh(χ); not every subset of a compact space need be com-
see Eq. (A.16). pact, mention may be made of the subset (a, b) of
Sufficiency. Let χ be a net in X that adheres the compact closed interval [a, b] in R.
at x ∈ X. From any class F of closed subsets of Theorem A.3.3. A closed subset F of a compact
X with FIP, construct as in the proof of Theo- space X is compact.
rem A.1.4, a decreasing nested sequence of closed
subsets Cβ = α β∈D {Fα : Fα ∈ F} and consider Proof. Let G be an open cover of F so that an
the directed set D Cβ = {(Cβ , β) : (β ∈ D)(xβ ∈ open cover of X is G (X − F ), which because
Cβ )} with its natural direction (A.11) to define the of compactness of X contains a finite subcover U.
net χ(Cβ , β) = xβ in X; see Definition A.1.10. From Then U − (X − F ) is a finite collection of G that
the assumed adherence of χ at some x ∈ X, it covers F .
follows that N F = ∅ for every N ∈ Nx and
It is not true in general that a compact subset
F ∈ F. Hence x belongs to the closed set F so that
of a space is necessarily closed. For example, in an
x ∈ adh(F); see Eq. (A.24). Hence X is compact.
infinite set X with the cofinite topology, let F be
an infinite subset of X with X − F also infinite.
Then although F is not closed in X, it is neverthe-
Using Theorem A.1.5 that specifies a definite
less compact because X is compact. Indeed, let G
criterion for the adherence of a net, this theorem
be an open cover of X and choose any non-empty
reduces to the useful formulation that a space is
G0 ∈ G. If G0 = X then {G0 } is the required fi-
compact iff each net in it has some convergent sub-
nite cover of X. If this is not the case, then because
net. An important application is the following: Since
X − G0 = {xi }n is a finite set, there is a Gi ∈ G
i=1
every decreasing sequence (Fm ) of nonempty sets
with xi ∈ Gi for each 1 ≤ i ≤ n, and therefore
has FIP (because M Fm = FM for every finite
m=1 {Gi }n is the finite cover that demonstrates the
i=0
M ), every decreasing sequence of nonempty closed compactness of the cofinite space X. Compactness
subsets of a compact space has nonempty intersec- of F now follows because the subspace topology on
tion. For a complete metric space this is known as F is the induced cofinite topology from X. The dis-
the Nested Set Theorem, and for [0, 1] and other tinguishing feature of this topology is that it, like
compact subspaces of R as the Cantor Intersection the cocountable, is not Hausdorff: If U and V are
Theorem.34 any two nonempty open sets of X, then they can-
For subspaces A of X, it is the relative topology not be disjoint as the complements of the open sets
that determines as usual compactness of A; however can only be finite and if U V were to be indeed
the following criterion renders this test in terms of empty, then
the relative topology unnecessary and shows that
the topology of X itself is sufficient to determine X = X − ∅ = X − (U V ) = (X − U ) (X − V )

33
This is of course a triviality if we identify each χ(Rβ ) (or F in the proof of the converse that follows) with a neighborhood
N of X that generates a topology on X.
34
Nested-set theorem. If (En ) is a decreasing sequence of nonempty, closed, subsets of a complete metric space (X, d) such
that limn→∞ dia(En ) = 0, then there is a unique point
∞
x∈ En .
n=0

The uniqueness arises because the limiting condition on the diameters of En imply, from property (H1), that (X, d) is a
Hausdorff space.

3228 A. Sengupta

would be a finite set. An immediate fallout of this is Vy is a neighborhood of x and the intersection is
that in an infinite cofinite space, a sequence (x i )i∈N over finitely many points y of A. To prove that K is
(and even a net) with xi = xj for i = j behaves in closed in X it is enough to show that V is disjoint
an extremely unusual way: It converges, as in the from K: If there is indeed some z ∈ V K then z
indiscrete space, to every point of the space. Indeed must be in some Uy for y ∈ A. But as z ∈ V it is
if x ∈ X, where X is an infinite set provided with also in Vy which is impossible as Uy and Vy are to
its cofinite topology, and U is any neighborhood of be disjoint. This last part of the argument in fact
x, any infinite sequence (xi )i∈N in X must be even- shows that if K is a compact subspace of a Haus-
tually in U because X − U is finite, and ignoring dorff space X and x ∈ K, then there are disjoint
/
of the initial set of its values lying in X − U in no open sets U and V of X containing x and K.
way alters the ultimate behavior of the sequence
(note that this implies that the filter induced on The last two theorems may be combined to give
X by the sequence agrees with its topology). Thus the obviously important
xi → x for any x ∈ X is a reflection of the fact that
there are no small neighborhoods of any point of X Corollary. In a compact Hausdorff space, closed-
with every neighborhood being almost the whole of ness and compactness of its subsets are equivalent
X, except for a null set consisting of only a finite concepts.
number of points. This is in sharp contrast with
Hausdorff spaces where, although every finite set is In the absence of Hausdorffness, it is not pos-
also closed, every point has arbitrarily small neigh- sible to conclude from the assumed compactness of
borhoods that lead to unique limits of sequences. A the space that every point to which the net may
corresponding result for cocountable spaces can be converge actually belongs to the subspace.
found in Example A.1.2, continued.
Definition A.3.5. A subset D of a topological
This example of the cofinite topology motivates
space (X, U) is dense in X if Cl(D) = X. Thus
the following “converse” of the previous theorem.
the closure of D is the largest open subset of X,
Theorem A.3.4. Every compact subspace of a and every neighborhood of any point of X contains
Hausdorff space is closed. a point of D not necessarily distinct from it; refer
to the distinction between Eqs. (20) and (22).
Proof. Let K be a non-empty compact subset of
X, Fig. 23, and let x ∈ X − K. Because of the sep- Loosely, D is dense in X iff every point of X
aration of X, for every y ∈ K there are disjoint has points of D arbitrarily close to it. A self-dense
open subsets Uy and Vy of X with y ∈ Uy , and (dense in itself ) set is a set without any isolated
x ∈ Vy . Hence {Uy }y∈K is an open cover for K, and points; hence A is self-dense iff A ⊆ Der(A). A
from its compactness there is a finite subset A of closed self-dense set is called a perfect set so that
K such that K ⊆ y∈A Uy with V = y∈A Vy an a closed set A is perfect iff it has no isolated points.
open neighborhood of x; V is open because each Accordingly
A is perfect ⇔ A = Der(A) ,
means that the closure of a set without any isolated
points is a perfect set.

# Theorem A.3.5. The following are equivalent
. statements.
!
1)$ ¢ ' $
0 ( % (1) D is dense in X.
(2) If F is any closed set of X with D ⊆ F, then
F = X; thus the only closed superset of D is
. X.
©§¥£¡
¨¦ ¤¢ (3) Every nonempty (basic) open set of X cuts D;
thus the only open set disjoint from D is the
empty set ∅.
Fig. 23. Closedness of compact subsets of a Hausdorff space. (4) The exterior of D is empty.


XX
X XX
X XX
X

£¡£¡£¡£¤
¡¡¡¤£¥£¥
¥¡¥¡¥¡¥¤£ ¨¡¨¤¨¡¨¡
¡¤¡¡¨©¨©
©¡©¤©¡©¡¨
Der(A) = = ∅
Der(A) = ∅
Der(A) ∅
¢¡ ¢ ¢ £¡£¡£¤¥£¥£¥
¡¢ ¥¡¥¡¥¤
¢¡ £¡£¡£¤£¥
¡ ¥¡¥¡¥¤
£ ¢¡ ¡¡¤
¢¡¢ ¥¡¥¡¥¤£¥
¥¡ ¥ ¥ ¥ £¡
£¡£¡£¡£¤
¥¡¥¡¥¡¥¤
£¡£ £ £
¥¡¡¡¤
¥¡¥ ¥ ¥
¡£¡£¡£¤ ¦§ ¨¡¨¤¨¡¨¡©
©©¡©¤©¡©¡¨
¨¨©¡¨¤¨¡¨¡
¡©¤©¡©¡©¨
¡¨¤¨¡¨¡©¨©
¨©¡©¤©¡©¡¨
¡¨¤¨¡¨¡©
AA
A
¡ £¡£¡£¤
AA
A
¡¡¡¤£
£¡£ £ £
¥¡¡¡¤¥
¥¡¥ ¥ ¥
£¡£¡ ¨©¡©¤©¡©¡¨
CC= = Der(C)
C= Der(C)
Der(C)
¡¨¤¨¡¨¡©
¨¡©¤©¡©¡
¡¨¤¨¡¨¡
©¡©¤©¡©¡
¨¡¨¤¨¡¨¡
¥ © © © ©
(a) (a) Aisisolated
(a)A is isisolated
(a)AA isolated
isolated (b) A Aisnwd
(b) Ais isnwd
(b) nwd
(b) A is nwd (c)(c)CisisCantor
(c) C is Cantor
(c)C C isCantor
Cantor
Fig. 24. Shows the distinction between isolated, nowhere dense and Cantor sets. Topologically, the Cantor set can be described
as a perfect, nowhere dense, totally disconnected and compact subset of a space. (b) The closed nowhere dense set Cl(A) is
the boundary of its open complement. Here downward and upward inclined hatching denote respectively Bdy A (X − A) and
BdyX−A (A).

Proof. (3) If U indeed is a non-empty open set of closure, that is A ⊆ Cl(X − Cl(A)). In particu-
X with U D = ∅, then D ⊆ X − U = X leads lar a closed subset A is nowhere dense in X iff
to the contradiction X = Cl(D) ⊆ Cl(X − U ) = A = Bdy(A), that is iff it contains no open set.
X − U = X, which also incidentally proves (2). (2) From M ⊆ N ⇒ Cl(M ) ⊆ Cl(N ) it follows,
From (3) it follows that for any open set U of with M = X − Cl(A) and N = X − A, that
X, Cl(U ) = Cl(U D) because if V is any open a nowhere dense set is residual, but a residual
neighborhood of x ∈ Cl(U ) then V U is a non- set need not be nowhere dense unless it is also
empty open set of X that must cut D so that closed in X.
V (U D) = ∅ implies x ∈ Cl(U D). Finally, (3) Since Cl(Cl(A)) = Cl(A), Cl(A) is nowhere
Cl(U D) ⊆ Cl(U ) completes the proof. dense in X iff A is.
(4) For any A ⊆ X, both Bdy A (X − A) := Cl(X −
Definition A.3.6. (a) A set A ⊆ X is said to be A) A and BdyX−A (A) := Cl(A) (X −
nowhere dense in X if Int(Cl(A)) = ∅ and residual A) are residual sets and as Fig. 22 shows
in X if Int(A) = ∅. BdyX (A) = BdyX−A (A) BdyA (X − A) is the
A is nowhere dense in X iff union of these two residual sets. When A is
Bdy(X − Cl(A)) = Bdy(Cl(A)) = Cl(A) closed (or open) with X its boundary, con-
so that sisting of the only component Bdy A (X − A)
(or BdyX−A (A)) as shown by the second row
Cl(X − Cl(A)) = (X − Cl(A)) Cl(A) = X (or column) of the figure, being a closed set
from which it follows that of X is also nowhere dense in X; in fact a
closed nowhere dense set is always the bound-
A is nwd in X ⇔ X − Cl(A) is dense in X
ary of some open set. Otherwise, the bound-
and ary components of the two residual parts —
A is residual in X ⇔ X − A is dense in X . as in the donor–donor, donor–neutral, neutral–
donor and neutral–neutral cases — need not
Thus A is nowhere dense iff Ext(A) := X −
be individually closed in X (although their
Cl(A) is dense in X, and in particular, a closed set
111
union is) and their union is a residual set that
is nowhere dense in X iff its complement is open
need not be nowhere dense in X: the union
dense in X with open-denseness being complimen-
of two nowhere dense sets is nowhere dense
tarily dual to closed-nowhere denseness. The ratio-
but the union of a residual and a nowhere
nals in reals is an example of a set that is residual
dense set is a residual set. One way in which
but not nowhere dense. The following are readily
a two-component boundary can be nowhere
verifiable properties of subsets of X.
dense is by having BdyA (X − A) ⊇ Der(A) or
(1) A set A ⊆ X is nowhere dense in X iff it is BdyX−A (A) ⊇ Der(X − A), so that it is effec-
contained in its own boundary, iff it is con- tively in one piece rather than in two, as show in
tained in the closure of the complement of its Fig. 24(b).

3230 A. Sengupta

Theorem A.3.6. A is nowhere dense in X iff each x1
(0)
x2
(0)
non-empty open set of X has a non-empty open sub- . . C0
(1) (1)
set disjoint from Cl(A). x2 x3
. . C1
(2) (2) (2) (2)
x2 x3 x6 x7
Proof. If U is a non-empty open set of X, then . . . . C2
U0 = U ∩ Ext(A) = ∅ as Ext(A) is dense in X; U 0 (3)
x3
(3)
x7
(3)
x10
(3)
x14
is the open subset that is disjoint from Cl(A). It . . . . C3
clearly follows from this that each non-empty open
C4
set of X has a non-empty open subset disjoint from
a nowhere dense set A.
C
What this result (which follows just from the
Fig. 25. Construction of the classical 1/3-Cantor set. The
definition of nowhere dense sets) actually means is 1
end points of C3 , for example, in increasing order are: |0, 27 |;
that no point in BdyX−A (A) can be isolated in it. | 27 , 9 |; | 2 , 27 |; | 27 , 1 |; | 2 , 19 |; | 27 , 7 |; | 9 , 25 |; | 26 , 1|. Ci is
2 1 7 8 20 8
9 3 3 27 9 27 27
the union of 2i pairwise disjoint closed intervals each of length
Corollary. A is nowhere dense in X iff Cl(A) does 3−i and the non-empty infinite intersection C = ∩∞ Ci i=0
not contain any non-empty open set of X iff any is the adherent Cantor set of the filter-base of closed sets
nonempty open set that contains A also contains its {C0 , C1 , C2 , . . .}.
closure.

Example A.3.2. Each finite subset of Rn is number — it follows that both rationals and irra-
nowhere dense in Rn ; the set {1/n}∞ is nowhere tionals belong to the Cantor set.
n=1
dense in R. The Cantor set C is nowhere dense in (C1) C is totally disconnected. If possible, let C
[0, 1] because every neighborhood of any point in have a component containing points a and b
C must contain, by its very construction, a point with a b. Then [a, b] ⊆ C ⇒ [a, b] ⊆ Ci for
with 1 in its ternary representation. That the in- all i. But this is impossible because we may
terior and the interior of the closure of a set are choose i large enough to have 3−i b − a
not necessarily the same is seen in the example so that a and b must belong to two differ-
of the rationals in reals: The set of rational num- ent members of the pairwise disjoint closed
bers Q has empty interior because any neighbor- 2i subintervals each of length 3−i that consti-
hood of a rational number contains irrational num- tutes Ci . Hence
bers (so also is the case for irrational numbers) and
R = Int(Cl(Q)) ⊇ Int(Q) = ∅ justifies the notion of [a, b] is not a subset of any Ci ⇒ [a, b]
a nowhere dense set. is not a subset of C .

The following properties of C can be taken to (C2) C is perfect so that for any x ∈ C every
define any subset of a topological space as a Can- neighborhood of x must contain some other
tor set; set-theoretically it should be clear from its point of C. Supposing to the contrary that the
classical middle-third construction that the Cantor singleton {x} is an open set of C, there must
set consists of all points of the closed interval [0, be an ε 0 such 1 that in the usual topology
1] whose infinite triadic (base 3) representation, ex- of R
pressed so as not to terminate with an infinite string {x} = C (x − ε, x + ε) . (A.52)
of 1’s, does not contain the digit 1. Accordingly, any
end point of the infinite set of closed intervals whose Choose a positive integer i large enough to
intersection yields the Cantor set, is represented by satisfy 3−i ε. Since x is in every Ci , it must
a repeating string of either 0 or 2 while a non end be in one of the 2i pairwise disjoint closed in-
point has every other arbitrary collection of these tervals [a, b] ⊂ (x − ε, x + ε) each of length
two digits. Recalling that any number in [0, 1] is a 3−i whose union is Ci . As [a, b] is an interval,
rational iff its representation in any base is termi- at least one of the end points of [a, b] is dif-
nating or recurring — thus any decimal that neither ferent from x, and since an end point belongs
repeats or terminates but consists of all possible se- to C, C ∩ (x − ε, x + ε) must also contain this
quences of all possible digits represents an irrational point thereby violating Eq. (A.52).


(C3) C is nowhere dense because each neighbor- where λ(ν) is the usual combination coefficient
hood of any point of C intersects Ext(C); see of the solutions of the homogeneous and non-
Theorem A.3.6. homogeneous parts of a linear equation, P(·) is a
(C4) C is compact because it is a closed subset principal value and δ(x) the Dirac delta, to lead
contained in the compact subspace [0, 1] of to the full-range −1 ≤ µ ≤ 1 solution valid for
R, see Theorem A.3.3. The compactness of −∞ x ∞
[0, 1] follows from the Heine-Borel Theorem Φ(x, µ) = a(ν0 )e−x/ν0 φ(µ, ν0 )
which states that any subset of the real line
is compact iff it is both closed and bounded + a(−ν0 )ex/ν0 φ(−ν0 , µ)
with respect to the Euclidean metric on R. 1
+ a(ν)e−x/ν φ(µ, ν)dν (A.56)
Compare (C1) and (C2) with the essentially −1
similar arguments of Example A.3.1(2) for the sub- of the one-speed neutron transport equation (A.53).
space of rationals in R. Here the real ν0 and ν satisfy respectively the inte-
gral constraints
A.4. Neutron Transport Theory cν0 ν0 + 1
ln = 1, |ν0 | 1
2 ν0 − 1
This section introduces the reader to the basics of
the linear neutron transport theory where graphi- cν 1 + ν
λ(ν) = 1 − ln , ν ∈ [−1, 1] ,
cal convergence approximations to the singular dis- 2 1−ν
tributions, interpreted here as multifunctions, led
with
to the study of this paper. The one-speed (that is
mono-energetic) neutron transport equation in one cν0 1
φ(µ, ν0 ) =
dimension and plane geometry, is 2 ν0 − µ
1 following from Eq. (A.55).
∂Φ(x, µ) c
µ + Φ(x, µ) = Φ(x, µ )dµ , It can be shown [Case Zweifel, 1967] that the
∂x 2 −1 eigenfunctions φ(ν, µ) satisfy the full-range orthog-
0 c 1, −1 ≤ µ ≤ 1 onality condition
(A.53) 1
µφ(ν, µ)φ(ν , µ)dµ = N (ν)δ(ν − ν ) ,
where x is a non-dimensional physical space variable −1
that denotes the location of the neutron moving in where the odd normalization constants N are given
a direction θ = cos−1 (µ), Φ(x, µ) is a neutron den- by
sity distribution function such that Φ(x, µ)dxdµ is 1
the expected number of neutrons in a distance dx N (±ν0 ) = µφ2 (±ν0 , µ)dµ for |ν0 | 1
about the point x moving at constant speed with −1
their direction cosines of motion in dµ about µ, cν03 c 1
and c is a physical constant that will be taken to =± 2 − 2 ,
2 ν0 − 1 ν0
satisfy the restriction shown above. Case’s method
starts by assuming the solution to be of the form and
Φν (x, µ) = e−x/µ φ(µ, ν) with a normalization inte- πcν 2
1 N (ν) = ν λ2 (ν) + for ν ∈ [−1, 1] .
gral constraint of −1 φ(µ, ν)dµ = 1 to lead to the 2
simple equation
With a source of particles ψ(x0 , µ) located at x =
cν x0 in an infinite medium, Eq. (A.56) reduces to the
(ν − µ)φ(µ, ν) = (A.54)
2 boundary condition, with µ, ν ∈ [−1, 1],
for the unknown function φ(ν, µ). Case then sug- ψ(x0 , µ) = a(ν0 )e−x0 /ν0 φ(µ, ν0 )
gested, see [Case Zweifel, 1967], the non-simple
complete solution of this equation to be + a(−ν0 )ex0 /ν0 φ(−ν0 , µ)
1
cν 1
φ(µ, ν) = P + λ(v)δ(v − µ) , (A.55) + a(ν)e−x0 /ν φ(µ, ν)dν (A.57)
2 ν−µ −1

3232 A. Sengupta

for the determination of the expansion coefficients 1
a(±ν0 ), {a(ν)}ν∈[−1,1] . Use of the above orthogonal- W (µ)φ(µ, ν )φ(µ, −ν)dµ
0
ity integrals then lead to the complete solution of
the problem to be cν
= (ν0 + ν)φ(ν , −ν)X(−ν)
2
1
ex0 /ν
a(ν) = µψ(x0 , µ)φ(µ, ν)dµ , where the half-range weight function W (µ) is
N (ν) −1
defined as
ν = ±ν0 or ν ∈ [−1, 1] . cµ
W (µ) = (A.61)
For example, in the infinite-medium Greens func- 2(1 − c)(ν0 + µ)X(−µ)
tion problem with x0 = 0 and ψ(x0 , µ) = δ(µ− in terms of the X-function
µ0 )/µ, the coefficients are a(±ν0 ) = φ(µ0 , ±ν0 )/
N (±ν0 ) when ν = ±ν0 , and a(ν) = φ(µ0 , ν)/N (ν) 1
c ν cν 2
for ν ∈ [−1, 1]. X(−µ) = exp −
2 N (ν)
1+
1 − ν2
ln(ν + µ)dν ,
0
For a half-space 0 ≤ x ∞, the obvious reduc-
tion of Eq. (A.56) to 0 ≤ µ ≤ 1,

Φ(x, µ) = a(ν0 )e−x/ν0 φ(µ, ν0 ) that is conveniently obtained from a numerical so-
1 lution of the nonlinear integral equation
+ a(ν)e−x/ν φ(µ, ν)dν (A.58) 1
0 cµ ν0 (1−c)−ν 2
2
Ω(−µ) = 1− 2 dν
with boundary condition, µ, ν ∈ [0, 1], 2(1−c) 0 (ν0 −ν 2 )(µ+ν)Ω(−ν)
(A.62)
ψ(x0 , µ) = a(ν0 )e−x0 /ν0 φ(µ, ν0 )
1
to yield
−x0 /ν
+ a(ν)e φ(µ, ν)dν , (A.59) Ω(−µ)
0 X(−µ) = √ ,
µ + ν0 1 − c
leads to an infinitely more difficult determination of
the expansion coefficients due to the more involved and X(±ν0 ) satisfy
nature of the orthogonality relations of the eigen-
2
ν0 (1 − c) − 1
functions in the half-interval [0, 1] that now reads
X(ν0 )X(−ν0 ) = .
for ν, ν ∈ [0, 1] [Case Zweifel, 1967] 2 2
2(1 − c)v0 (ν0 − 1)
1
W (µ)φ(µ, ν )φ(µ, ν)dµ Two other useful relations involving the W -function
1
0 are given by 0 W (µ)φ(µ, ν0 )dµ = cν0 /2 and
1
W (ν)N (ν) 0 W (µ)φ(µ, ν)dµ = cν/2.
= δ(ν − ν ) The utility of these full- and half-range orthog-
ν
1 onality relations lie in the fact that a suitable class
W (µ)φ(µ, ν0 )φ(µ, ν)dµ = 0 of functions of the type that is involved here can al-
0
ways be expanded in its terms, see [Case Zweifel,
1
W (µ)φ(µ, −ν0 )φ(µ, ν)dµ 1967]. An example of this for a full-range problem
0 has been given above; we end this introduction to
= cνν0 X(−ν0 )φ(ν, −ν0 ) (A.60) the generalized — traditionally known as singular in
1 neutron transport theory — eigenfunction method
W (µ)φ(µ, ±ν0 )φ(µ, ν0 )dµ with two examples of half-range orthogonality inte-
0 grals to the half-space problems A and B of Sec. 5.
cν0 2
= X(±ν0 )
2 Problem A (The Milne Problem). In this case
1
W (µ)φ(µ, ν0 )φ(µ, −ν)dµ there is no incident flux of particles from outside
0 the medium at x = 0, but for large x 0 the
c2 νν0 neutron distribution inside the medium behaves
= X(−ν)
4 like ex/ν0 φ(−ν0 , µ). Hence the boundary condition


(A.59) at x = 0 reduces to which leads, using the integral relations satisfied by
W , to the expansion coefficients
−φ(µ, −ν0 ) = aA (ν0 )φ(µ, ν0 )
aB (ν0 ) = −2/cν0 X(v0 )
1
1 (A.64)
+ aA (ν)φ(µ, ν)dν µ≥0 aB (ν) = (1 − c)ν(ν0 + ν)X(−ν) .
0 N (ν)
Use of the fourth and third equations of Eq. (A.60) where X(±ν0 ) are related to Problem A as
and the explicit relation Eq. (A.61) for W (µ) gives 2
respectively the coefficients 1 ν0 (1 − c) − 1
X(ν0 ) = 2
ν0 2aA (ν0 )(1 − c)(ν0 − 1)
X(−ν0 )
aA (ν0 ) = 2
X(v0 ) 1 aA (ν0 ) ν0 (1 − c) − 1
X(−ν0 ) = 2 .
1 ν0 2(1 − c)(ν0 − 1)
2
aA (ν) = − c(1 − c)ν0 νX(−ν0 )X(−ν) .
N (ν) This brief introduction to the singular eigen-
(A.63) function method should convince the reader of the
great difficulties associated with half-space, half-
The extrapolated end point z0 of Eq. (67) is re- range methods in particle transport theory; note
lated to aA (ν0 ) of the Milne problem by aA (ν0 ) = that the X-functions in the coefficients above must
− exp(−2z0 /ν0 ). be obtained from numerically computed tables. In
contrast, full-range methods are more direct due to
Problem B (The Constant Source Problem). Here the simplicity of the weight function µ, which sug-
the boundary condition at x = 0 is gests the full-range formulation of half-range prob-
1
lems presented in Sec. 5. Finally it should be men-
1 = aB (ν0 )φ(µ, ν0 ) + aB (ν)φ(µ, ν)dν µ≥0 tioned that this singular eigenfunction method is
0 based on the theory of singular integral equations.

Toward a theory of chaos

More Related Content

Viewers also liked (7)

Similar to Toward a theory of chaos (20)

Toward a theory of chaos